Hi,

my question is:

Is there some reason not to store all field attributes in one place (*.fnm) ?

A lucene field has some attributes:

  private boolean storeTermVector = false;
  private boolean storeOffsetWithTermVector = false; 
  private boolean storePositionWithTermVector = false;
  private boolean omitNorms = false;
  private boolean isStored = false;
  private boolean isIndexed = true;
  private boolean isTokenized = true;
  private boolean isBinary = false;
  private boolean isCompressed = false;

Some of them are stored as a one byte-bit mask
in the field infos file (*.fnm),

isIndexed  (IS_INDEXED)
storeTermVector  (STORE_TERMVECTOR)
storePositionsWithTermVector  (STORE_POSITIONS_WITH_TERMVECTOR)
storeOffsetWithTermVector (STORE_OFFSET_WITH_TERMVECTOR)
omitNorms  (OMIT_NORMS)

Other attributes are stored as a bit mask in the data file (*.fdt):

tokenized
binary
compressed


They could be stored as a bit field of variable length (like vint).
Then you would have all pieces of information about fields in one place:

binary format -> *.fnm
java object -> FieldInfo
read/write -> FieldInfos
bit mask definitions -> Field or FieldInfo

The nine booleans in Field could be replaced by a bit mask (int).
Using storage of attributes as a bitmask of variable length in *.fnm
would not break binary format if more attributes come up in future.

I feel like it would make the code clearer and easier to understand.

Any comments?

Robert







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to