Hi, I've tried to summarize the discussion so far:
My proposal was to move the tokenized/binary/compressed bits from *.fdt (field values) to *.fnm (field definitions). That would make the intent of the code handling field attributes much clearer and reduce the complexitiy of the code. (you'll find details in my first posting) As a tradeoff one would loose the possibility of storing the tokenized/binary/compressed attributes of a field on a per-document bases, instead they would be stored as a global attributes of a field. The other consequences of this refactoring would be: -- binary format of *.fdt will change. -- simpler code for writing/reading field attributes The consequences are not: -- no, you must not know all field definitions at start. It would be possible to add new fields to documents at any time. -- the handling of field norms will not change There are some proposals which go even further: 1. make field infos file (*fnm) single per-index 2. make filed infos file human readable 3. optimize merging of 1-document segments (http://issues.apache.org/jira/browse/LUCENE-211) While 3. is a completely different topic, the first two may be worth to be discussed. Concerning the second point I'm personally reluctant because it opens the discussion of what format to choose and those discussions end too often in choosing XML which would require the whole XML-bloat being linked to any C-library implementing lucene. Robert --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]