Currently, when merging segments, every document is [parsed and then
rewritten since the field numbers may differ between the segments
(compressed data is not uncompressed in the latest versions).
It would seem that in many (if not most) Lucene uses the fields
stored within each document with an index are relatively static,
probably changing for all documents added after point X, if at all.
Why not check the fields dictionary for the segments being merged,
and if the same, just copy the binary data directly?
In the common case this should be a vast improvement.
Anyone worked on anything like this? Am I missing something?
Robert Engels
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]