Sort of (if I understand you).

Eventually the segments (after merging) converge to having the same fields in the same order.

New segments are mostly merged only with new segment (which probably have the same fields).

When a "newer" segment is merged with a "older" you will not be able to optimize the process (some complex change/mapping code might be able to do a better job that the current brute force read all / write all method).

If the fields were always kept sorted you have a better chance of having the fields dictionary of various segments match up.

At least for us, our fields dictionaries are VERY static, and constant across all documents (we partition different document types into separate indexes), so this optimization is a big help.


On Nov 2, 2007, at 1:40 PM, Yonik Seeley wrote:

On 11/1/07, robert engels <[EMAIL PROTECTED]> wrote:
I have looked into modifying FieldInfos to keep the fields sorted by
field name, so the user would not be forced to add the fields in the
same order.

Sparse documents are really not a problem. Since after the first
merge of that document it will pickup the other fields from the other
segments, after which it will merge "as the same".

Only when the field numbers happen match up though right?
There could be number mismatches far after the first merge, depending
on what fields were encountered first in those segments.

Aside: renumbering fields is another area where using byte counts
instead of char counts should really speed things up.

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to