Re: possible segment merge improvement?

Michael McCandless Thu, 01 Nov 2007 02:05:27 -0800

"robert engels" <[EMAIL PROTECTED]> wrote:

> Why not check the fields dictionary for the segments being merged,
> and if the same, just copy the binary data directly?


+1

While Lucene does not have a global field schema/semantics, unlike eg
KinoSearch, I think for many apps the fields are in fact static.

In KinoSearch, merging of stored fields & term vectors is always a
fast concatenation of the entry for that document, whereas Lucene must
re-interpret/re-number all fields on the doc, in general.  In fact I
think that KinoSearch stores field names directly in the index (ie,
not numbers).

If we make this change to Lucene then for those apps that effectively
have a static field schema (because all docs always have matching
fields), we can get the same performance that KinoSearch always gets
during its merging of stored fields & term vectors.  For all other
apps we must continue to re-interpret each field on each document.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: possible segment merge improvement?

Reply via email to