Yeah, I've heard rumblings about this issue before. I can't remember what patch changed it though - one of Mike M's I think?
On Tue, Jun 30, 2009 at 8:40 PM, Chris Hostetter <hossman_luc...@fucit.org>wrote: > > Hmmm... i'm not an expert on the internals of indexing, and i don't use > FieldSelectors much, but this seems like a pretty big bug to me ... or at > the very least: a change in behavior that completely eliminates the value > of LOAD_AND_BREAK. > > https://issues.apache.org/jira/browse/LUCENE-1727 > > > > : The Lucene FAQ says... > : > : What is the order of fields returned by Document.fields()? > : * Fields are returned in the same order they were added to the document. > : (now getFields() as fields is deprecated) > : > : However I think this may no longer be the case in 2.4 > : > : We are indexing documents in a specific order so that we can > LOAD_AND_BREAK out of our FieldSelector as early as possible. > : i.e. we have typically 50 indexed fields for a document, but when we are > loading results with .doc(), we know we only need 4 of them. > : > : So, our code ensures that these are added to the index first - and once > the 4th field is loaded we break out of the selector. > : > : This speeds us up by an order of magnitude. > : > : > : > : However, we are finding that our field selector is processing fields in > alphabetical order, not order of addition. This means that we'd have to > rename our fields to 'aaa..' in order to guarantee they'd be processed > first. > : > : > : I think, but am not sure, that this bit of code causes the problem (as > spotted in > http://www.mail-archive.com/java-user@lucene.apache.org/msg24105.html). > : It seems to have been introduced in version 2.4 (fields are in addition > order in 2.3.2) > : > : DocFieldProcessorPerThread.java: > : > : // If we are writing vectors then we must visit > : // fields in sorted order so they are written in > : // sorted order. TODO: we actually only need to > : // sort the subset of fields that have vectors > : // enabled; we could save [small amount of] CPU > : // here. > : quickSort(fields, 0, fieldCount-1); > : > : > : This appears to sort fields into alphabetical order. > : > : Assuming that implementing the TODO would keep them in order of addition > (and just keep vectors fields themselves sorted) - is it worth raising a > JIRA to fix this ? > : > : > : regards, > : > : matt > : > : > : > : > : _________________________________________________________________ > : Get the best of MSN on your mobile > : http://clk.atdmt.com/UKM/go/147991039/direct/01/ > > > > -Hoss > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >