On Dec 16, 2008, at 10:00 AM, Patrick Johnstone wrote:


I'm using Lucene via Solr and recently upgraded from an early Summer nightly build to the released version of Solr 1.3 (which seems to use something in the neighborhood of Lucene 2.3). I'm posting this here because I believe
that my issue is with Lucene, not Solr.

Do you know the version of Lucene in that version of Solr (from this summer). If you open up the JAR, it should be in the Manifest. With that info, I can go back and look at that revision of Lucene. I'm guessing that it was at least 2.3 as well, but I'm not sure.



After the upgrade, I noticed that the order of fields being returned for documents had changed. Previously, the order of fields being returned was the same as the order in which they were added to the document (which is what's stated in the FAQ and other places I came across but not specifically
spelled out in the Javadoc).
Now, the fields always seem to come back in lexicographic order by field
name.

I would agree these are contradictory. I've always understood the contract to be such that they would be returned in order of addition. Still, seems like it isn't something I would rely on.




I believe (but am by no means sure) that this is being caused by the
following bit of code in DocFieldProcessorPerThread.java:

   // If we are writing vectors then we must visit
   // fields in sorted order so they are written in
   // sorted order.  TODO: we actually only need to
   // sort the subset of fields that have vectors
   // enabled; we could save [small amount of] CPU
   // here.
   quickSort(fields, 0, fieldCount-1);

   for(int i=0;i<fieldCount;i++)
     fields[i].consumer.processFields(fields[i].fields,
fields[i].fieldCount);

This code seems to sort the fields of the document into order before
processing them. If this is true, then the original field order is lost and can't ever be recovered. (None of the fields in our index use TermVectors.)

My questions are:  Is my reading of this behavior correct?  ...and...

I believe it is, but would be good to have Mike comment, as he wrote this code.



if so:  Is there some other way to get the original order back?

Not sure yet. There seems to be a need for it when writing the term vectors.



The application that I'm building took some advantage of the fact that the
fields were returned in the orignial order (becuase the order had some
meaning) and it may be difficult for me to work around this change.


Can you describe your use case a bit more? Perhaps we can brainstorm some alternatives, just to give you options.

-Grant

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ











---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to