[
https://issues.apache.org/jira/browse/LUCENE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erick Erickson resolved LUCENE-1372.
------------------------------------
Resolution: Won't Fix
SPRING_CLEANING_2013 JIRAS Defining the sort order on MV fields has always
seemed like one of those features that is more trouble than it's worth. One can
define a predictable order, but the use to the user is questionable.
> Proposal: introduce more sensible sorting when a doc has multiple values for
> a term
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1372
> URL: https://issues.apache.org/jira/browse/LUCENE-1372
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 2.3.2
> Reporter: Paul Cowan
> Priority: Minor
> Attachments: LUCENE-1372-MultiValueSorters.patch,
> lucene-multisort.patch
>
>
> At the moment, FieldCacheImpl has somewhat disconcerting values when sorting
> on a field for which multiple values exist for one document. For example,
> imagine a field "fruit" which is added to a document multiple times, with the
> values as follows:
> doc 1: {"apple"}
> doc 2: {"banana"}
> doc 3: {"apple", "banana"}
> doc 4: {"apple", "zebra"}
> if one sorts on the field "fruit", the loop in
> FieldCacheImpl.stringsIndexCache.createValue() (and similarly for the other
> methods in the various FieldCacheImpl caches) does the following:
> while (termDocs.next()) {
> retArray[termDocs.doc()] = t;
> }
> which means that we look over the terms in their natural order and, on each
> one, overwrite retArray[doc] with the value for each document with that term.
> Effectively, this overwriting means that a string sort in this circumstance
> will sort by the LAST term lexicographically, so the docs above will
> effecitvely be sorted as if they had the single values ("apple", "banana",
> "banana", "zebra") which is nonintuitive. To change this to sort on the first
> time in the TermEnum seems relatively trivial and low-overhead; while it's
> not perfect (it's not local-aware, for example) the behaviour seems much more
> sensible to me. Interested to see what people think.
> Patch to follow.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]