[
https://issues.apache.org/jira/browse/LUCENE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628513#action_12628513
]
Paul Smith commented on LUCENE-1372:
------------------------------------
bq. I'm not following this argument. Will it be less silly when {zebra,apple}
sorts before {banana} ?
Well, at the presentation layer I don't think you'd present it like that (we
don't). We'd sort the list of attributes so that it would appear as
"apple,zebra".
> Proposal: introduce more sensible sorting when a doc has multiple values for
> a term
> -----------------------------------------------------------------------------------
>
> Key: LUCENE-1372
> URL: https://issues.apache.org/jira/browse/LUCENE-1372
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Search
> Affects Versions: 2.3.2
> Reporter: Paul Cowan
> Priority: Minor
> Attachments: lucene-multisort.patch
>
>
> At the moment, FieldCacheImpl has somewhat disconcerting values when sorting
> on a field for which multiple values exist for one document. For example,
> imagine a field "fruit" which is added to a document multiple times, with the
> values as follows:
> doc 1: {"apple"}
> doc 2: {"banana"}
> doc 3: {"apple", "banana"}
> doc 4: {"apple", "zebra"}
> if one sorts on the field "fruit", the loop in
> FieldCacheImpl.stringsIndexCache.createValue() (and similarly for the other
> methods in the various FieldCacheImpl caches) does the following:
> while (termDocs.next()) {
> retArray[termDocs.doc()] = t;
> }
> which means that we look over the terms in their natural order and, on each
> one, overwrite retArray[doc] with the value for each document with that term.
> Effectively, this overwriting means that a string sort in this circumstance
> will sort by the LAST term lexicographically, so the docs above will
> effecitvely be sorted as if they had the single values ("apple", "banana",
> "banana", "zebra") which is nonintuitive. To change this to sort on the first
> time in the TermEnum seems relatively trivial and low-overhead; while it's
> not perfect (it's not local-aware, for example) the behaviour seems much more
> sensible to me. Interested to see what people think.
> Patch to follow.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]