[
https://issues.apache.org/jira/browse/LUCENE-4622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542947#comment-13542947
]
Michael McCandless commented on LUCENE-4622:
--------------------------------------------
I think the limitation here is I cannot specify 2 sort fields (like I can in
Lucene), right? Ie, I would sort by count (descending) then by label
(ascending), and then the top K selection and sorting of the final top K facets
would be "correct". And a second limitation is that "sort by label" can be
costly in general because you'd have to resolve each ord -> label whenever the
primary sort was equal.
It's true Lucene tie breaks by docID, but then the app can specify multiple
sort fields so that a dup result really in fact looks like a dup to the user as
well and then the tie-break doesn't matter much.
Anyway, given that the app can just sort after-the-fact, and given the cost of
sorting-by-label, I think we shouldn't fix this for now ... we can revisit if
the issue ever arrises in a real app.
> TopKFacetsResultHandler should tie break sort by label not ord?
> ---------------------------------------------------------------
>
> Key: LUCENE-4622
> URL: https://issues.apache.org/jira/browse/LUCENE-4622
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/facet
> Reporter: Michael McCandless
>
> EG I now get these facets:
> {noformat}
> Author (5)
> Lisa (2)
> Frank (1)
> Susan (1)
> Bob (1)
> {noformat}
> The primary sort is by count, but secondary is by ord (= order in which they
> were indexed), which is not really understandable/transparent to the end
> user. I think it'd be best if we could do tie-break sort by label ...
> But talking to Shai, this seems hard/costly to fix, because when visiting the
> facet ords to collect the top K, we don't currently resolve to label, and in
> the worst case (say my example had a million labels with count 1) that's a
> lot of extra label lookups ...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]