[ 
https://issues.apache.org/jira/browse/SOLR-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804701#comment-14804701
 ] 

Hoss Man commented on SOLR-6168:
--------------------------------

bq. The intention of the FieldComparator.value API is that you would use it to 
look up a value after all collection is complete, e.g. to fill the fields to 
create the FieldDoc for this hit, and so the comparator would naturally not be 
changing it anymore.

ok, but that's not clear/enforced anywhere in the API itself -- if you are only 
suppose to fetch back all the slots after the search is complete, why support 
fetching the individual slots (and compareValues()) at all?

bq. But I don't really understand what SOLR-6168 is trying to do ... maybe 
there is another way to do it w/o abusing this API?

In a nutshell the CollapseQParser is another style of "group collapsing" where 
it picks a single best representative doc for each group on the fly as docs are 
collected -- currently it does this only based on a single field via DocValues, 
SOLR-6168 is about making this work with any arbitrary "Sort" object.  The code 
in the patch (in SOLR-6168) is analogous to the type of logic that 
FieldValueHitQueue does, but instead of tracking the "N best docs" (according 
to the sort) in a single priority queue (which re-orders on the fly), it tracks 
the *1* best doc for every "group" as it makes a single pass collection uses 
value() + compareValues to know when a "better" doc has been found for that 
group.

(The most trivial implementation would be to use a FieldValueHitQueue of size 
'1' for each group, but this approach unwinds the logic a track the values 
directly.  When i was first looking at this approach, i initially considered 
using a FieldComparator "slot" for each group, but the number of groups isn't 
neccessarly known in advance and there's no way to "grow" a FieldComparator)

It doesn't feel like an abuse of the FieldComparator API to do this ... every 
other Comparator works fine as is.  As i mentioned I could always put hacks in 
the caller code (ColapseQParser) to do instanceof/casting/deepCopy when the 
FieldComparator returns BytesRefs -- but fixing Term(Ord)ValComparator.value to 
return "safe" objects seemed saner.

---

Forgetting SOLR-6168, even if you asssume that all callers should know that  
FieldComparator.value is only intended to be usable after the end of a search, 
the BytesRefs returned by Term(Ord)ValComparator.value don't really seem to be 
safe for a caller to hang on to (for later use searchAfter, etc...) in any 
situation where the FieldComparator itself might get re-used -- ie: re-using a 
FieldValueHitQueue (or Collector that uses FieldValueHitQueue) in a subsequent 
search.

what protects the user from those values being changed out from under them in 
that situation?


> enhance collapse QParser so that "group head" documents can be selected by 
> more complex sort options
> ----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6168
>                 URL: https://issues.apache.org/jira/browse/SOLR-6168
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.7.1, 4.8.1
>            Reporter: Umesh Prasad
>            Assignee: Joel Bernstein
>         Attachments: CollapsingQParserPlugin-6168.patch.1-1stcut, 
> SOLR-6168-group-head-inconsistent-with-sort.patch, SOLR-6168.patch, 
> SOLR-6168.patch
>
>
> The fundemental goal of this issue is add additional support to the 
> CollapseQParser so that as an alternative to the existing min/max localparam 
> options, more robust sort syntax can be used to sort on multiple criteria 
> when selecting the "group head" documents used to represent each collapsed 
> group.
> Since support for arbitrary, multi-clause, sorting is almost certainly going 
> to require more RAM then the existing min/max functionaly, this new 
> functionality should be in addition to the existing min/max localparam 
> implementation, not a replacement of it.
> (NOTE: early comments made in this jira may be confusing in historical 
> context due to the way this issue was originally filed as a bug report)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to