[ 
https://issues.apache.org/jira/browse/LUCENE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Harwood updated LUCENE-6066:
---------------------------------
    Description: This issue provides a new collector for situations where a 
client doesn't want more than N matches for any given key (e.g. no more than 5 
products from any one retailer in a marketplace). In these circumstances a 
document that was previously thought of as competitive during collection has to 
be removed from the final PQ and replaced with another doc (eg a retailer who 
already has 5 matches in the PQ receives a 6th match which is better than his 
previous ones). This requires a new remove method on the existing PriorityQueue 
class.  (was: It would be useful to be able to remove existing elements from a 
PriorityQueue. 
The proposal is that a linear scan is performed to find the element being 
removed and then the end element in heap[size] is swapped into this position to 
perform the delete. The method downHeap() is then called to shuffle the 
replacement element back down the array but the existing downHeap method must 
be modified to allow picking up an entry from any point in the array rather 
than always assuming the first element (which is its only current mode of 
operation).

A working javascript model of the proposal with animation is available here: 
http://jsfiddle.net/grcmquf2/22/ 

In tests the modified version of "downHeap" produces the same results as the 
existing impl but adds the ability to push down from any point.

An example use case that requires remove is where a client doesn't want more 
than N matches for any given key (e.g. no more than 5 products from any one 
retailer in a marketplace). In these circumstances a document that was 
previously thought of as competitive has to be removed from the final PQ and 
replaced with another doc (eg a retailer who already has 5 matches in the PQ 
receives a 6th match which is better than his previous ones). This particular 
process is managed by a special "DiversifyingPriorityQueue" which wraps the 
main PriorityQueue and could be contributed as part of another issue if there 
is interest in that. )
        Summary: Collector that manages diversity in search results  (was: New 
"remove" method in PriorityQueue)

> Collector that manages diversity in search results
> --------------------------------------------------
>
>                 Key: LUCENE-6066
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6066
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>            Reporter: Mark Harwood
>            Priority: Minor
>             Fix For: 5.0
>
>         Attachments: LUCENE-PQRemoveV3.patch
>
>
> This issue provides a new collector for situations where a client doesn't 
> want more than N matches for any given key (e.g. no more than 5 products from 
> any one retailer in a marketplace). In these circumstances a document that 
> was previously thought of as competitive during collection has to be removed 
> from the final PQ and replaced with another doc (eg a retailer who already 
> has 5 matches in the PQ receives a 6th match which is better than his 
> previous ones). This requires a new remove method on the existing 
> PriorityQueue class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to