[ 
https://issues.apache.org/jira/browse/LUCENE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219901#comment-14219901
 ] 

Mark Harwood commented on LUCENE-6066:
--------------------------------------

An analogy might be making a compilation album of 1967's top hit records:

1) A vanilla Lucene query's results might look like a "Best of the Beatles" 
album - no diversity
2) A grouping query would produce "The 10 top-selling artists of 1967 - some 
killer and quite a lot of filler"
3) A "diversified" query would be the top 20 hit records of that year - with a 
max of 3 Beatles hits to maintain diversity

> New "remove" method in PriorityQueue
> ------------------------------------
>
>                 Key: LUCENE-6066
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6066
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>            Reporter: Mark Harwood
>            Priority: Minor
>             Fix For: 5.0
>
>         Attachments: LUCENE-PQRemoveV1.patch
>
>
> It would be useful to be able to remove existing elements from a 
> PriorityQueue. 
> The proposal is that a linear scan is performed to find the element being 
> removed and then the end element in heap[size] is swapped into this position 
> to perform the delete. The method downHeap() is then called to shuffle the 
> replacement element back down the array but the existing downHeap method must 
> be modified to allow picking up an entry from any point in the array rather 
> than always assuming the first element (which is its only current mode of 
> operation).
> A working javascript model of the proposal with animation is available here: 
> http://jsfiddle.net/grcmquf2/22/ 
> In tests the modified version of "downHeap" produces the same results as the 
> existing impl but adds the ability to push down from any point.
> An example use case that requires remove is where a client doesn't want more 
> than N matches for any given key (e.g. no more than 5 products from any one 
> retailer in a marketplace). In these circumstances a document that was 
> previously thought of as competitive has to be removed from the final PQ and 
> replaced with another doc (eg a retailer who already has 5 matches in the PQ 
> receives a 6th match which is better than his previous ones). This particular 
> process is managed by a special "DiversifyingPriorityQueue" which wraps the 
> main PriorityQueue and could be contributed as part of another issue if there 
> is interest in that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to