[
https://issues.apache.org/jira/browse/LUCENE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219901#comment-14219901
]
Mark Harwood commented on LUCENE-6066:
--------------------------------------
An analogy might be making a compilation album of 1967's top hit records:
1) A vanilla Lucene query's results might look like a "Best of the Beatles"
album - no diversity
2) A grouping query would produce "The 10 top-selling artists of 1967 - some
killer and quite a lot of filler"
3) A "diversified" query would be the top 20 hit records of that year - with a
max of 3 Beatles hits to maintain diversity
> New "remove" method in PriorityQueue
> ------------------------------------
>
> Key: LUCENE-6066
> URL: https://issues.apache.org/jira/browse/LUCENE-6066
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/query/scoring
> Reporter: Mark Harwood
> Priority: Minor
> Fix For: 5.0
>
> Attachments: LUCENE-PQRemoveV1.patch
>
>
> It would be useful to be able to remove existing elements from a
> PriorityQueue.
> The proposal is that a linear scan is performed to find the element being
> removed and then the end element in heap[size] is swapped into this position
> to perform the delete. The method downHeap() is then called to shuffle the
> replacement element back down the array but the existing downHeap method must
> be modified to allow picking up an entry from any point in the array rather
> than always assuming the first element (which is its only current mode of
> operation).
> A working javascript model of the proposal with animation is available here:
> http://jsfiddle.net/grcmquf2/22/
> In tests the modified version of "downHeap" produces the same results as the
> existing impl but adds the ability to push down from any point.
> An example use case that requires remove is where a client doesn't want more
> than N matches for any given key (e.g. no more than 5 products from any one
> retailer in a marketplace). In these circumstances a document that was
> previously thought of as competitive has to be removed from the final PQ and
> replaced with another doc (eg a retailer who already has 5 matches in the PQ
> receives a 6th match which is better than his previous ones). This particular
> process is managed by a special "DiversifyingPriorityQueue" which wraps the
> main PriorityQueue and could be contributed as part of another issue if there
> is interest in that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]