[
https://issues.apache.org/jira/browse/LUCENE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Harwood updated LUCENE-6066:
---------------------------------
Description: This issue provides a new collector for situations where a
client doesn't want more than N matches for any given key (e.g. no more than 5
products from any one retailer in a marketplace). In these circumstances a
document that was previously thought of as competitive during collection has to
be removed from the final PQ and replaced with another doc (eg a retailer who
already has 5 matches in the PQ receives a 6th match which is better than his
previous ones). This requires a new remove method on the existing PriorityQueue
class. (was: It would be useful to be able to remove existing elements from a
PriorityQueue.
The proposal is that a linear scan is performed to find the element being
removed and then the end element in heap[size] is swapped into this position to
perform the delete. The method downHeap() is then called to shuffle the
replacement element back down the array but the existing downHeap method must
be modified to allow picking up an entry from any point in the array rather
than always assuming the first element (which is its only current mode of
operation).
A working javascript model of the proposal with animation is available here:
http://jsfiddle.net/grcmquf2/22/
In tests the modified version of "downHeap" produces the same results as the
existing impl but adds the ability to push down from any point.
An example use case that requires remove is where a client doesn't want more
than N matches for any given key (e.g. no more than 5 products from any one
retailer in a marketplace). In these circumstances a document that was
previously thought of as competitive has to be removed from the final PQ and
replaced with another doc (eg a retailer who already has 5 matches in the PQ
receives a 6th match which is better than his previous ones). This particular
process is managed by a special "DiversifyingPriorityQueue" which wraps the
main PriorityQueue and could be contributed as part of another issue if there
is interest in that. )
Summary: Collector that manages diversity in search results (was: New
"remove" method in PriorityQueue)
> Collector that manages diversity in search results
> --------------------------------------------------
>
> Key: LUCENE-6066
> URL: https://issues.apache.org/jira/browse/LUCENE-6066
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/query/scoring
> Reporter: Mark Harwood
> Priority: Minor
> Fix For: 5.0
>
> Attachments: LUCENE-PQRemoveV3.patch
>
>
> This issue provides a new collector for situations where a client doesn't
> want more than N matches for any given key (e.g. no more than 5 products from
> any one retailer in a marketplace). In these circumstances a document that
> was previously thought of as competitive during collection has to be removed
> from the final PQ and replaced with another doc (eg a retailer who already
> has 5 matches in the PQ receives a 6th match which is better than his
> previous ones). This requires a new remove method on the existing
> PriorityQueue class.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]