[jira] [Commented] (LUCENE-8950) FieldComparators Should Not Maintain Implicit PQs

2019-08-14 Thread Atri Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907093#comment-16907093
 ] 

Atri Sharma commented on LUCENE-8950:
-

{quote}you would like to introduce a sub class of FieldComparator that hides 
the fact that it maintains an implicit PQ, and make simple comparators extend 
this sub class instead of FieldComparator directly?
{quote}
Yes, exactly.

 

Thanks for validating – I will work on a PR now.

> FieldComparators Should Not Maintain Implicit PQs
> -
>
> Key: LUCENE-8950
> URL: https://issues.apache.org/jira/browse/LUCENE-8950
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>
> While doing some perf tests, I realised that FieldComparators inherently 
> maintain implicit priority queues for maintaining the sorted order of 
> documents for the given sort order. This is wasteful especially in the case 
> of a multi feature sort order and a large number of hits requested.
>  
> We should change this to have FieldComparators maintain only the top and 
> bottom values, and use them as barriers to compare



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8950) FieldComparators Should Not Maintain Implicit PQs

2019-08-14 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907090#comment-16907090
 ] 

Adrien Grand commented on LUCENE-8950:
--

OK I got confused with the notion of deprecation, which suggests to me that we 
need to find a replacement, but reading your last message, my understanding is 
that you would like to introduce a sub class of FieldComparator that hides the 
fact that it maintains an implicit PQ, and make simple comparators extend this 
sub class instead of FieldComparator directly? I would be +1 to giving it a try 
and making the change if the perf hit is negligible.

> FieldComparators Should Not Maintain Implicit PQs
> -
>
> Key: LUCENE-8950
> URL: https://issues.apache.org/jira/browse/LUCENE-8950
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>
> While doing some perf tests, I realised that FieldComparators inherently 
> maintain implicit priority queues for maintaining the sorted order of 
> documents for the given sort order. This is wasteful especially in the case 
> of a multi feature sort order and a large number of hits requested.
>  
> We should change this to have FieldComparators maintain only the top and 
> bottom values, and use them as barriers to compare



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8950) FieldComparators Should Not Maintain Implicit PQs

2019-08-14 Thread Atri Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907075#comment-16907075
 ] 

Atri Sharma commented on LUCENE-8950:
-

{quote}This looks like a duplicate of LUCENE-8878?
{quote}
Not necessarily – 8878 targets refactoring the API to be simpler, whereas this 
Jira only targets removing the necessary condition that FieldComparators 
maintain their own priority queues. I believe this Jira compliments 8878.
{quote}I think all of us agree on the fact that it would be nice to have a 
simpler FieldComparator API. The challenge is that we don't want to trade too 
much efficiency. For instance the API you are proposing wouldn't work well with 
geo-distance sorting since it would require computing the actual distance for 
every new document, while the current implementation tries to be smart to first 
check a bounding box, and then compute a sort key that compares like the actual 
distance but is much cheaper to compute
{quote}
Agreed, that is precisely why I suggested deprecating compare (slot, slot) 
instead of removing it completely. The idea is that comparators that require 
access to an internal PQ for whatever reasons are free to do so, but it should 
not be mandatory, and future comparators should not take on this dependency 
without understanding the tradeoffs

> FieldComparators Should Not Maintain Implicit PQs
> -
>
> Key: LUCENE-8950
> URL: https://issues.apache.org/jira/browse/LUCENE-8950
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>
> While doing some perf tests, I realised that FieldComparators inherently 
> maintain implicit priority queues for maintaining the sorted order of 
> documents for the given sort order. This is wasteful especially in the case 
> of a multi feature sort order and a large number of hits requested.
>  
> We should change this to have FieldComparators maintain only the top and 
> bottom values, and use them as barriers to compare



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8950) FieldComparators Should Not Maintain Implicit PQs

2019-08-14 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907069#comment-16907069
 ] 

Adrien Grand commented on LUCENE-8950:
--

This looks like a duplicate of LUCENE-8878?

I think all of us agree on the fact that it would be nice to have a simpler 
FieldComparator API. The challenge is that we don't want to trade too much 
efficiency. For instance the API you are proposing wouldn't work well with 
geo-distance sorting since it would require computing the actual distance for 
every new document, while the current implementation tries to be smart to first 
check a bounding box, and then compute a sort key that compares like the actual 
distance but is much cheaper to compute (see discussion on LUCENE-8878 for more 
details).

> FieldComparators Should Not Maintain Implicit PQs
> -
>
> Key: LUCENE-8950
> URL: https://issues.apache.org/jira/browse/LUCENE-8950
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>
> While doing some perf tests, I realised that FieldComparators inherently 
> maintain implicit priority queues for maintaining the sorted order of 
> documents for the given sort order. This is wasteful especially in the case 
> of a multi feature sort order and a large number of hits requested.
>  
> We should change this to have FieldComparators maintain only the top and 
> bottom values, and use them as barriers to compare



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8950) FieldComparators Should Not Maintain Implicit PQs

2019-08-14 Thread Atri Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906978#comment-16906978
 ] 

Atri Sharma commented on LUCENE-8950:
-

I confess I do not have a very clean idea as to how this can be implemented: 
the typical usages of FieldComparator mandate that the user maintain a list of 
slots into the FieldComparator, which can implicitly be as bad in terms of size 
as the queue itself. FieldComparator provides a convenient API to allow 
comparisons between two values of the type maintained in the queue, which can 
form the basis of this observation.

 

Here is the first cut of proposal that I have in mind:

1) Deprecate compare(slot, slot) so that new implementations do not depend on 
this method, but rather use compare(T val, T val).

2) Start with some comparators (Numeric comparators?), get rid of the implicit 
priority queue and make the user maintain those values.

3) Make Numeric comparators track only the top and bottom values, as needed.

 

Note that I am treating NumericComparators as the starting point/example, but 
the approach should extend for other comparators as well.

 

With [https://github.com/apache/lucene-solr/pull/831,] getting values out of 
leaf comparators should be easy, so the logical step after this PR is to depend 
on compare (val, val) more than we rely on compare (slot, slot).

 

Happy to receive feedback and alternate proposals

> FieldComparators Should Not Maintain Implicit PQs
> -
>
> Key: LUCENE-8950
> URL: https://issues.apache.org/jira/browse/LUCENE-8950
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Priority: Major
>
> While doing some perf tests, I realised that FieldComparators inherently 
> maintain implicit priority queues for maintaining the sorted order of 
> documents for the given sort order. This is wasteful especially in the case 
> of a multi feature sort order and a large number of hits requested.
>  
> We should change this to have FieldComparators maintain only the top and 
> bottom values, and use them as barriers to compare



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org