[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2018-07-25 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556397#comment-16556397
 ] 

Hoss Man commented on LUCENE-8060:
--

{quote}On the other hand I'm concerned that the "nothing" approach is not very 
usable in practice as it is hard to build a UI with pagination, which I see is 
a very common need. ...
{quote}
Oh, right – because if we default to "0" (or "-1" or whatever means "don't 
track at all") the the consumer of TopDocs doesn't even know if there should be 
a "next" link. yeah, i guess some arbitrary finite positive integer is the 
least bad option.
{quote}... maybe we could add a setter on IndexSearcher?
{quote}
Yeah i dunno... that just feels kind of weird to me – i guess i have two straw 
man concerns about that approach...
 # why have a {{setDefaultNumTotalHitsToTrack(int)}} just for this concept, and 
not a setter for all the other collector concepts that we currently have 
defaults for in the simple search/searchAfter methods (like {{Sort sort}} , 
{{boolean doDocScores}} , {{boolean doMaxScore}} , etc...)
 ** do we want to go down the route of an {{IndexSearcherConfig}} ?
 # this seems like it introduces divergent "intermediate APIs" for users to 
learn about that might frustrate them down the road...
 ** Today, the first time you build an app you just call something like 
{{IndexSearcher.search(myQuery, 100, mySort)}} and you're happy, and then later 
if you decide you want to do something more complicated you read the docs and 
learn about Collectors and you start using the builder methods to create 
collectors that solve common problems, and if you get to the point that those 
don't meet your needs you already understand the collector concept and you can 
write your own (composing existing ones)
 ** if there's IndexSearcher setters that change the default behavior of the 
{{search()}} methods, then that becomes the path of least resistence that 
intermediate users go down to do slightly more complex things, but if they 
reach the point where they want to do something new that doesn't have a setter, 
they have to "start over" learning about how to do searchers, and read up on 
writting/composing collectors w/o having used the out of the box collector 
builds yet.

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2018-07-25 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556323#comment-16556323
 ] 

Adrien Grand commented on LUCENE-8060:
--

I'd like to avoid IndexSearcher doing "nothing" or "everything". "everything" 
has the downside that it makes queries slow. On the other hand I'm concerned 
that the "nothing" approach is not very usable in practice as it is hard to 
build a UI with pagination, which I see is a very common need. I wouldn't like 
that simple use-cases can't use the simple search() methods on IndexSearcher 
and need to create collectors manually.

It is true that a value of 1000 or 10,000 feels arbitrary and might not be 
ideal for everybody depending on data volumes or use-case, but maybe we could 
add a setter on IndexSearcher?

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2018-07-25 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556268#comment-16556268
 ] 

Hoss Man commented on LUCENE-8060:
--

{quote}I think as long as totalHits is renamed/replaced to force a compilation 
error and draw attention to the need to use a Collector if you want to control 
if/how-much the total number of hits is accurately recorded, it's fine to 
hadcode a default in the IndexSearcher methods that return TopDocs directly ... 
i would go so far as to suggest that in that in that situation, hardcoding 
maxTotalHits/minExactTotalHits to "0" (ie: don't bother trying to track exactly 
at all) would be fine.
{quote}
To elaborate, my thinking is that having the "simple" IndexSearcher  APIs use a 
default of "nothing" (or "everything", ie: 
{{maxTotalHitsToTrack=Integer.MAX_VALUE}} ) seems much easier to 
explain/understand to new users regardless of their index size/usecases then 
some arbitrary positive number like "10,000")

But better still – let's assume:
 * we deprecate/remove {{TopDocs.totalHits}}
 ** replace it with a {{TopDocs.getTotalHits()}}
 * we add an {{int maxTotalHitsToTrack}} option on the collectors (builders)
 ** document it such that any positive number means "track accurate hit count 
up to this amount, after that just stop"
 ** document everything such that if {{maxTotalHitsToTrack}} is set to a 
negative number then then {{TopDocs.getTotalHits()}} will throw an illegal 
state exception.

...then i would suggest that the IndexSearcher methods that return TopDocs 
directly should default to using {{maxTotalHitsToTrack=-1}} .. so any attempt 
to use the "simple" apis makes it really clear it doesn't support hit tracking.

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2018-07-25 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556214#comment-16556214
 ] 

Adrien Grand commented on LUCENE-8060:
--

{quote}I don't know a lot about how the current estimation code works
{quote}
It assumes that the density of matches is the same in the whole index. So if 
docs are collected exactly until doc id 1000 and there are 1M documents in the 
index, it just multiplies the number of collected documents by 1000. This is 
often a bad estimate and we have no idea of how large the error is.
{quote}would that even be possible?
{quote}
I'm not aware of ways to get good estimates for queries that match many 
documents efficiently, especially conjunctions. So the error bound would be 
terrible in those cases I'm afraid. Maybe we could give a lower bound and an 
upper bound, or an enum that would say whether the hit count is accurate or a 
lower bound of the actual hit count.
{quote}i would go so far as to suggest that in that in that situation, 
hardcoding maxTotalHits/minExactTotalHits to "0" (ie: don't bother trying to 
track exactly at all) would be fine.
{quote}
OK. Thanks for the feedback!

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2018-07-25 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556071#comment-16556071
 ] 

Hoss Man commented on LUCENE-8060:
--

I don't know a lot about how the current estimation code works, but would it be 
worthwhile to capture info in the TopDocs about how accurate/confident the 
totalHits is, and instead of an {{int maxTotalHits}} option make it an {{int 
minExactTotalHits}} option?

So if a caller specifies {{minExactTotalHits=5000}} and the resulting 
{{TopDocs.totalHits == 42}} then there would also be a 
{{TopDocs.totalHitsAccuracy == 1.0D}} because we're 100% confident that that's 
the number of hits ... but if {{TopDocs.totalHits == 4200}} then maybe  
{{TopDocs.totalHitsAccuracy == 0.1D}} or maybe {{TopDocs.totalHitsAccuracy == 
0.9D}} depending on how confident the estimation is.

would that even be possible?



bq. Regarding integration in IndexSearcher, I am thinking of 3 ideas:

I think as long as {{totalHits}} is renamed/replaced to force a compilation 
error and draw attention to the need to use a Collector if you want to control 
if/how-much the total number of hits is accurately recorded, it's fine to 
hadcode a default in the IndexSearcher methods that return TopDocs directly ... 
i would go so far as to suggest that in that in that situation, hardcoding 
maxTotalHits/minExactTotalHits to "0" (ie: don't bother trying to track exactly 
at all) would be fine.


> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2018-07-25 Thread Adrien Grand (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555890#comment-16555890
 ] 

Adrien Grand commented on LUCENE-8060:
--

Our current hit count estimations are terrible. I don't think any user would 
want to rely on them or even display them in a UI. Problem is that hit counts 
are useful from a UI perspective, for instance for pagination, or to improve 
the user experience by giving users a sense of how many matches there are and 
giving confidence in the search engine by showing the user that there is a lot 
of content that matches his query.

I think an ok trade-off that would address the two above use-cases would be to 
only count up to a certain hit count? For instance if you allow users to 
paginate up to page 10 and have 20 hits per page, you only need to count up to 
200 hits to know how many pages to display. Similarly if your end goal is only 
to show users that you have lots of content, you could only count up to eg. 
10,000 matches and show something like "more than 10,000 hits" in the UI if 
that number is reached. In both cases, this should help keep the counting 
overhead contained so that it doesn't end up being the bottleneck of query 
processing?

I believe both TopScoreDocCollector and TopFieldCollector could easily be 
changed in order to replace `boolean trackTotalHits` with something like `int 
maxTotalHits` and we would stop counting after visiting maxTotalHits documents?

Regarding integration in IndexSearcher, I am thinking of 3 ideas:
 - hardcode a value for this parameter, maybe 10,000 and rename 
TopDocs.totalHits to make sure users get a compile error
 - add a parameter to the search() methods to require users to pass a 
maxTotalHits
 - add a required constructor argument to IndexSearcher that would affect all 
search() methods

We could also make the top docs collectors just compute a ScoreDoc[] (ie. no 
total hits) and require users to compute the hit count separately, but I'm 
concerned that it would make simple usage of Lucene harder.

Opinions?

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2018-07-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546522#comment-16546522
 ] 

ASF subversion and git services commented on LUCENE-8060:
-

Commit d730c8b214bd8b659aa92011e7a8d455af535382 in lucene-solr's branch 
refs/heads/master from [~jpountz]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d730c8b ]

LUCENE-8060: Remove usage of TopDocs#totalHits that should really be 
IndexSearcher#count.

Many tests were written before we introduced IndexSearcher#count and used
`searcher.search(query, 1).totalHits` to get the number of matches of a query
rather than `searcher.count(query)`.


> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2017-11-22 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263491#comment-16263491
 ] 

Adrien Grand commented on LUCENE-8060:
--

At first I thought it would be less user-friendly, but the methods I was 
thinking of in order to get approximate counts (basically assuming that the hit 
ratio is the same until collection is terminated as in the whole index) would 
be of low quality, so maybe it's better to not return any hit count as well.

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2017-11-22 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263175#comment-16263175
 ] 

Shai Erera commented on LUCENE-8060:


What if we conceptually remove {{TopDocs.totalHits}} and if users require that, 
they can chain their Collector with {{TotalHitCountCollector}}? We can also add 
that boolean as a sugar to {{IndexSearcher.search()}} API.

If we're OK w/ removing {{TopDocs.totalHits}}, and users getting a compilation 
error (that's easy to fix), then that's an easy option/change. Or... we 
deprecate it, but keep the simple IndexSearcher.search() APIs still compute it 
(by chaining this collector), and let users who'd like to optimize use the 
search() API which takes a Collector.

Just a thought...

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2017-11-22 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263148#comment-16263148
 ] 

Robert Muir commented on LUCENE-8060:
-

+1, I think something like that is a better tradeoff. I don't think it needs to 
be so verbose though. It may be enough to just change it to "hitCount" or 
similar and separately add a boolean indicating if it is exact.

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2017-11-22 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263133#comment-16263133
 ] 

Adrien Grand commented on LUCENE-8060:
--

I'd like to change defaults too but I'm worried that since there won't be a 
compilation error, users will blindly upgrade and miss the fact that we did 
this change, regardless of how well we document this change. Or we should at 
least do something like renaming {{TopDocs.totalHits}} to 
{{TopDocs.totalHitsApproximate}}?

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8060) Require users to tell us whether they need total hit counts

2017-11-22 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263111#comment-16263111
 ] 

Robert Muir commented on LUCENE-8060:
-

I don't think it should be mandatory, it should be the default... its a search 
engine.

> Require users to tell us whether they need total hit counts
> ---
>
> Key: LUCENE-8060
> URL: https://issues.apache.org/jira/browse/LUCENE-8060
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Fix For: master (8.0)
>
>
> We are getting optimizations when hit counts are not required (sorted 
> indexes, MAXSCORE, short-circuiting of phrase queries) but our users won't 
> benefit from them unless we disable exact hit counts by default or we require 
> them to tell us whether hit counts are required.
> I think making hit counts approximate by default is going to be a bit trappy, 
> so I'm rather leaning towards requiring users to tell us explicitly whether 
> they need total hit counts. I can think of two ways to do that: either by 
> passing a boolean to the IndexSearcher constructor or by adding a boolean to 
> all methods that produce TopDocs instances. I like the latter better but I'm 
> open to discussion or other ideas?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org