[ 
https://issues.apache.org/jira/browse/LUCENE-7590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15745855#comment-15745855
 ] 

Shai Erera commented on LUCENE-7590:
------------------------------------

bq. Instead of using a NOOP_COLLECTOR, you could throw a 
CollectionTerminatedException

OK, good idea.

bq. By the way, in such cases I think we should still increase the missing 
count?

I am not sure? I mean, {{missing}} represents all the documents that matched 
the query and did not have a value for that DV field. But when 
{{getLeafCollector}} is called, we don't know yet that any documents will be 
matched by the query at all (I think?) and therefore updating missing might be 
confusing? I.e., I'd expect that if anyone chained {{TotalHitsCollector}} with 
{{DocValuesStatsCollector}}, then {{totalHits = stats.count() + 
stats.missing()}}? I am open to discuss it, just not sure I always want to 
update missing with {{context.reader().numDocs()}} ...

bq. Can we avoid making DocValuesIterator public?

I did not find a way, since it's part of {{DocValuesStats.init()}} API and I 
think users should be able to provide their own {{Stats}} impl, e.g. if they 
want to compute something on a {{BinaryDocValues}} field?

Here too, I'd love to get more ideas though. I tried to avoid implementing N 
collectors, one for each DV type, where they share a large portion of the code. 
But if you have strong opinions about making {{DVI}} public, maybe that's what 
we should do ...

> Add DocValues statistics helpers
> --------------------------------
>
>                 Key: LUCENE-7590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7590
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/misc
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-7590.patch, LUCENE-7590.patch, LUCENE-7590.patch, 
> LUCENE-7590.patch, LUCENE-7590.patch
>
>
> I think it can be useful to have DocValues statistics helpers, that can allow 
> users to query for the min/max/avg etc. stats of a DV field. In this issue 
> I'd like to cover numeric DV, but there's no reason not to add it to other DV 
> types too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to