[ 
https://issues.apache.org/jira/browse/SOLR-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144067#comment-14144067
 ] 

Hoss Man commented on SOLR-6541:
--------------------------------


Setting aside point #2 & #3 for a moment (since they seem to be purely about 
API visibility)...

bq. 1. Accumulate methods should not return stats specific numbers (it is 
generic). ...

I'm not sure that i understand this sentence -- it seems to be refering to the 
DocValuesStats.accumSingle and DocValuesStats.accumMulti, and the fact that 
they return the number of docs "missing" values.  but i don't understand:

a) in what way are you saying these methods are "generic" ?  ... they aren't 
part of any API contract, they're just package private static methods.

b) how is the missing count a "stats specific number" ? ... at first glance, i 
thought you ment it was being accumulated in the AbstractStatsValues base class 
(or code tightly coupled with it) when really it should be associated with code 
in a specific subclass (similar to how subclasses like "NumericStatsValues" 
adds stats like "sum" and "mean" that don't make sense for string & enum 
fields) -- but even if that's hwat you ment, that doens't make sense given that 
"missing" is part of the AbstractStatsValues contract.

bq. 4. We don't need intermediate maps to accumulate missing counts. ...

At a glance this seems like a good idea to me -- but it seems to straight 
forward it makes me suspicious of why it wasn't done that way in the first 
place?

Looking at the comments in SOLR-6452, this seems to have been very deliberate...

bq. As an optimization, why don't you make the missingStats a Map<Integer, 
Integer> and use the ords as keys instead of the terms. That way you don't need 
to do the lookupOrd for all docs, and you do it only once per term in the 
accumulateMissing() method. 


> Enhancement for SOLR-6452 StatsComponent "missing" stat won't work with 
> docValues=true and indexed=false
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-6541
>                 URL: https://issues.apache.org/jira/browse/SOLR-6541
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.10, Trunk
>            Reporter: Vitaliy Zhovtyuk
>            Priority: Minor
>             Fix For: 5.0, Trunk
>
>         Attachments: SOLR-6541.patch
>
>
> This issue is refactoring of solution provided in SOLR-6452 StatsComponent 
> "missing" stat won't work with docValues=true and indexed=false.
> I think the following points need to be addressed:
> 1. Accumulate methods should not return stats specific numbers (it is 
> generic). Attached solution with container class. Also made them private 
> scoped.
> Returning just missing fields from accumulate methods does not allow you to 
> extend it with additional counts field, therefore i propose to leave void.
> 2. Reduced visibility of fields in FieldFacetStats.
> 3. Methods FieldFacetStats#accumulateMissing and 
> FieldFacetStats#accumulateTermNum does not throw any IO exception
> 4. We don't need intermediate maps to accumulate missing counts. Method  
> org.apache.solr.handler.component.FieldFacetStats#facetMissingNum 
> can be changed to work directly on StatsValues structure and removed 
> org.apache.solr.handler.component.FieldFacetStats#accumulateMissing. 
> We don't need to have it in 2 phases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to