[ 
https://issues.apache.org/jira/browse/SOLR-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180712#comment-14180712
 ] 

Hoss Man commented on SOLR-6349:
--------------------------------

Xu:

I havne't had a chance to review your patch, but it sounds like awesome 
progress -- in respose to your specific question...

bq. do we consider calcDistinct is a local parameter like min/max, or it should 
be the top-level parameter. (Which keeps the existing behavior for things like 
stats.field=foo&stats.calcDistinct=true)

...the key is that we need to support both.  as i mentioned previously...

bq. deprecate stats.calcdistinct but use it as a default for the new 
corresponding localparam(s)

and note the psuedo code i postd in my last comment...

{code}
if (statsInResponse.isEmpty()) {
  statsInResponse.addAll(LEGACY_DEFAULT_STATS); // static final EnumSet
  statsToCalculate.addAll(LEGACY_DEFAULT_STATS_DEPENDS); // static {} built 
EnumSet looping over LEGACY_DEFAULT_STATS 
}
if (params.getFieldBool(f, STATS_CALCDISTINCT, false)) { // top level req params
  statsInResponse.add(Stat.calcDistinct);
  statsToCalculate.add(Stat.calcDistinct);
}
{code}

to give some concrete examples, if we assume that "foo" is a numeric field, 
(where {{stats.field=foo}} currently returns 
min/max/missing/sum/count/mean/sumOfSquares/stddev) then these are the results 
you should get in various situations...


{noformat}
1) stats.field=foo
or stats.field=foo&stats.calcDistinct=false
or stats.field=foo&f.foo.stats.calcDistinct=false
or stats.field={!min=true max=true missing=true sum=true count=true mean=true 
sumOfSquares=true stddev=true}foo
or stats.field={!min=true max=true missing=true sum=true count=true mean=true 
sumOfSquares=true stddev=true}foo&f.foo.stats.calcDistinct=false
or stats.field={!min=true max=true missing=true sum=true count=true mean=true 
sumOfSquares=true stddev=true 
calcDistinct=false}foo&f.foo.stats.calcDistinct=true

=> min + max + missing + sum + count + mean + sumOfSquares + stddev

----

2) stats.field=foo&stats.calcDistinct=true
or stats.field=foo&f.foo.stats.calcDistinct=true
or stats.field={!min=true max=true missing=true sum=true count=true mean=true 
sumOfSquares=true stddev=true calcDistinct=true}foo
or stats.field={!min=true max=true missing=true sum=true count=true mean=true 
sumOfSquares=true stddev=true calcDistinct=true}foostats.calcDistinct=false
or stats.field={!min=true max=true missing=true sum=true count=true mean=true 
sumOfSquares=true stddev=true 
calcDistinct=true}foo&f.foo.stats.calcDistinct=false

=> min + max + missing + sum + count + mean + sumOfSquares + stddev + 
calcDistinct

----

3) stats.field={!calcDistinct=true}foo
or stats.field={!calcDistinct=true}foo&stats.calcDistinct=false
or stats.field={!calcDistinct=true}foo&f.foo.stats.calcDistinct=false

=> calcDistinct

----

3) stats.field={!min=true}foo&stats.calcDistinct=true
or stats.field={!min=true calcDistinct=true}foo&stats.calcDistinct=false
or stats.field={!min=true calcDistinct=true}foo&f.foo.stats.calcDistinct=false

=> min + calcDistinct

{noformat}

does that make sense?


> LocalParams for enabling/disabling individual stats
> ---------------------------------------------------
>
>                 Key: SOLR-6349
>                 URL: https://issues.apache.org/jira/browse/SOLR-6349
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6349-tflobbe.patch, SOLR-6349-tflobbe.patch, 
> SOLR-6349-tflobbe.patch, SOLR-6349-xu.patch, SOLR-6349-xu.patch, 
> SOLR-6349___bad_idea_broken.patch
>
>
> Stats component currently computes all stats (except for one) every time 
> because they are relatively cheap, and in some cases dependent on eachother 
> for distrib computation -- but if we start layering stats on other things it 
> becomes unnecessarily expensive to compute all the stats when they just want 
> the "sum" (and it will definitely become excessively verbose in the 
> responses).  
> The plan here is to use local params to make this configurable.  All of the 
> existing stat options could be modeled as a simple boolean param, but future 
> params (like percentiles) might take in a more complex param value...
> Example:
> {noformat}
> stats.field={!min=true max=true percentiles='99,99.999'}price
> stats.field={!mean=true}weight
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to