[ 
https://issues.apache.org/jira/browse/SOLR-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6349:
---------------------------
    Attachment: SOLR-6349.patch

some more udpates to the patch...

* StatsComponentTest
** undid an odd calcDistinct param change in 
testFieldStatisticsDocValuesAndMultiValuedIntegerFacetStats that shouldn't 
affect the test goal
*** want to ensure the behavior in this test isn't broken by changes
** fixed testFieldStatisticsDocValuesAndMultiValuedDouble
*** was doing stats.field twice in same request diff ways, but only checking one
*** changed to do 2 explicit requests and assert results are the same
*** added in canary assert for future numeric stats
** testIndividualStatLocalParams
*** added a canary assert to protect us against new stats in the future w/o 
updating the test
**** canary helped catch that we weren't testing calcdistinct in these 
permutations
*** added some sanity checks of localparams with 'false' values inspired by bug 
i found in testCalcDistinctStats
**** see question & nocommit (below)
*** added comment explaining point of isShard queries as best i understanding, 
see question & nocommit (below)
*** fixed asserts to play nice with calcdistinct excentricities
** iterateParaCombination
*** kind of hard to understand what this is doing and how it works because of 
recursive nature
*** definitely need to replace magic number "8" since that is brittle against 
future stats and already doesn't jive with num of legal Stat params (missing 
calcdistinct)
*** in general, i want to refactor this to replace it with commons-math's 
Combinations class - i'll look into that tomorow
** testCalcDistinctStats
*** part of the importance here is in the equivilence relationships - so i 
refactored each of the equivilent asesrt conditions to be a single assert 
inside a loop over the params.
**** this helps protect us against someone later thinking it's okay to change 
one assert w/o changing all of the equivilences
*** also simplified asserts to be less brittle: assert if expected stat keys 
are in response or not -- not number of stat keys returned (which might change 
in future if more default stats added -- in the case of these asserts, that 
isn't really relevant, what we care about here is the behavior of interaction 
of the various ways to request calcdistinct)
*** added some more permutations to each of the equivilences sets
**** this lead to discovring a semantics question so far undiscussed as far as 
what if only one stat is specified in a local param, but it's value is 'false' 
(see below)
*** fixed some mistakes in formatting params (eg: 
"f.a_i.stats.calcdistinct=false)"="true")
*** these changes let me eliminate the unused expectedStats and expectedType 
maps from this method (there were virtually unused cut/paste cruft prior to 
these changes anyway)
* TestDistributedSearch
** added some asserts that the (distrib) handling of calcdistinct works a 
variety of ways
* DistributedFacetPivotSmallTest
** small update to assert that when stats hang off pivots the local params 
correctly control which stats are returned.

----

New Questions...

* see nocommit in StatsComponentTest.testIndividualStatLocalParams - why the 
double loop here?{noformat}
   // whitebox test: explicitly ask for isShard=true with an individual stat
   //
   // :nocommit: why loop over every stat to get it's deps and then loop over 
them? ...
   // ... isn't it enough to loop over the set of all known stats?
   for (Stat statAsk:expectedStats.keySet()) {
     EnumSet <Stat> statAskEvaluate = statAsk.getDistribDeps();

     for (Stat stat : statAskEvaluate){
{noformat}

* how should these two queries behave...{noformat}
a) stats = true & stats.field = {!key=k min='false'}a_i

b) stats = true & stats.field = {!key=k min='false'}a_i & stats.calcdistinct = 
true
{noformat}...my gut says that for (a) the stats result set should be completley 
empty; and for (b) only countDistinct and distinctValues should be returned; -- 
because in both cases because the _use_ of localparams regardless of values 
should indicate that the implicit set of default stats should be ignored, and 
only explicitly requested stats should be returned -- but the only explicity 
mentioned stat via localparams is deliberately disabled with 'false'.  at the 
moment however, both of these cases returns all of the implicit default stats 
(see nocommits in tests) -- we'll need to fix this



> LocalParams for enabling/disabling individual stats
> ---------------------------------------------------
>
>                 Key: SOLR-6349
>                 URL: https://issues.apache.org/jira/browse/SOLR-6349
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6349-tflobbe.patch, SOLR-6349-tflobbe.patch, 
> SOLR-6349-tflobbe.patch, SOLR-6349-xu.patch, SOLR-6349-xu.patch, 
> SOLR-6349-xu.patch, SOLR-6349-xu.patch, SOLR-6349.patch, SOLR-6349.patch, 
> SOLR-6349.patch, SOLR-6349___bad_idea_broken.patch
>
>
> Stats component currently computes all stats (except for one) every time 
> because they are relatively cheap, and in some cases dependent on eachother 
> for distrib computation -- but if we start layering stats on other things it 
> becomes unnecessarily expensive to compute all the stats when they just want 
> the "sum" (and it will definitely become excessively verbose in the 
> responses).  
> The plan here is to use local params to make this configurable.  All of the 
> existing stat options could be modeled as a simple boolean param, but future 
> params (like percentiles) might take in a more complex param value...
> Example:
> {noformat}
> stats.field={!min=true max=true percentiles='99,99.999'}price
> stats.field={!mean=true}weight
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to