[ 
https://issues.apache.org/jira/browse/SOLR-6351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6351:
---------------------------
    Attachment: SOLR-6351.patch


My main focus the last day or so has been reviewing PivotFacetHelper & 
PivotFacetValue with an eye towards simplifying the amount of redundent code 
between them and StatsComponent.  Some details posted below but one key thing i 
wanted to point out...

Even as (relatively) familiar as i am with the exsting Pivot code, it took me a 
long time to understand how PivotFacetHelper.getStats + PivotListEntry.STATS 
were working in the case of leaf level pivot values -- short answer: 
PivotFacetHelper.getStats totally ignores the Enum value of 
PivotListEntry.STATS and uses "0" (something PivotFacetHelper.getPivots also 
does that i've never noticed before).  

Given that we plan to add more data to pivots in issues like SOLR-4212 & 
SOLR-6353, i really wanted to come up with a pattern for dealing with this that 
was less likeely to trip people up when looking at the code.



{panel:title=Changes in this patch}


* StatsComponent
** refactored out tiny little reusable "unwrapStats" utility
** refactored out reusable "convertToResponse" utility
*** i was hoping this would help encapsulate & simplify the way the count==0 
rules are applied, to make top level consistent with pivots, but that lead me 
down a rabbit hole of pain as far as testing and backcompat and solrj - so i 
just captured it in a 'force' method param.
*** But at least now, the method is consistently called everywhere that outputs 
stats, so if/when we change the rules for how "empty" stats are returned (see 
comments in SOLR-6349) we won't need to audit/change multiple pieces of code, 
we can just focus on callers of this method
** Added a StatsInfo.getStatsField(key) method for use by 
PivotFacetHelper.mergeStats so it wouldn't need to constantly loop over every 
possible stats.field

* PivotFacetValue
** removed an unneccessary level of wrapping arround the Map<String,StatsValues>
** switched to using StatsComponent.convertToResponse directly instead of 
PivotFacetHelper.convertStatsValuesToNamedList

* PivotListEntry
** renamed "index" to "minIndex"
** added an extract method that knows how to correctly deal with the diff 
between "optional" entries that may exist starting at the minIndex, and 
mandatory entires (field,value,count) that *must* exist at the expected index.

* PivotFacetHelper
** changed the various "getFoo" methods to use PivotListEntry.FOO.extract
*** these methods now exact mainly just for convinience with the Object casting
*** this also ment the "retrieve" method could be removed
** simplified mergeStats via:
*** StatsComponent.unwrapStats
*** StatsInfo.getStatsField
** mergeStats javadocs
** removed convertStatsValuesToNamedList

* PivotFacetProcessor
** switch using StatsComponent.convertToResponse

* TestCloudPivots
** update nocommit comment regarding 'null' actualStats based on pain 
encountered working on StastComponent.convertToResponse 
*** added some more sanity check assertions in this case as well

* DistributedFacetPivotSmallTest
** added doTestPivotStatsFromOneShard to account for an edge case in merging 
that occured to me while reviewing PivotFacetHelper.mergeStats 
*** this fails because of how +/-Infinity are treated as the min/max - i'll 
working on fixing this next
*** currently commented out + has some nocommits to beef up this test w/other 
types

* merged my working changes with Vitaliy's additions (but have not yet actually 
reviewed the new tests)...
** FacetPivotSmallTest
** DistributedFacetPivotSmallAdvancedTest
** PivotFacetValue.getStatsValues ... allthough it's not clear to me yet what 
purpose/value this adds?


{panel}


> Let Stats Hang off of Pivots (via 'tag')
> ----------------------------------------
>
>                 Key: SOLR-6351
>                 URL: https://issues.apache.org/jira/browse/SOLR-6351
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, 
> SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, 
> SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch, 
> SOLR-6351.patch, SOLR-6351.patch, SOLR-6351.patch
>
>
> he goal here is basically flip the notion of "stats.facet" on it's head, so 
> that instead of asking the stats component to also do some faceting 
> (something that's never worked well with the variety of field types and has 
> never worked in distributed mode) we instead ask the PivotFacet code to 
> compute some stats X for each leaf in a pivot.  We'll do this with the 
> existing {{stats.field}} params, but we'll leverage the {{tag}} local param 
> of the {{stats.field}} instances to be able to associate which stats we want 
> hanging off of which {{facet.pivot}}
> Example...
> {noformat}
> facet.pivot={!stats=s1}category,manufacturer
> stats.field={!key=avg_price tag=s1 mean=true}price
> stats.field={!tag=s1 min=true max=true}user_rating
> {noformat}
> ...with the request above, in addition to computing the min/max user_rating 
> and mean price (labeled "avg_price") over the entire result set, the 
> PivotFacet component will also include those stats for every node of the tree 
> it builds up when generating a pivot of the fields "category,manufacturer"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to