[ 
https://issues.apache.org/jira/browse/SOLR-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-6354:
---------------------------
    Attachment: SOLR-6354.patch


Much progress -- now actually supports stats over arbitrary functions.

* promoted StatsField to a top level public class
** added accessors for the SchemaField, ValueSource, and calcDistinct props
* changed the API for StastValueFactory.createStatsValues to take in the 
StatsField directly
** affected callers: DocValuesStats, UnInvertedField.getStats, FieldFacetStats
** propogate StatsField all the way down to the StatsValues constructors
* changed AbstractStatsValues.accumulate(BytesRef) & setNextReader to 
conditionally check wether we have a FieldType or not
** this seemed more straight forwrad and less complicated to understand then my 
initial idea of refactoring out new base classes due to some of the code 
duplication that would be needed in both the concrete leaf level classes (ie: 
NumericValueSourceStatsValues & NumericSchemaFieldStatsValues) which would need 
to have differnet parents (AbstractSchemaFieldStatsValues vs 
AbstractStatsValues) but would still collect the same types of stats in largely 
the same way.
* more progress on tests 

Still some more nocommits, but their mostly just around some more robust test 
coverage and adding javadocs to some simple methods

might be ready to commit tomorow.

----

[~crawdaddy78] - would love to hear any feedback you have if you get a chance 
to try this out.


> Support stats over functions
> ----------------------------
>
>                 Key: SOLR-6354
>                 URL: https://issues.apache.org/jira/browse/SOLR-6354
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Hoss Man
>         Attachments: SOLR-6354.patch, SOLR-6354.patch, TstStatsComponent.java
>
>
> The majority of the logic in StatsValuesFactory for dealing with stats over 
> fields just uses the ValueSource API.  There's very little reason we can't 
> generalize this to support computing aggregate stats over any arbitrary 
> function (or the scores from an arbitrary query).
> Example...
> {noformat}
> stats.field={!func key=mean_rating 
> mean=true}prod(user_rating,pow(editor_rating,2))
> {noformat}
> ...would mean that we can compute a conceptual "rating" for each doc by 
> multiplying the user_rating field by the square of the editor_rating field, 
> and then we'd compute the mean of that "rating" across all docs in the set 
> and return it as "mean_rating"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to