[
https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303217#comment-14303217
]
Elran Dvir commented on SOLR-5972:
----------------------------------
Hi all,
This patch contains a new statistics result for a field - existInDoc. It
returns the number of documents in which the field has a value (not missing).
For multivalue fields there is a calculation of existInDoc inside the class
UnInvertedField.
Since Solr 4.10 there was a fix for a stats calculation of multi valued field
which is doc valued. The class handling it is DocValuesStats.
I want to support existInDoc calculation also for multi valued - doc valued
field.
How Should I change DocValuesStats to support this?
Thanks.
> new statistics facet capabilities to StatsComponent facet - limit, sort and
> missing.
> ------------------------------------------------------------------------------------
>
> Key: SOLR-5972
> URL: https://issues.apache.org/jira/browse/SOLR-5972
> Project: Solr
> Issue Type: New Feature
> Reporter: Elran Dvir
> Attachments: SOLR-5972.patch, SOLR-5972.patch
>
>
> I thought it would be very useful to enable limiting and sorting
> StatsComponent facet response.
> I chose to implement it in Stats Component rather than Analytics component
> because Analytics doesn't support distributed queries yet.
> The default for limit is -1 - returns all facet values.
> The default for sort is no sorting.
> The default for missing is true.
> So if you use stats component exactly as before, the response won't change as
> of nowadays.
> If ask for sort or limit, missing facet value will be the last, as in regular
> facet.
> Sort types supported: min, max, sum and countdistinct for stats fields, and
> count and index for facet fields (all sort types are lower cased).
> Sort directions asc and desc are supported.
> Sorting by multiple fields is supported.
> our example use case will be employees' monthly salaries:
> The follwing query returns the 10 most "expensive" employees:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
> sum desc&f.employee_name.stats.facet.limit=10"
> The follwing query returns the 10 least "expensive" employees:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
> sum asc&f.employee_name.stats.facet.limit=10"
> The follwing query returns the employee that got the highest salary ever:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
> max desc&f.employee_name.stats.facet.limit=1"
> The follwing query returns the employee that got the lowest salary ever:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
> min asc&f.employee_name.stats.facet.limit=1"
> The follwing query returns the 10 first (lexicographically) employees:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name
> index asc&f.employee_name.stats.facet.limit=10"
> The follwing query returns the 10 employees that have worked for the longest
> period:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name
> count desc&f.employee_name.stats.facet.limit=10"
> The follwing query returns the 10 employee whose salaries vary the most:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
> countdistinct desc&f.employee_name.stats.facet.limit=10"
> Attached a patch implementing this in StatsComponent.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]