[ 
https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303217#comment-14303217
 ] 

Elran Dvir commented on SOLR-5972:
----------------------------------

Hi all,

This patch contains a new statistics result for a field - existInDoc. It 
returns the number of documents in which the field has a value (not missing).
For multivalue fields there is a calculation of existInDoc inside the class 
UnInvertedField.  
Since Solr 4.10 there was a fix for a stats calculation of multi valued field 
which is doc valued. The class handling it is DocValuesStats.
I want to support existInDoc calculation also for multi valued - doc valued 
field.
How Should I change DocValuesStats to support this?

Thanks.



> new statistics facet capabilities to StatsComponent facet - limit, sort and 
> missing.
> ------------------------------------------------------------------------------------
>
>                 Key: SOLR-5972
>                 URL: https://issues.apache.org/jira/browse/SOLR-5972
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Elran Dvir
>         Attachments: SOLR-5972.patch, SOLR-5972.patch
>
>
> I thought it would be very useful to enable limiting and sorting 
> StatsComponent facet response.
> I chose to implement it in Stats Component rather than Analytics component 
> because Analytics doesn't support distributed queries yet. 
> The default for limit is -1 - returns all facet values.
> The default for sort is no sorting.
> The default for missing is true.
> So if you use stats component exactly as before, the response won't change as 
> of nowadays.
> If ask for sort or limit, missing facet value will be the last, as in regular 
> facet.
> Sort types supported: min, max, sum and countdistinct for stats fields, and 
> count and index for facet fields (all sort types are lower cased).
> Sort directions asc and desc are supported.
> Sorting by multiple fields is supported.
> our example use case will be employees' monthly salaries:
> The follwing query returns the 10 most "expensive" employees: 
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
>  sum desc&f.employee_name.stats.facet.limit=10" 
> The follwing query returns the 10 least "expensive" employees:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
>  sum asc&f.employee_name.stats.facet.limit=10" 
> The follwing query returns the employee that got the highest salary ever:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
>  max desc&f.employee_name.stats.facet.limit=1" 
> The follwing query returns the employee that got the lowest salary ever:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
>  min asc&f.employee_name.stats.facet.limit=1" 
> The follwing query returns the 10 first (lexicographically) employees:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name
>  index asc&f.employee_name.stats.facet.limit=10" 
> The follwing query returns the 10 employees that have worked for the longest 
> period:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=employee_name
>  count desc&f.employee_name.stats.facet.limit=10" 
> The follwing query returns the 10 employee whose salaries vary the most:
> "q=*:*&stats=true&stats.field=salary&stats.facet=employee_name&f.employee_name.stats.facet.sort=salary
>  countdistinct desc&f.employee_name.stats.facet.limit=10" 
> Attached a patch implementing this in StatsComponent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to