[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15692801#comment-15692801 ] Lyubov Romanchuk commented on SOLR-5972: Hi all, Attached the patch for multi value docvalue fields. Best regards, Lyuba > new statistics facet capabilities to StatsComponent facet - limit, sort and > missing. > > > Key: SOLR-5972 > URL: https://issues.apache.org/jira/browse/SOLR-5972 > Project: Solr > Issue Type: New Feature >Reporter: Elran Dvir > Attachments: SOLR-5972.patch, SOLR-5972.patch, > SOLR-5972_multivalue_docvalue.patch > > > I thought it would be very useful to enable limiting and sorting > StatsComponent facet response. > I chose to implement it in Stats Component rather than Analytics component > because Analytics doesn't support distributed queries yet. > The default for limit is -1 - returns all facet values. > The default for sort is no sorting. > The default for missing is true. > So if you use stats component exactly as before, the response won't change as > of nowadays. > If ask for sort or limit, missing facet value will be the last, as in regular > facet. > Sort types supported: min, max, sum and countdistinct for stats fields, and > count and index for facet fields (all sort types are lower cased). > Sort directions asc and desc are supported. > Sorting by multiple fields is supported. > our example use case will be employees' monthly salaries: > The follwing query returns the 10 most "expensive" employees: > "q=*:*=true=salary=employee_name_name.stats.facet.sort=salary > sum desc_name.stats.facet.limit=10" > The follwing query returns the 10 least "expensive" employees: > "q=*:*=true=salary=employee_name_name.stats.facet.sort=salary > sum asc_name.stats.facet.limit=10" > The follwing query returns the employee that got the highest salary ever: > "q=*:*=true=salary=employee_name_name.stats.facet.sort=salary > max desc_name.stats.facet.limit=1" > The follwing query returns the employee that got the lowest salary ever: > "q=*:*=true=salary=employee_name_name.stats.facet.sort=salary > min asc_name.stats.facet.limit=1" > The follwing query returns the 10 first (lexicographically) employees: > "q=*:*=true=salary=employee_name_name.stats.facet.sort=employee_name > index asc_name.stats.facet.limit=10" > The follwing query returns the 10 employees that have worked for the longest > period: > "q=*:*=true=salary=employee_name_name.stats.facet.sort=employee_name > count desc_name.stats.facet.limit=10" > The follwing query returns the 10 employee whose salaries vary the most: > "q=*:*=true=salary=employee_name_name.stats.facet.sort=salary > countdistinct desc_name.stats.facet.limit=10" > Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303217#comment-14303217 ] Elran Dvir commented on SOLR-5972: -- Hi all, This patch contains a new statistics result for a field - existInDoc. It returns the number of documents in which the field has a value (not missing). For multivalue fields there is a calculation of existInDoc inside the class UnInvertedField. Since Solr 4.10 there was a fix for a stats calculation of multi valued field which is doc valued. The class handling it is DocValuesStats. I want to support existInDoc calculation also for multi valued - doc valued field. How Should I change DocValuesStats to support this? Thanks. new statistics facet capabilities to StatsComponent facet - limit, sort and missing. Key: SOLR-5972 URL: https://issues.apache.org/jira/browse/SOLR-5972 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Attachments: SOLR-5972.patch, SOLR-5972.patch I thought it would be very useful to enable limiting and sorting StatsComponent facet response. I chose to implement it in Stats Component rather than Analytics component because Analytics doesn't support distributed queries yet. The default for limit is -1 - returns all facet values. The default for sort is no sorting. The default for missing is true. So if you use stats component exactly as before, the response won't change as of nowadays. If ask for sort or limit, missing facet value will be the last, as in regular facet. Sort types supported: min, max, sum and countdistinct for stats fields, and count and index for facet fields (all sort types are lower cased). Sort directions asc and desc are supported. Sorting by multiple fields is supported. our example use case will be employees' monthly salaries: The follwing query returns the 10 most expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 least expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum ascf.employee_name.stats.facet.limit=10 The follwing query returns the employee that got the highest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary max descf.employee_name.stats.facet.limit=1 The follwing query returns the employee that got the lowest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary min ascf.employee_name.stats.facet.limit=1 The follwing query returns the 10 first (lexicographically) employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name index ascf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employees that have worked for the longest period: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name count descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employee whose salaries vary the most: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary countdistinct descf.employee_name.stats.facet.limit=10 Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091573#comment-14091573 ] Hoss Man commented on SOLR-5972: Elran: I appreciate that you've put a lot of work into trying to improve the {{stats.facet}} feature of StatsComponent, but personally i don't think it's really wise for us to be pursuing multiple divergent sets of facet code in Solr. The existing StatsComponent Faceting code has always felt like a kludge to me and has never worked as well or gotten as developer attention as the FacetComponent. I think in the long run, implementing things like SOLR-6351 to let people _combine_ StatsComponent with FacetComponent, and deprecating {{stats.facet}} completely will make a lot more sense, and be a lot more powerful. new statistics facet capabilities to StatsComponent facet - limit, sort and missing. Key: SOLR-5972 URL: https://issues.apache.org/jira/browse/SOLR-5972 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Attachments: SOLR-5972.patch, SOLR-5972.patch I thought it would be very useful to enable limiting and sorting StatsComponent facet response. I chose to implement it in Stats Component rather than Analytics component because Analytics doesn't support distributed queries yet. The default for limit is -1 - returns all facet values. The default for sort is no sorting. The default for missing is true. So if you use stats component exactly as before, the response won't change as of nowadays. If ask for sort or limit, missing facet value will be the last, as in regular facet. Sort types supported: min, max, sum and countdistinct for stats fields, and count and index for facet fields (all sort types are lower cased). Sort directions asc and desc are supported. Sorting by multiple fields is supported. our example use case will be employees' monthly salaries: The follwing query returns the 10 most expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 least expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum ascf.employee_name.stats.facet.limit=10 The follwing query returns the employee that got the highest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary max descf.employee_name.stats.facet.limit=1 The follwing query returns the employee that got the lowest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary min ascf.employee_name.stats.facet.limit=1 The follwing query returns the 10 first (lexicographically) employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name index ascf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employees that have worked for the longest period: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name count descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employee whose salaries vary the most: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary countdistinct descf.employee_name.stats.facet.limit=10 Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083916#comment-14083916 ] Elran Dvir commented on SOLR-5972: -- I attached a newer patch with fix of calculation of existInDoc for multivalue fields new statistics facet capabilities to StatsComponent facet - limit, sort and missing. Key: SOLR-5972 URL: https://issues.apache.org/jira/browse/SOLR-5972 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Attachments: SOLR-5972.patch, SOLR-5972.patch I thought it would be very useful to enable limiting and sorting StatsComponent facet response. I chose to implement it in Stats Component rather than Analytics component because Analytics doesn't support distributed queries yet. The default for limit is -1 - returns all facet values. The default for sort is no sorting. The default for missing is true. So if you use stats component exactly as before, the response won't change as of nowadays. If ask for sort or limit, missing facet value will be the last, as in regular facet. Sort types supported: min, max, sum and countdistinct for stats fields, and count and index for facet fields (all sort types are lower cased). Sort directions asc and desc are supported. Sorting by multiple fields is supported. our example use case will be employees' monthly salaries: The follwing query returns the 10 most expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 least expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum ascf.employee_name.stats.facet.limit=10 The follwing query returns the employee that got the highest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary max descf.employee_name.stats.facet.limit=1 The follwing query returns the employee that got the lowest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary min ascf.employee_name.stats.facet.limit=1 The follwing query returns the 10 first (lexicographically) employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name index ascf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employees that have worked for the longest period: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name count descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employee whose salaries vary the most: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary countdistinct descf.employee_name.stats.facet.limit=10 Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org