[
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911606#comment-13911606
]
Brett Hoerner commented on SOLR-2242:
-------------------------------------
[~shalinmangar], this ticket took a turn towards approximate counts using
probabilistic data structures (specifically HyperLogLog). That's to support
fast approximate unique counts in systems like SolrCloud where each shard could
have hundreds of millions of unique values. It sounds like
{{stats.calcDistinct=true}} does the "correct, but slow" thing?
> Get distinct count of names for a facet field
> ---------------------------------------------
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
> Issue Type: New Feature
> Components: Response Writers
> Affects Versions: 4.0-ALPHA
> Reporter: Bill Bell
> Priority: Minor
> Fix For: 4.7
>
> Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch,
> SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch,
> SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch,
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for
> distinct values. This is normal behavior. This patch tells you how many
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> Parameters:
> facet.numTerms or f.<field>.facet.numTerms = true (default is false) - turn
> on distinct counting of terms
> facet.field - the field to count the terms
> It creates a new section in the facet section...
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=false&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numTerms=true&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_counts">
> <lst name="facet_queries"/>
> <lst name="facet_fields">...</lst>
> <lst name="facet_numTerms">
> <lst name="localhost:8983/solr/">
> <int name="price">14</int>
> </lst>
> <lst name="localhost:8080/solr/">
> <int name="price">14</int>
> </lst>
> </lst>
> <lst name="facet_dates"/>
> <lst name="facet_ranges"/>
> </lst>
> OR with no sharding-
> <lst name="facet_numTerms">
> <int name="price">14</int>
> </lst>
> {code}
> Several people use this to get the group.field count (the # of groups).
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]