[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580945#comment-14580945 ] Shalin Shekhar Mangar commented on SOLR-2242: - I think this can be closed now that SOLR-6968 has added support for fast approximate cardinality estimation. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.9, Trunk Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355860#comment-14355860 ] Gregory Chanan commented on SOLR-2242: -- should we update the version to something unreleased? I'm not sure what the most appropriate version is. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.9, Trunk Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911606#comment-13911606 ] Brett Hoerner commented on SOLR-2242: - [~shalinmangar], this ticket took a turn towards approximate counts using probabilistic data structures (specifically HyperLogLog). That's to support fast approximate unique counts in systems like SolrCloud where each shard could have hundreds of millions of unique values. It sounds like {{stats.calcDistinct=true}} does the correct, but slow thing? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.7 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911611#comment-13911611 ] Shalin Shekhar Mangar commented on SOLR-2242: - bq. It sounds like stats.calcDistinct=true does the correct, but slow thing? Yes, that is why I did not close the ticket. bq. this ticket took a turn towards approximate counts using probabilistic data structures (specifically HyperLogLog). That's to support fast approximate unique counts in systems like SolrCloud where each shard could have hundreds of millions of unique values. Do you know what is the state of this patch? Are people using the hyperloglog implementation in production? Apart from a committer's attention, what does this issue need? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.7 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911796#comment-13911796 ] Brett Hoerner commented on SOLR-2242: - There is no public patch that I know of that does the HyperLogLog stuff. [~terrance.snyder] mentioned it in his comment above, but that's it. I haven't started any work here yet but I hoped to in the future. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.7 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911260#comment-13911260 ] Shalin Shekhar Mangar commented on SOLR-2242: - I think this is possible now with SOLR-5428 - StatsComponent can count distinct values of a field with stats.calcDistinct=true parameter. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.7 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911263#comment-13911263 ] Jonathan Rochkind commented on SOLR-2242: - I am out of the office on vacation until Wednesday February 26, 2014. I will not be checking email. For urgent Systems Department business, please contact Mercy Anaba, man...@jhu.edu,(410) 516-5306. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.7 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883528#comment-13883528 ] Vassil Velichkov commented on SOLR-2242: I really hope that this issue will be resolved in SOLR 4.7...Fingers crossed :-) Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.7 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701761#comment-13701761 ] Bill Bell commented on SOLR-2242: - The one use case (2 parts) that I want to make sure we are satisfying is: . Ability to get total number of distinct terms in the facet.field. For example, if facet.field=gender, I would expect the distinct to be 1 or 2 (Male/Female) depending on filters. . For Sharding, Terrance might be the right approach, but is it accurate or an approximation? For small sets sharding will work fine ( 100 results). For example, if you were asking for distinct counts from 2 shards, and the shards were setup for 20 states in one shard, and 30 in the other, I would expect distinct states = 50. Will your solution do that? Thanks - so happy this is moving forward. Not sure I understand the syntax from Terrance yet... :) Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697431#comment-13697431 ] Otis Gospodnetic commented on SOLR-2242: This issue looks very interesting and it looks like it's 2 years old with [~billnbell] having moved on, most likely. Based on my reading of the last 2 years worth of comments above, [~terrance.snyder]'s comment (see https://issues.apache.org/jira/browse/SOLR-2242?focusedCommentId=13275101page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13275101) seemed the most thorough and his idea the most advanced. He posted a patch to S3 which is no longer there. :( I'll email Terrance now in hopes of getting his patch attached here, but it would be great if somebody with more knowledge of faceting/pivot area of Solr could push this. I saw [~yo...@apache.org] did look at this issue a while back... Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697454#comment-13697454 ] Terrance A. Snyder commented on SOLR-2242: -- [~otis] I got the email - I'll give some background as we've enhanced and combined but I should be able to put together a patch in the following week. There is an old version on github I need to update to trunk I'll spend time doing this, most of this work was enhancing two existing JIRA items which are wonderful. Core Work: https://issues.apache.org/jira/browse/SOLR-2894 https://issues.apache.org/jira/browse/SOLR-3583 Newer features: + Some of the issues that have been discussed around distributed counting has already been done in larger installations (counting billions of items). I work in the advertising space and counting/slicing dicing things and sending between shards 90+ billion documents on highly unique facet counts such as session id, or cookie ID is hugely wasteful and doesn't scale. + The Ad industry is great at counting stuff at scale - sessions, web events, etc. We take the stance that counting stuff can be roughly right when we get to billions + or - 0-1.5% error rate is OK when the response goes from minutes to milliseconds. As such, optional parameters for estimated count is added which will leverage a HyperLogLog implementation to do a 98.5% correct response. By default this is turned on for us - on a large installation (multiple billions of POS transactions) *Questions as I'd like to actually do this right* + Rather than re-invent the wheel I use stream-lib (https://github.com/clearspring/stream-lib). It is apache licensed and includes HyperLogLog, HyperLogLogPlus, BloomFilters, TopK, QDigest, etc. Is this an issue? + Test cases - I've got 82% code coverage - is this good enough? + Documentation - I've got markdown documents that cover the commands and syntax - is this the right format? + SOLR-2894, SOLR-3583 - It makes logical sense that these start to be joined together. When using all these I sometimes start smelling solr as an analytic engine (and it's a very nice one when combining probabilistic data structures). If someone can answer the above questions while I sync to /trunk please let me know. Old Version for posterity until I get around to updating to latest trunk and including the HyperLogLog implementation - doesn't include HyperLogLog sketching - minor updates. https://github.com/terrancesnyder/solr-analytics/blob/master/solr/core/src/java/org/apache/solr/handler/component/PivotFacetHelper.java Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697475#comment-13697475 ] J Mohamed Zahoor commented on SOLR-2242: As i mentioned in my earlier comment, We have experimented with stream-lib implementation of HLL for similar purpose... and it is good. It makes good sense to have probabilistic data structures for large number of docs. BTW, we are using SOLR as a analytics engine with great success. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697485#comment-13697485 ] Otis Gospodnetic commented on SOLR-2242: [~terrance.snyder] - you're quick. :) Answers: * stream-lib and license - perfectly fine. At Sematext we use their stuff as well. * 82% test case coverage - good! * documentation - eventually it should be put on the Solr Wiki, but don't let that stop you! * smell - precisely! Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692002#comment-13692002 ] Jonathan Rochkind commented on SOLR-2242: - Shawn Heisey: Forgive me if I'm misunderstanding what you don't understand, but, here's what this feature does, at the high level: You can ask Solr for facet response already. You get, for instance, the first 10 (or first `facet.limit`) facet values, sorted by your chosen sort criteria. You can, already, then choose to page through all the facet values, using facet.offset combined with facet.limit. You can page through them, but you don't know how long you'll be paging for -- at some point your request with a given facet.offset will just stop returning results because you've exhausted all the facet values available. But you have no way to know when that will be until you get there. There is no way to get the total number of facet results available. This feature is meant to add that, a way to get in the response the count of the total number of unique facet values, the ones you'd be paging through with facet.offset. Apparently what makes it tricky to implement is the distributed environment? Some of the language used in this ticket to refer to the feature is indeed confusing IMO. I hope this helps. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692023#comment-13692023 ] Shawn Heisey commented on SOLR-2242: That will indeed help as I find time to look things over. Thanks! Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692043#comment-13692043 ] Robert Muir commented on SOLR-2242: --- {quote} Apparently what makes it tricky to implement is the distributed environment? {quote} because you have to merge *all* the values to get the unique count. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692154#comment-13692154 ] Shawn Heisey commented on SOLR-2242: Which patch represents the best work? I got SOLR-2242-solr40-3.patch to apply to trunk with a little love, but tests having to do with facets are failing. It is also quite a bit smaller than the newest patch for 3x. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691078#comment-13691078 ] J Mohamed Zahoor commented on SOLR-2242: One way to achieve this in distributed environment is to have some approximation techniques like HyperLogLog. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690500#comment-13690500 ] Shawn Heisey commented on SOLR-2242: [~billnbell] [~yriveiro]: I am having a hard time understanding what this feature actually DOES, in concrete terms. That's my failing, the info is probably in the description and comments, it's just not sinking in. I am willing to pursue this to the best of my ability, but I will admit in advance that my ability may not be quite enough. I'm new to the committer role, which means that I don't work very fast and I'm still learning the ropes. I think I can commit it and backport to 4.x if the following criteria are met: *) We can get the feature to apply to trunk and consistently pass tests (not counting what's failing due to other problems). *) There are new tests for all new functionality. *) We put it up for review by other committers, particularly [~rcmuir] and [~ysee...@gmail.com], and there are no negative votes. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.4 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644329#comment-13644329 ] Yago Riveiro commented on SOLR-2242: It is unfortunate that this feature is stalled. With sharding this feature is killer, I've been waiting for it since that I know that is in progress. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.3 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645265#comment-13645265 ] J Mohamed Zahoor commented on SOLR-2242: +1 for this feature with sharding support its a killer really... Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.3 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644287#comment-13644287 ] Bill Bell commented on SOLR-2242: - Yeah. This issue has stalled. To get it ready for release we just need to apply the patch and run all unit tests. Issues tend to stall when we don't have a commiter leading the work to get done. If someone will step up I will commit to do the work. the last time I made a push for this there was several approaches: 1. Change the facet formats (Yonik) 2. Change the parameter names and hide the fact that we are looping through all (limit=-1). 3. Try to get the sharding working. Although I would contend that we can release without sharding and add it later. Sharding - we can send the unique terms and combine to get exact numbers, or we can separate and send (as it is now). The former is much harder to do and could cause perf issues. Thoughts? Maybe at the Lucene conference this can be discussed? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.3 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570063#comment-13570063 ] J Mohamed Zahoor commented on SOLR-2242: Does the patch provide distinct counts in the case of multiple shards? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.2, 5.0 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507488#comment-13507488 ] Amber Duque commented on SOLR-2242: --- I have a question on the SOLR-2242-solr40-3.patch. I have applied this patch on top of the Solr 4.0 release (http://svn.apache.org/repos/asf/lucene/dev/tags/ - lucene_solr_4_0_0). The patch builds fine, but several solr unit tests fail: Tests with failures: - org.apache.solr.request.TestFaceting.testFacets - org.apache.solr.request.TestFaceting.testRegularBig - org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch - org.apache.solr.TestDistributedSearch.testDistribSearch - org.apache.solr.TestDistributedGrouping.testDistribSearch - org.apache.solr.request.SimpleFacetsTest (suite) - org.apache.solr.TestGroupingSearch.testRandomGrouping - org.apache.solr.TestGroupingSearch.testGroupingGroupedBasedFaceting - org.apache.solr.cloud.BasicDistributedZk2Test.testDistribSearch Do the unit tests pass successfully for anyone (for this patch applied on top of the solr 4.0 release)? Thanks! Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.1 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=falsefacet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields.../lst lst name=facet_numTerms lst name=localhost:8983/solr/ int name=price14/int /lst lst name=localhost:8080/solr/ int name=price14/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst OR with no sharding- lst name=facet_numTerms int name=price14/int /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13495020#comment-13495020 ] Bill Bell commented on SOLR-2242: - uygar, You are not using it properly. SOLR-2242-3x_5_tests.patch does indeed work. http://x.x.x.x:8985/solr/ar1/select?shards=192.168.200.202:8985/solr/ar3/,192.168.200.202:8985/solr/ar4q=hotelsgroup=truegroup.field=sitefacet=truef.site.facet.numFacetTerms=1facet.mincount=1facet.limit=-1 You forgot the facet.field=site With sample data. Do the following. Copy example to example2, and change jetty.xml on example2 to be port 8080. Run this: http://localhost:8983/solr/select?shards=localhost:8983/solr/,localhost:8080/solr/q=*:*rows=0facet=truefacet.field=pricefacet.numTerms=truefacet.mincount=1facet.limit=-1 Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0-ALPHA Reporter: Bill Bell Priority: Minor Fix For: 4.1 Attachments: SOLR-2242-3x_5_tests.patch, SOLR-2242-3x.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR-2242-solr40-3.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
Bill, which patch is working for you? It is difficult to follow! :) On Sat, Jun 9, 2012 at 1:02 AM, William Bell billnb...@gmail.com wrote: I am not sure what the issue is. This is working for me... On Fri, Jun 8, 2012 at 8:35 AM, Jason Rutherglen (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291803#comment-13291803] Jason Rutherglen commented on SOLR-2242: Terrance, can you post a patch to the Jira? It makes sense to start this Jira off non-distributed, and add a distributed version in another Jira issue... Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Bill Bell billnb...@gmail.com cell 720-256-8076 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291803#comment-13291803 ] Jason Rutherglen commented on SOLR-2242: Terrance, can you post a patch to the Jira? It makes sense to start this Jira off non-distributed, and add a distributed version in another Jira issue... Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
I am not sure what the issue is. This is working for me... On Fri, Jun 8, 2012 at 8:35 AM, Jason Rutherglen (JIRA) j...@apache.org wrote:   [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291803#comment-13291803 ] Jason Rutherglen commented on SOLR-2242: Terrance, can you post a patch to the Jira?  It makes sense to start this Jira off non-distributed, and add a distributed version in another Jira issue... Get distinct count of names for a facet field -         Key: SOLR-2242         URL: https://issues.apache.org/jira/browse/SOLR-2242       Project: Solr      Issue Type: New Feature      Components: Response Writers   Affects Versions: 4.0       Reporter: Bill Bell       Priority: Minor       Fix For: 4.0     Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields  lst name=price   int name=numFacetTerms14/int   int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int  /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Bill Bell billnb...@gmail.com cell 720-256-8076 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257323#comment-13257323 ] uygar bayar commented on SOLR-2242: --- hi I tried it 3.6.0 with SOLR-2242-3x_5_tests.patch but it didn't work. Results are grouped but all facets empty. lst name=facet_counts lst name=facet_queries/ lst name=facet_fields/ lst name=facet_numTerms/ lst name=facet_dates/ lst name=facet_ranges/ http://x.x.x.x:8985/solr/ar1/select?shards=192.168.200.202:8985/solr/ar3/,192.168.200.202:8985/solr/ar4q=hotelsgroup=truegroup.field=sitefacet=truef.site.facet.numFacetTerms=1facet.mincount=1facet.limit=-1 Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244961#comment-13244961 ] Bill Bell commented on SOLR-2242: - Ready for 3x merge. Test with: ant test -Dtestcase=NumFacetTermsFacetsTest Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_5_tests.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240225#comment-13240225 ] Bill Bell commented on SOLR-2242: - I changed the sharing response to check the size and only return the shard name if there is a response. {code} lst name=facet_numTerms lst name=localhost:8983/solr/ /lst Changed to lst name=facet_numTerms/ {code} Also, the code for field_facets was wrong. It needs to return the name of the field even if the size is 0 or null. See latest patch for 3x. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240240#comment-13240240 ] Bill Bell commented on SOLR-2242: - Found a bug and attaching new patch. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_2.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240270#comment-13240270 ] Bill Bell commented on SOLR-2242: - All tests pass on branch_3x now. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-3x_4.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239441#comment-13239441 ] Erick Erickson commented on SOLR-2242: -- Bill: Tests do not pass on either 3.x or trunk with this patch. some 3.x failures: ant test -Dtestcase=TestDistributedSearch ant test -Dtestcase=testGroupingGroupedBasedFaceting ant test -Dtestcase=TestDistributedGrouping some 4x failures: ant test -Dtestcase=BasicDistributedZkTest ant test -Dtestcase=TestGroupingSearch I'm not sure whether these are test problems or more serious... Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239544#comment-13239544 ] Yonik Seeley commented on SOLR-2242: There are other JIRA issues open for adding more facet-related data as well, and adding a new section for each doesn't seem desirable. I think I'm still in favor of biting the bullet and changing the facet response format for 4.0, while having some sort of flag to enable the older format for back compat. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240158#comment-13240158 ] Bill Bell commented on SOLR-2242: - Yonik agreed. However what is the alternative. We are talking distinct terms, and unless I limit the number of terms there could be a performance issue on using this with sharding. Since I would need to sent the terms and combine them and look for uniques. I am willing to do that work (not that much coding - more worried about CPU and network performance). The one I submitted does change the format by ADDING a new section. It shouldn't break other facets (usually adding sections to the JSON/XML output should not be a hard break). The latest version does not change the facet_field section so it is compatible. I am working on getting the tests to work. Most seem trivial fixes and not more serious. Since we changed the format... However, several people would like to use this. If I fix the test cases that are breaking can we consider a commit? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-3x.patch, SOLR-2242-solr40-3.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR-2242.solr35.patch, SOLR.2242.solr3.1.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235365#comment-13235365 ] Bill Bell commented on SOLR-2242: - I added sharding as discussed by Antoine. {code} lst name=facet_numTerms lst name=http://localhost:8983/solr; int name=price14/int int name=cat15/int /lst lst name=http://localhost:8081/solr; int name=price23/int int name=cat3/int /lst /lst {code} Example call http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:8081/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numTerms=truefacet.limit=-1facet.field=pricefacet.field=cat Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235366#comment-13235366 ] Bill Bell commented on SOLR-2242: - https://issues.apache.org/jira/secure/attachment/12519406/SOLR-2242-solr40-2.patch is the latest patch. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40-2.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233453#comment-13233453 ] Antoine Le Floc'h commented on SOLR-2242: - Bill, Just a thought, how are you going to plug in [SOLR-3134|https://issues.apache.org/jira/browse/SOLR-3134] then ? Since we are not able to aggregate distinct count over shards, shouldn't you do something like: {code} lst name=facet_numTerms lst name=localhost:/solr int name=cat15/int int name=price14/int /lst lst name=localhost:/solr int name=cat3/int int name=price23/int /lst /lst {code} Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233181#comment-13233181 ] Bill Bell commented on SOLR-2242: - Cody, I love your suggestion. I am actually ready to work on it. {code} lst name=facet_numTerms int name=text124/int /lst {code} After we get it committed we should then fix the shard issues as per SOLR-3134. We can also create a new JIRA ticket for that. Everyone agreed? I will do it on SOLR 4.0 and back port to 3.5. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233224#comment-13233224 ] Bill Bell commented on SOLR-2242: - How does it work? {code} http://localhost:8983/solr/select?q=*:*facet=truefacet.field=catfacet.field=pricef.price.facet.numTerms=truefacet.limit=-1f.cat.facet.numTerms=truef.price.facet.limit=1 {code} Parameters: facet.numTerms or f.field.facet.numTerms = true (default is false) - turn on distinct counting of terms facet.field - the field to count the terms It creates a new section in the facet section... For example: {code} lst name=facet_counts lst name=facet_queries/ lst name=facet_fields lst name=cat int name=camera1/int int name=connector2/int int name=copier1/int int name=currency4/int int name=electronics14/int int name=graphics card2/int int name=hard drive2/int int name=memory3/int int name=monitor2/int int name=multifunction printer1/int int name=music1/int int name=printer1/int int name=scanner1/int int name=search2/int int name=software2/int /lst lst name=price int name=0.03/int /lst /lst lst name=facet_numTerms int name=cat15/int int name=price14/int /lst lst name=facet_dates/ lst name=facet_ranges/ /lst {code} Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233232#comment-13233232 ] Bill Bell commented on SOLR-2242: - See https://issues.apache.org/jira/secure/attachment/12519024/SOLR-2242-solr40.patch for the patch. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218232#comment-13218232 ] Antoine Le Floc'h commented on SOLR-2242: - About the distribution issue, it looks like https://issues.apache.org/jira/browse/SOLR-3134 has some similar thinking as my post from 03/Jan/12 : show the info per shard. Even though the counter info cannot be aggregated across shards, knowing what the counter is for each shard would allow each user to use the info as he wants. It would work in single shard too. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191185#comment-13191185 ] Ethan Gruber commented on SOLR-2242: +1 for me too. I have been using this feature for almost a year. I plan to upgrade to the newest patch/Solr trunk code, but the patch doesn't apply to the current trunk. Do I have to check out the revision that dates to 12/21/11 to get this to work? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183627#comment-13183627 ] Cody Young commented on SOLR-2242: -- Had another idea that maintains backwards compatibility. We could add a new facet section: {code:xml} lst name=facet_fields lst name=text int name=electronics14/int int name=inc8/int int name=2.05/int int name=lcd5/int int name=memory5/int /lst /lst lst name=facet_numTerms int name=text124/int /lst {code} facet.query, facet.date and facet.range all show up in a different section, what about facet.numTerms. That brings up an interesting question actually, we'll want to control this on a per facet field basis, what about something like facet.numTerms=FieldName. That brings it more in line with facet.date and facet.range. Cody Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13178718#comment-13178718 ] Antoine Le Floc'h commented on SOLR-2242: - People who need to be back-compat won't be able to use {code} facet.numTerms=true {code}. Isn't it fair ? About the distribution issue, maybe the distinct counter could be displayed per shard, something like: {code} lst name=facet_fields lst name=shop_id lst name=numTerms int ip=192.168.0.10058/int int ip=192.168.0.101158/int /lst lst name=counts int name=280137567032406/int int name=280095893616625/int int name=9763497825/int int name=6351398780/int int name=28021713440118/int /lst /lst /lst {code} Like this, people who don't use shards are happy, and people who do, can display what makes sense for them, waiting for better in the future. This would allow to move forward with this JIRA. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176065#comment-13176065 ] Erick Erickson commented on SOLR-2242: -- Just to be clear. I'm not volunteering to actually *implement* this patch. I'll gladly guide it through the process if someone wants to work on it and address the concerns raised. And I'll keep prodding it along and try to keep it from dying on the vine, and certainly volunteer to test various incarnations. Or I'll try to kill it if it comes to that. There are two open issues really, of which the most pressing seems to be back-compat. Cody's initial suggestion doesn't work with all the various response formats. Working out a way to change the response format without breaking back-compat seems like a worthy goal in itself, but does that mean we need to create another JIRA for that and make this JIRA dependent on the new one? Note that this is the inverse of my original point 3, I'm suggesting we fix the back-compat issue before we address this one. I have no real clue yet how to approach that mind you. Again, I want a clear goal in mind before we put work into *any* solution. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174739#comment-13174739 ] Antoine Le Floc'h commented on SOLR-2242: - I am using this patch and possibly want to add extra infos in the facet results, and want to use sharding... Is there an associated patch to fix sharding ? Is it an easy fix ? Is this working out of the box in 4.0 ? Thank you. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174152#comment-13174152 ] Erick Erickson commented on SOLR-2242: -- OK, it seems like we have several themes here. I'd like to get a reasonable consensus before going forward... I'll put out a straw-man proposal here and we can go from there. But lets figure out where we're going before revamping stuff yet again. 1 Distributed support. I sure don't see a good way to support this currently. Perhaps some of the future enhancements will make this easier (thinking distributed TF/IDF such while being totally ignorant of that code), but returning the entire list of constraints (or names or terms or whatever we call it) is just a bad idea. The first time someone tries this on a field with 1,000,000 terms (yes, I've seen this) it'll just blow things up. I'm also slightly anti the min/max idea. I'm not sure what value there is in telling someone there are between 10,000 and 90,000 distinct values. And if it's a field with just a few pre-defined values, that information is already known anyway But if someone can show a use-case here I'm not completely against it. But I'd like to see the use case first, not someone might find it useful G. 2 back compat. Cody's suggestion seems to be the slickest in terms of not breaking things, but we use attributes in just a few places, are there reasons NOT to do it that way? 3 Possibly add a new JIRA for changing the facet response format to be tolerant of sub-fields, but don't do that here. Again, I want a clearly defined end point for the concerns raised before we dive back in here Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174179#comment-13174179 ] Jonathan Rochkind commented on SOLR-2242: - I would find this feature valuable even if it simply did not work at all on a distributed index. (Refusing to return a value rather than returning a known incorrect value would seem like the right way to go). Because my index is not distributed, and I would find this feature valuable, heh. I don't know if Solr currently has any policies against committing features that can't work on distributed, but personally my 'vote' would be doing that here, with clear documentation that it doesn't work on distributed (and the hope that future enhancements may make it more feasible to do so, as Erick suggests may possibly maybe happen). Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174184#comment-13174184 ] Yonik Seeley commented on SOLR-2242: bq. I'm also slightly anti the min/max idea. I'm not sure what value there is in telling someone there are between 10,000 and 90,000 distinct values. I think we could come up with a pretty good estimate (but we should tell them it's an estimate somehow). Anyway, that could optionally be handled in a different issue. bq. 2 back compat. Cody's suggestion seems to be the slickest in terms of not breaking things, but we use attributes in just a few places, are there reasons NOT to do it that way? Or does this mess up JSON, PHP, etc? Yes, it messes up JSON, binary format, etc. We'd need to figure out how to add attributes into our data model (that gets sent to response writers) in a generic way. bq. 3 Possibly add a new JIRA for changing the facet response format to be tolerant of sub-fields, but don't do that here. Not sure how that's possible... it's either more magic field names in with the individual constraints, or the facet response format has got to change. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Erick Erickson Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173479#comment-13173479 ] Cody Young commented on SOLR-2242: -- Simon, any plans for this patch? The general consensus seems to be that this is a good patch and desired functionality. The biggest issues seem to be the magic name and distributed support. I see a proposed solution by Yonik of changing the output format but that breaks distributed search. In addition, there is a worry about backwards compatibility and possibly supporting that through a parameter. What if we choose a format that doesn't break backwards compatibility and possibly commit without supporting distributed for the first pass (or supporting the simple case of just adding it all together). This would let us get some progress on this issue without having a magic name in the facet list. If we went with a format like below then it wouldn't break backwards compatibility and it shouldn't affect anyone unless they choose to use the feature. This is also consistent with the way numFound works for the main search results. (Admittedly, it's different than ngroups, although we still see numFound used to represent the number of documents in a group.) {code:xml} lst name=facet_fields lst name=text numFacetTerms=385 int name=electronics14/int int name=inc8/int int name=2.05/int int name=lcd5/int int name=memory5/int /lst /lst {code} Other smaller issues that appear to be outstanding: Change code to cache the numFacetTerms/numTerms and remove the code that caches the huge term list. Determine the parameter name: facet.nconstraints=true|false was proposed, allowing facet.count to control the rest of the behavior. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13120635#comment-13120635 ] bronco commented on SOLR-2242: -- Will there also be a solution for 3.5 to get the correct numFound results? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13100095#comment-13100095 ] Bill Bell commented on SOLR-2242: - Sharding will not work if you change the format of the facet results... We would need to fix sharding for this to go out... I am in holding pattern until a committer helps. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
Did this work before the patch ? This patch only changes facet.field and not ranges. Send the whole URL you are sensing to Solr. Bill Bell Sent from mobile On Aug 10, 2011, at 1:55 AM, Trinh Trung Kien (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082216#comment-13082216 ] Trinh Trung Kien commented on SOLR-2242: Hi, I apply the patch using SOLR 4.0 revision 1140474. The patch seem working OK but i observe several issues: - I have one field indexed as integer: field name=cell_id type=integer indexed=true stored=true/ When I search for cell_id:[900 TO 1000], there is no result (actually I have lots of data with cell_id between 900 to 1000) Then I search for cell_id:[1000 TO *], this should return data which have cell_id=1000, however they return me all the records, the condition seems don't have that meaning. Can you confirm that i'm using the correct version and revision? here is my svn info for the trunk: URL: http://svn.apache.org/repos/asf/lucene/dev/trunk Repository Root: http://svn.apache.org/repos/asf Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 Revision: 1140474 Node Kind: directory Schedule: normal Last Changed Author: chrism Last Changed Rev: 1140408 Last Changed Date: 2011-06-27 21:52:53 -0500 (Mon, 27 Jun 2011) Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081906#comment-13081906 ] Guna C commented on SOLR-2242: -- Hi Bill I wanted to add that this is a great patch. Provides a way to analyze which search terms are effective without requiring to retrieve all the docs themselves. I was looking for a patch for 3.3.0. Does the latest one work? Thanks -guna Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073334#comment-13073334 ] Bill Bell commented on SOLR-2242: - OK, I like the warning message idea. Also, it does depend on the shard approach since some shard by date... In that many cases the maxTerms would do what I need. List: 1. Change the facet.field format. 2. Get it working with sharding. 3. Change code to cache the numFacetTerms/numTerms and remove the code that caches the huge term list. I can do all of this except would like some help with #3. Bill Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072732#comment-13072732 ] Bill Bell commented on SOLR-2242: - To make this work right with distribution, if seems that it might be more complicated... Wouldn't you have to send the full list of facet terms, consolidate them, and then loop to get the distinct number? That is why I originally sent the WHOLE list of facets, and just added the magic number to the end. One machine: male: 1 numFacetTerms: 1 Another machine: female: 7000 male: 500 numFacetTerms: 2 The numFacetTerms that we want is 2. Since if you combined them and looped you get 2: male: 10500 female: 7000 numFacetTerms: 2 If we add numFacetTerms you get 1+2 = 3. The other 2 are easier: distribMaxTerms: 2 distribSumTerms: 3 This is not ideal but may be acceptable, the perfect solution is to send the whole list, dedupe them, and then count Thoughts? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072805#comment-13072805 ] Chris Male commented on SOLR-2242: -- I don't think its realistic to send back the whole list, it could be huge! Besides, in the situation where we are only doing counts we aren't going to store the list anywhere. The distributed environment is never going to be perfect in this situation, Ryan and my suggestion is to send the minimum and maximum number of constraints there could be. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072819#comment-13072819 ] Ryan McKinley commented on SOLR-2242: - Ya, always sending the whole seems like asking for problems. You can control how many terms it should pass around with facet.limit, and we could potentially add a warning message to the resposne if that is less then the total number of terms. Maybe we could also have facet.distrib.limit or something, that would bump up the number that it internally asks for, but still respect facet.limit for the final result? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072827#comment-13072827 ] Chris Male commented on SOLR-2242: -- I really want to avoid having to load the list just to calculate the counts, it seems unnecessary and a waste of memory. I think we should start simple and implement what you originally suggested. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13071147#comment-13071147 ] Ryan McKinley commented on SOLR-2242: - bq. The simplest option seems to be to return the max constraint count taken from all the shards That seems reasonable -- though I think we would also want to be able to have the sum when you know that all shards have unique values. I don't think bill is referring to the accuracy/meaning of distinct count in distributed search. His problem is that if we change the output format, we also need to update the code that collects the various values and passes them along. This patch just add a magic value (numFacetTerms) to the count list so that the value is handled with existing distributed response parsing. This is a fine one-off solution, but I am -1 for adding any more magic field names to solr. To add this feature, i think we need to bite the bullet and update the facet response format. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13071156#comment-13071156 ] Chris Male commented on SOLR-2242: -- {code} That seems reasonable – though I think we would also want to be able to have the sum when you know that all shards have unique values. {code} Perhaps we should return the maximum and sum of all shard counts? That way, assuming the client knew how many shards exist, they could handle most scenarios. {code} I don't think bill is referring to the accuracy/meaning of distinct count in distributed search. His problem is that if we change the output format, we also need to update the code that collects the various values and passes them along. This patch just add a magic value (numFacetTerms) to the count list so that the value is handled with existing distributed response parsing. This is a fine one-off solution, but I am -1 for adding any more magic field names to solr. To add this feature, i think we need to bite the bullet and update the facet response format. {code} Absolutely. I hadn't even considered the prospect of not changing the distributed response parsing. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13071162#comment-13071162 ] Ryan McKinley commented on SOLR-2242: - bq. Perhaps we should return the maximum and sum of all shard counts? That way, assuming the client knew how many shards exist, they could handle most scenarios. Once we change the output format, we should be able to add a few thigns to the output. Perhaps something like {code:xml} lst name=text int name=numTerms385/int int name=distribMaxTerms385/int int name=distribSumTerms845/int lst name=counts ... {code} Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070935#comment-13070935 ] Chris Male commented on SOLR-2242: -- Having walked through the SimpleFacet codebase, I see PerSegmentSingleValuedFaceting has already introduced a FacetCollector. I think we should take this and make it used throughout all the different faceting 'Strategies'. That way we can push the counting of constraints into the Collector. I've also thought about the distribution issue. The simplest option seems to be to return the max constraint count taken from all the shards. With this, no matter if shards have distinct or overlapping constraints sets, clients can alway see this as the minimum number of constraints that do exist. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068892#comment-13068892 ] Chris Male commented on SOLR-2242: -- I'm just jumping into this issue and considering the problem of loading all constraints just to get their size (or in fact, not wanting to do this). Is there scope in the SimpleFacets to have some sort of 'Collector' idea added? That way it would be easy to choose if we want to collect the constraints, their counts and the total number of constraints, or whether we just want to total number. Does anybody have any thoughts on the distribution issue? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067500#comment-13067500 ] Bill Bell commented on SOLR-2242: - Simon - thoughts? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064414#comment-13064414 ] Bill Bell commented on SOLR-2242: - Yonik, Yes I know about groups.ngroups. But the use case still stands. We need a way to add up facet terms without actually counting them. I had the restructured facet_fields XML like you recommended (twice). And the issue is it breaks ALL sharding. The reason why it breaks distribution is that it is looking for int and not lst... Several people have wanted me to change the name to count, to term, to distinct... I really don't care what the name is, since it makes sense when you try it. I think changing the distribution is a MUCH larger project. If you want to jump in on the sharding/distribution to make it work with lists, then please help. The format change is a HUGE issue. The magic names could also be an issue but ONLY if you use this new feature. It is not an issue for all APIs and usage - which is why I added it as a magic variable. Do we have any examples with Boolean? I have not seen any... Do we use True/False or on/off? Do you mean like facet=true ? The reason why I have a 1 and 2 is to get the count of terms, but only return a smaller set (internal limit=-1, but user types limit=5). That is the reason for that. I believe it is very useful. Having the numFacetTerms like every other term pretty much works with sharding/distribution. It just adds it together like any other facet count. One server returns 5, and the other returns numFacetTerms=10, and the combined result returns 15. It may break some new feature with distribution or something I am not aware of and not using... Concerning building in memory. Having it cached is what I was trying to achieve. If there is another way to cache the result then let me know other options. Not having it cached at all is a huge performance problem. If you are using mode 2, it does not matter that much since you need to return the list and in most cases you have it in memory... Mode 1 hides it a bit and builds the entire list in memory when we only need to cache the one value... Again - without breaking something else, not sure how to achieve that. As long as there are not more gotchas in distribution, most of the other things you are listing (XML, name change, boolean) are almost preferences and the XML format change will be a huge issue, and we should be able to commit? Also, would like to not cache the entire list in memory when using this - need some assistance. 1. Any other distribution/sharing issues with adding a magic variable in facet_field for a new feature? 2. Where and how do we store a cache value without using the array that is present so we don't cache the whole facet term list when we only need to cache the resulting number? Thanks. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064144#comment-13064144 ] Yonik Seeley commented on SOLR-2242: This issue was a bit tricky to review, given that the output doesn't seem to quite match the examples. I also wasn't exactly sure what the latest patch was, so I just looked at the patch uploaded on 28/Jun/11. Here's my summary on what the patch currently does: If you add facet.facetTermCounts=2 to a faceting request, you get the following: {code} lst name=facet_fields lst name=text int name=electronics14/int int name=inc8/int int name=2.05/int int name=lcd5/int int name=memory5/int int name=numFacetTerms385/int /lst /lst {code} If you add facet.facetTermCounts=1 to a faceting request, you get the following: {code} lst name=facet_fields lst name=text int name=numFacetTerms385/int /lst /lst {code} w.r.t. the interface, I agree with a number of Lance's observations. - facet.numFacetTerms name: the second Facet is a bit redundant. And we probably should be talking in terms of constraints instead of terms. Perhaps facet.numConstraints or (facet.nconstraints to be consistent with group.ngroups). - facet.nconstraints should just be a boolean... no need for 1 or 2. If the user doesn't want to see any constraints, then they can set facet.limit=0. This is also consistent with grouping. - we're mixing units in the same list, and that's probably not a great idea? Constraints have units of documents (number of documents that matched that constraint) while numFacetTerms has units of number of constraints. - I think this also breaks distributed faceting due to mixing of units? The distributed faceting code thinks that numFacetTerms is a constraint. - We need to figure out what we are going to do in distributed mode... it doesn't seem easy to actually figure out the number of constraints without streaming them *all* back and merging (i.e. you can't just add up the numbers) - I also agree that we should not built the entire list in memory just to get the size of that list. It seems like rather than adding more magic names to the list (and risk a real collision with the actual name of a constraint), we should add more structure to the response, as previously discussed. So if we added facet.nconstraints=true, we would get {code} lst name=facet_fields lst name=text int name=numFacetTerms385/int lst name=counts int name=electronics14/int int name=inc8/int int name=2.05/int int name=lcd5/int int name=memory5/int /lst /lst /lst {code} And when we use this new format, we should consider using a separate missing name for facet.missing=true instead of using the null name in with the counts. This format change is where we need to be careful about back compat - this interface is one of the widest used and with all the 3rd party clients and libraries out there, we should still support the old format via a facet.format parameter or something. Bill: You originally opened this issue for use with grouping to get the total number of groups. Are you aware of the group.ngroups parameter that was added that does this? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: NumFacetTermsFacetsTest.java, SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061785#comment-13061785 ] Bill Bell commented on SOLR-2242: - Are we ready to commit? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061788#comment-13061788 ] Jonathan Rochkind commented on SOLR-2242: -  I am out of the office on vacation, I will return Monday July 11. I will not be checking email. For urgent Systems Department business, please contact Mercy Anaba, man...@jhu.edu,(410) 516-5306. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061839#comment-13061839 ] Simon Willnauer commented on SOLR-2242: --- bq. Are we ready to commit? bill, isnt't there a test failure still on this issue related to FC? Yonik mentioned BW compat issues here and promised to comment. I will ping him again. thanks for the patience simon Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058205#comment-13058205 ] Bill Bell commented on SOLR-2242: - Thanks... If you look at my tests that I commented out, you will notice you get the Insane FieldCache usage(s) problem. It does it every time on my PC... This patch does not appear to gave any issues until you pull in the group issue. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055151#comment-13055151 ] Bill Bell commented on SOLR-2242: - OK. Here are some test cases. I am getting a weird error on running it: ant -Dtestcase=NumFacetTermsFacetsTest test {code} junit-sequential: [junit] Testsuite: org.apache.solr.request.NumFacetTermsFacetsTest [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 4.072 sec [junit] [junit] - Standard Error - [junit] NOTE: reproduce with: ant test -Dtestcase=NumFacetTermsFacetsTest -Dtestmethod=testNumFacetTermsFacetCounts -Dtests.seed=3921835369594659663:-3219730304883530389 [junit] *** BEGIN org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) *** [junit] SUBREADER: Found caches for descendants of DirectoryReader(segments_3 _0(4.0):C6)+hgid_i1 [junit] 'DirectoryReader(segments_3 _0(4.0):C6)'='hgid_i1',class org.apache.lucene.search.FieldCache$DocTermsIndex,org.apache.lucene.search.cache.DocTermsIndexCreator@603bb3eb=org.apache.lucene.search.cache.DocTermsIndexCreator$DocTermsIndexImpl#1026179434 (size =~ 372 bytes) [junit] 'org.apache.lucene.index.SegmentCoreReaders@7e8905bd'='hgid_i1',int,org.apache.lucene.search.cache.IntValuesCreator@30781822=org.apache.lucene.search.cache.CachedArray$IntValues#291172425 (size =~ 92 bytes) [junit] [junit] *** END org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) *** [junit] - --- [junit] Testcase: testNumFacetTermsFacetCounts(org.apache.solr.request.NumFacetTermsFacetsTest): FAILED [junit] org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) found expected:0 but was:1 [junit] junit.framework.AssertionFailedError: org.apache.solr.request.NumFacetTermsFacetsTest.testNumFacetTermsFacetCounts: Insane FieldCache usage(s) found expected:0 but was:1 [junit] at org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:725) [junit] at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:620) [junit] at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:96) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1430) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1348) [junit] [junit] [junit] Test org.apache.solr.request.NumFacetTermsFacetsTest FAILED {code} Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055155#comment-13055155 ] Bill Bell commented on SOLR-2242: - I think it has to do with a NPE in group ion 4.0 it fails on other code. Nothing to do with this patch. {code} assertQ(check group and facet counts with numFacetTerms=1, req(q, id:[1 TO 6] ,indent, on ,facet, true ,group, true ,group.field, hgid_i1 ,f.hgid_i1.facet.limit, -1 ,f.hgid_i1.facet.mincount, 1 ,f.hgid_i1.facet.numFacetTerms, 1 ,facet.field, hgid_i1 ) ,*[count(//arr[@name='groups'])=1] ,*[count(//lst[@name='facet_fields']/lst[@name='hgid_i1']/int)=1] // there are 1 unique items ,//lst[@name='hgid_i1']/int[@name='numFacetTerms'][.='4'] ); {code} Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13055158#comment-13055158 ] Bill Bell commented on SOLR-2242: - {code} junit-sequential: [junit] Testsuite: org.apache.solr.request.NumFacetTermsFacetsTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.48 sec [junit] {code} I fixed the NamedList() generic too. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054931#comment-13054931 ] Bill Bell commented on SOLR-2242: - re: whitespace What are the settings supposed to be for tabs? Because on my editor it looks perfect. 4 space, tabs, 2 space per tab? ?? I will add some tests. I think switching from if to switch and the movement to termList != null is mostly just style and does not really improve anything. I actually think it confuses things and makes the overall patch larger and more risky that we miss something or mess it up. I will also look at the Integer generic... Thanks. Bill Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054932#comment-13054932 ] Bill Bell commented on SOLR-2242: - Just so you know I have been using the original patch in production for over 5 months. I would say that the original one is tested. But now that we are changing it, I agree that we need more coverage. That will be my #1 priority. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053080#comment-13053080 ] Bill Bell commented on SOLR-2242: - Simon, I made all those changes except for the termsList one. I think it is useful to have the count based on terms. See attachment. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053113#comment-13053113 ] Simon Willnauer commented on SOLR-2242: --- bq. New patch ready for commit? bill, I still see lots of whitespace / indentation problems in that latest patch. Anyway I looked at it and I wonder if we could restructure this a little like we could first check if termList != null and do all the cases there and if termList == null we get the TermCountsLimit that would remove all the redundant getTermCountsLimit / getListedTermCounts calls. Like the termList==null case seems very easy and straight forward: {code} if (termList != null) { NamedListInteger counts = getListedTermCounts(facetValue, termList); switch (numFacetTerms) { case COUNTS: final NamedListInteger resCount = new NamedListInteger(); counts = resCount; case COUNTS_AND_VALUES: counts.add(numFacetTerms, counts.size()); break; } res.add(key, counts); } else { ... {code} yet, its hard to refactor this without a single test (note, there might be a bug). I would be really happy to see a test-case for this that tests all the variations. Regarding the constants, I think the default case should be a constant too. If you use NamedList can you make sure you put the right generic to it if possible, otherwise my IDE goes wild and adds warnings all over the place. In your case NamedListInteger works fine. simon Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049014#comment-13049014 ] Simon Willnauer commented on SOLR-2242: --- Bill, this seems like an important issue. Many votes etc. I am on travel right now so give me some days to come back and I will work with you to get this done. Thanks for your patience simon Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049077#comment-13049077 ] Simon Willnauer commented on SOLR-2242: --- Hey bill, I looked at your patch and I have some comments: * you should fix white-spaces within the try {} catch block in SimpleFacets * I think you should alsom make the constant came consistent with facet parameter s/FACET_NAMEDISTINCT/FACTE_NUM_FACET_TERMS/ * as lance noted (in a not necessarily appropriate tone but this is a different issue)switch to a constant / enum rather than a number something like [ COUNTS, COUNTS_AND_VALUES ] * if the termList is not null the results are all implicit meaning its always the number of terms you specify in the term list, right? I think we should not support this eg. only compute the count if no term list is specified * If you are asking for COUNTS_AND_FACETS (the 2 case) if seems we should check if the limit is already -1 so we don't comput that twice? * I think you should use a switch / case or an if ELSE construct instead of having 3 plain if statements I only considered the last patch you uploaded let me know if I should look at something else? Simon Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Assignee: Simon Willnauer Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price This currently only works on facet.field. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048480#comment-13048480 ] Lance Norskog commented on SOLR-2242: - There is a lot of complexity here, and having a bunch of orthogonal parameters is not quite enough. Looking at everything around facets, and group collapse, and the join trick, the Solr query syntax looks like the database world right before SQL. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048964#comment-13048964 ] Bill Bell commented on SOLR-2242: - Lance, There is literally 15 lines of code changes. Not sure how you cannot follow it. I could use no memory and just loop through the results, but that would not be cached - so the speed would still be slow since I need to pull in the array in order to count it. The field is not called namedistinct anymore... It is called facet.numFacetTerms=2,1,0. All other parameters are good. Also you do not need anything else to get it to work, since I set the defaults to work for you now. I'll see if I can write some more tests. Here is the rub: I would be happy to wrote hundreds of test cases if I knew someone was going to actually help me get this done. I am used to having a committer actually work with me - Mike McCandless is awesome and we worked on several issues together. But I have seen tons of features die when no one is willing to help. So here I am wanting, willing and able to get this done. And I have no one willing to assist from a committer perspective... The patch works fine in sharded and normal mode. So people can use it today. It is just not committed. I have 4 clients using it in production and one has 100M page views a year, and so far no problems. http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048461#comment-13048461 ] Bill Bell commented on SOLR-2242: - Can we PLEASE commit this? What else do we need to add? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048466#comment-13048466 ] Mark Miller commented on SOLR-2242: --- Hmm...yeah, fair amount of work went on here and a fair amount of interest... unfortunately, not my field (and I'm sick, on vacation, out of the country, and blah blah blah :) ). But, if no one takes this, I can get up to speed eventually - I doubt that soon though. Sorry Bill - not a lot of committers fluent in this area that are not very busy with other things. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048471#comment-13048471 ] Lance Norskog commented on SOLR-2242: - Yeah, my itch started just now also :) Constraint means any facet value: terms, numerical ranges, query results. Range queries have the same situation: when I give range endpoints and a gap, I want to know how many intervals it made from the gap.That would be the analog of this count. I'm not saying this patch has to do range counts also, but pointing out the eventual scope of this feature. Therefore, 'numTerms' is not the word we're looking for. 'count' or 'total' seem right. Below, both *features:{* and *popularity:{* need counts. {code} facet_counts:{ facet_queries:{ *:*:27}, facet_fields:{ features:[ facet_terms,[ 2,7,]]} facet_ranges:{ popularity:{ counts:[ 0,3, 2,0, 4,1, 6,9], gap:2, start:0, end:8 {code} p.s. I got the above from the example electronic shop database with this query: [click to see|http://localhost:8983/solr/select/?q=*%3A*version=2.2start=0rows=0indent=onfacet.field=popularityfacet=truefacet.numTerms=truefacet.query=*:*wt=jsonfacet.range.start=0facet.range.end=7facet.range.gap=2facet.range=popularity] Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13048474#comment-13048474 ] Lance Norskog commented on SOLR-2242: - If I was a committer which I'm not, I would demand: * params would be as simple as possible. 'namedistinct' would be a symbol like 'facet.method=enum'. Facets have exploded in complexity, and I can't follow how everything interlocks. The API may have to change later. * no white-space glitches * consistencyConsistencyConsistency. * there has to to be a way to use less memory when we're only pulling a count. * unit tests. It's somewhat unfair to expect you to write all the unit tests required to make sure this does not break anything else, give that so much of facet features do not have tests. Anyway, food calls. Hope this helps. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044727#comment-13044727 ] Bill Bell commented on SOLR-2242: - Since we changed the output of the facet_fields, the FacetComponent.java needs to change. This also impacts the DistribFieldFacet type. This code is not going to work, since price does not just have a list of numbers. It now has multiple lists (if we set the param). We might want to always return counts list in all cases. Then sharding can easily pick up on this... The DistribFieldFacet needs to be refactored. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int lst name=countsint name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst /lst {code} Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044730#comment-13044730 ] Bill Bell commented on SOLR-2242: - It would be easier for Sharding to not have multiple lists... I could use some help if we want to change it - since I have not played with FacetComponent.java. Otherwise, it would a more simpler fix to just add it and flatten the lists. {code} lst name=facet_fields lst name=price int name=numFacetTerms14/int int name=0.03/intint name=11.51/intint name=19.951/intint name=74.991/intint name=92.01/intint name=179.991/intint name=185.01/intint name=279.951/intint name=329.951/intint name=350.01/intint name=399.01/intint name=479.951/intint name=649.991/intint name=2199.01/int /lst /lst {code} Not ideal, but easier for v1 ? I could also just remove numFacetTerms=2 for now. Will only require an if statement to ignore the type check for numFacetTerms. Here is a patch that works with sharding. http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price Enjoy. Bill Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13040096#comment-13040096 ] Bill Bell commented on SOLR-2242: - From rajini: The patch solr 2242 for getting count of distinct facet terms doesn't work for distributedProcess (https://issues.apache.org/jira/browse/SOLR-2242) The error log says HTTP ERROR 500 Problem accessing /solr/select. Reason: For input string: numFacetTerms java.lang.NumberFormatException: For input string: numFacetTerms at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:403) at java.lang.Long.parseLong(Long.java:461) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:331) at org.apache.solr.schema.TrieField.toInternal(TrieField.java:344) at org.apache.solr.handler.component.FacetComponent$DistribFieldFacet.add(FacetComponent.java:619) at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponent.java:265) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:235) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) The query I passed : http://localhost:8983/solr/select?q=*:*facet=truefacet.field=2facet.field=648facet.mincount=1facet.limit=-1f.2.facet.numFacetTerms=1rows=0shards=localhost:8983/solr,localhost:8985/solrtwo Anyone can suggest me the changes i need to make to enable the same funcionality for shards? When i do it across single core.. I get the correct results. I have applied the solr 2242 patch in solr1.4.1 Awaiting for reply Regards, Rajani Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid.
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031539#comment-13031539 ] Bill Bell commented on SOLR-2242: - OK. Can you point me in the right direction. Are you a committer? Can we get this committed? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026297#comment-13026297 ] Jonathan Rochkind commented on SOLR-2242: - Wonderful much better, thanks Lance, this is a much more clear and flexible api consistent with other parts of Solr. (For a feature I could definitely really use, thanks Bill). But I wonder... should it be facet.numTerms to group with other facetting related params? Or wait, is it already? Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026823#comment-13026823 ] Lance Norskog commented on SOLR-2242: - I changed it to 'facet.numTerms'. There is still a big performance problem: numTerms builds the entire list of facets and then reports the length of the list. This could be done more efficiently. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026826#comment-13026826 ] Bill Bell commented on SOLR-2242: - I am not seeing the performance problem. If you are outputting facets anyways, the loop and list is going to be called. So in that case it is as efficient as probably can be. That is why I had the 0/1/2. I was reusing the code and just looking at the list size: countFacetTerms.size() counts.size() There is a lot of logic in getListedTermCounts() and getTermCountsLimit(). If we optimize, and just add a counter, we need to make sure the new methods are not forgotten about (test cases?). I have seen that happen numerous times. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026827#comment-13026827 ] Bill Bell commented on SOLR-2242: - Also I thought you wanted to change the name to numNames? I am okay with numTerms too. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026828#comment-13026828 ] Bill Bell commented on SOLR-2242: - It would be good to be able to cache the value, instead of building a list that is cached too. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026863#comment-13026863 ] Lance Norskog commented on SOLR-2242: - bq. There is a lot of logic in getListedTermCounts() and getTermCountsLimit(). If we optimize, and just add a counter, we need to make sure the new methods are not forgotten about (test cases?). I have seen that happen numerous times. Ayup. In fact this breaks SimpleFacetsTest. Everything in facets need tests. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026103#comment-13026103 ] Bill Bell commented on SOLR-2242: - Lance Norskog, What do you want it to be called? I would use a committer to take this issue on. It has several votes, and lots of downloads. People are using it successfully already. Do you want me to switch the numFacetTerms to numFacetNames ? Anything else? I feel like we are going in circles on this issue. {code} This will output the numFacetTerms AND hgid: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=hgidfacet.mincount=1f.hgid.facet.numFacetTerms=2 lst name=facet_fields lst name=hgid int name=numFacetTerms7/int !-- this is not 11 -- lst name=counts int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst /lst {code} Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR-2242.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024966#comment-13024966 ] Lance Norskog commented on SOLR-2242: - From the patch: bq. {{public static final String FACET_NAMEDISTINCT = FACET + .numFacetTerms;}} So- in this issue, a _name_ is what everything else calls a _term_. Please change this in the patch. Get distinct count of names for a facet field - Key: SOLR-2242 URL: https://issues.apache.org/jira/browse/SOLR-2242 Project: Solr Issue Type: New Feature Components: Response Writers Affects Versions: 4.0 Reporter: Bill Bell Priority: Minor Fix For: 4.0 Attachments: SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch When returning facet.field=name of field you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called namedistinct. Here is an example: http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1 Here is an example on field hgid (without namedistinct): {code} - lst name=facet_fields - lst name=hgid int name=HGPY045FD36D4000A1/int int name=HGPY0FBC6690453A91/int int name=HGPY1E44ED6C4FB3B1/int int name=HGPY1FA631034A1B81/int int name=HGPY3317ABAC43B481/int int name=HGPY3A17B2294CB5A5/int int name=HGPY3ADD2B3D48C391/int /lst /lst {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - lst name=facet_fields - lst name=hgid int name=_count_7/int /lst /lst {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org