[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064414#comment-13064414 ]
Bill Bell commented on SOLR-2242: --------------------------------- Yonik, Yes I know about groups.ngroups. But the use case still stands. We need a way to add up facet terms without actually counting them. I had the restructured facet_fields XML like you recommended (twice). And the issue is it breaks ALL sharding. The reason why it breaks distribution is that it is looking for <int> and not <lst>... Several people have wanted me to change the name to count, to term, to distinct... I really don't care what the name is, since it makes sense when you try it. I think changing the distribution is a MUCH larger project. If you want to jump in on the sharding/distribution to make it work with lists, then please help. The format change is a HUGE issue. The magic names could also be an issue but ONLY if you use this new feature. It is not an issue for all APIs and usage - which is why I added it as a magic variable. Do we have any examples with Boolean? I have not seen any... Do we use True/False or on/off? Do you mean like facet=true ? The reason why I have a 1 and 2 is to get the count of terms, but only return a smaller set (internal limit=-1, but user types limit=5). That is the reason for that. I believe it is very useful. Having the numFacetTerms like every other term pretty much works with sharding/distribution. It just adds it together like any other facet count. One server returns 5, and the other returns numFacetTerms=10, and the combined result returns 15. It may break some new feature with distribution or something I am not aware of and not using... Concerning building in memory. Having it cached is what I was trying to achieve. If there is another way to cache the result then let me know other options. Not having it cached at all is a huge performance problem. If you are using mode 2, it does not matter that much since you need to return the list and in most cases you have it in memory... Mode 1 hides it a bit and builds the entire list in memory when we only need to cache the one value... Again - without breaking something else, not sure how to achieve that. As long as there are not more gotchas in distribution, most of the other things you are listing (XML, name change, boolean) are almost preferences and the XML format change will be a huge issue, and we should be able to commit? Also, would like to not cache the entire list in memory when using this - need some assistance. 1. Any other distribution/sharing issues with adding a magic variable in facet_field for a new feature? 2. Where and how do we store a cache value without using the array that is present so we don't cache the whole facet term list when we only need to cache the resulting number? Thanks. > Get distinct count of names for a facet field > --------------------------------------------- > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers > Affects Versions: 4.0 > Reporter: Bill Bell > Assignee: Simon Willnauer > Priority: Minor > Fix For: 4.0 > > Attachments: NumFacetTermsFacetsTest.java, > SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, > SOLR-2242.shard.patch, SOLR-2242.shard.patch, > SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, > SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field=<name of field> you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price > This currently only works on facet.field. > {code} > <lst name="facet_fields"> > <lst name="price"> > <int name="numFacetTerms">14</int> > <int name="0.0">3</int><int name="11.5">1</int><int > name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int > name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int > name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int > name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int> > </lst> > </lst> > {code} > Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org