[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064414#comment-13064414
 ] 

Bill Bell commented on SOLR-2242:
---------------------------------

Yonik,

Yes I know about groups.ngroups. But the use case still stands. We need a way 
to add up facet terms without actually counting them.

I had the restructured facet_fields XML like you recommended (twice). And the 
issue is it breaks ALL sharding. The reason why it breaks distribution is that 
it is looking for <int> and not <lst>... Several people have wanted me to 
change the name to count, to term, to distinct... I really don't care what the 
name is, since it makes sense when you try it. I think changing the 
distribution is a MUCH larger project. If you want to jump in on the 
sharding/distribution to make it work with lists, then please help. The format 
change is a HUGE issue. The magic names could also be an issue but ONLY if you 
use this new feature. It is not an issue for all APIs and usage - which is why 
I added it as a magic variable.

Do we have any examples with Boolean? I have not seen any... Do we use 
True/False or on/off? Do you mean like facet=true ? The reason why I have a 1 
and 2 is to get the count of terms, but only return a smaller set (internal 
limit=-1, but user types limit=5). That is the reason for that. I believe it is 
very useful.

Having the numFacetTerms like every other term pretty much works with 
sharding/distribution. It just adds it together like any other facet count. One 
server returns 5, and the other returns numFacetTerms=10, and the combined 
result returns 15. It may break some new feature with distribution or something 
I am not aware of and not using...

Concerning building in memory. Having it cached is what I was trying to 
achieve. If there is another way to cache the result then let me know other 
options. Not having it cached at all is a huge performance problem. If you are 
using mode 2, it does not matter that much since you need to return the list 
and in most cases you have it in memory... Mode 1 hides it a bit and builds the 
entire list in memory when we only need to cache the one value... Again - 
without breaking something else, not sure how to achieve that.

As long as there are not more gotchas in distribution, most of the other things 
you are listing (XML, name change, boolean) are almost preferences and the XML 
format change will be a huge issue, and we should be able to commit? Also, 
would like to not cache the entire list in memory when using this - need some 
assistance. 

1. Any other distribution/sharing issues with adding a magic variable in 
facet_field for a new feature? 
2. Where and how do we store a cache value without using the array that is 
present so we don't cache the whole facet term list when we only need to cache 
the resulting number?

Thanks.




> Get distinct count of names for a facet field
> ---------------------------------------------
>
>                 Key: SOLR-2242
>                 URL: https://issues.apache.org/jira/browse/SOLR-2242
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>    Affects Versions: 4.0
>            Reporter: Bill Bell
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: NumFacetTermsFacetsTest.java, 
> SOLR-2242-notworkingtest.patch, SOLR-2242.patch, SOLR-2242.patch, 
> SOLR-2242.shard.patch, SOLR-2242.shard.patch, 
> SOLR-2242.shard.withtests.patch, SOLR-2242.solr3.1.patch, 
> SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field=<name of field> you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> <lst name="facet_fields">
>   <lst name="price">
>     <int name="numFacetTerms">14</int>
>     <int name="0.0">3</int><int name="11.5">1</int><int 
> name="19.95">1</int><int name="74.99">1</int><int name="92.0">1</int><int 
> name="179.99">1</int><int name="185.0">1</int><int name="279.95">1</int><int 
> name="329.95">1</int><int name="350.0">1</int><int name="399.0">1</int><int 
> name="479.95">1</int><int name="649.99">1</int><int name="2199.0">1</int>
>   </lst>
> </lst>
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to