How to avoid double counting for facet query
Hi, guys, I fixed Solr search UI (solr/browse) to display the price range facet values via http://thetechietutorials.blogspot.com/2011/06/fix-price-facet-display-in-solr-search.htm l: - Under 50http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B0.0+TO+50%5D (1331) - [50.0 TO 100]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B50.0+TO+100%5D (133) - [100.0 TO 150]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B100.0+TO+150%5D (31) - [150.0 TO 200]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B150.0+TO+200%5D (7) - [200.0 TO 250]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B200.0+TO+250%5D (2) - [250.0 TO 300]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B250.0+TO+300%5D (5) - [300.0 TO 350]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B300.0+TO+350%5D (3) - [350.0 TO 400]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B350.0+TO+400%5D (6) - [400.0 TO 450]http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B400.0+TO+450%5D (1) - 600.0+http://localhost:9090/solr/browse?q=Shakespearefq=price:%5B600.0+TO+*%5D(1) However I am having double counting issue. Here is the URL to only return docs whose prices are in between 110.0 and 160.0 and price facets: http://localhost:8983/solr/select/?q=Shakespeareversion=2.2rows=0* fq=price:[110.0+TO+160]** facet.query=price:[110%20TO%20160]facet.query=price:[160%20TO%20200]* facet.field=price The response is as below: *result name=response numFound=23 start=0 maxScore=0.37042576/ lst name=facet_counts lst name=facet_queries int name=price:[110 TO 160]23/int int name=price:[160 TO 200]1/int /lst ... /result* As you notice, the number of the results is 23, however an extra doc was found in the 160-200 range. Any way I can avoid double counting issue? Or does anyone have similar issues? Thanks, YH
Re: How to avoid double counting for facet query
int name=price:[110 TO 160]23/int int name=price:[160 TO 200]1/int /lst ... /result* As you notice, the number of the results is 23, however an extra doc was found in the 160-200 range. Any way I can avoid double counting issue? You can use exclusive range queries which are denoted by curly brackets. price:[110 TO 160} price:[160 TO 200}
Re: How to avoid double counting for facet query
Thanks! That's what I was trying to find. On Tue, Jun 14, 2011 at 1:48 PM, Ahmet Arslan iori...@yahoo.com wrote: int name=price:[110 TO 160]23/int int name=price:[160 TO 200]1/int /lst ... /result* As you notice, the number of the results is 23, however an extra doc was found in the 160-200 range. Any way I can avoid double counting issue? You can use exclusive range queries which are denoted by curly brackets. price:[110 TO 160} price:[160 TO 200}
Re: How to avoid double counting for facet query
You sure Solr supports that? I am getting exceptions by doing that. Ahmet, do you remember where you see that document? Thanks. On Tue, Jun 14, 2011 at 1:58 PM, Way Cool way1.wayc...@gmail.com wrote: Thanks! That's what I was trying to find. On Tue, Jun 14, 2011 at 1:48 PM, Ahmet Arslan iori...@yahoo.com wrote: int name=price:[110 TO 160]23/int int name=price:[160 TO 200]1/int /lst ... /result* As you notice, the number of the results is 23, however an extra doc was found in the 160-200 range. Any way I can avoid double counting issue? You can use exclusive range queries which are denoted by curly brackets. price:[110 TO 160} price:[160 TO 200}
Re: How to avoid double counting for facet query
You sure Solr supports that? I am getting exceptions by doing that. Ahmet, do you remember where you see that document? Thanks. I tested it with trunk. https://issues.apache.org/jira/browse/SOLR-355 https://issues.apache.org/jira/browse/LUCENE-996
Re: How to avoid double counting for facet query
That's good to know. From the ticket, looks like the fix will be in 4.0 then? Currently I can see {} and [] worked, but not combined for Solr 3.1. I will try 3.2 soon. Thanks. On Tue, Jun 14, 2011 at 2:07 PM, Ahmet Arslan iori...@yahoo.com wrote: You sure Solr supports that? I am getting exceptions by doing that. Ahmet, do you remember where you see that document? Thanks. I tested it with trunk. https://issues.apache.org/jira/browse/SOLR-355 https://issues.apache.org/jira/browse/LUCENE-996
Re: How to avoid double counting for facet query
That's good to know. From the ticket, looks like the fix will be in 4.0 then? It is already committed. You can use trunk: svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk Currently I can see {} and [] worked, but not combined for Solr 3.1. I will try 3.2 soon. After re-thinking you can simulate the same thing by using a negative clause too : facet.query=price:[110 TO 160] -price:160 I saw an facet by range example in solrconfig.xml. May be this will work for you? http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range int name=f.price.facet.range.start0/int int name=f.price.facet.range.end600/int int name=f.price.facet.range.gap50/int
Re: How to avoid double counting for facet query
: You can use exclusive range queries which are denoted by curly brackets. that will solve the problem of making the fq exclude a bound, but for the range facet counts you'll want to pay attention to look at facet.range.include... http://wiki.apache.org/solr/SimpleFacetParameters#facet.range.include -Hoss
Re: How to avoid double counting for facet query
I already checked out facet range query. By the way, I did put the facet.range.include as below: str name=f.price.facet.range.includelower/str Couple things I don't like though are: 1. It returns the following without end values (I have to re-calculate the end values) : lst name=counts int name=100.020/int int name=150.03/int /lst float name=gap50.0/float float name=start0.0/float float name=end600.0/float int name=before0/int 2. I can't specify custom ranges of values, for example, 1,2,3,4,5,...10, 15, 20, 30,40,50,60,80,90,100,200, ..., 600, 800, 900, 1000, 2000, ... etc. Thanks. On Tue, Jun 14, 2011 at 3:50 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : You can use exclusive range queries which are denoted by curly brackets. that will solve the problem of making the fq exclude a bound, but for the range facet counts you'll want to pay attention to look at facet.range.include... http://wiki.apache.org/solr/SimpleFacetParameters#facet.range.include -Hoss
Re: How to avoid double counting for facet query
I just checked SolrQueryParser.java from 3.2.0 source. Looks like Yonik Seeley's changes for LUCENE-996https://issues.apache.org/jira/browse/LUCENE-996is not in. I will check trunk later. Thanks! On Tue, Jun 14, 2011 at 5:34 PM, Way Cool way1.wayc...@gmail.com wrote: I already checked out facet range query. By the way, I did put the facet.range.include as below: str name=f.price.facet.range.includelower/str Couple things I don't like though are: 1. It returns the following without end values (I have to re-calculate the end values) : lst name=counts int name=100.020/int int name=150.03/int /lst float name=gap50.0/float float name=start0.0/float float name=end600.0/float int name=before0/int 2. I can't specify custom ranges of values, for example, 1,2,3,4,5,...10, 15, 20, 30,40,50,60,80,90,100,200, ..., 600, 800, 900, 1000, 2000, ... etc. Thanks. On Tue, Jun 14, 2011 at 3:50 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : You can use exclusive range queries which are denoted by curly brackets. that will solve the problem of making the fq exclude a bound, but for the range facet counts you'll want to pay attention to look at facet.range.include... http://wiki.apache.org/solr/SimpleFacetParameters#facet.range.include -Hoss