Re: Speeding up search by combining common sub-filters

Jonathan Rochkind Wed, 27 Jul 2011 14:54:37 -0700

I'm pretty sure Solr/lucene have no such "optimization" already, butit's not clear to me that it would result in much of a performancebenefit, just because of the way lucene works, it's not obvious to methat the second version of your query will be noticeably faster than thefirst version.

Maybe in cases with many many clauses, rather than the few clauses inyour example. You'd definitely want to performance test it to verifythere are any gains, before embarking on writing the 'optimization' --you can test it just by sending the different versions of your realworld queries to Solr and seeing what the response times are,calculating the hypothetically 'optimized' version yourself by hand ifneed be, right?




On 7/27/2011 5:05 PM, Scott Smith wrote:

We have a solr application which ends up creating queries with very complicated 
filters (literally hundreds and sometimes thousands of terms-typically a large 
number of terms OR'ed together where each of these terms might have a half a 
dozen keywords ANDed/ORed together).  In looking at the filters, I realized 
that there are often a lot of common sub-filters.

A simple example of what I mean is:

                 ("cat" AND "dog") OR ("cat" AND "horse")

This could clearly be simplified by saying:

                 "cat" AND ("dog" OR "horse")

It turns out that finding and combining common sub-filters isn't trivial for our 
application.  So, before I start a project to attempt some kind of 
"optimization", my question is whether it's likely that I will see significant 
decreases in query times to justify the development effort it takes to optimize the 
filters.  Certainly, if I thought I might get a 20%+ decrease in time, I would say it's 
probably a good project.  If it's just a few percentage points of improvement, then I'm 
less excited about doing it.

Does Solr already go through some kind of optimization which effectively 
combines common sub-filters and possibly duplicated terms?  Does anyone have 
any thoughts on this subject?

Thanks

Scott

Re: Speeding up search by combining common sub-filters

Reply via email to