Thanks for your reply, Yonik:

On Thu, May 21, 2009 at 2:43 AM, Yonik Seeley
<yo...@lucidimagination.com> wrote:
>
> Some thoughts:
>
> #1) This is sort of already implemented in some form... see this
> section of solrconfig.xml and try uncommenting it:
> ...

> > On Wed, May 20, 2009 at 12:43 PM, Yonik Seeley
> > > <yo...@lucidimagination.com> wrote:
> > >    <useFilterForSortedQuery>true</useFilterForSortedQuery>

> > Of course the examples you gave used the default sort (by score) so
> > this wouldn't help if you do actually need to sort by score.

Right - we need to sort by relevance

> #2) Your problem might be able to be solved with field collapsing on
> the "category" field in the future (but it's not in Solr yet).

Sorry - I didnt understand this

> #3) Current work I'm doing right now will push Filters down a level
> and check them in tandem with the query instead of after.  This should
> speed things up by at least a factor of 2 in your case.
> https://issues.apache.org/jira/browse/SOLR-1165
>
> I'm trying to get SOLR-1165 finished this week, and I'd love to see
> how it affects your performance.
> In the meantime, try useFilterForSortedQuery and let us know if it
> still works (it's been turned off for a loooong time) ;-)

OK - so this looks like something to make all queries much faster by
only bothering to score results matching a filter?  If so, that's
really great, but I'm not sure it particularly helps our use-case
(other than making all filtered results faster) because:

- we've got one query we want filtered 5 ways to find the top scoring
results matching the query and each filter

- the filtering basically divides that query result set into 5 non
overlapping sets

- the query part is often complicated and expensive - we want to avoid
running it 5 times because our sloppy phrase requirement and often
millions of hits make finding and scoring expensive

- all documents in the query part will be scored eventually, even with
SOLR-1165, because they'll be part of one of the 5 filters

It is tempting to pass back to a custom query component lots of
results - enough so that the 'n' top scoring document that satisfy
each filter appear, but we may need to pass up to the query component
millions of hits to find, say, the top 5 ranked results for "maps".

It is tempting to apply the filters one by one in our own query
component on a scored document list retrieved by SolrIndexSearcher -
Im not sure - maybe I havent understood SOLR-1165?

Thanks also Walter for your suggestions.  Our users have a requirement
for the index to be continuously updated (well, every 10 minutes or
so), and our queries are extremely diverse/"long tail"ish, so an HTTP
cache will probably not help us.

Kent Fitch

Reply via email to