>Are you using a TrieDateField for the dates?
Yes

>Consider creating and re-using a filter for the keywords and let the
>query consist of the date range only.
In this case, do I have to configure any cache or solr's default
configurations are enough?

>Guessing here: You request all the results from the search, which is
>potentially 100M documents? Solr is not geared towards such massive
>responses. You might have better luck by paging, but even that does not
>behave very well when requesting pages very far into the result set.
We have implemented paging but the problem is sort. Solr try to sort the
dates of all the documents satisfying that query, hence if the numFound is
very large, solr loads all the date values in the memory so as to sort them
and hence goes OOM. Correct me if I am wrong.


On Tue, Sep 11, 2012 at 12:30 PM, Toke Eskildsen 
<t...@statsbiblioteket.dk>wrote:

> On Tue, 2012-09-11 at 08:00 +0200, Amey Patil wrote:
> > Our solr index (Solr 3.4) has over 100 million docuemnts.
> [...]
> > *((keyword1 AND keyword2...) OR (keyword3 AND keyword4...) OR ...) AND
> > date:[date1 TO *]*
> > No. of keywords can be in the range of 100 - 1000.
> > We are adding sort parameter *'date asc'*.
>
> Are you using a TrieDateField for the dates?
>
> > The keyword part of the query changes very rarely but date part always
> > changes.
>
> Consider creating and re-using a filter for the keywords and let the
> query consist of the date range only.
>
> [...]
>
> > 2) Sometimes when 'numFound' is very large for a query, It gives OOM
> error
> > (I guess this is because of sort).
>
> Guessing here: You request all the results from the search, which is
> potentially 100M documents? Solr is not geared towards such massive
> responses. You might have better luck by paging, but even that does not
> behave very well when requesting pages very far into the result set.
>
>

Reply via email to