Re: Passing IDs in query takes more time

Erick Erickson Fri, 06 May 2016 10:01:00 -0700

Well, you're parsing 80K IDs and forming them into a query. Consider
what has to happen. Even in the very best case of the <other criteria>
being evaluated first, for every doc that satisfies that clause the inverted
index must be examined 80,000 times to see if that doc matches
one of the IDs in your huge clause for scoring purposes.


You might be better off by moving the 80K list to an fq clause like
fq={!cache=false}docid:(111 222 333).

Additionally, you probably want to use the TermsQueryParser, something like:
fq={!terms f=id cache=false}111,222,333
see:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-TermsQueryParser

In any case, though, an 80K clause will slow things down considerably.

Best,
Erick

On Thu, May 5, 2016 at 2:42 AM, Bhaumik Joshi <bhaumik.jo...@outlook.com> wrote:
> Hi,
>
>
> I am retrieving ids from collection1 based on some query and passing those 
> ids as a query to collection2 so the query to collection2 which contains ids 
> in it takes much more time compare to normal query.
>
>
> Que. 1 - While passing ids to query why it takes more time compare to normal 
> query however we are narrowing the criteria by passing ids?
>
> e.g.  query-1: doc_id:(111 222 333 444 ...) AND <other criteria> slower 
> (takes 7-9 sec) than
>
> only <other criteria> (700-800 ms). Please note that in this case i am 
> passing 80k ids in  and retrieving 250 rows.
>
>
> Que. 2 - Any idea on how i can achieve above (get ids from one collection and 
> pass those ids to other one) in efficient manner or any other way to get data 
> from one collection based on response of other collection?
>
>
> Thanks & Regards,
>
> Bhaumik Joshi

Re: Passing IDs in query takes more time

Reply via email to