In step 4, once the node 1 gets all the responses, it merges and sorts
them: Lets say you requested 15 docs from each shard (because the rows
parameter is 15), at this point node 1 merges the results from all the
responses and gets the "top 15" across all of them. The second request is
only to get the requested "fl" from those top 15 docs. This request will
only be sent to those nodes that have at least 1 of the top 15. You'll see
that the second request has the parameter "ids", with the list of ids from
the documents that have to be retrieved.



> In terms of querying + scoring, clearly that has to happen in the shards
> (since only the shard knows the IDF), and the shards only return the
> requested number of documents (15 each in our case).  So it seems like the
> final step 5 just has to sort the 15 x 4 = 60 documents it has been given
> and return the top 15 of those.  However, we are seeing a dis-proportionate
> amount of time in that step (admittedly we are only looking at query times
> in the logs, don't have debug on this system yet).
>
This is correct, but it happens in step 4, not 5.


>
> So I'm thinking what about filtering?  We have some FilterQueries (Post
> Filter implementations) and it seems to be combinations of those which
> cause the massive query times, is it possible the consolidation is then
> trying to run (or re-run ) the filters?
>

All filters are applied in each node before responding to node1

>
> I can include logs and specifics if necessary, but in essence for a
> particular set of queries, step 3 takes about 400ms (on all 4 shards), step
> 4 is 5ms, yet the user response isn't sent out for about 13s(!)
>

The second request is usually much faster than the first one. In this cases
the problem may be due to network latency. You can compare the times you
see in Solr logs vs the request log of your servlet container.

>
> In terms of the log entries for the distributed queries, I'm assuming the
> logs are written as the queries complete, and the QTime is the time taken
> to run that query?
>

Yes, everyting is logged after the fact. You should see. A log entry for
the "search" request in the nodes. A log entry for the "fetch" request in
the logs (may not be in all nodes, if some of them didn't match or didin't
have any doc of the top 15), and finally the main search entry, including
all of the above.


Tomás

Reply via email to