Re: distributed search components

Mike Anderson Fri, 21 Aug 2009 15:36:14 -0700

I've been trying to dissect the MLT component and understand how itworks. Every-time I think I have the process figured it out I somehowjust end up more confused. Here is my so far best guess at how theprocess and flow work:

1. request comes in, and is routed to distributed section ofSearchHandler

2. request is sent to each shard

3. after the shard returns a list of Doc IDs, new MLT requests arecreated, one for each Doc ID. (this happens in responseHandler())4. each MLT request is processed on the same shard (this happens inprocess())5. shard returns MLT results, which are collated (this happens infinishedStage())

although I don't think this is quite right because it doesn't match myprint statements. I also noticed that the Purpose isn't 400 but 401.Whats up with this? is 401 a code for something else?

(as an aside, is it unsafe to assume that the logs will appear inactual chronological order?)

Any advice or pointers at this point would be greatly appreciated.. Ithink I'm going in circles.



-mike



On Aug 21, 2009, at 12:54 PM, Jason Rutherglen wrote:

Mike,

I'm also finding the Solr distributed process to be confusing.  Lets
try to add things to the wiki as we learn them?

-J

On Fri, Aug 21, 2009 at 9:52 AM, Mike Anderson<mik...@mit.edu> wrote:
I'm trying to make my way through learning how to modify and write
distributed search components.

A few questions
1. in SearchHandler, when the query is broken down and sent to eachshard,will this request make it's way to the process() method of thecomponent(because it will look like a non-distributed request to theSearchHandler of
the shard)?
2. the comment above the response handling loop (in SearchHandler)says thatif any requests are added while in the loop, the loop will breakand make
the request immediately. I see that the loop will exit if there is an
exception or if there are no more responses, but I don't see howthe new
requests will be called unless it goes through the entire loop again.
3. if one adds a request to rb in the handleResponses method, thiswouldn'tnecessarily be called, namely in the event that none of thecomponentsoverride the distributedProcess method, and the loop only goesthrough once.
4. where can I learn more about the shard.purpose variable? Wherein the
component should this be set, if anywhere?
I've taken a look at the wiki page, but if there is moredocumentation
elsewhere please point me towards it.

Thanks in advance,
Mike

Re: distributed search components

Reply via email to