I've been trying to dissect the MLT component and understand how it works. Every-time I think I have the process figured it out I somehow just end up more confused. Here is my so far best guess at how the process and flow work:

1. request comes in, and is routed to distributed section of SearchHandler
2. request is sent to each shard
3. after the shard returns a list of Doc IDs, new MLT requests are created, one for each Doc ID. (this happens in responseHandler()) 4. each MLT request is processed on the same shard (this happens in process()) 5. shard returns MLT results, which are collated (this happens in finishedStage())

although I don't think this is quite right because it doesn't match my print statements. I also noticed that the Purpose isn't 400 but 401. Whats up with this? is 401 a code for something else?

(as an aside, is it unsafe to assume that the logs will appear in actual chronological order?)


Any advice or pointers at this point would be greatly appreciated.. I think I'm going in circles.


-mike



On Aug 21, 2009, at 12:54 PM, Jason Rutherglen wrote:

Mike,

I'm also finding the Solr distributed process to be confusing.  Lets
try to add things to the wiki as we learn them?

-J

On Fri, Aug 21, 2009 at 9:52 AM, Mike Anderson<mik...@mit.edu> wrote:
I'm trying to make my way through learning how to modify and write
distributed search components.

A few questions

1. in SearchHandler, when the query is broken down and sent to each shard, will this request make it's way to the process() method of the component (because it will look like a non-distributed request to the SearchHandler of
the shard)?

2. the comment above the response handling loop (in SearchHandler) says that if any requests are added while in the loop, the loop will break and make
the request immediately. I see that the loop will exit if there is an
exception or if there are no more responses, but I don't see how the new
requests will be called unless it goes through the entire loop again.

3. if one adds a request to rb in the handleResponses method, this wouldn't necessarily be called, namely in the event that none of the components override the distributedProcess method, and the loop only goes through once.

4. where can I learn more about the shard.purpose variable? Where in the
component should this be set, if anywhere?


I've taken a look at the wiki page, but if there is more documentation
elsewhere please point me towards it.

Thanks in advance,
Mike



Reply via email to