I've been trying to dissect the MLT component and understand how it
works. Every-time I think I have the process figured it out I somehow
just end up more confused. Here is my so far best guess at how the
process and flow work:
1. request comes in, and is routed to distributed section of
SearchHandler
2. request is sent to each shard
3. after the shard returns a list of Doc IDs, new MLT requests are
created, one for each Doc ID. (this happens in responseHandler())
4. each MLT request is processed on the same shard (this happens in
process())
5. shard returns MLT results, which are collated (this happens in
finishedStage())
although I don't think this is quite right because it doesn't match my
print statements. I also noticed that the Purpose isn't 400 but 401.
Whats up with this? is 401 a code for something else?
(as an aside, is it unsafe to assume that the logs will appear in
actual chronological order?)
Any advice or pointers at this point would be greatly appreciated.. I
think I'm going in circles.
-mike
On Aug 21, 2009, at 12:54 PM, Jason Rutherglen wrote:
Mike,
I'm also finding the Solr distributed process to be confusing. Lets
try to add things to the wiki as we learn them?
-J
On Fri, Aug 21, 2009 at 9:52 AM, Mike Anderson<mik...@mit.edu> wrote:
I'm trying to make my way through learning how to modify and write
distributed search components.
A few questions
1. in SearchHandler, when the query is broken down and sent to each
shard,
will this request make it's way to the process() method of the
component
(because it will look like a non-distributed request to the
SearchHandler of
the shard)?
2. the comment above the response handling loop (in SearchHandler)
says that
if any requests are added while in the loop, the loop will break
and make
the request immediately. I see that the loop will exit if there is an
exception or if there are no more responses, but I don't see how
the new
requests will be called unless it goes through the entire loop again.
3. if one adds a request to rb in the handleResponses method, this
wouldn't
necessarily be called, namely in the event that none of the
components
override the distributedProcess method, and the loop only goes
through once.
4. where can I learn more about the shard.purpose variable? Where
in the
component should this be set, if anywhere?
I've taken a look at the wiki page, but if there is more
documentation
elsewhere please point me towards it.
Thanks in advance,
Mike