mapSearcher was Re: Index update and Google Dance

Stefan Groschupf Fri, 11 Nov 2005 09:32:07 -0800

Hi Doug,

In the future I would like to implement a more automateddistributed search system than Nutch currently has. One way to dothis might be to use MapReduce. Each map task's input could be anindex and some segment data. The map method would serve queries,i.e., run a Nutch DistributedSearch.Server. It would first copythe index out of NDFS to the local disk, for better performance.


I have 2 questions regarding this mechanism.

First, what you plan to make the running search servers known by themaster (search client) I can imaging a similar mechanism as thetasktracker and jobtracker use, a kind of heart beat message.Second wouldn't be there also a possibility to solve nutch-92(DistributedSearch incorrectly scores results) by first running a mapreduce task over the indexes that counting terms and than hold thissomehow in the memory of master (search server client). But I'm notsure if that is may to much data.


Stefan

mapSearcher was Re: Index update and Google Dance

Reply via email to