mapSearcher was Re: Index update and Google Dance

2005-11-11 Thread Stefan Groschupf
Hi Doug, In the future I would like to implement a more automated distributed search system than Nutch currently has. One way to do this might be to use MapReduce. Each map task's input could be an index and some segment data. The map method would serve queries, i.e., run a Nutch

Re: Index update and Google Dance

2005-11-09 Thread Andrzej Bialecki
Jack Tang wrote: Hi Andrzej In document, Michael said: I'd strongly recommend using the system with a replication rate of 3 copies, 2 minimum. Desired replication can be set in nutch config file using ndfs.replication property, and MIN_REPLICATION constant is located in ndfs/FSNamesystem.java

Re: Index update and Google Dance

2005-11-09 Thread Stefan Groschupf
and three copies of chunks are distributed on the slaves. If slave 1 is 90% busy, and 2 is 80% busy, 3 is idle. How does NFS do in this case? Actually you have to do that manually, but there will be a automatically solution later. Or could you tell me where should I start learning? The

Re: Index update and Google Dance

2005-11-09 Thread Jack Tang
Thanks for your explaination, Andrzej. I am going to read some NFS source codes and ask smarter questions later. Thanks again. Regards /Jack On 11/9/05, Andrzej Bialecki [EMAIL PROTECTED] wrote: Jack Tang wrote: Hi Andrzej In document, Michael said: I'd strongly recommend using the system

Re: Index update and Google Dance

2005-11-09 Thread Doug Cutting
Jack Tang wrote: Below is google architecture in my brain: DataNode A Master DataNode B GoogleCrawler DataNode C .. GoogleCrawler is kept running all the time. One day, it gets fethlist from DataNode A, crawls all pages and

Re: Index update and Google Dance

2005-11-09 Thread Jack Tang
Hi Doug On 11/10/05, Doug Cutting [EMAIL PROTECTED] wrote: Jack Tang wrote: Below is google architecture in my brain: DataNode A Master DataNode B GoogleCrawler DataNode C .. GoogleCrawler is kept running all

Re: Index update and Google Dance

2005-11-08 Thread Stefan Groschupf
Google Dance - The Index Update of the Google Search Engine : http://dance.efactory.de/ -- Keep Discovering ... ... http://www.jroller.com/page/jmars --- company:http://www.media-style.com forum:http://www.text-mining.org

Re: Index update and Google Dance

2005-11-08 Thread Andrzej Bialecki
Jack Tang wrote: Hi Stefan Deleting is totally OK if there is NO references to the chunks(segments). Also, Will master balance the searching request? Say, there are 3 slaves: Slave 1, 2, 3 and three copies of chunks are distributed on the slaves. If slave 1 is 90% busy, and 2 is 80% busy, 3 is