I think that is exactly what HADOOP does! Start here: http://wiki.apache.org/nutch/NutchHadoopTutorial
On Tue, Mar 27, 2012 at 6:19 AM, pepe3059 <pepe3...@gmail.com> wrote: > Hello, i have some questions, sorry if i'm so noob > > Is there a way to divide "fetch process" between two or > more computers using distinct internet conection? may > be divide load from crawldb into segments and after doing > a merge process whit them? is hadoop only for storage sharing? > > i hope you can help me, i'm doing a crawling but it's too slow > for one machine, any suggestion or tip is welcome, thank you > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/divide-fetch-process-tp3859625p3859625.html > Sent from the Nutch - User mailing list archive at Nabble.com. >