Hello all, I am locking forward to create a distributed crawler. I know MapReduce is very good for indexing, but how about fetching? How can I distribute the downloads in a best way over some nr of machine?
Does Nutch(+ Hadoop) has some facilities for distributed fetching? Please provide some ideas, or some documentation. Thank you, Sergiu. -- View this message in context: http://lucene.472066.n3.nabble.com/Distributed-Fetching-tp4004066.html Sent from the Nutch - User mailing list archive at Nabble.com.

