[Nutch-dev] mapred question

Jay Pound Sat, 06 Aug 2005 10:42:08 -0700

how would I setup mapred for smp machines, I understand it will split up big
jobs like indexing or updating the db into a bunch of chunks to be processed
by separate machines, I have machines that are multiple processor machines
that I want to test this with internally, makes sense to utilize the full
potential of smp machines, this is a great idea and I'm very glad its being
implemented. currently on these machines I index 4 segements at the same
time, but the update db can only be done one segment at a time, so it would
be great to speed that process up. will mapreduce work with the updatedb and
also the generatedb? not that generate db is bad to wait for but it will be
when it contains billions of links!!!
-J
PS: I'm planning on running another benchmark for ndfs, I'll make a site and
post data/screenshots with the results





-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

[Nutch-dev] mapred question

Reply via email to