how would I setup mapred for smp machines, I understand it will split up big
jobs like indexing or updating the db into a bunch of chunks to be processed
by separate machines, I have machines that are multiple processor machines
that I want to test this with internally, makes sense to utilize the full
potential of smp machines, this is a great idea and I'm very glad its being
implemented. currently on these machines I index 4 segements at the same
time, but the update db can only be done one segment at a time, so it would
be great to speed that process up. will mapreduce work with the updatedb and
also the generatedb? not that generate db is bad to wait for but it will be
when it contains billions of links!!!
-J
PS: I'm planning on running another benchmark for ndfs, I'll make a site and
post data/screenshots with the results