Anyone have any good configuration ideas for indexing/merging with 0.9
using hadoop on a local fs?  Our segment merging is taking an
extremely long time compared with nutch 0.7.  Currently, I am trying
to merge 300 segments, which amounts to about 1gig of data.  It has
taken hours to merge, and it's still not done. This box has dual zeon
2.8ghz processors with 4 gigs of ram.

So, I figure there must be a better setup in the mapred-default.xml
for a single machine.  Do I increase the file size for I/O buffers,
sort buffers, etc.?  Do I reduce the number of tasks or increase them?
I'm at a loss.

Any advice would be greatly appreciated.


--
"Conscious decisions by conscious minds are what make reality real"

Reply via email to