Hi, I'm trying to merge (using nutch-1.0 mergesegs) about 1.2MM pages on one machine contained in 10 segments, using:
bin/nutch mergesegs crawl/merge_seg -dir crawl/segments ,but there is not enough space on 500G disk to complete this merge task (getting java.io.IOException: No space left on device in hadoop.log) Shouldn't 500G be enough disk space for this merge? Is this a bug? If this is not a bug, how much disk space is required for this merge? Tomislav