Which command did you use? Merging segments is very expensive in resources, so I try to avoid merging them.
-----Original Message----- From: Marseld Dedgjonaj <[email protected]> To: user <[email protected]> Sent: Tue, Jan 4, 2011 7:12 am Subject: FW: Exception on segment merging I see in hadup log and some more details about the exception are there. Please help me what to check for this error. Here are the details: 2011-01-04 07:40:23,999 INFO segment.SegmentMerger - Slice size: 50000 URLs. 2011-01-04 07:40:36,563 INFO segment.SegmentMerger - Slice size: 50000 URLs. 2011-01-04 07:40:36,563 INFO segment.SegmentMerger - Slice size: 50000 URLs. 2011-01-04 07:40:43,685 INFO segment.SegmentMerger - Slice size: 50000 URLs. 2011-01-04 07:40:43,686 INFO segment.SegmentMerger - Slice size: 50000 URLs. 2011-01-04 07:40:47,316 WARN mapred.LocalJobRunner - job_local_0001 java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1 044) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.io.Text.writeString(Text.java:412) at org.apache.nutch.metadata.Metadata.write(Metadata.java:220) at org.apache.nutch.protocol.Content.write(Content.java:170) at org.apache.hadoop.io.GenericWritable.write(GenericWritable.java:135) at org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java:107) at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser ialize(WritableSerialization.java:90) at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser ialize(WritableSerialization.java:77) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900) at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466 ) at org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:361) at org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:113) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000032_0/output/spi ll0.out at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWr ite(LocalDirAllocator.java:343) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocato r.java:124) at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.ja va:107) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1 221) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:68 6) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.jav a:1173) -----Original Message----- From: Marseld Dedgjonaj [mailto:[email protected]] Sent: Tuesday, January 04, 2011 1:28 PM To: [email protected] Subject: Exception on segment merging Hello everybody, I have configured nutch-1.2 to crawl all urls of a specific website. It runs fine for a while but now that the number of indexed urls has grown more than 30'000, I got an exception on segment merging. Have anybody seen this kind of error. The exception is shown below. Slice size: 50000 URLs. Slice size: 50000 URLs. Slice size: 50000 URLs. Slice size: 50000 URLs. Slice size: 50000 URLs. Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) at org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:638) at org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:683) Merge Segments- End at: 04-01-2011 07:40:48 Thanks in advance & Best Regards, Marseldi <p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni <b>Punë të Mirë</b> dhe <b>të Mirë për Punë</b>... Vizitoni: <a target="_blank" href="http://www.punaime.al/">www.punaime.al</a></span></p> <p><a target="_blank" href="http://www.punaime.al/"><span style="text-decoration: none;"><img width="165" height="31" border="0" alt="punaime" src="http://www.ikub.al/images/punaime.al_small.png" /></span></a></p>

