I see in hadup log and some more details about the exception are there.
Please help me what to check for this error.
Here are the details:
2011-01-04 07:40:23,999 INFO segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:36,563 INFO segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:36,563 INFO segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:43,685 INFO segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:43,686 INFO segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:47,316 WARN mapred.LocalJobRunner - job_local_0001
java.io.IOException: Spill failed
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1
044)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.Text.writeString(Text.java:412)
at org.apache.nutch.metadata.Metadata.write(Metadata.java:220)
at org.apache.nutch.protocol.Content.write(Content.java:170)
at
org.apache.hadoop.io.GenericWritable.write(GenericWritable.java:135)
at org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java:107)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser
ialize(WritableSerialization.java:90)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser
ialize(WritableSerialization.java:77)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900)
at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466
)
at
org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:361)
at
org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:113)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
find any valid local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000032_0/output/spi
ll0.out
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWr
ite(LocalDirAllocator.java:343)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocato
r.java:124)
at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.ja
va:107)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1
221)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:68
6)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.jav
a:1173)
-----Original Message-----
From: Marseld Dedgjonaj [mailto:[email protected]]
Sent: Tuesday, January 04, 2011 1:28 PM
To: [email protected]
Subject: Exception on segment merging
Hello everybody,
I have configured nutch-1.2 to crawl all urls of a specific website.
It runs fine for a while but now that the number of indexed urls has grown
more than 30'000, I got an exception on segment merging.
Have anybody seen this kind of error.
The exception is shown below.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Slice size: 50000 URLs.
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
at
org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:638)
at
org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:683)
Merge Segments- End at: 04-01-2011 07:40:48
Thanks in advance & Best Regards,
Marseldi
<p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni <b>Punë
të Mirë</b> dhe <b>të Mirë për Punë</b>...
Vizitoni: <a target="_blank"
href="http://www.punaime.al/">www.punaime.al</a></span></p>
<p><a target="_blank" href="http://www.punaime.al/"><span
style="text-decoration: none;"><img width="165" height="31" border="0"
alt="punaime" src="http://www.ikub.al/images/punaime.al_small.png"
/></span></a></p>