I see in hadup log and some more details about the exception are there.
Please help me what to check for this error.

Here are the details:

2011-01-04 07:40:23,999 INFO  segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:36,563 INFO  segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:36,563 INFO  segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:43,685 INFO  segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:43,686 INFO  segment.SegmentMerger - Slice size: 50000
URLs.
2011-01-04 07:40:47,316 WARN  mapred.LocalJobRunner - job_local_0001
java.io.IOException: Spill failed
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1
044)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.io.Text.writeString(Text.java:412)
        at org.apache.nutch.metadata.Metadata.write(Metadata.java:220)
        at org.apache.nutch.protocol.Content.write(Content.java:170)
        at
org.apache.hadoop.io.GenericWritable.write(GenericWritable.java:135)
        at org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java:107)
        at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser
ialize(WritableSerialization.java:90)
        at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser
ialize(WritableSerialization.java:77)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900)
        at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466
)
        at
org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:361)
        at
org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:113)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
find any valid local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000032_0/output/spi
ll0.out
        at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWr
ite(LocalDirAllocator.java:343)
        at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocato
r.java:124)
        at
org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.ja
va:107)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1
221)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:68
6)
        at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.jav
a:1173)


-----Original Message-----
From: Marseld Dedgjonaj [mailto:[email protected]] 
Sent: Tuesday, January 04, 2011 1:28 PM
To: [email protected]
Subject: Exception on segment merging

Hello everybody,

I have configured nutch-1.2 to crawl all urls of a specific website. 

It runs fine for a while but now that the number of indexed urls has grown
more than 30'000,  I got an exception on segment merging.

Have anybody seen this kind of error.

 

The exception is shown below.

 

Slice size: 50000 URLs.


Slice size: 50000 URLs.


Slice size: 50000 URLs.


Slice size: 50000 URLs.


Slice size: 50000 URLs.


Exception in thread "main" java.io.IOException: Job failed!


        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)


        at
org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:638)


        at
org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:683)


Merge Segments-  End at:   04-01-2011 07:40:48     

 

Thanks in advance & Best Regards,

Marseldi





<p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni <b>Pun&euml; 
t&euml; Mir&euml;</b> dhe <b>t&euml; Mir&euml; p&euml;r Pun&euml;</b>... 
Vizitoni: <a target="_blank" 
href="http://www.punaime.al/";>www.punaime.al</a></span></p>
<p><a target="_blank" href="http://www.punaime.al/";><span 
style="text-decoration: none;"><img width="165" height="31" border="0" 
alt="punaime" src="http://www.ikub.al/images/punaime.al_small.png"; 
/></span></a></p>


Reply via email to