Which command did you use? Merging segments is very expensive in resources, so 
I try to avoid merging them. 
 


 

 

-----Original Message-----
From: Marseld Dedgjonaj <[email protected]>
To: user <[email protected]>
Sent: Tue, Jan 4, 2011 7:12 am
Subject: FW: Exception on segment merging


I see in hadup log and some more details about the exception are there.

Please help me what to check for this error.



Here are the details:



2011-01-04 07:40:23,999 INFO  segment.SegmentMerger - Slice size: 50000

URLs.

2011-01-04 07:40:36,563 INFO  segment.SegmentMerger - Slice size: 50000

URLs.

2011-01-04 07:40:36,563 INFO  segment.SegmentMerger - Slice size: 50000

URLs.

2011-01-04 07:40:43,685 INFO  segment.SegmentMerger - Slice size: 50000

URLs.

2011-01-04 07:40:43,686 INFO  segment.SegmentMerger - Slice size: 50000

URLs.

2011-01-04 07:40:47,316 WARN  mapred.LocalJobRunner - job_local_0001

java.io.IOException: Spill failed

        at

org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1

044)

        at java.io.DataOutputStream.write(DataOutputStream.java:90)

        at org.apache.hadoop.io.Text.writeString(Text.java:412)

        at org.apache.nutch.metadata.Metadata.write(Metadata.java:220)

        at org.apache.nutch.protocol.Content.write(Content.java:170)

        at

org.apache.hadoop.io.GenericWritable.write(GenericWritable.java:135)

        at org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java:107)

        at

org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser

ialize(WritableSerialization.java:90)

        at

org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser

ialize(WritableSerialization.java:77)

        at

org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900)

        at

org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466

)

        at

org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:361)

        at

org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:113)

        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)

        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)

        at

org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not

find any valid local directory for

taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000032_0/output/spi

ll0.out

        at

org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWr

ite(LocalDirAllocator.java:343)

        at

org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocato

r.java:124)

        at

org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.ja

va:107)

        at

org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1

221)

        at

org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:68

6)

        at

org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.jav

a:1173)





-----Original Message-----

From: Marseld Dedgjonaj [mailto:[email protected]] 

Sent: Tuesday, January 04, 2011 1:28 PM

To: [email protected]

Subject: Exception on segment merging



Hello everybody,



I have configured nutch-1.2 to crawl all urls of a specific website. 



It runs fine for a while but now that the number of indexed urls has grown

more than 30'000,  I got an exception on segment merging.



Have anybody seen this kind of error.



 



The exception is shown below.



 



Slice size: 50000 URLs.





Slice size: 50000 URLs.





Slice size: 50000 URLs.





Slice size: 50000 URLs.





Slice size: 50000 URLs.





Exception in thread "main" java.io.IOException: Job failed!





        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)





        at

org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:638)





        at

org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:683)





Merge Segments-  End at:   04-01-2011 07:40:48     



 



Thanks in advance & Best Regards,



Marseldi











<p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni <b>Pun&euml; 

t&euml; Mir&euml;</b> dhe <b>t&euml; Mir&euml; p&euml;r Pun&euml;</b>... 

Vizitoni: <a target="_blank" 
href="http://www.punaime.al/";>www.punaime.al</a></span></p>

<p><a target="_blank" href="http://www.punaime.al/";><span 
style="text-decoration: 

none;"><img width="165" height="31" border="0" alt="punaime" 

src="http://www.ikub.al/images/punaime.al_small.png"; /></span></a></p>








 

Reply via email to