Re: Exception on segment merging

Julien Nioche Tue, 04 Jan 2011 13:49:48 -0800

Other users have previously reported similar problems which were due to a
lack on space on disk as suggested by this


*Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
notfind any valid local directory for
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000032_0/output/spi
ll0.out*

Make sure that the temporary directory used by Hadoop is on a partition with
enough space

HTH

Julien


On 4 January 2011 18:19, <[email protected]> wrote:

> Which command did you use? Merging segments is very expensive in resources,
> so I try to avoid merging them.
>
>
>
>
>
>
>
> -----Original Message-----
> From: Marseld Dedgjonaj <[email protected]>
> To: user <[email protected]>
> Sent: Tue, Jan 4, 2011 7:12 am
> Subject: FW: Exception on segment merging
>
>
> I see in hadup log and some more details about the exception are there.
>
> Please help me what to check for this error.
>
>
>
> Here are the details:
>
>
>
> 2011-01-04 07:40:23,999 INFO  segment.SegmentMerger - Slice size: 50000
>
> URLs.
>
> 2011-01-04 07:40:36,563 INFO  segment.SegmentMerger - Slice size: 50000
>
> URLs.
>
> 2011-01-04 07:40:36,563 INFO  segment.SegmentMerger - Slice size: 50000
>
> URLs.
>
> 2011-01-04 07:40:43,685 INFO  segment.SegmentMerger - Slice size: 50000
>
> URLs.
>
> 2011-01-04 07:40:43,686 INFO  segment.SegmentMerger - Slice size: 50000
>
> URLs.
>
> 2011-01-04 07:40:47,316 WARN  mapred.LocalJobRunner - job_local_0001
>
> java.io.IOException: Spill failed
>
>        at
>
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1
>
> 044)
>
>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>
>        at org.apache.hadoop.io.Text.writeString(Text.java:412)
>
>        at org.apache.nutch.metadata.Metadata.write(Metadata.java:220)
>
>        at org.apache.nutch.protocol.Content.write(Content.java:170)
>
>        at
>
> org.apache.hadoop.io.GenericWritable.write(GenericWritable.java:135)
>
>        at org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java:107)
>
>        at
>
>
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser
>
> ialize(WritableSerialization.java:90)
>
>        at
>
>
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.ser
>
> ialize(WritableSerialization.java:77)
>
>        at
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:900)
>
>        at
>
>
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466
>
> )
>
>        at
>
> org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:361)
>
>        at
>
> org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:113)
>
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>
>        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>
>        at
>
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>
> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
>
> find any valid local directory for
>
>
> taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000032_0/output/spi
>
> ll0.out
>
>        at
>
>
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWr
>
> ite(LocalDirAllocator.java:343)
>
>        at
>
>
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocato
>
> r.java:124)
>
>        at
>
>
> org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.ja
>
> va:107)
>
>        at
>
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1
>
> 221)
>
>        at
>
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:68
>
> 6)
>
>        at
>
>
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.jav
>
> a:1173)
>
>
>
>
>
> -----Original Message-----
>
> From: Marseld Dedgjonaj [mailto:[email protected]]
>
> Sent: Tuesday, January 04, 2011 1:28 PM
>
> To: [email protected]
>
> Subject: Exception on segment merging
>
>
>
> Hello everybody,
>
>
>
> I have configured nutch-1.2 to crawl all urls of a specific website.
>
>
>
> It runs fine for a while but now that the number of indexed urls has grown
>
> more than 30'000,  I got an exception on segment merging.
>
>
>
> Have anybody seen this kind of error.
>
>
>
>
>
>
>
> The exception is shown below.
>
>
>
>
>
>
>
> Slice size: 50000 URLs.
>
>
>
>
>
> Slice size: 50000 URLs.
>
>
>
>
>
> Slice size: 50000 URLs.
>
>
>
>
>
> Slice size: 50000 URLs.
>
>
>
>
>
> Slice size: 50000 URLs.
>
>
>
>
>
> Exception in thread "main" java.io.IOException: Job failed!
>
>
>
>
>
>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>
>
>
>
>
>        at
>
> org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:638)
>
>
>
>
>
>        at
>
> org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:683)
>
>
>
>
>
> Merge Segments-  End at:   04-01-2011 07:40:48
>
>
>
>
>
>
>
> Thanks in advance & Best Regards,
>
>
>
> Marseldi
>
>
>
>
>
>
>
>
>
>
>
> <p class="MsoNormal"><span style="color: rgb(31, 73, 125);">Gjeni
> <b>Pun&euml;
>
> t&euml; Mir&euml;</b> dhe <b>t&euml; Mir&euml; p&euml;r Pun&euml;</b>...
>
> Vizitoni: <a target="_blank" href="http://www.punaime.al/";>www.punaime.al
> </a></span></p>
>
> <p><a target="_blank" href="http://www.punaime.al/";><span
> style="text-decoration:
>
> none;"><img width="165" height="31" border="0" alt="punaime"
>
> src="http://www.ikub.al/images/punaime.al_small.png"; /></span></a></p>
>
>
>
>
>
>
>
>
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Re: Exception on segment merging

Reply via email to