hi there,

there was a little improvement; at least its not running out of ram
anymore; but you're right there seems to be a side effect. 

i am now having what seems to be disk issues! i am running in a VPS so i
am suspecting this might have something to do with it?

but what is the cause now?


==>>

00:36:28,912 INFO [TaskRunner] Task 'attempt_local_0001_m_000064_0'
done. 
00:36:29,104 INFO  [MapTask] numReduceTasks: 1
00:36:29,104 INFO  [MapTask] io.sort.mb = 100
00:36:29,240 INFO  [MapTask] data buffer = 79691776/99614720
00:36:29,240 INFO  [MapTask] record buffer = 262144/327680
00:36:29,260 INFO  [CodecPool] Got brand-new decompressor
00:36:29,264 INFO  [MapTask] Starting flush of map output
00:36:29,276 INFO  [MapTask] Finished spill 0
00:36:29,280 INFO  [TaskRunner] Task:attempt_local_0001_m_000065_0 is done. And 
is in the process of commiting
00:36:29,280 INFO  [LocalJobRunner] 
file:/home/meda/workspace/web/crawl/segments/20091101171338/parse_text/part-00000/data:0+12655
00:36:29,280 INFO  [TaskRunner] Task 'attempt_local_0001_m_000065_0' done.
00:36:38,533 WARN  [LocalJobRunner] job_local_0001
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
taskTracker/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output/file.out
 in any of the configured local directories
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:381)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
        at 
org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:50)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:150)
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
        at org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:620)
        at org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:665)



On Tue, 2009-11-03 at 10:28 -0500, Kalaimathan Mahenthiran wrote:
> if you set the mapred.child.java.opts
> with additional value "-XX: -UseGCOverheadLimit" you can bypass this
> exception. I don't know if it has any side effects as a result of
> this..
> ex.
> -Xmx512m -XX: -UseGCOverheadLimit
> 
> 
> On Tue, Nov 3, 2009 at 7:50 AM, Fadzi Ushewokunze
> <fa...@butterflycluster.net> wrote:
> > hi,
> >
> > i am running on a single machine; 2G RAM, and java heap space set at
> > 1024m, the segments are quite - tiny less than 100 urls and during
> > mergeSegments i get this exception below;
> >
> > i have set mapred.child.java.opts=-Xmx512m but there is no change;
> >
> > any suggestions?
> >
> >
> > ====>
> >
> > 2009-11-03 17:58:28,971 INFO  [org.apache.hadoop.mapred.LocalJobRunner]
> > reduce > reduce
> > 2009-11-03 17:58:38,448 INFO  [org.apache.hadoop.mapred.LocalJobRunner]
> > reduce > reduce
> > 2009-11-03 17:58:57,085 INFO  [org.apache.hadoop.mapred.LocalJobRunner]
> > reduce > reduce
> > 2009-11-03 17:59:34,723 INFO  [org.apache.hadoop.mapred.LocalJobRunner]
> > reduce > reduce
> > 2009-11-03 18:02:09,660 INFO  [org.apache.hadoop.mapred.TaskRunner]
> > Communication exception: java.lang.OutOfMemoryError: Java heap space
> >        at org.apache.hadoop.mapred.Counters
> > $Group.getCounterForName(Counters.java:327)
> >        at
> > org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:494)
> >        at org.apache.hadoop.mapred.Counters.sum(Counters.java:506)
> >        at org.apache.hadoop.mapred.LocalJobRunner
> > $Job.statusUpdate(LocalJobRunner.java:222)
> >        at org.apache.hadoop.mapred.Task$1.run(Task.java:418)
> >        at java.lang.Thread.run(Thread.java:619)
> >
> > 2009-11-03 18:02:10,376 WARN  [org.apache.hadoop.mapred.LocalJobRunner]
> > job_local_0001
> > java.lang.ThreadDeath
> >        at java.lang.Thread.stop(Thread.java:715)
> >        at
> > org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
> >        at org.apache.hadoop.mapred.JobClient
> > $NetworkedJob.killJob(JobClient.java:315)
> >        at
> > org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1239)
> >        at
> > org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:620)
> >        at
> > org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:665)
> >
> >

Reply via email to