Hi Guys,

I tried to merge 2 crawl of about 200 000 fetched pages each and i got the
following error :

2007-08-15 09:47:43,472 WARN  mapred.TaskTracker - Error running child
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2786)
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
        at org.apache.nutch.protocol.Content.write(Content.java:163)
        at org.apache.hadoop.io.GenericWritable.write(GenericWritable.java:100)
        at org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java:107)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:365)
        at org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.java:338)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1707)

I used the trunk version on Linux 2.6.22and Java 1.6.

Does it mean anything for you?
Any help would be appreciate..
Thanks
E

Reply via email to