Hi,
During CrawlDb Map reduce job,
The reduce worker fail 1 by 1 with :
java.lang.OutOfMemoryError: GC overhead limit exceeded
at
java.util.concurrent.ConcurrentHashMap$HashEntry.newArray(ConcurrentHashMap.java:205)
at
Hi,
The reducing step of the updatedb requires quite a lot of memory indeed. See
https://issues.apache.org/jira/browse/NUTCH-702 for a discussion on this
subject.
BTW you'll have to specify the parameter mapred.child.java.opts in your
conf/hadoop-site.xml so that the value is sent to the hadoop
Julien,
I did tryed with 2048M / Task child,
no luck I still have two reduce that doesn't go through,
Is it somewhat related to the number of reduce,
on this cluster I have 4 servers :
- dual xeon dual core (8 core)
- 8Gb ram
- 4 disks
I did set mapred.reduce.tasks and mapred.map.tasks to 16.
MoD wrote:
Julien,
I did tryed with 2048M / Task child,
no luck I still have two reduce that doesn't go through,
Is it somewhat related to the number of reduce,
on this cluster I have 4 servers :
- dual xeon dual core (8 core)
- 8Gb ram
- 4 disks
I did set mapred.reduce.tasks and
fixed, thanks.
On Sun, Aug 16, 2009 at 8:38 PM, Andrzej Bialeckia...@getopt.org wrote:
MoD wrote:
Julien,
I did tryed with 2048M / Task child,
no luck I still have two reduce that doesn't go through,
Is it somewhat related to the number of reduce,
on this cluster I have 4 servers :
-