Hello everybody,

We are use using CapacityScheduler and Hadoop 0.20.2 (cdh3U3) on a cluster 
composed of nodes(20) with :

ONE-NODE=16 core, 24 GB memory (AMD 4274), JVM HotSpot  1.7.0_05

We execute scientific Job in Map/Reduce task (using Cascading 1.2). We use 
CapacityScheduler to avoid memory consumption Jobs to shut down the nodes 
through excessive swap usage...

So we setup memory slot in JobConf (one slot=1.5 GB...) (Job1=1Slot, 
job2=3slots etc...) according to program needs.
In JobConf.xml we setup in java.opts JVM options like Xmx=xxx to control (try 
to...) the memory effectively used by the whole JVM.
By default on the kind of computer(see ONE-NODE above) we use, the JVM setup 13 
GarbageCollector Thread and 80 MB of memory per Thread and so reserve (13x80Mb 
= 1040 Mb) per JVM. We know that there is option to control both GCThreadNumber 
and GCMemoryPerThread and we used it...

Despite the use of -XX:ParallelGCThreads=X and -XX:HeapSizePerGCThread=YY  JVM 
options to try to control JVM full process size, some jobs are killed  because 
Map (and/or) Reduce tasks are killed by Capacity scheduler with messages like :
Task Tree [pid=xx, tipId=xx] is running beyond memory limit Current Usage=xxx. 
Limit=xxxx.

Does someone already deals with such problem using CapacityScheduler and know 
which JVM options has to be used to control JVM process size...( we try 
MaxPermSize , Xss with no tanglible results...)

Thank you for your response

Regards

Alain

[@@THALES GROUP RESTRICTED@@]

Reply via email to