Did u try playing with mapred.child.ulimit along with java.opts ?

Sent from my iPhone

On Jun 18, 2011, at 9:55 AM, Ken Williams <[email protected]> wrote:

> 
> Hi All,
> 
> I'm having a problem running a job on Hadoop. Using Mahout, I've been able to 
> run several Bayesian classifiers and train and test them successfully on 
> increasingly large datasets. Now I'm working on a dataset of 100,000 
> documents (size 100MB). I've trained the classifier on 80,000 docs and am 
> using the remaining 20,000 as the test set. I've been able to train the 
> classifier but when I try to 'testclassifier' all the map tasks are failing 
> with a 'Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded' 
> exception, before the job itself is 'Killed'. I have a small cluster of 3 
> machines but have plenty of memory and CPU power (3 x 16GB, 2.5GHz quad-core 
> machines).
> I've tried setting 'mapred.child.java.opts' flags up to 3GB in size (-Xms3G 
> -Xmx3G) but still get the same error. I've also tried setting HADOOP_HEAPSIZE 
> at values like 2000, 2500 and 3000 but this made no difference. When the 
> program is running I can use 'top' to see that although the CPUs are busy, 
> memory usage rarely goes above 12GB and absolutely no swapping is taking 
> place. (see Program console output: http://pastebin.com/0m2Uduxa, Job config 
> file: http://pastebin.com/4GEFSnUM).
> I found a similar problem with a 'GC overhead limit exceeded' where the 
> program was spending so much time garbage-collecting (more then 90% of its 
> time!) that it was unable to progress and so threw the 'GC overhead limit 
> exceeded' exception.  If I set (-XX:-UseGCOverheadLimit) in the 
> 'mapred.child.java.opts' property to avoid this exception then I see the same 
> behaviour as before only a slightly different exception is thrown,   Caused 
> by: java.lang.OutOfMemoryError: Java heap space     at 
> java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:39)
> So I'm guessing that maybe my program is spending too much time 
> garbage-collecting for it to progress ? But how do I fix this ? There's no 
> further info in the log-files other than seeing the exceptions being thrown. 
> I tried to reduce the 'dfs.block.size' parameter to reduce the amount of data 
> going into each 'map' process (and therefore reduce it's memory requirements) 
> but this made no difference. I tried various settings for JVM reuse 
> (mapred.job.reuse.jvm.num.tasks)using values for no re-use (0), limited 
> re-use (10), and unlimited re-use (-1) but no difference. I think the problem 
> is in the job configuration parameters but how do I find it ? I'm using 
> Hadoop 0.20.2 and the latest Mahout snapshot version. All machines are 
> running 64-bit Ubuntu and Java 6.Any help would be very much appreciated,
> 
>           Ken Williams
> 
> 
> 
> 
> 
> 
> 
>                                                                               
>                                                                               
>  

Reply via email to