Fwd: Regarding WholeInputFileFormat Java Heap Size error

Shubh hadoopExp Wed, 11 May 2016 22:16:16 -0700

> 
> 
> Hi All,
> 
> While reading input from directory recursively consisting of files of size 
> 30Mb, using WholeFileInputFormat and WholeFileRecordReader, I am running into 
> JavaHeapSize error for even a very small file of 30MB. By default the 
> mapred.child.java.opts is set to -Xmx200m and should be sufficient enough to 
> run atleast 30MB files present in the directory. 
> 
> The input is a normal random words in file. Each Map is given a single file 
> of size 30MB and I am reading value as the content of the whole file. And 
> running normal word count.
> 
> If I increase the mapred.child.java.opts size to higher value the 
> applications runs successfully. But it would be great if anyone can suggest 
> me why mapred.child.java.opts  which is currently 200Mb default for task is 
> not sufficient for 30 MB file, as this means Hadoop MapReduce is consuming a 
> lot of heap size and out of 200MB it doesn't even use 30Mb to process the 
> task? Also, is there any other way to read the a large Whole file as a input 
> to a single Map, meaning every Map gets a whole file to process?
> 
> -Shubh

Fwd: Regarding WholeInputFileFormat Java Heap Size error

Reply via email to