>
>
> Hi All,
>
> While reading input from directory recursively consisting of files of size
> 30Mb, using WholeFileInputFormat and WholeFileRecordReader, I am running into
> JavaHeapSize error for even a very small file of 30MB. By default the
> mapred.child.java.opts is set to -Xmx200m and should be sufficient enough to
> run atleast 30MB files present in the directory.
>
> The input is a normal random words in file. Each Map is given a single file
> of size 30MB and I am reading value as the content of the whole file. And
> running normal word count.
>
> If I increase the mapred.child.java.opts size to higher value the
> applications runs successfully. But it would be great if anyone can suggest
> me why mapred.child.java.opts which is currently 200Mb default for task is
> not sufficient for 30 MB file, as this means Hadoop MapReduce is consuming a
> lot of heap size and out of 200MB it doesn't even use 30Mb to process the
> task? Also, is there any other way to read the a large Whole file as a input
> to a single Map, meaning every Map gets a whole file to process?
>
> -Shubh