Hello Folks,

I have been having a few jobs failing due to OutOfMemory and GC overhead
limit exceeded errors. To counter these I tried setting `SET
mapred.child.java.opts="-Xmx3G -XX:+UseConcMarkSweepGC";` at the start of
the hive script**.

Basically any time I add this option to the script, the MR jobs that get
scheduled(for the first of several queries in the script) are 'killed'
right away.

Any thoughts on how to rectify this? Are there any other params that need
to be tinkered with in conjunction with max heap space(eg. `io.sort.mb`)?
Any help would be **most appreciated**.

FWIW, I am using `hive-0.7.0` with `hadoop-0.20.2`. The default setting for
max heap size in our cluster is 1200M.

TIA.

** - Some other alternatives that were tried(with comical effect but no
discernible change in outcome):

- `SET mapred.child.java.opts="-Xmx3G";`

- `SET mapred.child.java.opts="-server -Xmx3072M";`

- `SET mapred.map.child.java.opts ="-server -Xmx3072M";`

  `SET mapred.reduce.child.java.opts ="-server -Xmx3072M";`

- `SET mapred.child.java.opts="-Xmx2G";`

Update: I am beginning to think that this doesn't even have anything to do
with the heap size setting. Tinkering with mapred.child.java.opts in any
way is causing the same outcome. For example setting it thusly, SET
mapred.child.java.opts="-XX:+UseConcMarkSweepGC"; is having the same result
of MR jobs getting killed right away. Or even setting heap size explicitly
in the script to what is the 'cluster default' causes this.

Note: I have an SO question open at: http://goo.gl/j9II0V if you'd prefer
to answer it there. Thanks.

Reply via email to