Hi Alex,

Can you try increasing the heapsize of the ApplicationMaster?

yarn.app.mapreduce.am.resource.mb=3584
yarn.app.mapreduce.am.command-opts=-Xmx3096m

Koji



On Fri, Apr 27, 2018 at 1:49 PM, Alex Soto <alex.s...@envieta.com> wrote:

> Hello,
>
> I am using Pig version 0.17.0.  When I attempt to run my pig script from
> the command line on a Yarn cluster I get out of memory errors.  From the
> Yarn application logs, I see this stack trace:
>
> 2018-04-27 13:22:10,543 ERROR [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster:
> Error starting MRAppMaster
> java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOfRange(Arrays.java:3664)
>         at java.lang.String.<init>(String.java:207)
>         at java.lang.StringBuilder.toString(StringBuilder.java:407)
>         at org.apache.hadoop.conf.Configuration.loadResource(
> Configuration.java:2992)
>         at org.apache.hadoop.conf.Configuration.loadResources(
> Configuration.java:2817)
>         at org.apache.hadoop.conf.Configuration.getProps(
> Configuration.java:2689)
>         at org.apache.hadoop.conf.Configuration.set(
> Configuration.java:1326)
>         at org.apache.hadoop.conf.Configuration.set(
> Configuration.java:1298)
>         at org.apache.pig.backend.hadoop.datastorage.ConfigurationUtil.
> mergeConf(ConfigurationUtil.java:70)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
> PigOutputFormat.setLocation(PigOutputFormat.java:185)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
> PigOutputCommitter.setUpContext(PigOutputCommitter.java:115)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
> PigOutputCommitter.getCommitters(PigOutputCommitter.java:89)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
> PigOutputCommitter.<init>(PigOutputCommitter.java:70)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
> PigOutputFormat.getOutputCommitter(PigOutputFormat.java:297)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$3.call(
> MRAppMaster.java:550)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$3.call(
> MRAppMaster.java:532)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
> callWithJobClassLoader(MRAppMaster.java:1779)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
> createOutputCommitter(MRAppMaster.java:532)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
> serviceInit(MRAppMaster.java:309)
>         at org.apache.hadoop.service.AbstractService.init(
> AbstractService.java:164)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(
> MRAppMaster.java:1737)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1962)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.
> initAndStartAppMaster(MRAppMaster.java:1734)
>         at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(
> MRAppMaster.java:1668)
>
>
> Now, in trying to increase the heap size, I added this to the beginning of
> the script:
>
>
> SET mapreduce.map.java.opts '-Xmx2048m';
> SET mapreduce.reduce.java.opts '-Xmx2048m';
> SET mapreduce.map.memory.mb 2536;
> SET mapreduce.reduce.memory.mb 2536;
>
> But this causes no effect, as it is being ignored.  From the Yarn logs, I
> see the Container being launched with 1024m heap size:
>
> echo "Launching container"
> exec /bin/bash -c "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.container.log.dir=/opt/hadoop/lo
> gs/userlogs/application_1523452171521_0223/container_1523452171521_0223_01_000001
> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> -Dhadoop.
> root.logfile=syslog  -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> 1>/opt/hadoop/logs/userlogs/application_1523452171521_
> 0223/container_1523452171
> 521_0223_01_000001/stdout 2>/opt/hadoop/logs/userlogs/
> application_1523452171521_0223/container_1523452171521_0223_01_000001/stderr
> “
>
> I also tried setting the memory requirements with the PIG_OPTS environment
> variable:
>
> export PIG_OPTS="-Dmapreduce.reduce.memory.mb=5000
> -Dmapreduce.map.memory.mb=5000 -Dmapreduce.map.java.opts=-Xmx5000m”
>
> No matter what I do, the container is always launched with -Xmx1024m and
> the same OOM error occurs.
> The question is, what is the proper way to specify the heap sizes for my
> Pig mappers and reducers?
>
> Best regards,
> Alex soto
>
>
>

Reply via email to