Hi Koji, That did help, thank you. Now, can I specify this in the PIG_OPTS environment variable instead of in the Pig script?
Best regards, Alex soto > On Apr 27, 2018, at 2:21 PM, Koji Noguchi <knogu...@oath.com.INVALID> wrote: > > Hi Alex, > > Can you try increasing the heapsize of the ApplicationMaster? > > yarn.app.mapreduce.am.resource.mb=3584 > yarn.app.mapreduce.am.command-opts=-Xmx3096m > > Koji > > > > On Fri, Apr 27, 2018 at 1:49 PM, Alex Soto <alex.s...@envieta.com> wrote: > >> Hello, >> >> I am using Pig version 0.17.0. When I attempt to run my pig script from >> the command line on a Yarn cluster I get out of memory errors. From the >> Yarn application logs, I see this stack trace: >> >> 2018-04-27 13:22:10,543 ERROR [main] >> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: >> Error starting MRAppMaster >> java.lang.OutOfMemoryError: Java heap space >> at java.util.Arrays.copyOfRange(Arrays.java:3664) >> at java.lang.String.<init>(String.java:207) >> at java.lang.StringBuilder.toString(StringBuilder.java:407) >> at org.apache.hadoop.conf.Configuration.loadResource( >> Configuration.java:2992) >> at org.apache.hadoop.conf.Configuration.loadResources( >> Configuration.java:2817) >> at org.apache.hadoop.conf.Configuration.getProps( >> Configuration.java:2689) >> at org.apache.hadoop.conf.Configuration.set( >> Configuration.java:1326) >> at org.apache.hadoop.conf.Configuration.set( >> Configuration.java:1298) >> at org.apache.pig.backend.hadoop.datastorage.ConfigurationUtil. >> mergeConf(ConfigurationUtil.java:70) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. >> PigOutputFormat.setLocation(PigOutputFormat.java:185) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. >> PigOutputCommitter.setUpContext(PigOutputCommitter.java:115) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. >> PigOutputCommitter.getCommitters(PigOutputCommitter.java:89) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. >> PigOutputCommitter.<init>(PigOutputCommitter.java:70) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. >> PigOutputFormat.getOutputCommitter(PigOutputFormat.java:297) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$3.call( >> MRAppMaster.java:550) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$3.call( >> MRAppMaster.java:532) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster. >> callWithJobClassLoader(MRAppMaster.java:1779) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster. >> createOutputCommitter(MRAppMaster.java:532) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster. >> serviceInit(MRAppMaster.java:309) >> at org.apache.hadoop.service.AbstractService.init( >> AbstractService.java:164) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run( >> MRAppMaster.java:1737) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:422) >> at org.apache.hadoop.security.UserGroupInformation.doAs( >> UserGroupInformation.java:1962) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster. >> initAndStartAppMaster(MRAppMaster.java:1734) >> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main( >> MRAppMaster.java:1668) >> >> >> Now, in trying to increase the heap size, I added this to the beginning of >> the script: >> >> >> SET mapreduce.map.java.opts '-Xmx2048m'; >> SET mapreduce.reduce.java.opts '-Xmx2048m'; >> SET mapreduce.map.memory.mb 2536; >> SET mapreduce.reduce.memory.mb 2536; >> >> But this causes no effect, as it is being ignored. From the Yarn logs, I >> see the Container being launched with 1024m heap size: >> >> echo "Launching container" >> exec /bin/bash -c "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp >> -Dlog4j.configuration=container-log4j.properties >> -Dyarn.app.container.log.dir=/opt/hadoop/lo >> gs/userlogs/application_1523452171521_0223/container_1523452171521_0223_01_000001 >> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA >> -Dhadoop. >> root.logfile=syslog -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster >> 1>/opt/hadoop/logs/userlogs/application_1523452171521_ >> 0223/container_1523452171 >> 521_0223_01_000001/stdout 2>/opt/hadoop/logs/userlogs/ >> application_1523452171521_0223/container_1523452171521_0223_01_000001/stderr >> “ >> >> I also tried setting the memory requirements with the PIG_OPTS environment >> variable: >> >> export PIG_OPTS="-Dmapreduce.reduce.memory.mb=5000 >> -Dmapreduce.map.memory.mb=5000 -Dmapreduce.map.java.opts=-Xmx5000m” >> >> No matter what I do, the container is always launched with -Xmx1024m and >> the same OOM error occurs. >> The question is, what is the proper way to specify the heap sizes for my >> Pig mappers and reducers? >> >> Best regards, >> Alex soto >> >> >>