Gentlemen, Apologies for coming back late. The issue was just the minimum container size that was configured in my cluster (yarn.scheduler.maximum-allocation-mb). It was set at 1 GB.
I didn't specify any spark specific memory parameters during my run (the memory defaults that the SparkSetupProvider was looking at) and to top it the code was setting the Xmx at 1 GB causing the overallocation and failure. I have one minor proposal. If this is agreeable, I can raise a quick PR. Can we pull out the executor java options as a property in the amaterasu.properties? amaterasu.executor.extra.java.opts = "-Xmx1G -Dscala.usejavacp=true -Dhdp.version=2.6.5.0-292" As a side effect, we must provide the flexibility to allow quotes around the parameter but passing the quotes to the java command would fail. I have stripped off the extra quotes in a dirty way at the moment. Should we consider proper command parsing (and possibly convert them to be bash compatible strings)? s"java -cp spark/jars/*:executor.jar:spark/conf/:${config.YARN.hadoopHomeDir}/conf/ " + s" ${config.amaterasuExecutorJavaOpts.replaceAll("\"","")} "+ Meanwhile, I'll also update by PR for Amaterasu-24 after pulling the latest from the branch. Cheers, Arun On Wed, May 30, 2018 at 1:25 PM Arun Manivannan <a...@arunma.com> wrote: > Thanks a lot, Nadav. Will get home and spend some more time on this. I was > in a rush and did this poor workaround. My VM is just 8 GB. > > Cheers > Arun > > > On Wed, May 30, 2018, 12:27 Nadav Har Tzvi <nadavhart...@gmail.com> wrote: > >> Yaniv and I just tested it. It worked flawlessly on my end (HDP docker on >> AWS). Both Spark-Scala and PySpark. >> It worked on Yaniv's HDP cluster as well. >> Worth noting: >> 1. HDP 2.6.4 >> 2. Cluster has total of 32GB memory available >> 3. Each container is allocated 1G memory. >> 4. Amaterasu.properties: >> >> zk=sandbox-hdp.hortonworks.com >> version=0.2.0-incubating-rc3 >> master=192.168.33.11 >> user=root >> mode=yarn >> webserver.port=8000 >> webserver.root=dist >> spark.version=2.6.4.0-91 >> yarn.queue=default >> yarn.jarspath=hdfs:///apps/amaterasu >> spark.home=/usr/hdp/current/spark2-client >> >> #spark.home=/opt/cloudera/parcels/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658/lib/spark2 >> yarn.hadoop.home.dir=/etc/hadoop >> spark.opts.spark.yarn.am.extraJavaOptions="-Dhdp.version=2.6.4.0-91" >> spark.opts.spark.driver.extraJavaOptions="-Dhdp.version=2.6.4.0-91" >> >> >> Arun, please share: >> 1. YARN memory configurations >> 2. amaterasu.properties content >> 3. HDP version. >> >> Cheers, >> Nadav >> >> >> On 30 May 2018 at 07:11, Arun Manivannan <a...@arunma.com> wrote: >> >> > The pmem disabling is just temporary. I'll do a detailed analysis and >> get >> > back with a proper solution. >> > >> > Any hints on this front is highly appreciated. >> > >> > Cheers >> > Arun >> > >> > On Wed, May 30, 2018, 01:10 Nadav Har Tzvi <nadavhart...@gmail.com> >> wrote: >> > >> > > Yaniv, Eyal, this might be related to the same issue you faced with >> HDP. >> > > Can you confirm? >> > > >> > > On Tue, May 29, 2018, 17:58 Arun Manivannan <a...@arunma.com> wrote: >> > > >> > > > +1 from me >> > > > >> > > > Unit Tests and Build ran fine. >> > > > >> > > > Tested on HDP (VM) but had trouble allocating containers (didn't >> have >> > > that >> > > > before). Apparently Centos VMs are known to have this problem. >> > Disabled >> > > > physical memory check (yarn.nodemanager.pmem-check-enabled) and ran >> > jobs >> > > > successfully. >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > On Tue, May 29, 2018 at 10:42 PM Kirupa Devarajan < >> > > kirupagara...@gmail.com >> > > > > >> > > > wrote: >> > > > >> > > > > Unit tests passing and build was successful on the branch >> > > > > "version-0.2.0-incubating-rc3" >> > > > > >> > > > > +1 from me >> > > > > >> > > > > Cheers, >> > > > > Kirupa >> > > > > >> > > > > >> > > > > On Tue, May 29, 2018 at 3:06 PM, guy peleg <whisr...@gmail.com> >> > wrote: >> > > > > >> > > > > > +1 looks good to me >> > > > > > >> > > > > > On Tue, May 29, 2018, 14:39 Nadav Har Tzvi < >> nadavhart...@gmail.com >> > > >> > > > > wrote: >> > > > > > >> > > > > > > +1 approve. Tested multiple times and after a long round of >> > fixing >> > > > and >> > > > > > > testing over and over. >> > > > > > > >> > > > > > > Cheers, >> > > > > > > Nadav >> > > > > > > >> > > > > > > >> > > > > > > On 29 May 2018 at 07:38, Yaniv Rodenski <ya...@shinto.io> >> wrote: >> > > > > > > >> > > > > > > > Hi everyone, >> > > > > > > > >> > > > > > > > We have fixed the legal issues, as well as a bug found by >> > @Nadav >> > > > > please >> > > > > > > > review and vote on the release candidate #3 for the version >> > > > > > > > 0.2.0-incubating, as follows >> > > > > > > > >> > > > > > > > [ ] +1, Approve the release >> > > > > > > > [ ] -1, Do not approve the release (please provide specific >> > > > comments) >> > > > > > > > >> > > > > > > > The complete staging area is available for your review, >> which >> > > > > includes: >> > > > > > > > >> > > > > > > > * JIRA release notes [1], >> > > > > > > > * the official Apache source release to be deployed to >> > > > > dist.apache.org >> > > > > > > > [2], >> > > > > > > > which is signed with the key with fingerprint [3], >> > > > > > > > * source code tag "version-0.2.0-incubating-rc3" [4], >> > > > > > > > * Java artifacts were built with Gradle 3.1 and >> OpenJDK/Oracle >> > > JDK >> > > > > > > > 1.8.0_151 >> > > > > > > > >> > > > > > > > The vote will be open for at least 72 hours. It is adopted >> by >> > > > > majority >> > > > > > > > approval, with at least 3 PMC affirmative votes. >> > > > > > > > >> > > > > > > > Thanks, >> > > > > > > > Yaniv >> > > > > > > > >> > > > > > > > [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa? >> > > > > > > > projectId=12321521&version=12342793 >> > > > > > > > [2] https://dist.apache.org/repos/ >> > dist/dev/incubator/amaterasu/ >> > > > > > 0.2.0rc3/ >> > > > > > > > [3] >> > > > https://dist.apache.org/repos/dist/dev/incubator/amaterasu/KEYS >> > > > > > > > [4] https://github.com/apache/incubator-amaterasu/tags >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >