Yes my mistake. I am using Spark 1.5.2 not 2.x. I looked at running spark driver jvm process on linux. Looks like my settings are not being applied to driver. We use oozie spark action to launch spark. I will have to investigate more on that.
hopefully spark is or have replaced memory killer Java serializer to better streaming serializer. Thanks On Sun, May 8, 2016 at 9:33 AM, Ted Yu <yuzhih...@gmail.com> wrote: > See the following: > [SPARK-7997][CORE] Remove Akka from Spark Core and Streaming > > I guess you meant you are using Spark 1.5.1 > > For the time being, consider increasing spark.driver.memory > > Cheers > > On Sun, May 8, 2016 at 9:14 AM, Nirav Patel <npa...@xactlycorp.com> wrote: > >> Yes, I am using yarn client mode hence I specified am settings too. >> What you mean akka is moved out of picture? I am using spark 2.5.1 >> >> Sent from my iPhone >> >> On May 8, 2016, at 6:39 AM, Ted Yu <yuzhih...@gmail.com> wrote: >> >> Are you using YARN client mode ? >> >> See >> https://spark.apache.org/docs/latest/running-on-yarn.html >> >> In cluster mode, spark.yarn.am.memory is not effective. >> >> For Spark 2.0, akka is moved out of the picture. >> FYI >> >> On Sat, May 7, 2016 at 8:24 PM, Nirav Patel <npa...@xactlycorp.com> >> wrote: >> >>> I have 20 executors, 6 cores each. Total 5 stages. It fails on 5th one. >>> All of them have 6474 tasks. 5th task is a count operations and it also >>> performs aggregateByKey as a part of it lazy evaluation. >>> I am setting: >>> spark.driver.memory=10G, spark.yarn.am.memory=2G and >>> spark.driver.maxResultSize=9G >>> >>> >>> On a side note, could it be something to do with java serialization >>> library, ByteArrayOutputStream using byte array? Can it be replaced by >>> some better serializing library? >>> >>> https://bugs.openjdk.java.net/browse/JDK-8055949 >>> https://bugs.openjdk.java.net/browse/JDK-8136527 >>> >>> Thanks >>> >>> On Sat, May 7, 2016 at 4:51 PM, Ashish Dubey <ashish....@gmail.com> >>> wrote: >>> >>>> Driver maintains the complete metadata of application ( scheduling of >>>> executor and maintaining the messaging to control the execution ) >>>> This code seems to be failing in that code path only. With that said >>>> there is Jvm overhead based on num of executors , stages and tasks in your >>>> app. Do you know your driver heap size and application structure ( num of >>>> stages and tasks ) >>>> >>>> Ashish >>>> >>>> On Saturday, May 7, 2016, Nirav Patel <npa...@xactlycorp.com> wrote: >>>> >>>>> Right but this logs from spark driver and spark driver seems to use >>>>> Akka. >>>>> >>>>> ERROR [sparkDriver-akka.actor.default-dispatcher-17] >>>>> akka.actor.ActorSystemImpl: Uncaught fatal error from thread >>>>> [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down >>>>> ActorSystem [sparkDriver] >>>>> >>>>> I saw following logs before above happened. >>>>> >>>>> 2016-05-06 09:49:17,813 INFO >>>>> [sparkDriver-akka.actor.default-dispatcher-17] >>>>> org.apache.spark.MapOutputTrackerMasterEndpoint: Asked to send map output >>>>> locations for shuffle 1 to hdn6.xactlycorporation.local:44503 >>>>> >>>>> >>>>> As far as I know driver is just driving shuffle operation but not >>>>> actually doing anything within its own system that will cause memory >>>>> issue. >>>>> Can you explain in what circumstances I could see this error in driver >>>>> logs? I don't do any collect or any other driver operation that would >>>>> cause >>>>> this. It fails when doing aggregateByKey operation but that should happen >>>>> in executor JVM NOT in driver JVM. >>>>> >>>>> >>>>> Thanks >>>>> >>>>> On Sat, May 7, 2016 at 11:58 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>>>> >>>>>> bq. at akka.serialization.JavaSerializer.toBinary( >>>>>> Serializer.scala:129) >>>>>> >>>>>> It was Akka which uses JavaSerializer >>>>>> >>>>>> Cheers >>>>>> >>>>>> On Sat, May 7, 2016 at 11:13 AM, Nirav Patel <npa...@xactlycorp.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I thought I was using kryo serializer for shuffle. I could verify >>>>>>> it from spark UI - Environment tab that >>>>>>> spark.serializer org.apache.spark.serializer.KryoSerializer >>>>>>> spark.kryo.registrator >>>>>>> com.myapp.spark.jobs.conf.SparkSerializerRegistrator >>>>>>> >>>>>>> >>>>>>> But when I see following error in Driver logs it looks like spark is >>>>>>> using JavaSerializer >>>>>>> >>>>>>> 2016-05-06 09:49:26,490 ERROR >>>>>>> [sparkDriver-akka.actor.default-dispatcher-17] >>>>>>> akka.actor.ActorSystemImpl: >>>>>>> Uncaught fatal error from thread >>>>>>> [sparkDriver-akka.remote.default-remote-dispatcher-6] shutting down >>>>>>> ActorSystem [sparkDriver] >>>>>>> >>>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>>> >>>>>>> at java.util.Arrays.copyOf(Arrays.java:2271) >>>>>>> >>>>>>> at >>>>>>> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) >>>>>>> >>>>>>> at >>>>>>> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) >>>>>>> >>>>>>> at >>>>>>> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) >>>>>>> >>>>>>> at >>>>>>> java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876) >>>>>>> >>>>>>> at >>>>>>> java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785) >>>>>>> >>>>>>> at >>>>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1188) >>>>>>> >>>>>>> at >>>>>>> java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) >>>>>>> >>>>>>> at >>>>>>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply$mcV$sp(Serializer.scala:129) >>>>>>> >>>>>>> at >>>>>>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:129) >>>>>>> >>>>>>> at >>>>>>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:129) >>>>>>> >>>>>>> at >>>>>>> scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) >>>>>>> >>>>>>> at >>>>>>> akka.serialization.JavaSerializer.toBinary(Serializer.scala:129) >>>>>>> >>>>>>> at >>>>>>> akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:36) >>>>>>> >>>>>>> at >>>>>>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:843) >>>>>>> >>>>>>> at >>>>>>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:843) >>>>>>> >>>>>>> at >>>>>>> scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) >>>>>>> >>>>>>> at >>>>>>> akka.remote.EndpointWriter.serializeMessage(Endpoint.scala:842) >>>>>>> >>>>>>> at akka.remote.EndpointWriter.writeSend(Endpoint.scala:743) >>>>>>> >>>>>>> at >>>>>>> akka.remote.EndpointWriter$$anonfun$2.applyOrElse(Endpoint.scala:718) >>>>>>> >>>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:467) >>>>>>> >>>>>>> at >>>>>>> akka.remote.EndpointActor.aroundReceive(Endpoint.scala:411) >>>>>>> >>>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) >>>>>>> >>>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487) >>>>>>> >>>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) >>>>>>> >>>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:220) >>>>>>> >>>>>>> at >>>>>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) >>>>>>> >>>>>>> at >>>>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>>>>>> >>>>>>> at >>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >>>>>>> >>>>>>> at >>>>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >>>>>>> >>>>>>> at >>>>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >>>>>>> >>>>>>> >>>>>>> >>>>>>> What I am missing here? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> >>>>>>> [image: What's New with Xactly] >>>>>>> <http://www.xactlycorp.com/email-click/> >>>>>>> >>>>>>> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] >>>>>>> <https://www.linkedin.com/company/xactly-corporation> [image: >>>>>>> Twitter] <https://twitter.com/Xactly> [image: Facebook] >>>>>>> <https://www.facebook.com/XactlyCorp> [image: YouTube] >>>>>>> <http://www.youtube.com/xactlycorporation> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> [image: What's New with Xactly] >>>>> <http://www.xactlycorp.com/email-click/> >>>>> >>>>> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] >>>>> <https://www.linkedin.com/company/xactly-corporation> [image: >>>>> Twitter] <https://twitter.com/Xactly> [image: Facebook] >>>>> <https://www.facebook.com/XactlyCorp> [image: YouTube] >>>>> <http://www.youtube.com/xactlycorporation> >>>> >>>> >>> >>> >>> >>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> >>> >>> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] >>> <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] >>> <https://twitter.com/Xactly> [image: Facebook] >>> <https://www.facebook.com/XactlyCorp> [image: YouTube] >>> <http://www.youtube.com/xactlycorporation> >>> >> >> >> >> >> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> >> >> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] >> <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] >> <https://twitter.com/Xactly> [image: Facebook] >> <https://www.facebook.com/XactlyCorp> [image: YouTube] >> <http://www.youtube.com/xactlycorporation> >> > > -- [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] <https://twitter.com/Xactly> [image: Facebook] <https://www.facebook.com/XactlyCorp> [image: YouTube] <http://www.youtube.com/xactlycorporation>