Re: How to verify if spark is using kryo serializer for shuffle

2016-05-08 Thread Nirav Patel
Yes my mistake. I am using Spark 1.5.2 not 2.x. I looked at running spark driver jvm process on linux. Looks like my settings are not being applied to driver. We use oozie spark action to launch spark. I will have to investigate more on that. hopefully spark is or have replaced memory killer

Re: How to verify if spark is using kryo serializer for shuffle

2016-05-08 Thread Ted Yu
See the following: [SPARK-7997][CORE] Remove Akka from Spark Core and Streaming I guess you meant you are using Spark 1.5.1 For the time being, consider increasing spark.driver.memory Cheers On Sun, May 8, 2016 at 9:14 AM, Nirav Patel wrote: > Yes, I am using yarn

Re: How to verify if spark is using kryo serializer for shuffle

2016-05-08 Thread Nirav Patel
Yes, I am using yarn client mode hence I specified am settings too. What you mean akka is moved out of picture? I am using spark 2.5.1 Sent from my iPhone > On May 8, 2016, at 6:39 AM, Ted Yu wrote: > > Are you using YARN client mode ? > > See >

Re: How to verify if spark is using kryo serializer for shuffle

2016-05-08 Thread Ted Yu
Are you using YARN client mode ? See https://spark.apache.org/docs/latest/running-on-yarn.html In cluster mode, spark.yarn.am.memory is not effective. For Spark 2.0, akka is moved out of the picture. FYI On Sat, May 7, 2016 at 8:24 PM, Nirav Patel wrote: > I have 20

Re: How to verify if spark is using kryo serializer for shuffle

2016-05-07 Thread Nirav Patel
I have 20 executors, 6 cores each. Total 5 stages. It fails on 5th one. All of them have 6474 tasks. 5th task is a count operations and it also performs aggregateByKey as a part of it lazy evaluation. I am setting: spark.driver.memory=10G, spark.yarn.am.memory=2G and spark.driver.maxResultSize=9G

Re: How to verify if spark is using kryo serializer for shuffle

2016-05-07 Thread Ashish Dubey
Driver maintains the complete metadata of application ( scheduling of executor and maintaining the messaging to control the execution ) This code seems to be failing in that code path only. With that said there is Jvm overhead based on num of executors , stages and tasks in your app. Do you know

Re: How to verify if spark is using kryo serializer for shuffle

2016-05-07 Thread Nirav Patel
Right but this logs from spark driver and spark driver seems to use Akka. ERROR [sparkDriver-akka.actor.default-dispatcher-17] akka.actor.ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down ActorSystem [sparkDriver] I saw

Re: How to verify if spark is using kryo serializer for shuffle

2016-05-07 Thread Ted Yu
bq. at akka.serialization.JavaSerializer.toBinary(Serializer.scala:129) It was Akka which uses JavaSerializer Cheers On Sat, May 7, 2016 at 11:13 AM, Nirav Patel wrote: > Hi, > > I thought I was using kryo serializer for shuffle. I could verify it from > spark UI -

How to verify if spark is using kryo serializer for shuffle

2016-05-07 Thread Nirav Patel
Hi, I thought I was using kryo serializer for shuffle. I could verify it from spark UI - Environment tab that spark.serializer org.apache.spark.serializer.KryoSerializer spark.kryo.registrator com.myapp.spark.jobs.conf.SparkSerializerRegistrator But when I see following error in Driver logs it