running spark application compiled with 1.6 on spark 2.1 cluster

2017-07-26 Thread satishl
My Spark application is compiled with 1.6 spark core and dependencies. When I try to run this app on a spark 2.1 cluster - I run into *ERROR ApplicationMaster: User class threw exception: java.lang.NoClassDefFoundError: org/apache/spark/Logging * I was hoping that 2.+ spark is backward

Question about Parallel Stages in Spark

2017-06-26 Thread satishl
For the below code, since rdd1 and rdd2 dont depend on each other - i was expecting that both first and second printlns would be interwoven. However - the spark job runs all "first " statements first and then all "seocnd" statements next in serial fashion. I have set spark.scheduler.mode = FAIR.

reading snappy eventlog files from hdfs using spark

2017-04-07 Thread satishl
Hi, I am planning to process spark app eventlogs with another spark app. These event logs are saved with snappy compression (extension: .snappy). When i read the file in a new spark app - i get a snappy library not found error. I am confused as to how can spark write eventlog in snappy format

spark stages UI page has 'gc time' column Emtpy

2017-04-04 Thread satishl
Hi, I am using spark 1.6 in YARN cluster mode. When my application runs, I am unable to see gc time metrics in the Spark UI (Application UI->Stages->Tasks). I am attaching the screenshot here. Is this a bug in Spark UI or is this expected?

spark.speculation setting support on standalone mode?

2017-02-27 Thread satishl
Are spark.speculation and related settings supported on standalone mode? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-speculation-setting-support-on-standalone-mode-tp28433.html Sent from the Apache Spark User List mailing list archive at

Spark executors in streaming app always uses 2 executors

2017-02-21 Thread satishl
I am reading from a kafka topic which has 8 partitions. My spark app is given 40 executors (1 core per executor). After reading the data, I repartition the dstream by 500, map it and save it to cassandra. However, I see that only 2 executors are being used per batch. even though I see 500 tasks

Executor tab values in Spark Application UI

2017-02-17 Thread satishl
I would like to understand spark Application UI's executor tab values better. Are the values for Input, Shuffle Rad and Shuffle Write for sum of values for all tasks across all stages? If yes, then it appears that value isnt much of help while debugging? Or am I missing the point of these

extracting eventlogs saved snappy format.

2017-02-15 Thread satishl
what is the right way to unzip an Spark app eventlog saved in snappy format (.snappy) Are there any libraries which we can use to do this programmatically? -- View this message in context:

What is the practical use of "Peak Execution Memory" in Spark App Resource tuning

2017-02-15 Thread satishl
Question is in the title. Can the metric "Peak Execution memory" be used for spark app resource tuning, if yes how? if no, what purpose does it serve during debugging Apps. -- View this message in context:

Spark executor memory and jvm heap memory usage metric

2017-02-15 Thread satishl
We have been measuring jvm heap memory usage in our spark app, by taking periodic sampling of jvm heap memory usage and saving it in our metrics db. we do this by spawning a thread in the spark app and measuring the jvm heap memory usage every 1 min. Is it a fair assumption to conclude that if the

Question about "Output Op Duration" in SparkStreaming Batch details UX

2017-01-31 Thread satishl
For Spark Streaming Apps, what does "Output Op Duration" in the batch details UX signify? We have been observing that - for the given batch's last output Op id - Output Op duration > Job duration by a factor. Sometimes it is huge (1 min). I have provided the screenshot below where - you can see

Decompressing Spark Eventlogs with snappy (*.snappy)

2017-01-25 Thread satishl
Our spark job eventlogs are stored in compressed .snappy format. What do I need to do to decompress these files programmatically? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Decompressing-Spark-Eventlogs-with-snappy-snappy-tp28340.html Sent from the

Re: App works, but executor state is "killed"

2016-09-16 Thread satishl
Any solutions for this? Spark version: 1.4.1, running in standalone mode. All my applications complete succesfully but the spark master UI shows that the executors in KILLED Status. Is it just a UI bug or are my executors actually KILLED? -- View this message in context: