My Spark application is compiled with 1.6 spark core and dependencies.
When I try to run this app on a spark 2.1 cluster - I run into
*ERROR ApplicationMaster: User class threw exception:
java.lang.NoClassDefFoundError: org/apache/spark/Logging
*
I was hoping that 2.+ spark is backward
For the below code, since rdd1 and rdd2 dont depend on each other - i was
expecting that both first and second printlns would be interwoven. However -
the spark job runs all "first " statements first and then all "seocnd"
statements next in serial fashion. I have set spark.scheduler.mode = FAIR.
Hi, I am planning to process spark app eventlogs with another spark app.
These event logs are saved with snappy compression (extension: .snappy).
When i read the file in a new spark app - i get a snappy library not found
error. I am confused as to how can spark write eventlog in snappy format
Hi, I am using spark 1.6 in YARN cluster mode. When my application runs, I am
unable to see gc time metrics in the Spark UI (Application
UI->Stages->Tasks). I am attaching the screenshot here.
Is this a bug in Spark UI or is this expected?
Are spark.speculation and related settings supported on standalone mode?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-speculation-setting-support-on-standalone-mode-tp28433.html
Sent from the Apache Spark User List mailing list archive at
I am reading from a kafka topic which has 8 partitions. My spark app is given
40 executors (1 core per executor). After reading the data, I repartition
the dstream by 500, map it and save it to cassandra.
However, I see that only 2 executors are being used per batch. even though I
see 500 tasks
I would like to understand spark Application UI's executor tab values better.
Are the values for Input, Shuffle Rad and Shuffle Write for sum of values
for all tasks across all stages?
If yes, then it appears that value isnt much of help while debugging?
Or am I missing the point of these
what is the right way to unzip an Spark app eventlog saved in snappy format
(.snappy)
Are there any libraries which we can use to do this programmatically?
--
View this message in context:
Question is in the title. Can the metric "Peak Execution memory" be used for
spark app resource tuning, if yes how? if no, what purpose does it serve
during debugging Apps.
--
View this message in context:
We have been measuring jvm heap memory usage in our spark app, by taking
periodic sampling of jvm heap memory usage and saving it in our metrics db.
we do this by spawning a thread in the spark app and measuring the jvm heap
memory usage every 1 min.
Is it a fair assumption to conclude that if the
For Spark Streaming Apps, what does "Output Op Duration" in the batch details
UX signify?
We have been observing that - for the given batch's last output Op id -
Output Op duration > Job duration by a factor. Sometimes it is huge (1 min).
I have provided the screenshot below where - you can see
Our spark job eventlogs are stored in compressed .snappy format.
What do I need to do to decompress these files programmatically?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Decompressing-Spark-Eventlogs-with-snappy-snappy-tp28340.html
Sent from the
Any solutions for this?
Spark version: 1.4.1, running in standalone mode.
All my applications complete succesfully but the spark master UI shows that
the executors in KILLED Status.
Is it just a UI bug or are my executors actually KILLED?
--
View this message in context:
13 matches
Mail list logo