[Apache Spark][Streaming Job][Checkpoint]Spark job failed on Checkpoint recovery with Batch not found error

2020-05-28 Thread taylorwu
Hi,

We have a Spark 2.4 job failed on Checkpoint recovery every few hours with
the following errors (from the Driver Log):

driver spark-kubernetes-driver ERROR 20:38:51 ERROR MicroBatchExecution:
Query impressionUpdate [id = 54614900-4145-4d60-8156-9746ffc13d1f, runId =
3637c2f3-49b6-40c2-b6d0-7edb28361c5d] terminated with error
java.lang.IllegalStateException: batch 946 doesn't exist
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply$mcV$sp(MicroBatchExecution.scala:406)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
at
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
at
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply$mcZ$sp(MicroBatchExecution.scala:381)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:557)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch(MicroBatchExecution.scala:337)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:183)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
at
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
at
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1.apply$mcZ$sp(MicroBatchExecution.scala:166)
at
org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
at
org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:160)
at
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:281)
at
org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:193)

And the executor logs show this error:

 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM

How should I fix this?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Spark Job Failed with FileNotFoundException

2016-11-01 Thread fanooos
I have a spark cluster consists of 5 nodes and I have a spark job that should
process some files from a directory and send its content to Kafka.

I am trying to submit the job using the following command

bin$ ./spark-submit --total-executor-cores 20 --executor-memory 5G --class
org.css.java.FileMigration.FileSparkMigrator --master
spark://spark-master:7077
/home/me/FileMigrator-0.1.1-jar-with-dependencies.jar /home/me/shared
kafka01,kafka02,kafka03,kafka04,kafka05 kafka_topic


The directory /home/me/shared is mounted on all the 5 nodes but when I
submit the job I got the following exception

java.io.FileNotFoundException: File file:/home/me/shared/input_1.txt does
not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:140)
at
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
at
org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
at
org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:239)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)



After some tries, I faced another weird behavior. When I submit the job
while the directory contains 1 or 2 files, the same exception is thrown on
the driver machine but the job continued and the files are processed
successfully. Once I add another file, the exception is thrown and the job
failed.





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Job-Failed-with-FileNotFoundException-tp27980.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Spark job failed

2015-09-14 Thread Ted Yu
Have you considered posting on vendor forum ?

FYI

On Mon, Sep 14, 2015 at 6:09 AM, Renu Yadav  wrote:

>
> -- Forwarded message --
> From: Renu Yadav 
> Date: Mon, Sep 14, 2015 at 4:51 PM
> Subject: Spark job failed
> To: d...@spark.apache.org
>
>
> I am getting below error while running spark job:
>
> storage.DiskBlockObjectWriter: Uncaught exception while reverting partial
> writes to file
> /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd
> java.io.FileNotFoundException:
> /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd
> (No such file or directory
>
>
>
> I am running 1.3TB data
> following are the transformation
>
> read from hadoop->map(key/value).coalease(2000).groupByKey.
> then sorting each record by server_ts and select most recent
>
> saving data into parquet.
>
>
> Following is the command
> spark-submit --class com.test.Myapp--master yarn-cluster  --driver-memory
> 16g  --executor-memory 20g --executor-cores 5   --num-executors 150
> --files /home/renu_yadav/fmyapp/hive-site.xml --conf
> spark.yarn.preserve.staging.files=true --conf
> spark.shuffle.memoryFraction=0.6  --conf spark.storage.memoryFraction=0.1
> --conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m"  --conf
> SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m"   --conf
> spark.akka.timeout=40  --conf spark.locality.wait=10 --conf
> spark.yarn.executor.memoryOverhead=8000   --conf
> SPARK_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
> --conf spark.reducer.maxMbInFlight=96  --conf
> spark.shuffle.file.buffer.kb=64 --conf
> spark.core.connection.ack.wait.timeout=120  --jars
> /usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-rdbms-3.2.9.jar
> myapp_2.10-1.0.jar
>
>
>
>
>
>
>
> Cluster configuration
>
> 20 Nodes
> 32 cores per node
> 125 GB ram per node
>
>
> Please Help.
>
> Thanks & Regards,
> Renu Yadav
>
>


Fwd: Spark job failed

2015-09-14 Thread Renu Yadav
-- Forwarded message --
From: Renu Yadav 
Date: Mon, Sep 14, 2015 at 4:51 PM
Subject: Spark job failed
To: d...@spark.apache.org


I am getting below error while running spark job:

storage.DiskBlockObjectWriter: Uncaught exception while reverting partial
writes to file
/data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd
java.io.FileNotFoundException:
/data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd
(No such file or directory



I am running 1.3TB data
following are the transformation

read from hadoop->map(key/value).coalease(2000).groupByKey.
then sorting each record by server_ts and select most recent

saving data into parquet.


Following is the command
spark-submit --class com.test.Myapp--master yarn-cluster  --driver-memory
16g  --executor-memory 20g --executor-cores 5   --num-executors 150
--files /home/renu_yadav/fmyapp/hive-site.xml --conf
spark.yarn.preserve.staging.files=true --conf
spark.shuffle.memoryFraction=0.6  --conf spark.storage.memoryFraction=0.1
--conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m"  --conf
SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m"   --conf
spark.akka.timeout=40  --conf spark.locality.wait=10 --conf
spark.yarn.executor.memoryOverhead=8000   --conf
SPARK_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
--conf spark.reducer.maxMbInFlight=96  --conf
spark.shuffle.file.buffer.kb=64 --conf
spark.core.connection.ack.wait.timeout=120  --jars
/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-rdbms-3.2.9.jar
myapp_2.10-1.0.jar







Cluster configuration

20 Nodes
32 cores per node
125 GB ram per node


Please Help.

Thanks & Regards,
Renu Yadav


Re: Spark Job Failed (Executor Lost & then FS closed)

2015-08-09 Thread Akhil Das
Can you look more in the worker logs and see whats going on? It looks like
a memory issue (kind of GC overhead etc., You need to look in the worker
logs)

Thanks
Best Regards

On Fri, Aug 7, 2015 at 3:21 AM, ÐΞ€ρ@Ҝ (๏̯͡๏)  wrote:

> Re attaching the images.
>
> On Thu, Aug 6, 2015 at 2:50 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)  wrote:
>
>> Code:
>> import java.text.SimpleDateFormat
>> import java.util.Calendar
>> import java.sql.Date
>> import org.apache.spark.storage.StorageLevel
>>
>> def extract(array: Array[String], index: Integer) = {
>>   if (index < array.length) {
>> array(index).replaceAll("\"", "")
>>   } else {
>> ""
>>   }
>> }
>>
>>
>> case class GuidSess(
>>   guid: String,
>>   sessionKey: String,
>>   sessionStartDate: String,
>>   siteId: String,
>>   eventCount: String,
>>   browser: String,
>>   browserVersion: String,
>>   operatingSystem: String,
>>   experimentChannel: String,
>>   deviceName: String)
>>
>> val rowStructText =
>> sc.textFile("/user/zeppelin/guidsess/2015/08/05/part-m-1.gz")
>> val guidSessRDD = rowStructText.filter(s => s.length != 1).map(s =>
>> s.split(",")).map(
>>   {
>> s =>
>>   GuidSess(extract(s, 0),
>> extract(s, 1),
>> extract(s, 2),
>> extract(s, 3),
>> extract(s, 4),
>> extract(s, 5),
>> extract(s, 6),
>> extract(s, 7),
>> extract(s, 8),
>> extract(s, 9))
>>   })
>>
>> val guidSessDF = guidSessRDD.toDF()
>> guidSessDF.registerTempTable("guidsess")
>>
>> Once the temp table is created, i wrote this query
>>
>> select siteid, count(distinct guid) total_visitor,
>> count(sessionKey) as total_visits
>> from guidsess
>> group by siteid
>>
>> *Metrics:*
>>
>> Data Size: 170 MB
>> Spark Version: 1.3.1
>> YARN: 2.7.x
>>
>>
>>
>> Timeline:
>> There is 1 Job, 2 stages with 1 task each.
>>
>> *1st Stage : mapPartitions*
>> [image: Inline image 1]
>>
>> 1st Stage: Task 1 started to fail. A second attempt started for 1st task
>> of first Stage. The first attempt failed "Executor LOST"
>> when i go to YARN resource manager and go to that particular host, i see
>> that its running fine.
>>
>> *Attempt #1*
>> [image: Inline image 2]
>>
>> *Attempt #2* Executor LOST AGAIN
>> [image: Inline image 3]
>> *Attempt 3&4*
>>
>> *[image: Inline image 4]*
>>
>>
>>
>> *2nd Stage runJob : SKIPPED*
>>
>> *[image: Inline image 5]*
>>
>> Any suggestions ?
>>
>>
>> --
>> Deepak
>>
>>
>
>
> --
> Deepak
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>


Spark Job Failed (Executor Lost & then FS closed)

2015-08-06 Thread ๏̯͡๏
Code:
import java.text.SimpleDateFormat
import java.util.Calendar
import java.sql.Date
import org.apache.spark.storage.StorageLevel

def extract(array: Array[String], index: Integer) = {
  if (index < array.length) {
array(index).replaceAll("\"", "")
  } else {
""
  }
}


case class GuidSess(
  guid: String,
  sessionKey: String,
  sessionStartDate: String,
  siteId: String,
  eventCount: String,
  browser: String,
  browserVersion: String,
  operatingSystem: String,
  experimentChannel: String,
  deviceName: String)

val rowStructText =
sc.textFile("/user/zeppelin/guidsess/2015/08/05/part-m-1.gz")
val guidSessRDD = rowStructText.filter(s => s.length != 1).map(s =>
s.split(",")).map(
  {
s =>
  GuidSess(extract(s, 0),
extract(s, 1),
extract(s, 2),
extract(s, 3),
extract(s, 4),
extract(s, 5),
extract(s, 6),
extract(s, 7),
extract(s, 8),
extract(s, 9))
  })

val guidSessDF = guidSessRDD.toDF()
guidSessDF.registerTempTable("guidsess")

Once the temp table is created, i wrote this query

select siteid, count(distinct guid) total_visitor,
count(sessionKey) as total_visits
from guidsess
group by siteid

*Metrics:*

Data Size: 170 MB
Spark Version: 1.3.1
YARN: 2.7.x



Timeline:
There is 1 Job, 2 stages with 1 task each.

*1st Stage : mapPartitions*
[image: Inline image 1]

1st Stage: Task 1 started to fail. A second attempt started for 1st task of
first Stage. The first attempt failed "Executor LOST"
when i go to YARN resource manager and go to that particular host, i see
that its running fine.

*Attempt #1*
[image: Inline image 2]

*Attempt #2* Executor LOST AGAIN
[image: Inline image 3]
*Attempt 3&4*

*[image: Inline image 4]*



*2nd Stage runJob : SKIPPED*

*[image: Inline image 5]*

Any suggestions ?


-- 
Deepak


Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Frank Austin Nothaft
You’ll definitely want to use a Kryo-based serializer for Avro. We have a Kryo 
based serializer that wraps the Avro efficient serializer here.

Frank Austin Nothaft
fnoth...@berkeley.edu
fnoth...@eecs.berkeley.edu
202-340-0466

On Apr 3, 2015, at 5:41 AM, Akhil Das  wrote:

> Because, its throwing up serializable exceptions and kryo is a serializer to 
> serialize your objects.
> 
> Thanks
> Best Regards
> 
> On Fri, Apr 3, 2015 at 5:37 PM, Deepak Jain  wrote:
> I meant that I did not have to use kyro. Why will kyro help fix this issue 
> now ?
> 
> Sent from my iPhone
> 
> On 03-Apr-2015, at 5:36 pm, Deepak Jain  wrote:
> 
>> I was able to write record that extends specificrecord (avro) this class was 
>> not auto generated. Do we need to do something extra for auto generated 
>> classes 
>> 
>> Sent from my iPhone
>> 
>> On 03-Apr-2015, at 5:06 pm, Akhil Das  wrote:
>> 
>>> This thread might give you some insights 
>>> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E
>>> 
>>> Thanks
>>> Best Regards
>>> 
>>> On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)  wrote:
>>> My Spark Job failed with
>>> 
>>> 
>>> 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: 
>>> saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
>>> 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: 
>>> Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not 
>>> serializable result: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>>> Serialization stack:
>>> - object not serializable (class: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
>>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": 
>>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, 
>>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": 
>>> null, "currPsLvlId": null}))
>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 
>>> in stage 2.0 (TID 0) had a not serializable result: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>>> Serialization stack:
>>> - object not serializable (class: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
>>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": 
>>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, 
>>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": 
>>> null, "currPsLvlId": null}))
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
>>> at 
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>> at scala.Option.foreach(Option.scala:236)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
>>> at 
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
>>> at 
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onRecei

Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Akhil Das
Because, its throwing up serializable exceptions and kryo is a serializer
to serialize your objects.

Thanks
Best Regards

On Fri, Apr 3, 2015 at 5:37 PM, Deepak Jain  wrote:

> I meant that I did not have to use kyro. Why will kyro help fix this issue
> now ?
>
> Sent from my iPhone
>
> On 03-Apr-2015, at 5:36 pm, Deepak Jain  wrote:
>
> I was able to write record that extends specificrecord (avro) this class
> was not auto generated. Do we need to do something extra for auto generated
> classes
>
> Sent from my iPhone
>
> On 03-Apr-2015, at 5:06 pm, Akhil Das  wrote:
>
> This thread might give you some insights
> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E
>
> Thanks
> Best Regards
>
> On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)  wrote:
>
>> My Spark Job failed with
>>
>>
>> 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed:
>> saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
>> 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw
>> exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0)
>> had a not serializable result:
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>> Serialization stack:
>> - object not serializable (class:
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt":
>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0,
>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt":
>> null, "currPsLvlId": null}))
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 0.0 in stage 2.0 (TID 0) had a not serializable result:
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>> Serialization stack:
>> - object not serializable (class:
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt":
>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0,
>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt":
>> null, "currPsLvlId": null}))
>> at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
>> at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>> 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED,
>> exitCode: 15, (reason: User class threw exception: Job aborted due to stage
>> failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result:
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>> Serialization stack:
>>
>> 
>>
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is
>> auto generated through avro schema using avro-generate-sources maven pulgin.
>>
>>
>> package com.ebay.ep.poc.spark.reporting.process.model.dw;
>>
>> @SuppressWarnings("all")
>>
>> @org.apache.avro.specific.AvroGenerated
>>
>> public class SpsLevelMetricSum extends
>> org.apache.avro.specific.SpecificRecordBase implements
>> org.apache.avro.specific.SpecificRecord {
>> ...
>> ...
>> }
>>
>> Can anyone suggest how to fix this ?
>>
>>
>>
>> --
>> Deepak
>>
>>
>


Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Deepak Jain
I meant that I did not have to use kyro. Why will kyro help fix this issue now ?

Sent from my iPhone

> On 03-Apr-2015, at 5:36 pm, Deepak Jain  wrote:
> 
> I was able to write record that extends specificrecord (avro) this class was 
> not auto generated. Do we need to do something extra for auto generated 
> classes 
> 
> Sent from my iPhone
> 
>> On 03-Apr-2015, at 5:06 pm, Akhil Das  wrote:
>> 
>> This thread might give you some insights 
>> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E
>> 
>> Thanks
>> Best Regards
>> 
>>> On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)  wrote:
>>> My Spark Job failed with
>>> 
>>> 
>>> 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: 
>>> saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
>>> 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: 
>>> Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not 
>>> serializable result: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>>> Serialization stack:
>>> - object not serializable (class: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
>>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": 
>>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, 
>>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": 
>>> null, "currPsLvlId": null}))
>>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 
>>> in stage 2.0 (TID 0) had a not serializable result: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>>> Serialization stack:
>>> - object not serializable (class: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
>>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": 
>>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, 
>>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": 
>>> null, "currPsLvlId": null}))
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
>>> at 
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>> at scala.Option.foreach(Option.scala:236)
>>> at 
>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
>>> at 
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
>>> at 
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
>>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>>> 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, 
>>> exitCode: 15, (reason: User class threw exception: Job aborted due to stage 
>>> failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>>> Serialization stack:
>>> 
>>> 
>>> 
>>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto 
>>> generated through avro schema using avro-generate-sources maven pulgin.
>>> 
>>> 
>>> package com.ebay.ep.poc.spark.reporting.process.model.dw;  
>>> 
>>> @SuppressWarnings("all")
>>> 
>>> @org.apache.avro.specific.AvroGenerated
>>> 
>>> public class SpsLevelMetricSum extends 
>>> org.apache.avro.specific.SpecificRecordBase implements 
>>> org.apache.avro.specific.SpecificRecord {
>>> 
>>> ...
>>> ...
>>> }
>>> 
>>> Can anyone suggest how to fix this ?
>>> 
>>> 
>>> 
>>> -- 
>>> Deepak
>> 


Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Deepak Jain
I was able to write record that extends specificrecord (avro) this class was 
not auto generated. Do we need to do something extra for auto generated classes 

Sent from my iPhone

> On 03-Apr-2015, at 5:06 pm, Akhil Das  wrote:
> 
> This thread might give you some insights 
> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E
> 
> Thanks
> Best Regards
> 
>> On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)  wrote:
>> My Spark Job failed with
>> 
>> 
>> 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: 
>> saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
>> 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: 
>> Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not 
>> serializable result: 
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>> Serialization stack:
>>  - object not serializable (class: 
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": 
>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>>  - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>>  - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, 
>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": 
>> null, "currPsLvlId": null}))
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 
>> in stage 2.0 (TID 0) had a not serializable result: 
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>> Serialization stack:
>>  - object not serializable (class: 
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": 
>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
>>  - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>>  - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, 
>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": 
>> null, "currPsLvlId": null}))
>>  at 
>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
>>  at 
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>  at scala.Option.foreach(Option.scala:236)
>>  at 
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
>>  at 
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
>>  at 
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
>>  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>> 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, 
>> exitCode: 15, (reason: User class threw exception: Job aborted due to stage 
>> failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: 
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
>> Serialization stack:
>> 
>> 
>> 
>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto 
>> generated through avro schema using avro-generate-sources maven pulgin.
>> 
>> 
>> package com.ebay.ep.poc.spark.reporting.process.model.dw;  
>> 
>> @SuppressWarnings("all")
>> 
>> @org.apache.avro.specific.AvroGenerated
>> 
>> public class SpsLevelMetricSum extends 
>> org.apache.avro.specific.SpecificRecordBase implements 
>> org.apache.avro.specific.SpecificRecord {
>> 
>> ...
>> ...
>> }
>> 
>> Can anyone suggest how to fix this ?
>> 
>> 
>> 
>> -- 
>> Deepak
> 


Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Akhil Das
This thread might give you some insights
http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E

Thanks
Best Regards

On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏)  wrote:

> My Spark Job failed with
>
>
> 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed:
> saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
> 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw
> exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0)
> had a not serializable result:
> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
> Serialization stack:
> - object not serializable (class:
> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt":
> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0,
> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt":
> null, "currPsLvlId": null}))
> org.apache.spark.SparkException: Job aborted due to stage failure: Task
> 0.0 in stage 2.0 (TID 0) had a not serializable result:
> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
> Serialization stack:
> - object not serializable (class:
> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt":
> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0,
> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt":
> null, "currPsLvlId": null}))
> at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
> at
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception: Job aborted due to stage
> failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result:
> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
> Serialization stack:
>
> 
>
> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto
> generated through avro schema using avro-generate-sources maven pulgin.
>
>
> package com.ebay.ep.poc.spark.reporting.process.model.dw;
>
> @SuppressWarnings("all")
>
> @org.apache.avro.specific.AvroGenerated
>
> public class SpsLevelMetricSum extends
> org.apache.avro.specific.SpecificRecordBase implements
> org.apache.avro.specific.SpecificRecord {
> ...
> ...
> }
>
> Can anyone suggest how to fix this ?
>
>
>
> --
> Deepak
>
>


Spark Job Failed - Class not serializable

2015-04-03 Thread ๏̯͡๏
My Spark Job failed with


15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed:
saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception:
Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not
serializable result:
com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
Serialization stack:
- object not serializable (class:
com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
{"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt":
null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
- field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
- object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0,
"spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt":
null, "currPsLvlId": null}))
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0
in stage 2.0 (TID 0) had a not serializable result:
com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
Serialization stack:
- object not serializable (class:
com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
{"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt":
null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null})
- field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
- object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0,
"spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt":
null, "currPsLvlId": null}))
at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 15, (reason: User class threw exception: Job aborted due to stage
failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result:
com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
Serialization stack:



com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto
generated through avro schema using avro-generate-sources maven pulgin.


package com.ebay.ep.poc.spark.reporting.process.model.dw;

@SuppressWarnings("all")

@org.apache.avro.specific.AvroGenerated

public class SpsLevelMetricSum extends
org.apache.avro.specific.SpecificRecordBase implements
org.apache.avro.specific.SpecificRecord {
...
...
}

Can anyone suggest how to fix this ?



-- 
Deepak