[jira] [Commented] (SPARK-5235) java.io.NotSerializableException: org.apache.spark.sql.SQLConf

Sean Owen (JIRA) Wed, 14 Jan 2015 08:54:45 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277217#comment-14277217
 ]


Sean Owen commented on SPARK-5235:
----------------------------------

[~alexbaretta] It certainly may not be your code of course. I mean "people" 
including the Spark code. But surely the problem is solved exactly by not 
trying to serialize {{SQLContext}}, no? despite its declaration, as you've 
demonstrated, it does not serialize, and was not designed to be used after 
serialization given the {{@transient}} field. 

You've suggested a reasonable band-aid on a band-aid but I would either like to 
fix the cause or understand why it's actually supposed to act this way. Other 
Contexts in Spark are not supposed to be serialized. Where I've seen this 
pattern before in the unit tests, it was certainly a hack for convenience that 
didn't matter much because it was just a test. 

Can you run with {{-Dsun.io.serialization.extendeddebuginfo=true}}? this will 
show exactly what had the reference to {{SQLContext}}.

> java.io.NotSerializableException: org.apache.spark.sql.SQLConf
> --------------------------------------------------------------
>
>                 Key: SPARK-5235
>                 URL: https://issues.apache.org/jira/browse/SPARK-5235
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Alex Baretta
>
> The SQLConf field in SQLContext is neither Serializable nor transient. Here's 
> the stack trace I get when running SQL queries against a Parquet file.
> Exception in thread "Thread-43" org.apache.spark.SparkException: Job aborted 
> due to stage failure: Task not serializable: 
> java.io.NotSerializableException: org.apache.spark.sql.SQLConf
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1195)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1184)
>         at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1183)
>         at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1183)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:843)
>         at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:779)
>         at 
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:763)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1364)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>         at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1356)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:220)
>         at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
>         at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-5235) java.io.NotSerializableException: org.apache.spark.sql.SQLConf

Reply via email to