[
https://issues.apache.org/jira/browse/SPARK-8142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580670#comment-14580670
]
Marcelo Vanzin commented on SPARK-8142:
---------------------------------------
So, using {{userClassPathFirst}} is tricky exactly because of these issues. You
have to be super careful when you have classes in your app that cross the class
loader boundaries.
Two general comments:
- if you want to use the glassfish jersey version, you shouldn't need to do
this, right? Spark depends on the old one that is under com.sun.*, IIRC.
- marking all dependencies (including hbase) as provided and using
{{spark.{driver,executor}.extraClassPath}} might be the easiest way out if you
really need to use {{userClassPathFirst}}.
Basically the class cast exceptions you're getting are because you have the
same class in both Spark's class loader and your app's class loader, and those
classes need to cross that boundary. So if you make sure in your app's build
that these conflicts do not occur, then using {{userClassPathFirst}} should
work.
Sorry I don't have a better suggestion; it's just not that trivial of a
problem. :-/ We could make the child-first class loader configurable, so that
you can set a sort of "blacklist" of packages where it should look at the
parent first, but that would still require people to fiddle with configurations
to make things work.
> Spark Job Fails with ResultTask ClassCastException
> --------------------------------------------------
>
> Key: SPARK-8142
> URL: https://issues.apache.org/jira/browse/SPARK-8142
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.3.1
> Reporter: Dev Lakhani
>
> When running a Spark Job, I get no failures in the application code
> whatsoever but a weird ResultTask Class exception. In my job, I create a RDD
> from HBase and for each partition do a REST call on an API, using a REST
> client. This has worked in IntelliJ but when I deploy to a cluster using
> spark-submit.sh I get :
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in
> stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0
> (TID 3, host): java.lang.ClassCastException:
> org.apache.spark.scheduler.ResultTask cannot be cast to
> org.apache.spark.scheduler.Task
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> These are the configs I set to override the spark classpath because I want to
> use my own glassfish jersey version:
>
> sparkConf.set("spark.driver.userClassPathFirst","true");
> sparkConf.set("spark.executor.userClassPathFirst","true");
> I see no other warnings or errors in any of the logs.
> Unfortunately I cannot post my code, but please ask me questions that will
> help debug the issue. Using spark 1.3.1 hadoop 2.6.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]