[GitHub] spark pull request: [SPARK-14908] [YARN] Provide support HDFS-loca...

mikhaildubkov Wed, 27 Apr 2016 14:46:50 -0700

Github user mikhaildubkov commented on the pull request:

    https://github.com/apache/spark/pull/12678#issuecomment-215239721
  
    @tgravescs,
    
    I have tried to use --jars, --files and so on, but the main reason why we 
have to use "spark.executor.extraClassPath" is that we need add jar to 
**CoarseGrainedExecutorBackend** classpath.
    That is due to usage of **custom Spark serializer** in our project and 
Spark instantiates serializer before loading any --jars/--files etc.
    Here is the stack trace, which we have without 
"spark.executor.extraClassPath" usage:
    
    `Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1672)
        at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:151)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:253)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
    Caused by: java.lang.ClassNotFoundException: 
com.exeample.CustomSparkSerializer
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.spark.util.Utils$.classForName(Utils.scala:174)
        at org.apache.spark.SparkEnv$.instantiateClass$1(SparkEnv.scala:286)
        at 
org.apache.spark.SparkEnv$.instantiateClassFromConf$1(SparkEnv.scala:307)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:310)
        at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:217)
        at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:186)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
        at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        ... 4 more`
    
    As you can find, the serializer instantiates during Spark executor evn 
creation time. There are no other possible options, which I found, to put jar 
to the right place. Only "spark.executor.extraClassPath" helps.
    I'll try one more time options that you mentioned, but I believe that I 
already tried all of them.
    The root cause, why it not works for me is that too late, the serializer 
class instantiates before.
    
    What do think about it?
    
    Thank you!




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-14908] [YARN] Provide support HDFS-loca...

Reply via email to