GitHub user caneGuy opened a pull request:

    https://github.com/apache/spark/pull/18901

    [SPARK-21689][YARN] Download user jar from remote in case get hadoop token 
…

    …failed
    
    ## What changes were proposed in this pull request?
    
    When use yarn cluster mode,and we need scan hbase,there will be a case 
which can not work:
    If we put user jar on hdfs,when local classpath will has no hbase,which 
will let get hbase token failed.Then later when job submitted to yarn, it will 
failed since has no token to access hbase table.I mock three cases:
    1:user jar is on classpath, and has hbase
    `17/08/10 13:48:03 INFO security.HadoopFSDelegationTokenProvider: Renewal 
interval is 86400050 for token HDFS_DELEGATION_TOKEN
    17/08/10 13:48:03 INFO security.HadoopDelegationTokenManager: Service hive
    17/08/10 13:48:03 INFO security.HadoopDelegationTokenManager: Service hbase
    17/08/10 13:48:05 INFO security.HBaseDelegationTokenProvider: Attempting to 
fetch HBase security token.`
    
    Logs showing we can get token normally.
    
    
    2:user jar on hdfs
    `17/08/10 13:43:58 WARN security.HBaseDelegationTokenProvider: Class 
org.apache.hadoop.hbase.HBaseConfiguration not found.
    17/08/10 13:43:58 INFO security.HBaseDelegationTokenProvider: Failed to get 
token from service hbase
    java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.security.token.TokenUtil
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at 
org.apache.spark.deploy.security.HBaseDelegationTokenProvider.obtainDelegationTokens(HBaseDelegationTokenProvider.scala:41)
        at 
org.apache.spark.deploy.security.HadoopDelegationTokenManager$$anonfun$obtainDelegationTokens$2.apply(HadoopDelegationTokenManager.scala:112)
        at 
org.apache.spark.deploy.security.HadoopDelegationTokenManager$$anonfun$obtainDelegationTokens$2.apply(HadoopDelegationTokenManager.scala:109)
        at 
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
        at 
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)`
    
    
    Logs showing we can get token failed with ClassNotFoundException.
    
    If we download user jar from remote first,then things will work 
correctly.So this patch will download user jar from remote when in yarn cluster 
mode.
    
    
    ## How was this patch tested?
    
    Manually tested by execute spark-submit scripts with different user jars.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/caneGuy/spark zhoukang/download-userjar

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18901.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18901
    
----
commit 31fb394f983313c2ee767bf68220041fa6c84b2e
Author: zhoukang <zhoukang199...@gmail.com>
Date:   2017-08-09T10:42:43Z

    [SPARK][YARN] Download user jar from remote in case get hadoop token failed

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to