[
https://issues.apache.org/jira/browse/SPARK-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216243#comment-15216243
]
Thomas Graves commented on SPARK-12800:
---------------------------------------
You are talking about launching a job using org.apache.spark.deploy.yarn.Client
directly, correct? If so, we don't officially support that, I realize it isn't
currently private and some people are using it but in 2.0 that will be made
private.
> Subtle bug on Spark Yarn Client under Kerberos Security Mode
> ------------------------------------------------------------
>
> Key: SPARK-12800
> URL: https://issues.apache.org/jira/browse/SPARK-12800
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 1.5.1, 1.5.2
> Reporter: Chester
>
> Version used: Spark 1.5.1 (1.5.2-SNAPSHOT)
> Deployment Mode: Yarn-Cluster
> Problem observed:
> When running spark job directly from YarnClient (without using
> spark-submit, I did not verify the spark-submit has the same issue or not),
> when kerberos security is enabled, the first time run spark job always fail.
> The failure is due to that the hadoop consider the job is in SIMPLE model
> rather than Kerberos mode. But without shutting down the JVM, run the same
> job again, the spark job will pass. If one restart the JVM, then the spark
> job will fail again.
> The cause:
> Tracking down the source of the issue, I found that the problem seems lie
> at the spark Yarn Client.scala. In the Client
> {code}
> def prepareLocalResources() method L 266 of Client.java, the following line
> code is called.
> YarnSparkHadoopUtil.get.obtainTokensForNamenodes(nns, hadoopConf,
> credentials)
> {code}
> The YarnSparkHadoopUtil.get is in turns get initialized via reflection
> {code}
> object SparkHadoopUtil {
> private val hadoop = {
> val yarnMode = java.lang.Boolean.valueOf(
> System.getProperty("SPARK_YARN_MODE",
> System.getenv("SPARK_YARN_MODE")))
> if (yarnMode) {
> try {
> Utils.classForName("org.apache.spark.deploy.yarn.YarnSparkHadoopUtil")
> .newInstance()
> .asInstanceOf[SparkHadoopUtil]
> } catch {
> case e: Exception => throw new SparkException("Unable to load YARN
> support", e)
> }
> } else {
> new SparkHadoopUtil
> }
> }
> def get: SparkHadoopUtil = {
> hadoop
> }
> }
>
> class SparkHadoopUtil extends Logging {
> private val sparkConf = new SparkConf()
> val conf: Configuration = newConfiguration(sparkConf)
> UserGroupInformation.setConfiguration(conf)
> .... rest of line
> }
> {code}
> Here SparkHadoopUtil creates a empty SparkConf and Hadoop Configuration from
> that and set to UserGroupInformation
> {code}
> UserGroupInformation.setConfiguration(conf)
> {code}
> As the UserGroupInformation.authenticationMethod is static, above all wipe
> out the security settings. UserGroupInformation.isSecurityEnabled() changed
> from true to false. Thus the sequence call will fail.
> Since the SparkHadoopUtil.hadoop is static/non-mutable variable, so
> the next run it will be not create again, then
> UserGroupInformation.setConfiguration(conf)
> will not be called again, so the sequence spark job works.
> The work around:
> {code}
> //first initialize the SparkHadoopUtil, which will create a static
> instance
> //which will set UserGroupInformation to a empty hadoop Configuration.
> //we will need to reset the UserGroupInformation after that.
> val util = SparkHadoopUtil.get
> UserGroupInformation.setConfiguration(hadoopConf)
> {code}
> Then call
> client.run()
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]