[jira] [Created] (SPARK-20608) Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA

Yuechen Chen (JIRA) Thu, 04 May 2017 23:12:47 -0700

Yuechen Chen created SPARK-20608:
------------------------------------

             Summary: Standby namenodes should be allowed to included in 
yarn.spark.access.namenodes to support HDFS HA
                 Key: SPARK-20608
                 URL: https://issues.apache.org/jira/browse/SPARK-20608
             Project: Spark
          Issue Type: Bug
          Components: Spark Submit, YARN
    Affects Versions: 2.1.0, 2.0.1
            Reporter: Yuechen Chen



If one Spark Application need to access remote namenodes, 
${yarn.spark.access.namenodes} should be only be configged in spark-submit 
scripts, and Spark Client(On Yarn) would fetch HDFS credential periodically.
If one hadoop cluster is configured by HA, there would be one active namenode 
and at least one standby namenode. 
However, if ${yarn.spark.access.namenodes} includes both active and standby 
namenodes, Spark Application will be failed for the reason that the standby 
namenode would not access by Spark for org.apache.hadoop.ipc.StandbyException.
I think it won't cause any bad effect to config standby namenodes in 
${yarn.spark.access.namenodes}, and my Spark Application can be able to sustain 
the failover of Hadoop namenode.

HA Examples:
spark-submit script: 
yarn.spark.access.namenodes=hdfs://namenode01,hdfs://namenode02
Spark Application:
dataframe.write.parquet(getActiveNameNode(...) + hdfsPath)




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-20608) Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA

Reply via email to