Terence Yim created SPARK-13441:
-----------------------------------

             Summary: NullPointerException when either HADOOP_CONF_DIR or 
YARN_CONF_DIR is not readable
                 Key: SPARK-13441
                 URL: https://issues.apache.org/jira/browse/SPARK-13441
             Project: Spark
          Issue Type: Bug
          Components: YARN
    Affects Versions: 1.6.0, 1.5.1, 1.4.1
            Reporter: Terence Yim


NPE is throw from the yarn Client.scala because {{File.listFiles()}} can return 
null on directory that it doesn't have permission to list. This is the code 
fragment in question:

{noformat}
// In org/apache/spark/deploy/yarn/Client.scala
    Seq("HADOOP_CONF_DIR", "YARN_CONF_DIR").foreach { envKey =>
      sys.env.get(envKey).foreach { path =>
        val dir = new File(path)
        if (dir.isDirectory()) {
          // dir.listFiles() can return null
          dir.listFiles().foreach { file =>
            if (file.isFile && !hadoopConfFiles.contains(file.getName())) {
              hadoopConfFiles(file.getName()) = file
            }
          }
        }
      }
    }
{noformat}

To reproduce, simply do:

{noformat}
sudo mkdir /tmp/conf
sudo chown 700 /tmp/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf
export YARN_CONF_DIR=/tmp/conf
spark-submit --master yarn-client SimpleApp.py
{noformat}

It fails on any Spark app. Though not important, the SimpleApp.py I used looks 
like this:

{noformat}
from pyspark import SparkContext

sc = SparkContext(None, "Simple App")

data = [1, 2, 3, 4, 5]
distData = sc.parallelize(data)

total = distData.reduce(lambda a, b: a + b)

print("Total: %i" % total)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to