Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/969#discussion_r13767184
--- Diff:
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala
---
@@ -45,6 +44,12 @@ class ClientArguments(val args: Array[String], val
sparkConf: SparkConf) {
parseArgs(args.toList)
+ files =
Option(files).getOrElse(sparkConf.getOption("spark.yarn.dist.files").orNull)
+ files = Option(files).map(p => Utils.resolveURIs(p)).orNull
--- End diff --
thanks @witgo, you are right, I actually had it backwards. Because of this
line it always make it default to file://. When we use the env variable for
yarn-client or if its specified via the --archives/--files option with
spark-class it should default to look in hdfs. That is the previous behavior
from 0.9.
so we should change this to only do the map(p => Utils.resolveURIs(p)) if
it is actually reading it from spark.yarn.dist.files. If its specified in
--archives/--files then we shouldn't resolveURI's.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---