[GitHub] spark pull request #21734: [SPARK-24149][YARN][FOLLOW-UP] Add a config to co...

mgaido91 Tue, 10 Jul 2018 03:51:01 -0700

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21734#discussion_r201298533
  
    --- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala
 ---
    @@ -193,8 +193,7 @@ object YarnSparkHadoopUtil {
           sparkConf: SparkConf,
           hadoopConf: Configuration): Set[FileSystem] = {
         val filesystemsToAccess = sparkConf.get(FILESYSTEMS_TO_ACCESS)
    -      .map(new Path(_).getFileSystem(hadoopConf))
    -      .toSet
    +    val isRequestAllDelegationTokens = filesystemsToAccess.isEmpty
    --- End diff --
    
    `spark.yarn.access.hadoopFileSystems` is not invalid, it is just needed to 
access external cluster, which is what it was created for. Moreover, if you use 
viewfs, the same operations are performed under the hood by Hadoop code. So 
this seems to be a more general performance/scalability issue on the number of 
namespaces we support.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21734: [SPARK-24149][YARN][FOLLOW-UP] Add a config to co...

Reply via email to