Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21734#discussion_r201298533
  
    --- Diff: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala
 ---
    @@ -193,8 +193,7 @@ object YarnSparkHadoopUtil {
           sparkConf: SparkConf,
           hadoopConf: Configuration): Set[FileSystem] = {
         val filesystemsToAccess = sparkConf.get(FILESYSTEMS_TO_ACCESS)
    -      .map(new Path(_).getFileSystem(hadoopConf))
    -      .toSet
    +    val isRequestAllDelegationTokens = filesystemsToAccess.isEmpty
    --- End diff --
    
    `spark.yarn.access.hadoopFileSystems` is not invalid, it is just needed to 
access external cluster, which is what it was created for. Moreover, if you use 
viewfs, the same operations are performed under the hood by Hadoop code. So 
this seems to be a more general performance/scalability issue on the number of 
namespaces we support.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to