Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21734#discussion_r201298533
--- Diff:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala
---
@@ -193,8 +193,7 @@ object YarnSparkHadoopUtil {
sparkConf: SparkConf,
hadoopConf: Configuration): Set[FileSystem] = {
val filesystemsToAccess = sparkConf.get(FILESYSTEMS_TO_ACCESS)
- .map(new Path(_).getFileSystem(hadoopConf))
- .toSet
+ val isRequestAllDelegationTokens = filesystemsToAccess.isEmpty
--- End diff --
`spark.yarn.access.hadoopFileSystems` is not invalid, it is just needed to
access external cluster, which is what it was created for. Moreover, if you use
viewfs, the same operations are performed under the hood by Hadoop code. So
this seems to be a more general performance/scalability issue on the number of
namespaces we support.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]