sunchao commented on a change in pull request #31761:
URL: https://github.com/apache/spark/pull/31761#discussion_r589881790
##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -691,6 +691,15 @@ package object config {
.toSequence
.createWithDefault(Nil)
+ private[spark] val KERBEROS_FILESYSTEM_RENEWAL_EXCLUDE =
+ ConfigBuilder("spark.kerberos.renewal.exclude.hadoopFileSystems")
+ .doc("The list of Hadoop filesystem URLs whose hosts will be excluded
from " +
Review comment:
It's hard to imagine that this applies to defaultFS/stagingDir since it
means the YARN cluster (or other types of clusters) is not configured to be in
the same Kerberos realm as these which could cause other more serious issues.
But, just as `spark.kerberos.access.hadoopFileSystems`, I guess nothing stops
users from putting in the host for defaultFS/stagingDir in the config as well,
and Spark will just do what it's told to do.
##########
File path:
core/src/main/scala/org/apache/spark/deploy/security/HadoopFSDelegationTokenProvider.scala
##########
@@ -99,11 +100,24 @@ private[deploy] class HadoopFSDelegationTokenProvider
private def fetchDelegationTokens(
renewer: String,
filesystems: Set[FileSystem],
- creds: Credentials): Credentials = {
+ creds: Credentials,
+ hadoopConf: Configuration,
+ sparkConf: SparkConf): Credentials = {
+
+ // The hosts on which the file systems to be excluded from token renewal
+ val fsToExclude = sparkConf.get(KERBEROS_FILESYSTEM_RENEWAL_EXCLUDE)
+ .map(new Path(_).getFileSystem(hadoopConf).getUri.getHost)
+ .toSet
filesystems.foreach { fs =>
- logInfo(s"getting token for: $fs with renewer $renewer")
- fs.addDelegationTokens(renewer, creds)
+ if (fsToExclude.contains(fs.getUri.getHost)) {
+ // RM skips renewing token with empty renewer
Review comment:
Hmm does it only apply to YARN though? It seems Spark has its own
`HadoopDelegationTokenManager` which is separated from YARN. It has its own
renewal logic as well and I'm not sure we even need to use "" for renewer.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]