sunchao commented on a change in pull request #31761:
URL: https://github.com/apache/spark/pull/31761#discussion_r588972518



##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -691,6 +691,15 @@ package object config {
     .toSequence
     .createWithDefault(Nil)
 
+  private[spark] val KERBEROS_FILESYSTEM_RENEWAL_EXCLUDE =
+    ConfigBuilder("spark.kerberos.renewal.exclude.hadoopFileSystems")
+      .doc("The list of Hadoop filesystem URLs whose hosts will be excluded 
from " +

Review comment:
       It makes sense for `spark.kerberos.access.hadoopFileSystems` to be URLs 
since Spark needs to instantiate `FileSystem`s from them. But for this case I'm 
not sure if it's necessary: we can just parse the config into a set of host 
names and check whether the file systems above contain them:
   ```scala
   val hostsToExclude = sparkConf.get(KERBEROS_FILESYSTEM_RENEWAL_EXCLUDE).toSet
   filesystems.filter(fs => !hostsToExclude.contains(fs.getUri.getHost).foreach 
{ fs =>
    ...
   }
   ```
   
   I'm fine either way though since it also makes sense to keep it consistent 
with `spark.kerberos.access.hadoopFileSystems`. BTW I think we'll need to 
update security.md for the new config.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to