gaborgsomogyi commented on PR #36658: URL: https://github.com/apache/spark/pull/36658#issuecomment-1136868593
Let's take a look at how the configs were evolving: * `spark.yarn.access.namenodes` introduced * `spark.yarn.access.hadoopFileSystems` introduced so `spark.yarn.access.namenodes` deprecated * `spark.kerberos.access.hadoopFileSystems` introduced so `spark.yarn.access.hadoopFileSystems` deprecated So from my perspective the correct order is: * `spark.yarn.access.namenodes` must be overwritten by `spark.yarn.access.hadoopFileSystems` * `spark.yarn.access.namenodes` and `spark.yarn.access.hadoopFileSystems` must be overwritten by `spark.kerberos.access.hadoopFileSystems` I understand that it was different previously but I wouldn't change it because of the following reasons: * If we take a look at the deprecation history then the actual behavior makes sense(newer configs must take precedence). Personally I would consider https://github.com/apache/spark/pull/23698 as a fix. * We're fixing things on master branch but there these 2 configs are just deprecated and `spark.kerberos.access.hadoopFileSystems` is the preferred way so I suggest to migrate to that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
