advancedxy commented on code in PR #41168:
URL: https://github.com/apache/spark/pull/41168#discussion_r1194522841
##########
core/src/main/scala/org/apache/spark/SparkConf.scala:
##########
@@ -403,7 +403,7 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable
with Logging with Seria
*/
def getAllWithPrefix(prefix: String): Array[(String, String)] = {
Review Comment:
> We may want to be more explicit about when to substitute and when not to.
Sure and agreed. In my opinion, `getOption` should be substituted which is
consistent with `get(xx_config_entry)`.
`getAll` shall not be substituted, as it's also used to create
spark.properties file when submitted by spark-submit to resource managers, if
substituted, the replaced value may reference the submit client's environment,
which might not work as expected. We may add `private[spark] def
getAllWithSubstitution` method though.
Variants deriving from `getAll`, such as `getAllWithPrefix`, as long as
that's a query only operation, should be substituted, so they are consistent
with get one by one.
> shouldn't that mean the current inconsistent behavior would only be
partially resolved by this PR?
The `spark.hadoop.*` and `spark.hive.*` cases didn't occur to me. I'm not
sure whether they should be substituted or not as they are also passed to
hadoop/hive configuration, which may already have substitution enabled.
For other cases, we may solve the inconsistent behavior case by case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]