advancedxy commented on code in PR #41168:
URL: https://github.com/apache/spark/pull/41168#discussion_r1194522841


##########
core/src/main/scala/org/apache/spark/SparkConf.scala:
##########
@@ -403,7 +403,7 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable 
with Logging with Seria
    */
   def getAllWithPrefix(prefix: String): Array[(String, String)] = {

Review Comment:
   > We may want to be more explicit about when to substitute and when not to. 
   
   Sure and agreed. In my opinion, `getOption` should be substituted which is 
consistent with `get(xx_config_entry)`. 
   
   `getAll` shall not be substituted, as it's also used to create 
spark.properties file when submitted by spark-submit to resource managers, if 
substituted, the replaced value may reference the submit client's environment, 
which might not work as expected. We may add  `private[spark] def 
getAllWithSubstitution` method though.
   
   Variants deriving from `getAll`, such as `getAllWithPrefix`, as long as 
that's a query only operation, should be substituted, so they are consistent 
with get one by one.
   
   > shouldn't that mean the current inconsistent behavior would only be 
partially resolved by this PR?
   
   The `spark.hadoop.*` and `spark.hive.*` cases didn't occur to me. I'm not 
sure whether they should be substituted or not as they are also passed to 
hadoop/hive configuration, which may already have substitution enabled.
   
   For other cases, we may solve the inconsistent behavior case by case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to