Re: [PR] chore: Improve framework for specifying that configs can be set with env vars [datafusion-comet]

via GitHub Fri, 07 Nov 2025 01:24:23 -0800


martin-g commented on code in PR #2722:
URL: https://github.com/apache/datafusion-comet/pull/2722#discussion_r2502287903



##########
common/src/main/scala/org/apache/comet/CometConf.scala:
##########
@@ -865,6 +861,37 @@ private class TypedConfigBuilder[T](
     CometConf.register(conf)
     conf
   }
+
+  /**
+   * Creates a [[ConfigEntry]] that has a default value, with support for 
environment variable
+   * override.
+   *
+   * The value is resolved in the following priority order:
+   *   1. Spark config value (if set) 2. Environment variable value (if set) 
3. Default value
+   *
+   * @param envVar
+   *   The environment variable name to check for override value
+   * @param default
+   *   The default value to use if neither config nor env var is set
+   * @return
+   *   A ConfigEntry with environment variable support
+   */
+  def createWithEnvVarOrDefault(envVar: String, default: T): ConfigEntry[T] = {
+    val defaultValue = sys.env.get(envVar).map(converter).getOrElse(default)

Review Comment:
   nit: `defaultValue` should be just `value`.
   The double usage of `converter` is confusing. Maybe it could be something 
like:
   ```scala
   val value = sys.env.getOrElse(envVar, stringConverter(default))
   val transformedDefault = converter(value)
   ```
   ?



##########
docs/source/user-guide/latest/configs.md:
##########
@@ -127,10 +127,10 @@ These settings can be used to determine which parts of 
the plan are accelerated
 | Config | Description | Default Value |
 |--------|-------------|---------------|
 | `spark.comet.columnar.shuffle.memory.factor` | Fraction of Comet memory to 
be allocated per executor process for columnar shuffle when running in on-heap 
mode. For more information, refer to the [Comet Tuning 
Guide](https://datafusion.apache.org/comet/user-guide/tuning.html). | 1.0 |
-| `spark.comet.exec.onHeap.enabled` | Whether to allow Comet to run in on-heap 
mode. Required for running Spark SQL tests. | false |
+| `spark.comet.exec.onHeap.enabled` | Whether to allow Comet to run in on-heap 
mode. Required for running Spark SQL tests. Can be overridden by environment 
variable `ENABLE_COMET_ONHEAP`. | false |
 | `spark.comet.exec.onHeap.memoryPool` | The type of memory pool to be used 
for Comet native execution when running Spark in on-heap mode. Available pool 
types are `greedy`, `fair_spill`, `greedy_task_shared`, 
`fair_spill_task_shared`, `greedy_global`, `fair_spill_global`, and 
`unbounded`. | greedy_task_shared |
 | `spark.comet.memoryOverhead` | The amount of additional memory to be 
allocated per executor process for Comet, in MiB, when running Spark in on-heap 
mode. | 1024 MiB |
-| `spark.comet.testing.strict` | Experimental option to enable strict testing, 
which will fail tests that could be more comprehensive, such as checking for a 
specific fallback reason | false |
+| `spark.comet.testing.strict` | Experimental option to enable strict testing, 
which will fail tests that could be more comprehensive, such as checking for a 
specific fallback reason Can be overridden by environment variable 
`ENABLE_COMET_STRICT_TESTING`. | false |

Review Comment:
   ```suggestion
   | `spark.comet.testing.strict` | Experimental option to enable strict 
testing, which will fail tests that could be more comprehensive, such as 
checking for a specific fallback reason. Can be overridden by environment 
variable `ENABLE_COMET_STRICT_TESTING`. | false |
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] chore: Improve framework for specifying that configs can be set with env vars [datafusion-comet]

Reply via email to