martin-g commented on code in PR #2722:
URL: https://github.com/apache/datafusion-comet/pull/2722#discussion_r2502287903
##########
common/src/main/scala/org/apache/comet/CometConf.scala:
##########
@@ -865,6 +861,37 @@ private class TypedConfigBuilder[T](
CometConf.register(conf)
conf
}
+
+ /**
+ * Creates a [[ConfigEntry]] that has a default value, with support for
environment variable
+ * override.
+ *
+ * The value is resolved in the following priority order:
+ * 1. Spark config value (if set) 2. Environment variable value (if set)
3. Default value
+ *
+ * @param envVar
+ * The environment variable name to check for override value
+ * @param default
+ * The default value to use if neither config nor env var is set
+ * @return
+ * A ConfigEntry with environment variable support
+ */
+ def createWithEnvVarOrDefault(envVar: String, default: T): ConfigEntry[T] = {
+ val defaultValue = sys.env.get(envVar).map(converter).getOrElse(default)
Review Comment:
nit: `defaultValue` should be just `value`.
The double usage of `converter` is confusing. Maybe it could be something
like:
```scala
val value = sys.env.getOrElse(envVar, stringConverter(default))
val transformedDefault = converter(value)
```
?
##########
docs/source/user-guide/latest/configs.md:
##########
@@ -127,10 +127,10 @@ These settings can be used to determine which parts of
the plan are accelerated
| Config | Description | Default Value |
|--------|-------------|---------------|
| `spark.comet.columnar.shuffle.memory.factor` | Fraction of Comet memory to
be allocated per executor process for columnar shuffle when running in on-heap
mode. For more information, refer to the [Comet Tuning
Guide](https://datafusion.apache.org/comet/user-guide/tuning.html). | 1.0 |
-| `spark.comet.exec.onHeap.enabled` | Whether to allow Comet to run in on-heap
mode. Required for running Spark SQL tests. | false |
+| `spark.comet.exec.onHeap.enabled` | Whether to allow Comet to run in on-heap
mode. Required for running Spark SQL tests. Can be overridden by environment
variable `ENABLE_COMET_ONHEAP`. | false |
| `spark.comet.exec.onHeap.memoryPool` | The type of memory pool to be used
for Comet native execution when running Spark in on-heap mode. Available pool
types are `greedy`, `fair_spill`, `greedy_task_shared`,
`fair_spill_task_shared`, `greedy_global`, `fair_spill_global`, and
`unbounded`. | greedy_task_shared |
| `spark.comet.memoryOverhead` | The amount of additional memory to be
allocated per executor process for Comet, in MiB, when running Spark in on-heap
mode. | 1024 MiB |
-| `spark.comet.testing.strict` | Experimental option to enable strict testing,
which will fail tests that could be more comprehensive, such as checking for a
specific fallback reason | false |
+| `spark.comet.testing.strict` | Experimental option to enable strict testing,
which will fail tests that could be more comprehensive, such as checking for a
specific fallback reason Can be overridden by environment variable
`ENABLE_COMET_STRICT_TESTING`. | false |
Review Comment:
```suggestion
| `spark.comet.testing.strict` | Experimental option to enable strict
testing, which will fail tests that could be more comprehensive, such as
checking for a specific fallback reason. Can be overridden by environment
variable `ENABLE_COMET_STRICT_TESTING`. | false |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]