andygrove opened a new pull request, #3226:
URL: https://github.com/apache/datafusion-comet/pull/3226

   ## Summary
   
   - Add `get_spark_configs()` method to base Benchmark class for 
benchmark-specific Spark configurations
   - Common Comet configs (enabled, logging) now defined in Python for 
jvm/native modes
   - Add shuffle benchmark variants with and without native parquet writes:
     - `shuffle-hash-native-write`: hash shuffle with Comet native parquet 
writes enabled
     - `shuffle-hash-spark-write`: hash shuffle with native writes disabled 
(uses Spark writer)
     - `shuffle-roundrobin-native-write`: round-robin shuffle with native 
writes enabled
     - `shuffle-roundrobin-spark-write`: round-robin shuffle with native writes 
disabled
   - Add `--print-configs` CLI option to output benchmark-specific configs
   - Refactor `run_all_benchmarks.sh` to use helper function and remove 
duplicated configs
   - Exclude `benchmarks/pyspark/**` from CI test workflows to avoid triggering 
tests for benchmark-only changes
   
   ## Test plan
   
   - [ ] Run `python run_benchmark.py --list-benchmarks` to verify new 
benchmarks are registered
   - [ ] Run `python run_benchmark.py --print-configs --benchmark 
shuffle-hash-native-write --mode native` to verify config output
   - [ ] Run `./run_all_benchmarks.sh` to verify benchmarks execute correctly
   
   🤖 Generated with [Claude Code](https://claude.ai/code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to