andygrove opened a new pull request, #4405:
URL: https://github.com/apache/datafusion-comet/pull/4405

   ## Which issue does this PR close?
   
   N/A. This adds local developer tooling and has no associated issue.
   
   ## Rationale for this change
   
   The `spark_sql_test.yml` workflow runs Apache Spark's own SQL test suites 
with Comet enabled, but there is no convenient way to reproduce that run on a 
developer machine. Debugging a Spark SQL test failure currently means 
reconstructing the steps by hand: clone Spark at a version tag, apply the Comet 
diff, build Comet, and run the right `build/sbt` shard with the right 
environment.
   
   ## What changes are included in this PR?
   
   New bash scripts under `dev/ci/spark-sql-tests/` that reproduce the 
`spark_sql_test.yml` workflow locally for Apache Spark 4.1:
   
   - `config.sh`: shared configuration and the seven CI module-shard 
definitions, copied from `spark_sql_test.yml`.
   - `setup-spark.sh`: maintains a persistent `apache/spark` checkout and 
applies `dev/diffs/4.1.1.diff`, preserving Spark's build artifacts across runs.
   - `run.sh`: builds Comet, runs the selected module shard(s) with `build/sbt` 
using the same environment as CI, and prints a PASS/FAIL summary. Supports 
`SKIP_BUILD` and `SKIP_SPARK_SETUP` for fast iteration.
   - `README.md`: usage, prerequisites, and environment variables.
   
   Only Spark 4.1 is supported for now; the scripts are structured so a later 
change can parameterize the version.
   
   ## How are these changes tested?
   
   These scripts orchestrate a multi-hour external test run, so they are not 
exercised end-to-end in CI. They were verified with `bash -n` and `shellcheck 
-x` (both clean), and with smoke tests of `run.sh` argument handling (`--help`, 
unknown-module rejection). The module definitions and `build/sbt` arguments 
match `spark_sql_test.yml` exactly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to