andygrove opened a new pull request, #1913:
URL: https://github.com/apache/datafusion-ballista/pull/1913
# Which issue does this PR close?
N/A — CI coverage improvement.
# Rationale for this change
The `TPC-H SF10` CI job runs the suite only with AQE off
(`prefer_hash_join=false`). The adaptive (AQE-on) planner is a different
planner
with materially different join behavior — it's the path that does dynamic
broadcast join selection — and it has **no CI coverage today**. That means a
change can pass CI while breaking the adaptive path: e.g. a
broadcast-under-AQE
change can fail a query (a downstream stage requesting a partition the
broadcast
side doesn't expose) that the AQE-off run never exercises.
# What changes are included in this PR?
- Run the TPC-H SF10 suite a second time with
`-c ballista.planner.adaptive.enabled=true` (AQE on), in addition to the
existing AQE-off run.
- Both runs use the **same generated data and the same cluster** — AQE
on/off is
a per-query session setting, so the scheduler selects the planner per job;
there's no need to regenerate data or restart the cluster. The query loop
is
refactored into a `run_suite` shell function called once per mode.
- Job renamed to `TPC-H SF10 (all queries, AQE off + on)`.
No change to data generation, query set (q16 still omitted), partitions (16),
iterations, or cluster shape.
# Are there any user-facing changes?
No. CI-only change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]