crm26 opened a new issue, #22543:
URL: https://github.com/apache/datafusion/issues/22543

   Several insta snapshot tests in `datafusion/core/tests/physical_optimizer/` 
capture `RepartitionExec: partitioning=RoundRobinBatch(N), input_partitions=M` 
where `N` is the host's CPU core count. On hosts whose core count differs from 
the snapshot environment, the assertions fail despite the optimizer behavior 
being correct.
   
   ### Reproduction
   
   On a 24-core / 24-thread machine, against current `main` (HEAD `2453bec66`):
   
   ```bash
   cargo test -p datafusion --test core_integration \
     
physical_optimizer::ensure_requirements::test_filter_over_multi_partition_sort_limit
   ```
   
   Fails with:
   
   ```
   -        MockMultiPartitionExec
   +        RepartitionExec: partitioning=RoundRobinBatch(24), 
input_partitions=16
   ```
   
   ### Other examples of the same class
   
   - `test_filter_over_multi_partition_sort_limit` — added in #21976 
(2026-05-24)
   - `explain_analyze.slt:103` (`output_rows_skew` — added in #21211) — same 
root cause, manifests in SLT
   
   ### Workaround
   
   Setting `DATAFUSION_EXECUTION_TARGET_PARTITIONS=4` (or whatever count the 
snapshot was captured against) makes the affected tests pass.
   
   ### Suggested fix
   
   Either:
   
   1. Pin a fixed `target_partitions` in the test's `SessionConfig` so the 
captured plan doesn't inherit the host's CPU count, or
   2. Add a CPU-count guard / `#[ignore]` for environments where 
`target_partitions` exceeds the value the snapshot was captured against.
   
   Happy to send a PR for approach (1) if there's consensus.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to