sunchao opened a new pull request, #55839: URL: https://github.com/apache/spark/pull/55839
### What changes were proposed in this pull request? This PR makes Adaptive Query Execution propagate "empty stage" information through common physical wrappers such as shuffle reads, sorts, projections, and columnar-to-row conversion. It also preserves aggregate row-count semantics while doing so: - grouped aggregates keep propagating the child stage row count - global aggregates over an empty child are treated as producing one output row Finally, when AQE replaces obsolete exchange stages after choosing a new physical plan, cancellation failures from those intentionally discarded stages are ignored instead of being surfaced as query failures. ### Why are the changes needed? AQE can already collapse plans when a materialized stage proves that one side of a branch is empty, but that signal is currently lost once the stage is wrapped by operators such as `SortExec` or `ProjectExec`. That prevents otherwise valid short-circuiting, leaves obsolete work running longer than needed, and can report cancellation errors from stages that AQE has already decided to replace. The change keeps the empty-stage optimization effective across those wrappers and avoids treating intentional cancellation of obsolete stages as a real execution failure. ### Does this PR introduce _any_ user-facing change? Yes. Queries that materialize empty adaptive stages can terminate earlier and avoid spurious failures caused by cancellation of obsolete stages that are no longer part of the chosen adaptive plan. ### How was this patch tested? The patch adds targeted AQE unit coverage for: - short-circuiting an empty materialized stage through sort wrappers - preserving correctness for an empty filtered global aggregate stage The OSS carryover branch was prepared from the corresponding internal change and reconciled with current `apache/master`. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: OpenAI Codex -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
