This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push:
new dfc8bb7dd6 fix(sql): handle GROUP BY ALL with aliased aggregates
(#20943)
dfc8bb7dd6 is described below
commit dfc8bb7dd65ba6a96b2a259dec5776945b361767
Author: Kumar Ujjawal <[email protected]>
AuthorDate: Mon Mar 23 23:31:48 2026 +0530
fix(sql): handle GROUP BY ALL with aliased aggregates (#20943)
## Which issue does this PR close?
<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->
- Closes #20909
## Rationale for this change
Issue #20909 reported that:
- SELECT COUNT(*) FROM t GROUP BY ALL works
- SELECT COUNT(*) AS c FROM t GROUP BY ALL fails at execution
The failure came from an incorrect plan that grouped by an aggregate
alias expression.
<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
## What changes are included in this PR?
This PR fixes GROUP BY ALL when the select list has an aliased aggregate
like COUNT(*) AS c.
Changes:
- Update GROUP BY ALL expansion logic to skip any select expression that
contains an aggregate, even if it is wrapped in aliases.
- Add SQL planner tests for:
- SELECT COUNT(*) FROM ... GROUP BY ALL
- SELECT COUNT(*) AS c FROM ... GROUP BY ALL
- Add core execution test for:
- SELECT COUNT(*) AS c FROM ... GROUP BY ALL
<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
## Are these changes tested?
Yes
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code
If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
## Are there any user-facing changes?
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
---
datafusion/sql/src/select.rs | 12 ++++--------
datafusion/sqllogictest/test_files/group_by.slt | 5 +++++
2 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/datafusion/sql/src/select.rs b/datafusion/sql/src/select.rs
index 7e291afa04..bb600a2a50 100644
--- a/datafusion/sql/src/select.rs
+++ b/datafusion/sql/src/select.rs
@@ -31,7 +31,7 @@ use datafusion_common::error::DataFusionErrorBuilder;
use datafusion_common::tree_node::{TreeNode, TreeNodeRecursion};
use datafusion_common::{Column, DFSchema, Result, not_impl_err, plan_err};
use datafusion_common::{RecursionUnnestOption, UnnestOptions};
-use datafusion_expr::expr::{Alias, PlannedReplaceSelectItem, WildcardOptions};
+use datafusion_expr::expr::{PlannedReplaceSelectItem, WildcardOptions};
use datafusion_expr::expr_rewriter::{
normalize_col, normalize_col_with_schemas_and_ambiguity_check,
normalize_sorts,
};
@@ -193,15 +193,11 @@ impl<S: ContextProvider> SqlToRel<'_, S> {
.collect::<Result<Vec<Expr>>>()?
} else {
// 'group by all' groups wrt. all select expressions except
'AggregateFunction's.
- // Filter and collect non-aggregate select expressions
+ // Filter and collect non-aggregate select expressions.
select_exprs
.iter()
- .filter(|select_expr| match select_expr {
- Expr::AggregateFunction(_) => false,
- Expr::Alias(Alias { expr, name: _, .. }) => {
- !matches!(**expr, Expr::AggregateFunction(_))
- }
- _ => true,
+ .filter(|select_expr| {
+
find_aggregate_exprs(std::iter::once(*select_expr)).is_empty()
})
.cloned()
.collect()
diff --git a/datafusion/sqllogictest/test_files/group_by.slt
b/datafusion/sqllogictest/test_files/group_by.slt
index 294841552a..9653988a68 100644
--- a/datafusion/sqllogictest/test_files/group_by.slt
+++ b/datafusion/sqllogictest/test_files/group_by.slt
@@ -1979,6 +1979,11 @@ SELECT col0, col1, COUNT(col2), SUM(col3) FROM tab3
GROUP BY ALL
0 2 2 -3
1 NULL 1 -2
+query I
+SELECT COUNT(*) AS c FROM tab3 GROUP BY ALL
+----
+5
+
# query below should work in multi partition, successfully.
query II
SELECT l.col0, LAST_VALUE(r.col1 ORDER BY r.col0) as last_col1
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]