mustafasrepo commented on code in PR #7040:
URL: https://github.com/apache/arrow-datafusion/pull/7040#discussion_r1276223297
##########
datafusion/sql/src/select.rs:
##########
@@ -431,6 +432,37 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
group_by_exprs: Vec<Expr>,
aggr_exprs: Vec<Expr>,
) -> Result<(LogicalPlan, Vec<Expr>, Option<Expr>)> {
+ let schema = input.schema();
Review Comment:
> Yes I agree that sounds weird. I still don't understand why the functional
dependency can't be computed from the plan itself
Currently, if the query is
```sql
SELECT sn, amount
FROM sales_global
GROUP BY sn
```
and we know that `sn` is primary key. We rewrite query above as
```sql
SELECT sn, amount
FROM sales_global
GROUP BY sn, amount
```
so that select either contains group by expressions or aggregate
expressions. This update is done here.
However, if the query were
```sql
SELECT sn
FROM sales_global
GROUP BY sn
```
we wouldn't rewrite query as
```sql
SELECT sn
FROM sales_global
GROUP BY sn, amount
```
hence, amount is not used and. We do not need to update group by here.
For this reason, we need the information of `select exprs` to decide how to
extend group by expressions to support a query.
However, if we were to extend group by expressions with their functional
dependencies in all cases. We won't need the `select exprs` for this decision.
Such as, we would treat
```sql
SELECT sn
FROM sales_global
GROUP BY sn
```
as
```sql
SELECT sn
FROM sales_global
GROUP BY sn, amount
```
under the hood.
What do you think about extending group by expression in all cases, without
checking for `select exprs`. I think this is a bit confusing.
In short, functional dependency can be calculated without external
information. However, query rewrite cannot be done without resorting to the
information in the `select exprs` in current design.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]