ygf11 commented on code in PR #4465:
URL: https://github.com/apache/arrow-datafusion/pull/4465#discussion_r1111194134
##########
benchmarks/expected-plans/q10.txt:
##########
@@ -1,12 +1,17 @@
Sort: revenue DESC NULLS FIRST
Projection: customer.c_custkey, customer.c_name,
SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS revenue,
customer.c_acctbal, nation.n_name, customer.c_address, customer.c_phone,
customer.c_comment
Aggregate: groupBy=[[customer.c_custkey, customer.c_name,
customer.c_acctbal, customer.c_phone, nation.n_name, customer.c_address,
customer.c_comment]], aggr=[[SUM(CAST(lineitem.l_extendedprice AS
Decimal128(38, 4)) * CAST(Decimal128(Some(100),23,2) - CAST(lineitem.l_discount
AS Decimal128(23, 2)) AS Decimal128(38, 4))) AS SUM(lineitem.l_extendedprice *
Int64(1) - lineitem.l_discount)]]
- Inner Join: customer.c_nationkey = nation.n_nationkey
- Inner Join: orders.o_orderkey = lineitem.l_orderkey
- Inner Join: customer.c_custkey = orders.o_custkey
- TableScan: customer projection=[c_custkey, c_name, c_address,
c_nationkey, c_phone, c_acctbal, c_comment]
- Filter: orders.o_orderdate >= Date32("8674") AND
orders.o_orderdate < Date32("8766")
- TableScan: orders projection=[o_orderkey, o_custkey, o_orderdate]
- Filter: lineitem.l_returnflag = Utf8("R")
- TableScan: lineitem projection=[l_orderkey, l_extendedprice,
l_discount, l_returnflag]
- TableScan: nation projection=[n_nationkey, n_name]
\ No newline at end of file
+ Projection: customer.c_custkey, customer.c_name, customer.c_address,
customer.c_phone, customer.c_acctbal, customer.c_comment,
lineitem.l_extendedprice, lineitem.l_discount, nation.n_name
Review Comment:
Maybe it is relative to
[`build_join_schema`](https://github.com/apache/arrow-datafusion/blob/main/datafusion/expr/src/logical_plan/builder.rs#L935).
Some optimizer rules call this function.
I think it is ok for calling it before pushdown projection, but I guess it
is not correct after push down projection.
For the query:
```sql
select a.id from a join b on a.id = b.id
```
`schema(a)`: a.id
`schema(b)`: b.id
`build_join_schema` will merge left and right, the result is `a.id` +
`b.id`, but the expected result should be only `a.id`.
Maybe we can fix it first, and then we will not need the projection any more.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]