xudong963 commented on pull request #1566: URL: https://github.com/apache/arrow-datafusion/pull/1566#issuecomment-1013877492
> Could you possibly provide some tests @xudong963 ? Sure. > I was expecting to see code that basically applied an algebraic transformation on predicates like: The ticket doesn't do the transformation. It does the following thing. First of all, let's see the example: ``` ❯ create table part as select 1 as p_partkey; 0 rows in set. Query took 0.003 seconds. ❯ create table lineitem as select 1 as l_partkey, 2 as l_suppkey; 0 rows in set. Query took 0.005 seconds. ❯ create table supplier as select 1 as s_suppkey; 0 rows in set. Query took 0.002 seconds. ❯ explain select * from part, supplier, lineitem where p_partkey = l_partkey and s_suppkey = l_suppkey; +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------+ | plan_type | plan | +---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------+ | logical_plan | Projection: #part.p_partkey, #supplier.s_suppkey, #lineitem.l_partkey, #lineitem.l_suppkey | | | Join: #part.p_partkey = #lineitem.l_partkey, #supplier.s_suppkey = #lineitem.l_suppkey | | | CrossJoin: | | | TableScan: part projection=Some([0]) | | | TableScan: supplier projection=Some([0]) | | | TableScan: lineitem projection=Some([0, 1]) ``` https://github.com/apache/arrow-datafusion/blob/6f7b2d25fb75c843efed67fbd72d09b2c2d6c2eb/datafusion/src/sql/planner.rs#L718 In the `for` loop, at first, `left` is `part`, `right` is `supplier`, there is no `join key` between `part` and `right`, so result in `cross join` between `part` and `right`. It's heavy. In the ticket, we can push `supplier` to `mut_plans`, after inner join `part` and `lineitem`, `supplier` can inner join with them. ``` +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +---------------+----------------------------------------------------------------------------------------------------------------------------------------------------+ | logical_plan | Projection: #part.p_partkey, #lineitem.l_partkey, #lineitem.l_suppkey, #supplier.s_suppkey | | | Join: #lineitem.l_suppkey = #supplier.s_suppkey | | | Join: #part.p_partkey = #lineitem.l_partkey | | | TableScan: part projection=Some([0]) | | | TableScan: lineitem projection=Some([0, 1]) | | | TableScan: supplier projection=Some([0]) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org