> On April 27, 2020, 6:02 p.m., Vineet Garg wrote:
> > ql/src/test/results/clientpositive/llap/keep_uniform.q.out
> > Lines 946 (patched)
> > <https://reviews.apache.org/r/72431/diff/2/?file=2227827#file2227827line956>
> >
> > Why is there an extra join in the plan now?
>
> Krisztian Kasa wrote:
> I run `explain cbo` on master and the branch where this patch is applied
> with this query. Both plans has 8 HiveJoin operators. Comparing the CBO plans
> I see that 2 of the joins were reordered:
> On master: (web_returns + (web_sales + web_sales))
> On the branch: (web_sales + web_returns) + web_sales
>
> It also turned out that on master the CBO plan contains a HiveProject
> with all the columns from table `web_returns`. The reason is just the same as
> in case of the example query mentioned in HIVE-23206. This project has only
> the necessary columns (wr_order_number only) in the plan created after
> applying this patch.
>
> In the physical plan there are 7 joins on master and 8 when this applied.
> SharedWorkOptimizer merge two of them i need to investigate further...
On master two joins are merged because their parent ReduceSinks are merged in
`sharedWorkExtendedOptimization`.
See the plan after `SharedWorkOptimizer` before `SharedWorkExtendedOptimizer`:
Plan on master
```
TS[0]-FIL[102]-SEL[2]-RS[47]-MERGEJOIN[231]-RS[50]-MERGEJOIN[232]-RS[53]-MERGEJOIN[236]-RS[56]-MERGEJOIN[237]-RS[59]-MERGEJOIN[238]-GBY[111]-RS[112]-GBY[113]-GBY[114]-RS[115]-GBY[116]-FS[67]
TS[3]-FIL[103]-SEL[5]-RS[48]-MERGEJOIN[231]
TS[6]-FIL[104]-SEL[8]-RS[51]-MERGEJOIN[232]
TS[9]-FIL[105]-SEL[11]-RS[15]-MERGEJOIN[233]-SEL[18]-GBY[19]-RS[20]-GBY[21]-RS[54]-MERGEJOIN[236]
-RS[32]-MERGEJOIN[234]-SEL[35]-RS[37]-MERGEJOIN[235]-GBY[40]-RS[41]-GBY[42]-RS[57]-MERGEJOIN[237]
TS[12]-FIL[106]-SEL[14]-RS[16]-MERGEJOIN[233]
-RS[33]-MERGEJOIN[234]
TS[23]-FIL[107]-SEL[25]-RS[36]-MERGEJOIN[235]
TS[44]-FIL[110]-SEL[46]-RS[60]-MERGEJOIN[238]
```
RS[16] and RS[33] were merged.
Plan after applying patch
```
TS[0]-FIL[101]-SEL[2]-RS[46]-MERGEJOIN[213]-RS[49]-MERGEJOIN[214]-RS[52]-MERGEJOIN[218]-RS[55]-MERGEJOIN[219]-RS[58]-MERGEJOIN[220]-GBY[110]-RS[111]-GBY[112]-GBY[113]-RS[114]-GBY[115]-FS[66]
TS[3]-FIL[102]-SEL[5]-RS[47]-MERGEJOIN[213]
TS[6]-FIL[103]-SEL[8]-RS[50]-MERGEJOIN[214]
TS[9]-FIL[104]-SEL[11]-RS[15]-MERGEJOIN[215]-SEL[18]-GBY[19]-RS[20]-GBY[21]-RS[53]-MERGEJOIN[218]
-RS[32]-MERGEJOIN[216]-RS[35]-MERGEJOIN[217]-SEL[38]-GBY[39]-RS[40]-GBY[41]-RS[56]-MERGEJOIN[219]
-RS[36]-MERGEJOIN[217]
TS[12]-FIL[105]-SEL[14]-RS[16]-MERGEJOIN[215]
TS[26]-FIL[107]-SEL[28]-RS[33]-MERGEJOIN[216]
TS[43]-FIL[109]-SEL[45]-RS[59]-MERGEJOIN[220]
```
RS[15] and RS[32] was not merged because MERGEJOIN[215] and MERGEJOIN[216] has
different keys.
RS[15] and RS[36] was not merged because `tag` is different
- Krisztian
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72431/#review220503
-----------------------------------------------------------
On May 4, 2020, 5:02 a.m., Krisztian Kasa wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72431/
> -----------------------------------------------------------
>
> (Updated May 4, 2020, 5:02 a.m.)
>
>
> Review request for hive, Jesús Camacho Rodríguez, Steve Carlin, and Vineet
> Garg.
>
>
> Bugs: HIVE-23206
> https://issues.apache.org/jira/browse/HIVE-23206
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Project not defined correctly after reordering a join
>
>
> Diffs
> -----
>
> itests/src/test/resources/testconfiguration.properties 5468728f83
>
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinProjectTransposeRule.java
> 492c55e050
> ql/src/test/queries/clientpositive/join_reorder5.q PRE-CREATION
> ql/src/test/results/clientpositive/join22.q.out ad34bc4310
> ql/src/test/results/clientpositive/llap/correlationoptimizer3.q.out
> f063766a1f
> ql/src/test/results/clientpositive/llap/join_reorder5.q.out PRE-CREATION
> ql/src/test/results/clientpositive/llap/keep_uniform.q.out 54d0b5fab6
> ql/src/test/results/clientpositive/llap/sharedwork.q.out f8d3b4b2f5
> ql/src/test/results/clientpositive/llap/subquery_select.q.out 311cee743d
> ql/src/test/results/clientpositive/perf/tez/cbo_query2.q.out 26a98ffcec
> ql/src/test/results/clientpositive/perf/tez/cbo_query59.q.out abc5d999b5
> ql/src/test/results/clientpositive/perf/tez/cbo_query95.q.out 218ca7d8b6
> ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query14.q.out
> eaa1defa81
> ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query2.q.out
> 4c90da4476
> ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query59.q.out
> 8d17cc79d1
> ql/src/test/results/clientpositive/perf/tez/constraints/cbo_query95.q.out
> ace074316b
> ql/src/test/results/clientpositive/perf/tez/constraints/query14.q.out
> 8204245245
> ql/src/test/results/clientpositive/perf/tez/constraints/query2.q.out
> 66777769e6
> ql/src/test/results/clientpositive/perf/tez/constraints/query59.q.out
> f7c7260077
> ql/src/test/results/clientpositive/perf/tez/constraints/query95.q.out
> 39d35ec330
> ql/src/test/results/clientpositive/perf/tez/query2.q.out 0e67e97c02
> ql/src/test/results/clientpositive/perf/tez/query59.q.out 1a2ba964f4
> ql/src/test/results/clientpositive/perf/tez/query95.q.out f15afbed4b
> ql/src/test/results/clientpositive/runtime_skewjoin_mapjoin_spark.q.out
> 9547e4fa7c
> ql/src/test/results/clientpositive/smb_mapjoin_25.q.out 8fb82e1659
>
>
> Diff: https://reviews.apache.org/r/72431/diff/4/
>
>
> Testing
> -------
>
> mvn test -Dtest.output.overwrite -DskipSparkTests
> -Dtest=TestMiniLlapLocalCliDriver -Dqfile=join_reorder5.q -pl itests/qtest
> -Pitests
>
>
> Thanks,
>
> Krisztian Kasa
>
>