[ 
https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated HIVE-17474:
------------------------------------
    Description: 
in 
[DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql].
 On hive version(d3b88f6),  i found that the logical plan is different in 
runtime with the same settings.

sometimes the logical plan
{code}
TS[0]-FIL[63]-SEL[2]-RS[43]-JOIN[45]-RS[46]-JOIN[48]-SEL[49]-GBY[50]-RS[51]-GBY[52]-SEL[53]-RS[54]-SEL[55]-PTF[56]-SEL[57]-RS[59]-SEL[60]-LIM[61]-FS[62]
TS[3]-FIL[64]-SEL[5]-RS[44]-JOIN[45]
TS[6]-FIL[65]-SEL[8]-RS[39]-JOIN[41]-RS[47]-JOIN[48]
TS[9]-FIL[67]-SEL[11]-RS[18]-JOIN[20]-RS[21]-JOIN[23]-SEL[24]-GBY[25]-RS[26]-GBY[27]-RS[29]-SEL[30]-PTF[31]-FIL[66]-SEL[32]-GBY[38]-RS[40]-JOIN[41]
TS[12]-FIL[68]-SEL[14]-RS[19]-JOIN[20]
TS[15]-FIL[69]-SEL[17]-RS[22]-JOIN[23]
{code}
 TS\[6\] connects with TS\[9\] on JOIN\[41\] and connects with TS\[0\] on 
JOIN\[48\].

sometimes 
{code}
TS[0]-FIL[63]-RS[3]-JOIN[6]-RS[8]-JOIN[11]-RS[41]-JOIN[44]-SEL[46]-GBY[47]-RS[48]-GBY[49]-RS[50]-GBY[51]-RS[52]-SEL[53]-PTF[54]-SEL[55]-RS[57]-SEL[58]-LIM[59]-FS[60]
TS[1]-FIL[64]-RS[5]-JOIN[6]
TS[2]-FIL[65]-RS[10]-JOIN[11]
TS[12]-FIL[68]-RS[16]-JOIN[19]-RS[20]-JOIN[23]-FIL[67]-SEL[25]-GBY[26]-RS[27]-GBY[28]-RS[29]-GBY[30]-RS[31]-SEL[32]-PTF[33]-FIL[66]-SEL[34]-GBY[39]-RS[43]-JOIN[44]
TS[13]-FIL[69]-RS[18]-JOIN[19]
TS[14]-FIL[70]-RS[22]-JOIN[23]
{code}
TS\[2\] connects with TS\[0\] on JOIN\[11\]

Although TS\[2\] and TS\[6\] has different operator id, they are table store in 
the query.

The difference causes different spark execution plan and different execution 
time.  I'm very confused why there are different logical plan with same 
setting. Can anyone know where to investigate the root cause?

  was:
in 
[DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql].
 On hive version(d3b88f6),  i found that the physical plan is different in 
runtime with the same settings.

sometimes the physical plan
{code}
TS[0]-FIL[63]-SEL[2]-RS[43]-JOIN[45]-RS[46]-JOIN[48]-SEL[49]-GBY[50]-RS[51]-GBY[52]-SEL[53]-RS[54]-SEL[55]-PTF[56]-SEL[57]-RS[59]-SEL[60]-LIM[61]-FS[62]
TS[3]-FIL[64]-SEL[5]-RS[44]-JOIN[45]
TS[6]-FIL[65]-SEL[8]-RS[39]-JOIN[41]-RS[47]-JOIN[48]
TS[9]-FIL[67]-SEL[11]-RS[18]-JOIN[20]-RS[21]-JOIN[23]-SEL[24]-GBY[25]-RS[26]-GBY[27]-RS[29]-SEL[30]-PTF[31]-FIL[66]-SEL[32]-GBY[38]-RS[40]-JOIN[41]
TS[12]-FIL[68]-SEL[14]-RS[19]-JOIN[20]
TS[15]-FIL[69]-SEL[17]-RS[22]-JOIN[23]
{code}
 TS\[6\] connects with TS\[9\] on JOIN\[41\] and connects with TS\[0\] on 
JOIN\[48\].

sometimes 
{code}
TS[0]-FIL[63]-RS[3]-JOIN[6]-RS[8]-JOIN[11]-RS[41]-JOIN[44]-SEL[46]-GBY[47]-RS[48]-GBY[49]-RS[50]-GBY[51]-RS[52]-SEL[53]-PTF[54]-SEL[55]-RS[57]-SEL[58]-LIM[59]-FS[60]
TS[1]-FIL[64]-RS[5]-JOIN[6]
TS[2]-FIL[65]-RS[10]-JOIN[11]
TS[12]-FIL[68]-RS[16]-JOIN[19]-RS[20]-JOIN[23]-FIL[67]-SEL[25]-GBY[26]-RS[27]-GBY[28]-RS[29]-GBY[30]-RS[31]-SEL[32]-PTF[33]-FIL[66]-SEL[34]-GBY[39]-RS[43]-JOIN[44]
TS[13]-FIL[69]-RS[18]-JOIN[19]
TS[14]-FIL[70]-RS[22]-JOIN[23]
{code}
TS\[2\] connects with TS\[0\] on JOIN\[11\]

Although TS\[2\] and TS\[6\] has different operator id, they are table store in 
the query.

The difference causes different spark execution plan and different execution 
time.  I'm very confused why there are different physical plan with same 
setting. Can anyone know where to investigate the root cause?


> Different logical plan of same query(TPC-DS/70) with same settings
> ------------------------------------------------------------------
>
>                 Key: HIVE-17474
>                 URL: https://issues.apache.org/jira/browse/HIVE-17474
>             Project: Hive
>          Issue Type: Bug
>            Reporter: liyunzhang_intel
>
> in 
> [DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql].
>  On hive version(d3b88f6),  i found that the logical plan is different in 
> runtime with the same settings.
> sometimes the logical plan
> {code}
> TS[0]-FIL[63]-SEL[2]-RS[43]-JOIN[45]-RS[46]-JOIN[48]-SEL[49]-GBY[50]-RS[51]-GBY[52]-SEL[53]-RS[54]-SEL[55]-PTF[56]-SEL[57]-RS[59]-SEL[60]-LIM[61]-FS[62]
> TS[3]-FIL[64]-SEL[5]-RS[44]-JOIN[45]
> TS[6]-FIL[65]-SEL[8]-RS[39]-JOIN[41]-RS[47]-JOIN[48]
> TS[9]-FIL[67]-SEL[11]-RS[18]-JOIN[20]-RS[21]-JOIN[23]-SEL[24]-GBY[25]-RS[26]-GBY[27]-RS[29]-SEL[30]-PTF[31]-FIL[66]-SEL[32]-GBY[38]-RS[40]-JOIN[41]
> TS[12]-FIL[68]-SEL[14]-RS[19]-JOIN[20]
> TS[15]-FIL[69]-SEL[17]-RS[22]-JOIN[23]
> {code}
>  TS\[6\] connects with TS\[9\] on JOIN\[41\] and connects with TS\[0\] on 
> JOIN\[48\].
> sometimes 
> {code}
> TS[0]-FIL[63]-RS[3]-JOIN[6]-RS[8]-JOIN[11]-RS[41]-JOIN[44]-SEL[46]-GBY[47]-RS[48]-GBY[49]-RS[50]-GBY[51]-RS[52]-SEL[53]-PTF[54]-SEL[55]-RS[57]-SEL[58]-LIM[59]-FS[60]
> TS[1]-FIL[64]-RS[5]-JOIN[6]
> TS[2]-FIL[65]-RS[10]-JOIN[11]
> TS[12]-FIL[68]-RS[16]-JOIN[19]-RS[20]-JOIN[23]-FIL[67]-SEL[25]-GBY[26]-RS[27]-GBY[28]-RS[29]-GBY[30]-RS[31]-SEL[32]-PTF[33]-FIL[66]-SEL[34]-GBY[39]-RS[43]-JOIN[44]
> TS[13]-FIL[69]-RS[18]-JOIN[19]
> TS[14]-FIL[70]-RS[22]-JOIN[23]
> {code}
> TS\[2\] connects with TS\[0\] on JOIN\[11\]
> Although TS\[2\] and TS\[6\] has different operator id, they are table store 
> in the query.
> The difference causes different spark execution plan and different execution 
> time.  I'm very confused why there are different logical plan with same 
> setting. Can anyone know where to investigate the root cause?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to