[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO

2016-07-28 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398540#comment-15398540
 ] 

Nemon Lou commented on HIVE-14353:
--

The motivation of this jira ticket is that ,I found that query46 was lower with 
CBO on than off, while the join order is the same. ( I changed the join order 
in SQL manually when CBO is off.)
After comparing these two query plans,the major difference is the select 
operator introduced by CBO's projection pruning.


> Performance degradation  after Projection Pruning in CBO
> 
>
> Key: HIVE-14353
> URL: https://issues.apache.org/jira/browse/HIVE-14353
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
> Attachments: q46_cbo_no_projection_prune_explain.txt, 
> q46_cbo_projection_prune_explain.rar
>
>
> TPC-DS with factor 1024.
> Hive on Spark. 
> With and without projection prunning,time spent are quite different.
> The way to disable projection prunning : disable HiveRelFieldTrimmer in code 
> and compile a new jar.
> ||queries||CBO_no_projection_prune||CBO||
> |q27| 160|251 | 
> |q7   |   200|312 |
> |q88| 701|1092|
> |q68| 234|345 |
> |q39|53|78  |
> |q73| 160|228 |
> |q31| 463|659 |
> |q79| 242|343 |
> |q46| 256|363 |
> |q60| 271|382 |
> |q66| 198|278 |
> |q34| 155|217 |
> |q19| 184|256 |
> |q26| 154|214 |
> |q56| 262|364 |
> |q75| 942|1303|
> |q71| 288|388 |
> |q25| 329|442 |
> |q52| 142|190 |
> |q42| 142|189 |
> |q3   |   139|185 |
> |q98| 153|203 |
> |q89| 187|248 |
> |q58| 264|340 |
> |q43| 127|162 |
> |q32| 174|221 |
> |q96| 156|197 |
> |q70| 320|404 |
> |q29| 499|629 |
> |q18| 266|329 |
> |q21| 76 |92  |
> |q90| 139|165 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO

2016-07-28 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397041#comment-15397041
 ] 

Nemon Lou commented on HIVE-14353:
--

A preliminary analysis:
Hive has a built in column pruner, and column pruning has been pushed down to 
InputFormat layer.
CBO adds an projection above table scan,which is very costly especially when 
doing projection before join.
Join can filter out a lot of rows in most cases of TPCDS.

> Performance degradation  after Projection Pruning in CBO
> 
>
> Key: HIVE-14353
> URL: https://issues.apache.org/jira/browse/HIVE-14353
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
> Attachments: q46_cbo_no_projection_prune_explain.txt, 
> q46_cbo_projection_prune_explain.rar
>
>
> TPC-DS with factor 1024.
> Hive on Spark. 
> With and without projection prunning,time spent are quite different.
> The way to disable projection prunning : disable HiveRelFieldTrimmer in code 
> and compile a new jar.
> ||queries||CBO_no_projection_prune||CBO||
> |q27| 160|251 | 
> |q7   |   200|312 |
> |q88| 701|1092|
> |q68| 234|345 |
> |q39|53|78  |
> |q73| 160|228 |
> |q31| 463|659 |
> |q79| 242|343 |
> |q46| 256|363 |
> |q60| 271|382 |
> |q66| 198|278 |
> |q34| 155|217 |
> |q19| 184|256 |
> |q26| 154|214 |
> |q56| 262|364 |
> |q75| 942|1303|
> |q71| 288|388 |
> |q25| 329|442 |
> |q52| 142|190 |
> |q42| 142|189 |
> |q3   |   139|185 |
> |q98| 153|203 |
> |q89| 187|248 |
> |q58| 264|340 |
> |q43| 127|162 |
> |q32| 174|221 |
> |q96| 156|197 |
> |q70| 320|404 |
> |q29| 499|629 |
> |q18| 266|329 |
> |q21| 76 |92  |
> |q90| 139|165 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO

2016-07-27 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396922#comment-15396922
 ] 

Nemon Lou commented on HIVE-14353:
--

[~pxiong]  Sorry for the misleading. Performance degradation is at run time(an 
application run on YARN),not compile time.
HiveRelFieldTrimmer adds a projection rel node above table scan.The projection 
node then compiled to select operator in hive.
That's why I record the time spent in select operator during run time.

> Performance degradation  after Projection Pruning in CBO
> 
>
> Key: HIVE-14353
> URL: https://issues.apache.org/jira/browse/HIVE-14353
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>
> TPC-DS with factor 1024.
> Hive on Spark. 
> With and without projection prunning,time spent are quite different.
> The way to disable projection prunning : disable HiveRelFieldTrimmer in code 
> and compile a new jar.
> ||queries||CBO_no_projection_prune||CBO||
> |q27| 160|251 | 
> |q7   |   200|312 |
> |q88| 701|1092|
> |q68| 234|345 |
> |q39|53|78  |
> |q73| 160|228 |
> |q31| 463|659 |
> |q79| 242|343 |
> |q46| 256|363 |
> |q60| 271|382 |
> |q66| 198|278 |
> |q34| 155|217 |
> |q19| 184|256 |
> |q26| 154|214 |
> |q56| 262|364 |
> |q75| 942|1303|
> |q71| 288|388 |
> |q25| 329|442 |
> |q52| 142|190 |
> |q42| 142|189 |
> |q3   |   139|185 |
> |q98| 153|203 |
> |q89| 187|248 |
> |q58| 264|340 |
> |q43| 127|162 |
> |q32| 174|221 |
> |q96| 156|197 |
> |q70| 320|404 |
> |q29| 499|629 |
> |q18| 266|329 |
> |q21| 76 |92  |
> |q90| 139|165 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO

2016-07-27 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396905#comment-15396905
 ] 

Pengcheng Xiong commented on HIVE-14353:


[~nemon], thanks for your data. I am bit confused for the CBO_time_in_SelectOP 
that you mentioned... What i would like to see is how much time did we spend on 
compiling the query (I mean the time spent on the application of 
HiveRelFieldTrimmer). For example, for q27, the time difference is 251-160=91. 
If the time spent on the application of HiveRelFieldTrimmer is around 90s, then 
that means we should improve the compilation. Otherwise if the time spent on 
the application of HiveRelFieldTrimmer is around 1-5s, then that means the 
difference comes from somewhere else rather than compilation.. Thanks.

> Performance degradation  after Projection Pruning in CBO
> 
>
> Key: HIVE-14353
> URL: https://issues.apache.org/jira/browse/HIVE-14353
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>
> TPC-DS with factor 1024.
> Hive on Spark. 
> With and without projection prunning,time spent are quite different.
> The way to disable projection prunning : disable HiveRelFieldTrimmer in code 
> and compile a new jar.
> ||queries||CBO_no_projection_prune||CBO||
> |q27| 160|251 | 
> |q7   |   200|312 |
> |q88| 701|1092|
> |q68| 234|345 |
> |q39|53|78  |
> |q73| 160|228 |
> |q31| 463|659 |
> |q79| 242|343 |
> |q46| 256|363 |
> |q60| 271|382 |
> |q66| 198|278 |
> |q34| 155|217 |
> |q19| 184|256 |
> |q26| 154|214 |
> |q56| 262|364 |
> |q75| 942|1303|
> |q71| 288|388 |
> |q25| 329|442 |
> |q52| 142|190 |
> |q42| 142|189 |
> |q3   |   139|185 |
> |q98| 153|203 |
> |q89| 187|248 |
> |q58| 264|340 |
> |q43| 127|162 |
> |q32| 174|221 |
> |q96| 156|197 |
> |q70| 320|404 |
> |q29| 499|629 |
> |q18| 266|329 |
> |q21| 76 |92  |
> |q90| 139|165 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO

2016-07-27 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396893#comment-15396893
 ] 

Nemon Lou commented on HIVE-14353:
--

||queries||CBO_total_time||CBO_time_in_SelectOP||
|q27|   266.494|251 | 
|q7 |   328.259|98.8 |
|q68|   369.159|105 |
|q46|   392.777|91.75|

I just run a few of them because of time limit. The time spent in selectOP is 
calculated by adding up total times spent for selectOP  in one executor ,and 
then divide number of cores.(4 in my case).
Also,I have run q46 without projection pruning.And total time is 266.226,time 
spent in selectOP is 0.125 seconds.

> Performance degradation  after Projection Pruning in CBO
> 
>
> Key: HIVE-14353
> URL: https://issues.apache.org/jira/browse/HIVE-14353
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>
> TPC-DS with factor 1024.
> Hive on Spark. 
> With and without projection prunning,time spent are quite different.
> The way to disable projection prunning : disable HiveRelFieldTrimmer in code 
> and compile a new jar.
> ||queries||CBO_no_projection_prune||CBO||
> |q27| 160|251 | 
> |q7   |   200|312 |
> |q88| 701|1092|
> |q68| 234|345 |
> |q39|53|78  |
> |q73| 160|228 |
> |q31| 463|659 |
> |q79| 242|343 |
> |q46| 256|363 |
> |q60| 271|382 |
> |q66| 198|278 |
> |q34| 155|217 |
> |q19| 184|256 |
> |q26| 154|214 |
> |q56| 262|364 |
> |q75| 942|1303|
> |q71| 288|388 |
> |q25| 329|442 |
> |q52| 142|190 |
> |q42| 142|189 |
> |q3   |   139|185 |
> |q98| 153|203 |
> |q89| 187|248 |
> |q58| 264|340 |
> |q43| 127|162 |
> |q32| 174|221 |
> |q96| 156|197 |
> |q70| 320|404 |
> |q29| 499|629 |
> |q18| 266|329 |
> |q21| 76 |92  |
> |q90| 139|165 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO

2016-07-27 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395947#comment-15395947
 ] 

Pengcheng Xiong commented on HIVE-14353:


[~nemon]. that is interesting finding. Could u also record how much time did we 
spend on HiveRelFieldTrimmer in CBO on case and put them in the 3rd column? 
Thanks.

> Performance degradation  after Projection Pruning in CBO
> 
>
> Key: HIVE-14353
> URL: https://issues.apache.org/jira/browse/HIVE-14353
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.1
>Reporter: Nemon Lou
>
> TPC-DS with factor 1024.
> Hive on Spark. 
> With and without projection prunning,time spent are quite different.
> The way to disable projection prunning : disable HiveRelFieldTrimmer in code 
> and compile a new jar.
> ||queries||CBO_no_projection_prune||CBO||
> |q27| 160|251 | 
> |q7   |   200|312 |
> |q88| 701|1092|
> |q68| 234|345 |
> |q39|53|78  |
> |q73| 160|228 |
> |q31| 463|659 |
> |q79| 242|343 |
> |q46| 256|363 |
> |q60| 271|382 |
> |q66| 198|278 |
> |q34| 155|217 |
> |q19| 184|256 |
> |q26| 154|214 |
> |q56| 262|364 |
> |q75| 942|1303|
> |q71| 288|388 |
> |q25| 329|442 |
> |q52| 142|190 |
> |q42| 142|189 |
> |q3   |   139|185 |
> |q98| 153|203 |
> |q89| 187|248 |
> |q58| 264|340 |
> |q43| 127|162 |
> |q32| 174|221 |
> |q96| 156|197 |
> |q70| 320|404 |
> |q29| 499|629 |
> |q18| 266|329 |
> |q21| 76 |92  |
> |q90| 139|165 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)