[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO
[ https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15398540#comment-15398540 ] Nemon Lou commented on HIVE-14353: -- The motivation of this jira ticket is that ,I found that query46 was lower with CBO on than off, while the join order is the same. ( I changed the join order in SQL manually when CBO is off.) After comparing these two query plans,the major difference is the select operator introduced by CBO's projection pruning. > Performance degradation after Projection Pruning in CBO > > > Key: HIVE-14353 > URL: https://issues.apache.org/jira/browse/HIVE-14353 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou > Attachments: q46_cbo_no_projection_prune_explain.txt, > q46_cbo_projection_prune_explain.rar > > > TPC-DS with factor 1024. > Hive on Spark. > With and without projection prunning,time spent are quite different. > The way to disable projection prunning : disable HiveRelFieldTrimmer in code > and compile a new jar. > ||queries||CBO_no_projection_prune||CBO|| > |q27| 160|251 | > |q7 | 200|312 | > |q88| 701|1092| > |q68| 234|345 | > |q39|53|78 | > |q73| 160|228 | > |q31| 463|659 | > |q79| 242|343 | > |q46| 256|363 | > |q60| 271|382 | > |q66| 198|278 | > |q34| 155|217 | > |q19| 184|256 | > |q26| 154|214 | > |q56| 262|364 | > |q75| 942|1303| > |q71| 288|388 | > |q25| 329|442 | > |q52| 142|190 | > |q42| 142|189 | > |q3 | 139|185 | > |q98| 153|203 | > |q89| 187|248 | > |q58| 264|340 | > |q43| 127|162 | > |q32| 174|221 | > |q96| 156|197 | > |q70| 320|404 | > |q29| 499|629 | > |q18| 266|329 | > |q21| 76 |92 | > |q90| 139|165 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO
[ https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397041#comment-15397041 ] Nemon Lou commented on HIVE-14353: -- A preliminary analysis: Hive has a built in column pruner, and column pruning has been pushed down to InputFormat layer. CBO adds an projection above table scan,which is very costly especially when doing projection before join. Join can filter out a lot of rows in most cases of TPCDS. > Performance degradation after Projection Pruning in CBO > > > Key: HIVE-14353 > URL: https://issues.apache.org/jira/browse/HIVE-14353 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou > Attachments: q46_cbo_no_projection_prune_explain.txt, > q46_cbo_projection_prune_explain.rar > > > TPC-DS with factor 1024. > Hive on Spark. > With and without projection prunning,time spent are quite different. > The way to disable projection prunning : disable HiveRelFieldTrimmer in code > and compile a new jar. > ||queries||CBO_no_projection_prune||CBO|| > |q27| 160|251 | > |q7 | 200|312 | > |q88| 701|1092| > |q68| 234|345 | > |q39|53|78 | > |q73| 160|228 | > |q31| 463|659 | > |q79| 242|343 | > |q46| 256|363 | > |q60| 271|382 | > |q66| 198|278 | > |q34| 155|217 | > |q19| 184|256 | > |q26| 154|214 | > |q56| 262|364 | > |q75| 942|1303| > |q71| 288|388 | > |q25| 329|442 | > |q52| 142|190 | > |q42| 142|189 | > |q3 | 139|185 | > |q98| 153|203 | > |q89| 187|248 | > |q58| 264|340 | > |q43| 127|162 | > |q32| 174|221 | > |q96| 156|197 | > |q70| 320|404 | > |q29| 499|629 | > |q18| 266|329 | > |q21| 76 |92 | > |q90| 139|165 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO
[ https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396922#comment-15396922 ] Nemon Lou commented on HIVE-14353: -- [~pxiong] Sorry for the misleading. Performance degradation is at run time(an application run on YARN),not compile time. HiveRelFieldTrimmer adds a projection rel node above table scan.The projection node then compiled to select operator in hive. That's why I record the time spent in select operator during run time. > Performance degradation after Projection Pruning in CBO > > > Key: HIVE-14353 > URL: https://issues.apache.org/jira/browse/HIVE-14353 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou > > TPC-DS with factor 1024. > Hive on Spark. > With and without projection prunning,time spent are quite different. > The way to disable projection prunning : disable HiveRelFieldTrimmer in code > and compile a new jar. > ||queries||CBO_no_projection_prune||CBO|| > |q27| 160|251 | > |q7 | 200|312 | > |q88| 701|1092| > |q68| 234|345 | > |q39|53|78 | > |q73| 160|228 | > |q31| 463|659 | > |q79| 242|343 | > |q46| 256|363 | > |q60| 271|382 | > |q66| 198|278 | > |q34| 155|217 | > |q19| 184|256 | > |q26| 154|214 | > |q56| 262|364 | > |q75| 942|1303| > |q71| 288|388 | > |q25| 329|442 | > |q52| 142|190 | > |q42| 142|189 | > |q3 | 139|185 | > |q98| 153|203 | > |q89| 187|248 | > |q58| 264|340 | > |q43| 127|162 | > |q32| 174|221 | > |q96| 156|197 | > |q70| 320|404 | > |q29| 499|629 | > |q18| 266|329 | > |q21| 76 |92 | > |q90| 139|165 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO
[ https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396905#comment-15396905 ] Pengcheng Xiong commented on HIVE-14353: [~nemon], thanks for your data. I am bit confused for the CBO_time_in_SelectOP that you mentioned... What i would like to see is how much time did we spend on compiling the query (I mean the time spent on the application of HiveRelFieldTrimmer). For example, for q27, the time difference is 251-160=91. If the time spent on the application of HiveRelFieldTrimmer is around 90s, then that means we should improve the compilation. Otherwise if the time spent on the application of HiveRelFieldTrimmer is around 1-5s, then that means the difference comes from somewhere else rather than compilation.. Thanks. > Performance degradation after Projection Pruning in CBO > > > Key: HIVE-14353 > URL: https://issues.apache.org/jira/browse/HIVE-14353 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou > > TPC-DS with factor 1024. > Hive on Spark. > With and without projection prunning,time spent are quite different. > The way to disable projection prunning : disable HiveRelFieldTrimmer in code > and compile a new jar. > ||queries||CBO_no_projection_prune||CBO|| > |q27| 160|251 | > |q7 | 200|312 | > |q88| 701|1092| > |q68| 234|345 | > |q39|53|78 | > |q73| 160|228 | > |q31| 463|659 | > |q79| 242|343 | > |q46| 256|363 | > |q60| 271|382 | > |q66| 198|278 | > |q34| 155|217 | > |q19| 184|256 | > |q26| 154|214 | > |q56| 262|364 | > |q75| 942|1303| > |q71| 288|388 | > |q25| 329|442 | > |q52| 142|190 | > |q42| 142|189 | > |q3 | 139|185 | > |q98| 153|203 | > |q89| 187|248 | > |q58| 264|340 | > |q43| 127|162 | > |q32| 174|221 | > |q96| 156|197 | > |q70| 320|404 | > |q29| 499|629 | > |q18| 266|329 | > |q21| 76 |92 | > |q90| 139|165 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO
[ https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15396893#comment-15396893 ] Nemon Lou commented on HIVE-14353: -- ||queries||CBO_total_time||CBO_time_in_SelectOP|| |q27| 266.494|251 | |q7 | 328.259|98.8 | |q68| 369.159|105 | |q46| 392.777|91.75| I just run a few of them because of time limit. The time spent in selectOP is calculated by adding up total times spent for selectOP in one executor ,and then divide number of cores.(4 in my case). Also,I have run q46 without projection pruning.And total time is 266.226,time spent in selectOP is 0.125 seconds. > Performance degradation after Projection Pruning in CBO > > > Key: HIVE-14353 > URL: https://issues.apache.org/jira/browse/HIVE-14353 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou > > TPC-DS with factor 1024. > Hive on Spark. > With and without projection prunning,time spent are quite different. > The way to disable projection prunning : disable HiveRelFieldTrimmer in code > and compile a new jar. > ||queries||CBO_no_projection_prune||CBO|| > |q27| 160|251 | > |q7 | 200|312 | > |q88| 701|1092| > |q68| 234|345 | > |q39|53|78 | > |q73| 160|228 | > |q31| 463|659 | > |q79| 242|343 | > |q46| 256|363 | > |q60| 271|382 | > |q66| 198|278 | > |q34| 155|217 | > |q19| 184|256 | > |q26| 154|214 | > |q56| 262|364 | > |q75| 942|1303| > |q71| 288|388 | > |q25| 329|442 | > |q52| 142|190 | > |q42| 142|189 | > |q3 | 139|185 | > |q98| 153|203 | > |q89| 187|248 | > |q58| 264|340 | > |q43| 127|162 | > |q32| 174|221 | > |q96| 156|197 | > |q70| 320|404 | > |q29| 499|629 | > |q18| 266|329 | > |q21| 76 |92 | > |q90| 139|165 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14353) Performance degradation after Projection Pruning in CBO
[ https://issues.apache.org/jira/browse/HIVE-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395947#comment-15395947 ] Pengcheng Xiong commented on HIVE-14353: [~nemon]. that is interesting finding. Could u also record how much time did we spend on HiveRelFieldTrimmer in CBO on case and put them in the 3rd column? Thanks. > Performance degradation after Projection Pruning in CBO > > > Key: HIVE-14353 > URL: https://issues.apache.org/jira/browse/HIVE-14353 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 1.2.1 >Reporter: Nemon Lou > > TPC-DS with factor 1024. > Hive on Spark. > With and without projection prunning,time spent are quite different. > The way to disable projection prunning : disable HiveRelFieldTrimmer in code > and compile a new jar. > ||queries||CBO_no_projection_prune||CBO|| > |q27| 160|251 | > |q7 | 200|312 | > |q88| 701|1092| > |q68| 234|345 | > |q39|53|78 | > |q73| 160|228 | > |q31| 463|659 | > |q79| 242|343 | > |q46| 256|363 | > |q60| 271|382 | > |q66| 198|278 | > |q34| 155|217 | > |q19| 184|256 | > |q26| 154|214 | > |q56| 262|364 | > |q75| 942|1303| > |q71| 288|388 | > |q25| 329|442 | > |q52| 142|190 | > |q42| 142|189 | > |q3 | 139|185 | > |q98| 153|203 | > |q89| 187|248 | > |q58| 264|340 | > |q43| 127|162 | > |q32| 174|221 | > |q96| 156|197 | > |q70| 320|404 | > |q29| 499|629 | > |q18| 266|329 | > |q21| 76 |92 | > |q90| 139|165 | -- This message was sent by Atlassian JIRA (v6.3.4#6332)