[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=489736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-489736 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 23/Sep/20 18:16 Start Date: 23/Sep/20 18:16 Worklog Time Spent: 10m Work Description: vineetgarg02 merged pull request #1472: URL: https://github.com/apache/hive/pull/1472 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 489736) Time Spent: 2h (was: 1h 50m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=489735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-489735 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 23/Sep/20 18:14 Start Date: 23/Sep/20 18:14 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r493793444 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: I have updated HIVE-24005 to investigate this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 489735) Time Spent: 1h 50m (was: 1h 40m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=488953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-488953 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 23/Sep/20 04:22 Start Date: 23/Sep/20 04:22 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492539012 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: Let's create a follow-up to explore whether some of them may be made transient again and discuss over there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 488953) Time Spent: 1h 40m (was: 1.5h) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=488035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-488035 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 22/Sep/20 07:54 Start Date: 22/Sep/20 07:54 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492539012 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: Let's create a follow-up to explore whether some of them may be made transient again and discuss over there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 488035) Time Spent: 1.5h (was: 1h 20m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487618 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:21 Start Date: 22/Sep/20 03:21 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492224561 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: I actually tried not keeping these fields but I was running into all sorts of issues like unable to serialize/de-serialize or plan generating without metadata etc. I am not sure if we need to keep all of these fields or we can selectively choose, I went by almost all in interest of time. If Gopal or Rajesh thinks that this may cause performance issue I can open a follow-up to investigate and choose fields selectively. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -387,6 +387,12 @@ protected volatile boolean disableJoinMerge = false; protected final boolean defaultJoinMerge; + /* + * This is used by prepare/execute statement + * Prepare/Execute requires operators to be copied and cached + */ + protected Map topOpsCopy = null; Review comment: Original operator tree shape is changed when going through physical transformations and task generation (don't know why though), as a result this operator tree can not be used later to regenerate tasks or re-running physical transformations. Therefore we make a copy and cache it after operator tree is generated. I will leave a comment. ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] Review comment: Yeah I think this is likely side effect of some changes in w.r.t serialization/de-serialization. Although this is positive side effect now that we have more information in explain plan. ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] <-Map 6 [CONTAINS] vectorized, llap Reduce Output Operator [RS_45] Limit [LIM_44] (rows=1 width=2) Number of rows:1 Select Operator [SEL_43] (rows=1 width=0) Output:["_col0"] TableScan [TS_29] (rows=1 width=0) +default@tb2,tb2,Tbl:PARTIAL,Col:COMPLETE Review comment: I confirmed that this is expected. I compared this plan against master (with explain.user set to false) and there is no difference in the plan. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487618) Time Spent: 1h 20m (was: 1h 10m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487584 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 22/Sep/20 03:18 Start Date: 22/Sep/20 03:18 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r491751273 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -253,14 +191,17 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String queryName = getQueryName(root); if (ss.getPreparePlans().containsKey(queryName)) { // retrieve cached plan from session state - BaseSemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName); + SemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName); // make copy of the plan - createTaskCopy(cachedPlan); + //createTaskCopy(cachedPlan); Review comment: Can remove line commented out. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/PrepareStatementAnalyzer.java ## @@ -54,6 +58,21 @@ private void savePlan(String queryName) throws SemanticException{ ss.getPreparePlans().put(queryName, this); } + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); Review comment: Can we leave a comment on this method to understand what it is trying to do? ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: Do we need to keep all these fields for the plan cache in the operator, table, etc.? I am wondering about the implications of keeping them when the operator plan is serialized (i.e., whether that could have an performance impact). @t3rmin4t0r , @rbalamohan , could you comment on this? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws SemanticException { this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks()); this.initCtx(cachedPlan.getCtx()); this.ctx.setCboInfo(cachedPlan.getCboInfo()); + this.setLoadFileWork(cachedPlan.getLoadFileWork()); + this.setLoadTableWork(cachedPlan.getLoadTableWork()); + + this.setQB(cachedPlan.getQB()); + + ParseContext pctxt = this.getParseContext(); + // partition pruner + Transform ppr = new PartitionPruner(); + ppr.transform(pctxt); + + //pctxt.setQueryProperties(this.queryProperties); + if (!ctx.getExplainLogical()) { +TaskCompiler compiler = TaskCompilerFactory.getCompiler(conf, pctxt); +compiler.init(queryState, console, db); +compiler.compile(pctxt, rootTasks, inputs, outputs); +fetchTask = pctxt.getFetchTask(); +//fetchTask = makeCopy(cachedPlan.getFetchTask(), cachedPlan.getFetchTask().getClass()); Review comment: This comment too. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws SemanticException { this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks()); this.initCtx(cachedPlan.getCtx()); this.ctx.setCboInfo(cachedPlan.getCboInfo()); + this.setLoadFileWork(cachedPlan.getLoadFileWork()); + this.setLoadTableWork(cachedPlan.getLoadTableWork()); + + this.setQB(cachedPlan.getQB()); + + ParseContext pctxt = this.getParseContext(); + // partition pruner + Transform ppr = new PartitionPruner(); + ppr.transform(pctxt); + + //pctxt.setQueryProperties(this.queryProperties); Review comment: Same, can be removed? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -387,6 +387,12 @@ protected volatile boolean disableJoinMerge = false; protected final boolean defaultJoinMerge; + /* + * This is used by prepare/execute statement + * Prepare/Execute requires operators to be copied and cached + */ + protected Map topOpsCopy = null; Review comment: Why do you need to keep a copy instead of using the original operators? Could you leave a comment on that? ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"]
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487125 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:48 Start Date: 21/Sep/20 17:48 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492240115 ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] <-Map 6 [CONTAINS] vectorized, llap Reduce Output Operator [RS_45] Limit [LIM_44] (rows=1 width=2) Number of rows:1 Select Operator [SEL_43] (rows=1 width=0) Output:["_col0"] TableScan [TS_29] (rows=1 width=0) +default@tb2,tb2,Tbl:PARTIAL,Col:COMPLETE Review comment: I confirmed that this is expected. I compared this plan against master (with explain.user set to false) and there is no difference in the plan. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487125) Time Spent: 1h (was: 50m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487109 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:30 Start Date: 21/Sep/20 17:30 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492229758 ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"] TableScan [TS_24] (rows=1 width=4) -Output:["id"] +default@tb2,tb2,Tbl:COMPLETE,Col:NONE,Output:["id"] Review comment: Yeah I think this is likely side effect of some changes in w.r.t serialization/de-serialization. Although this is positive side effect now that we have more information in explain plan. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487109) Time Spent: 50m (was: 40m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487107 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:29 Start Date: 21/Sep/20 17:29 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492229067 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -387,6 +387,12 @@ protected volatile boolean disableJoinMerge = false; protected final boolean defaultJoinMerge; + /* + * This is used by prepare/execute statement + * Prepare/Execute requires operators to be copied and cached + */ + protected Map topOpsCopy = null; Review comment: Original operator tree shape is changed when going through physical transformations and task generation (don't know why though), as a result this operator tree can not be used later to regenerate tasks or re-running physical transformations. Therefore we make a copy and cache it after operator tree is generated. I will leave a comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487107) Time Spent: 40m (was: 0.5h) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=487101=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-487101 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 17:22 Start Date: 21/Sep/20 17:22 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r492224561 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: I actually tried not keeping these fields but I was running into all sorts of issues like unable to serialize/de-serialize or plan generating without metadata etc. I am not sure if we need to keep all of these fields or we can selectively choose, I went by almost all in interest of time. If Gopal or Rajesh thinks that this may cause performance issue I can open a follow-up to investigate and choose fields selectively. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 487101) Time Spent: 0.5h (was: 20m) > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=486726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486726 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 21/Sep/20 00:30 Start Date: 21/Sep/20 00:30 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r491751273 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -253,14 +191,17 @@ public void analyzeInternal(ASTNode root) throws SemanticException { String queryName = getQueryName(root); if (ss.getPreparePlans().containsKey(queryName)) { // retrieve cached plan from session state - BaseSemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName); + SemanticAnalyzer cachedPlan = ss.getPreparePlans().get(queryName); // make copy of the plan - createTaskCopy(cachedPlan); + //createTaskCopy(cachedPlan); Review comment: Can remove line commented out. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/PrepareStatementAnalyzer.java ## @@ -54,6 +58,21 @@ private void savePlan(String queryName) throws SemanticException{ ss.getPreparePlans().put(queryName, this); } + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); Review comment: Can we leave a comment on this method to understand what it is trying to do? ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java ## @@ -63,19 +63,19 @@ private VectorizationContext taskVectorizationContext; - protected transient JobConf jc; - private transient boolean inputFileChanged = false; + protected JobConf jc; Review comment: Do we need to keep all these fields for the plan cache in the operator, table, etc.? I am wondering about the implications of keeping them when the operator plan is serialized (i.e., whether that could have an performance impact). @t3rmin4t0r , @rbalamohan , could you comment on this? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws SemanticException { this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks()); this.initCtx(cachedPlan.getCtx()); this.ctx.setCboInfo(cachedPlan.getCboInfo()); + this.setLoadFileWork(cachedPlan.getLoadFileWork()); + this.setLoadTableWork(cachedPlan.getLoadTableWork()); + + this.setQB(cachedPlan.getQB()); + + ParseContext pctxt = this.getParseContext(); + // partition pruner + Transform ppr = new PartitionPruner(); + ppr.transform(pctxt); + + //pctxt.setQueryProperties(this.queryProperties); + if (!ctx.getExplainLogical()) { +TaskCompiler compiler = TaskCompilerFactory.getCompiler(conf, pctxt); +compiler.init(queryState, console, db); +compiler.compile(pctxt, rootTasks, inputs, outputs); +fetchTask = pctxt.getFetchTask(); +//fetchTask = makeCopy(cachedPlan.getFetchTask(), cachedPlan.getFetchTask().getClass()); Review comment: This comment too. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -286,6 +227,24 @@ public void analyzeInternal(ASTNode root) throws SemanticException { this.acidFileSinks.addAll(cachedPlan.getAcidFileSinks()); this.initCtx(cachedPlan.getCtx()); this.ctx.setCboInfo(cachedPlan.getCboInfo()); + this.setLoadFileWork(cachedPlan.getLoadFileWork()); + this.setLoadTableWork(cachedPlan.getLoadTableWork()); + + this.setQB(cachedPlan.getQB()); + + ParseContext pctxt = this.getParseContext(); + // partition pruner + Transform ppr = new PartitionPruner(); + ppr.transform(pctxt); + + //pctxt.setQueryProperties(this.queryProperties); Review comment: Same, can be removed? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ## @@ -387,6 +387,12 @@ protected volatile boolean disableJoinMerge = false; protected final boolean defaultJoinMerge; + /* + * This is used by prepare/execute statement + * Prepare/Execute requires operators to be copied and cached + */ + protected Map topOpsCopy = null; Review comment: Why do you need to keep a copy instead of using the original operators? Could you leave a comment on that? ## File path: ql/src/test/results/clientpositive/llap/constprog_dpp.q.out ## @@ -84,12 +84,13 @@ Stage-0 Select Operator [SEL_40] (rows=1 width=4) Output:["_col0"]
[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement
[ https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=484202=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484202 ] ASF GitHub Bot logged work on HIVE-24009: - Author: ASF GitHub Bot Created on: 14/Sep/20 20:41 Start Date: 14/Sep/20 20:41 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1472: URL: https://github.com/apache/hive/pull/1472#discussion_r488207028 ## File path: ql/src/test/queries/clientnegative/prepare_execute_1.q ## @@ -1,3 +0,0 @@ ---! qt:dataset:src -prepare query1 from select count(*) from src where key > ? and value < ? group by ?; -execute query1 using 1,100,1; Review comment: This query no longer fails. I have opened a follow-up to fix this https://issues.apache.org/jira/browse/HIVE-24164 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 484202) Remaining Estimate: 0h Time Spent: 10m > Support partition pruning and other physical transformations for EXECUTE > statement > --- > > Key: HIVE-24009 > URL: https://issues.apache.org/jira/browse/HIVE-24009 > Project: Hive > Issue Type: Sub-task >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Current partition pruning (compile time) isn't kicked in for EXECUTE > statements. -- This message was sent by Atlassian Jira (v8.3.4#803005)