[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467561 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 20:43 Start Date: 06/Aug/20 20:43 Worklog Time Spent: 10m Work Description: vineetgarg02 merged pull request #1315: URL: https://github.com/apache/hive/pull/1315 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467561) Time Spent: 5h 10m (was: 5h) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 5h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467437=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467437 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 17:25 Start Date: 06/Aug/20 17:25 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466570882 ## File path: ql/src/test/results/clientpositive/llap/prepare_plan.q.out ## @@ -0,0 +1,1575 @@ +PREHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +PREHOOK: type: QUERY +PREHOOK: Input: default@src + A masked pattern was here +POSTHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +POSTHOOK: type: QUERY +POSTHOOK: Input: default@src + A masked pattern was here +OPTIMIZED SQL: SELECT COUNT(*) AS `$f0` +FROM `default`.`src` +WHERE `key` > CAST(? AS STRING) +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 +Tez + A masked pattern was here + Edges: +Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) + A masked pattern was here + Vertices: +Map 1 +Map Operator Tree: +TableScan + alias: src + filterExpr: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) + Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE + GatherStats: false + Filter Operator +isSamplingPred: false +predicate: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) +Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE +Select Operator + Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE + Group By Operator +aggregations: count() +minReductionHashAggr: 0.99 +mode: hash +outputColumnNames: _col0 +Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE +Reduce Output Operator + bucketingVersion: 2 + null sort order: + numBuckets: -1 + sort order: + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE + tag: -1 + value expressions: _col0 (type: bigint) + auto parallelism: false +Execution mode: llap +LLAP IO: no inputs +Path -> Alias: + A masked pattern was here +Path -> Partition: + A masked pattern was here +Partition + base file name: src + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: +bucket_count -1 +bucketing_version 2 +column.name.delimiter , +columns key,value +columns.types string:string + A masked pattern was here +name default.src +serialization.format 1 +serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + +input format: org.apache.hadoop.mapred.TextInputFormat +output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat +properties: + bucketing_version 2 + column.name.delimiter , + columns key,value + columns.comments 'default','default' + columns.types string:string + A masked pattern was here + name default.src + serialization.format 1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +name: default.src + name: default.src +Truncated Path -> Alias: + /src [src] +Reducer 2 +Execution mode: llap +Needs Tagging: false +Reduce
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467435 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 17:23 Start Date: 06/Aug/20 17:23 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466568964 ## File path: ql/src/test/results/clientpositive/llap/prepare_plan.q.out ## @@ -0,0 +1,1575 @@ +PREHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +PREHOOK: type: QUERY +PREHOOK: Input: default@src + A masked pattern was here +POSTHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +POSTHOOK: type: QUERY +POSTHOOK: Input: default@src + A masked pattern was here +OPTIMIZED SQL: SELECT COUNT(*) AS `$f0` +FROM `default`.`src` +WHERE `key` > CAST(? AS STRING) +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 +Tez + A masked pattern was here + Edges: +Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) + A masked pattern was here + Vertices: +Map 1 +Map Operator Tree: +TableScan + alias: src + filterExpr: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) + Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE + GatherStats: false + Filter Operator +isSamplingPred: false +predicate: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) +Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE +Select Operator + Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE + Group By Operator +aggregations: count() +minReductionHashAggr: 0.99 +mode: hash +outputColumnNames: _col0 +Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE +Reduce Output Operator + bucketingVersion: 2 + null sort order: + numBuckets: -1 + sort order: + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE + tag: -1 + value expressions: _col0 (type: bigint) + auto parallelism: false +Execution mode: llap +LLAP IO: no inputs +Path -> Alias: + A masked pattern was here +Path -> Partition: + A masked pattern was here +Partition + base file name: src + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: +bucket_count -1 +bucketing_version 2 +column.name.delimiter , +columns key,value +columns.types string:string + A masked pattern was here +name default.src +serialization.format 1 +serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + +input format: org.apache.hadoop.mapred.TextInputFormat +output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat +properties: + bucketing_version 2 + column.name.delimiter , + columns key,value + columns.comments 'default','default' + columns.types string:string + A masked pattern was here + name default.src + serialization.format 1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +name: default.src + name: default.src +Truncated Path -> Alias: + /src [src] +Reducer 2 +Execution mode: llap +Needs Tagging: false +Reduce
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467433=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467433 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 17:22 Start Date: 06/Aug/20 17:22 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466567882 ## File path: ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java ## @@ -205,7 +205,9 @@ DROP_MAPPING("DROP MAPPING", HiveParser.TOK_DROP_MAPPING, null, null, false, false), CREATE_SCHEDULED_QUERY("CREATE SCHEDULED QUERY", HiveParser.TOK_CREATE_SCHEDULED_QUERY, null, null), ALTER_SCHEDULED_QUERY("ALTER SCHEDULED QUERY", HiveParser.TOK_ALTER_SCHEDULED_QUERY, null, null), - DROP_SCHEDULED_QUERY("DROP SCHEDULED QUERY", HiveParser.TOK_DROP_SCHEDULED_QUERY, null, null) + DROP_SCHEDULED_QUERY("DROP SCHEDULED QUERY", HiveParser.TOK_DROP_SCHEDULED_QUERY, null, null), + PREPARE("PREPARE QUERY", HiveParser.TOK_PREPARE, null, null), + EXECUTE("EXECUTE QUERY", HiveParser.TOK_EXECUTE, null, null) Review comment: Follow-up: https://issues.apache.org/jira/browse/HIVE-24007 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467433) Time Spent: 4h 40m (was: 4.5h) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 4h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467421 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 17:16 Start Date: 06/Aug/20 17:16 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466563430 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/drop/ExecuteStatementAnalyzer.java ## @@ -0,0 +1,377 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.drop; + +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.exec.ExplainTask; +import org.apache.hadoop.hive.ql.exec.FetchTask; +import org.apache.hadoop.hive.ql.exec.FilterOperator; +import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.OperatorUtils; +import org.apache.hadoop.hive.ql.exec.SelectOperator; +import org.apache.hadoop.hive.ql.exec.SerializationUtilities; +import org.apache.hadoop.hive.ql.exec.Task; +import org.apache.hadoop.hive.ql.exec.Utilities; +import org.apache.hadoop.hive.ql.exec.tez.TezTask; +import org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.type.ExprNodeDescExprFactory; +import org.apache.hadoop.hive.ql.plan.BaseWork; +import org.apache.hadoop.hive.ql.plan.ExprDynamicParamDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeDesc; +import org.apache.hadoop.hive.ql.session.SessionState; +import org.apache.hadoop.hive.serde2.typeinfo.CharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Analyzer for Execute statement. + * This analyzer + * retreives cached {@link BaseSemanticAnalyzer}, + * makes copy of all tasks by serializing/deserializing it, + * bind dynamic parameters inside cached {@link BaseSemanticAnalyzer} using values provided + */ +@DDLType(types = HiveParser.TOK_EXECUTE) +public class ExecuteStatementAnalyzer extends BaseSemanticAnalyzer { + + public ExecuteStatementAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + /** + * This class encapsulate all {@link Task} required to be copied. + * This is required because {@link FetchTask} list of {@link Task} may hold reference to same + * objects (e.g. list of result files) and are required to be serialized/de-serialized together. + */ + private class PlanCopy { +FetchTask fetchTask; +List> tasks; + +PlanCopy(FetchTask fetchTask, List> tasks) { + this.fetchTask = fetchTask; + this.tasks = tasks; +} + +FetchTask getFetchTask() { + return fetchTask; +} + +List> getTasks() { + return tasks; +} + } + + private String getQueryName(ASTNode root) { +ASTNode queryNameAST = (ASTNode)(root.getChild(1)); +return queryNameAST.getText(); + } + + /** + * Utility method to create copy of provided object using kyro serialization/de-serialization. + */ + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +SerializationUtilities.serializePlan(task, baos); + +return SerializationUtilities.deserializePlan( +new ByteArrayInputStream(baos.toByteArray()), objClass); +
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467419=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467419 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 17:14 Start Date: 06/Aug/20 17:14 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466561873 ## File path: parser/src/java/org/apache/hadoop/hive/ql/parse/PrepareStatementParser.g ## @@ -0,0 +1,66 @@ +/** + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +*/ +parser grammar PrepareStatementParser; + +options +{ +output=AST; +ASTLabelType=ASTNode; +backtrack=false; +k=3; +} + +@members { + @Override + public Object recoverFromMismatchedSet(IntStream input, + RecognitionException re, BitSet follow) throws RecognitionException { +throw re; + } + @Override + public void displayRecognitionError(String[] tokenNames, + RecognitionException e) { +gParent.errors.add(new ParseError(gParent, e, tokenNames)); + } +} + +@rulecatch { +catch (RecognitionException e) { + throw e; +} +} + +//--- Rules for parsing Prepare statement- +prepareStatement +@init { gParent.pushMsg("prepare statement ", state); } +@after { gParent.popMsg(state); } +: KW_PREPARE identifier KW_FROM queryStatementExpression +-> ^(TOK_PREPARE queryStatementExpression identifier) +; + +executeStatement +@init { gParent.pushMsg("execute statement ", state); } +@after { gParent.popMsg(state); } +: KW_EXECUTE identifier KW_USING executeParamList +-> ^(TOK_EXECUTE executeParamList identifier) +; + +executeParamList +@init { gParent.pushMsg("execute param list", state); } +@after { gParent.popMsg(state); } +: constant (COMMA constant)* Review comment: https://issues.apache.org/jira/browse/HIVE-24002 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467419) Time Spent: 4h 20m (was: 4h 10m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467384=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467384 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 16:26 Start Date: 06/Aug/20 16:26 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466534477 ## File path: ql/src/test/queries/clientpositive/prepare_plan.q ## @@ -0,0 +1,113 @@ +--! qt:dataset:src +--! qt:dataset:alltypesorc + +set hive.explain.user=false; +set hive.vectorized.execution.enabled=false; + +explain extended prepare pcount from select count(*) from src where key > ?; +prepare pcount from select count(*) from src where key > ?; +execute pcount using 200; + +-- single param +explain extended prepare p1 from select * from src where key > ? order by key limit 10; +prepare p1 from select * from src where key > ? order by key limit 10; + +execute p1 using 200; + +-- same query, different param +execute p1 using 0; + +-- same query, negative param +--TODO: fails (constant in grammar do not support negatives) +-- execute p1 using -1; Review comment: https://issues.apache.org/jira/browse/HIVE-24002 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467384) Time Spent: 4h 10m (was: 4h) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467381=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467381 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 16:25 Start Date: 06/Aug/20 16:25 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466533583 ## File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ## @@ -1619,6 +1620,12 @@ public static ColStatistics getColStatisticsFromExpression(HiveConf conf, Statis colName = enfd.getFieldName(); colType = enfd.getTypeString(); countDistincts = numRows; +} else if (end instanceof ExprDynamicParamDesc) { + //skip collecting stats for parameters Review comment: https://issues.apache.org/jira/browse/HIVE-24003 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467381) Time Spent: 4h (was: 3h 50m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467380 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 16:24 Start Date: 06/Aug/20 16:24 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466533366 ## File path: ql/src/test/results/clientpositive/llap/udf_greatest.q.out ## @@ -63,7 +63,7 @@ STAGE PLANS: alias: src Row Limit Per Split: 1 Select Operator -expressions: 'c' (type: string), 'a' (type: string), 'AaA' (type: string), 'AAA' (type: string), '13' (type: string), '2' (type: string), '03' (type: string), '1' (type: string), null (type: double), null (type: double), null (type: double), null (type: double), null (type: double), null (type: double) Review comment: There is a small change in the patch which updates the type inference rule for void/null. Prior to the change the expressions were being inferred as `Double` in this case. With the change it is appropriately inferred as `String` (since rest of the expressions within this UDF (`GREATEST('a', 'b', null )`) is interpreted as string. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 467380) Time Spent: 3h 50m (was: 3h 40m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467379 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 16:22 Start Date: 06/Aug/20 16:22 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466531974 ## File path: ql/src/test/results/clientpositive/llap/prepare_plan.q.out ## @@ -0,0 +1,2512 @@ +PREHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +PREHOOK: type: QUERY +PREHOOK: Input: default@src + A masked pattern was here +POSTHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +POSTHOOK: type: QUERY +POSTHOOK: Input: default@src + A masked pattern was here +OPTIMIZED SQL: SELECT COUNT(*) AS `$f0` +FROM `default`.`src` +WHERE `key` > CAST(? AS STRING) +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 +Tez + A masked pattern was here + Edges: +Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) + A masked pattern was here + Vertices: +Map 1 +Map Operator Tree: +TableScan + alias: src + filterExpr: (key > CAST( $1 AS STRING)) (type: boolean) + Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE + GatherStats: false + Filter Operator +isSamplingPred: false +predicate: (key > CAST( $1 AS STRING)) (type: boolean) +Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE +Select Operator + Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE + Group By Operator +aggregations: count() +minReductionHashAggr: 0.99 +mode: hash +outputColumnNames: _col0 +Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE +Reduce Output Operator + bucketingVersion: 2 + null sort order: + numBuckets: -1 + sort order: + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE + tag: -1 + value expressions: _col0 (type: bigint) + auto parallelism: false +Execution mode: llap +LLAP IO: all inputs +Path -> Alias: + A masked pattern was here +Path -> Partition: + A masked pattern was here +Partition + base file name: src + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: +bucket_count -1 +bucketing_version 2 +column.name.delimiter , +columns key,value +columns.types string:string + A masked pattern was here +name default.src +serialization.format 1 +serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + +input format: org.apache.hadoop.mapred.TextInputFormat +output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat +properties: + bucketing_version 2 + column.name.delimiter , + columns key,value + columns.comments 'default','default' + columns.types string:string + A masked pattern was here + name default.src + serialization.format 1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +name: default.src + name: default.src +Truncated Path -> Alias: + /src [src] +Reducer 2 +Execution mode: llap +Needs Tagging: false +Reduce Operator Tree: + Group By Operator +
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=467033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-467033 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 06/Aug/20 01:14 Start Date: 06/Aug/20 01:14 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r466085537 ## File path: ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java ## @@ -121,6 +121,8 @@ private final DDLDescWithWriteId acidDdlDesc; private Boolean autoCommitValue; + private Boolean prepareQuery; Review comment: It seems this should be a boolean given the return type of the methods getter/setter. ## File path: ql/src/java/org/apache/hadoop/hive/ql/Compiler.java ## @@ -338,12 +339,22 @@ private QueryPlan createPlan(BaseSemanticAnalyzer sem) { plan.setOptimizedCBOPlan(context.getCalcitePlan()); plan.setOptimizedQueryString(context.getOptimizedSql()); +// this is required so that later driver can skip executing prepare queries +if (sem.getIsPrepareQuery()) { Review comment: `getIsPrepareQuery` -> `isPrepareQuery` ## File path: parser/src/java/org/apache/hadoop/hive/ql/parse/PrepareStatementParser.g ## @@ -0,0 +1,66 @@ +/** + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +*/ +parser grammar PrepareStatementParser; + +options +{ +output=AST; +ASTLabelType=ASTNode; +backtrack=false; +k=3; +} + +@members { + @Override + public Object recoverFromMismatchedSet(IntStream input, + RecognitionException re, BitSet follow) throws RecognitionException { +throw re; + } + @Override + public void displayRecognitionError(String[] tokenNames, + RecognitionException e) { +gParent.errors.add(new ParseError(gParent, e, tokenNames)); + } +} + +@rulecatch { +catch (RecognitionException e) { + throw e; +} +} + +//--- Rules for parsing Prepare statement- +prepareStatement +@init { gParent.pushMsg("prepare statement ", state); } +@after { gParent.popMsg(state); } +: KW_PREPARE identifier KW_FROM queryStatementExpression +-> ^(TOK_PREPARE queryStatementExpression identifier) +; + +executeStatement +@init { gParent.pushMsg("execute statement ", state); } +@after { gParent.popMsg(state); } +: KW_EXECUTE identifier KW_USING executeParamList +-> ^(TOK_EXECUTE executeParamList identifier) +; + +executeParamList +@init { gParent.pushMsg("execute param list", state); } +@after { gParent.popMsg(state); } +: constant (COMMA constant)* Review comment: Is there a JIRA? ## File path: ql/src/test/results/clientpositive/llap/udf_greatest.q.out ## @@ -63,7 +63,7 @@ STAGE PLANS: alias: src Row Limit Per Split: 1 Select Operator -expressions: 'c' (type: string), 'a' (type: string), 'AaA' (type: string), 'AAA' (type: string), '13' (type: string), '2' (type: string), '03' (type: string), '1' (type: string), null (type: double), null (type: double), null (type: double), null (type: double), null (type: double), null (type: double) Review comment: Why did these types change? Seems unrelated to this patch. Was it an existing bug? ## File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ## @@ -1619,6 +1620,12 @@ public static ColStatistics getColStatisticsFromExpression(HiveConf conf, Statis colName = enfd.getFieldName(); colType = enfd.getTypeString(); countDistincts = numRows; +} else if (end instanceof ExprDynamicParamDesc) { + //skip collecting stats for parameters Review comment: JIRA? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/ExecuteStatementAnalyzer.java ## @@ -0,0 +1,320 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=466567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-466567 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 05/Aug/20 02:59 Start Date: 05/Aug/20 02:59 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r465443070 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/drop/ExecuteStatementAnalyzer.java ## @@ -0,0 +1,377 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.drop; + +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.exec.ExplainTask; +import org.apache.hadoop.hive.ql.exec.FetchTask; +import org.apache.hadoop.hive.ql.exec.FilterOperator; +import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.OperatorUtils; +import org.apache.hadoop.hive.ql.exec.SelectOperator; +import org.apache.hadoop.hive.ql.exec.SerializationUtilities; +import org.apache.hadoop.hive.ql.exec.Task; +import org.apache.hadoop.hive.ql.exec.Utilities; +import org.apache.hadoop.hive.ql.exec.tez.TezTask; +import org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.type.ExprNodeDescExprFactory; +import org.apache.hadoop.hive.ql.plan.BaseWork; +import org.apache.hadoop.hive.ql.plan.ExprDynamicParamDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeDesc; +import org.apache.hadoop.hive.ql.session.SessionState; +import org.apache.hadoop.hive.serde2.typeinfo.CharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Analyzer for Execute statement. + * This analyzer + * retreives cached {@link BaseSemanticAnalyzer}, + * makes copy of all tasks by serializing/deserializing it, + * bind dynamic parameters inside cached {@link BaseSemanticAnalyzer} using values provided + */ +@DDLType(types = HiveParser.TOK_EXECUTE) +public class ExecuteStatementAnalyzer extends BaseSemanticAnalyzer { + + public ExecuteStatementAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + /** + * This class encapsulate all {@link Task} required to be copied. + * This is required because {@link FetchTask} list of {@link Task} may hold reference to same + * objects (e.g. list of result files) and are required to be serialized/de-serialized together. + */ + private class PlanCopy { +FetchTask fetchTask; +List> tasks; + +PlanCopy(FetchTask fetchTask, List> tasks) { + this.fetchTask = fetchTask; + this.tasks = tasks; +} + +FetchTask getFetchTask() { + return fetchTask; +} + +List> getTasks() { + return tasks; +} + } + + private String getQueryName(ASTNode root) { +ASTNode queryNameAST = (ASTNode)(root.getChild(1)); +return queryNameAST.getText(); + } + + /** + * Utility method to create copy of provided object using kyro serialization/de-serialization. + */ + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +SerializationUtilities.serializePlan(task, baos); + +return SerializationUtilities.deserializePlan( +new ByteArrayInputStream(baos.toByteArray()), objClass); + }
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=466495=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-466495 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 04/Aug/20 22:19 Start Date: 04/Aug/20 22:19 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r465362372 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/drop/ExecuteStatementAnalyzer.java ## @@ -0,0 +1,377 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.drop; + +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.exec.ExplainTask; +import org.apache.hadoop.hive.ql.exec.FetchTask; +import org.apache.hadoop.hive.ql.exec.FilterOperator; +import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.OperatorUtils; +import org.apache.hadoop.hive.ql.exec.SelectOperator; +import org.apache.hadoop.hive.ql.exec.SerializationUtilities; +import org.apache.hadoop.hive.ql.exec.Task; +import org.apache.hadoop.hive.ql.exec.Utilities; +import org.apache.hadoop.hive.ql.exec.tez.TezTask; +import org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.type.ExprNodeDescExprFactory; +import org.apache.hadoop.hive.ql.plan.BaseWork; +import org.apache.hadoop.hive.ql.plan.ExprDynamicParamDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeDesc; +import org.apache.hadoop.hive.ql.session.SessionState; +import org.apache.hadoop.hive.serde2.typeinfo.CharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Analyzer for Execute statement. + * This analyzer + * retreives cached {@link BaseSemanticAnalyzer}, + * makes copy of all tasks by serializing/deserializing it, + * bind dynamic parameters inside cached {@link BaseSemanticAnalyzer} using values provided + */ +@DDLType(types = HiveParser.TOK_EXECUTE) +public class ExecuteStatementAnalyzer extends BaseSemanticAnalyzer { + + public ExecuteStatementAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + /** + * This class encapsulate all {@link Task} required to be copied. + * This is required because {@link FetchTask} list of {@link Task} may hold reference to same + * objects (e.g. list of result files) and are required to be serialized/de-serialized together. + */ + private class PlanCopy { +FetchTask fetchTask; +List> tasks; + +PlanCopy(FetchTask fetchTask, List> tasks) { + this.fetchTask = fetchTask; + this.tasks = tasks; +} + +FetchTask getFetchTask() { + return fetchTask; +} + +List> getTasks() { + return tasks; +} + } + + private String getQueryName(ASTNode root) { +ASTNode queryNameAST = (ASTNode)(root.getChild(1)); +return queryNameAST.getText(); + } + + /** + * Utility method to create copy of provided object using kyro serialization/de-serialization. + */ + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +SerializationUtilities.serializePlan(task, baos); + +return SerializationUtilities.deserializePlan( +new ByteArrayInputStream(baos.toByteArray()), objClass); +
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=466486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-466486 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 04/Aug/20 22:06 Start Date: 04/Aug/20 22:06 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r465356752 ## File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ## @@ -1619,6 +1620,9 @@ public static ColStatistics getColStatisticsFromExpression(HiveConf conf, Statis colName = enfd.getFieldName(); colType = enfd.getTypeString(); countDistincts = numRows; +} else if (end instanceof ExprDynamicParamDesc) { + //skip colecting stats for parameters Review comment: Nevermind, colstats require column name and type name, since both are missing for dynamic param expression it is not possible to create col stats object, i will update the comment as you suggested. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 466486) Time Spent: 3h (was: 2h 50m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=466478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-466478 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 04/Aug/20 21:49 Start Date: 04/Aug/20 21:49 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r465349679 ## File path: ql/src/java/org/apache/hadoop/hive/ql/plan/ExprDynamicParamDesc.java ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.plan; + +import java.io.Serializable; +import java.util.List; + +import org.apache.commons.lang3.builder.HashCodeBuilder; +import org.apache.hadoop.hive.common.StringInternUtils; +import org.apache.hadoop.hive.serde.serdeConstants; +import org.apache.hadoop.hive.serde2.objectinspector.ConstantObjectInspector; +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils; +import org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils; + +/** + * A constant expression. + */ +public class ExprDynamicParamDesc extends ExprNodeDesc implements Serializable { + private static final long serialVersionUID = 1L; + final protected transient static char[] hexArray = "0123456789ABCDEF".toCharArray(); + + private int index; + private Object value; + + public ExprDynamicParamDesc() { + } + + public ExprDynamicParamDesc(TypeInfo typeInfo, int index, Object value) { +super(typeInfo); +this.index = index; +this.value = value; + } + + public Object getValue() { +return value; + } + + public int getIndex() { +return index; + } + + + @Override + public String toString() { +return "Dynamic Parameter " + " index: " + index; Review comment: Ok let me update it to `$ Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465969 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 04/Aug/20 00:24 Start Date: 04/Aug/20 00:24 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464729855 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java ## @@ -49,6 +50,8 @@ import com.google.common.collect.Lists; import com.google.common.collect.Multimap; +import static org.apache.hadoop.hive.ql.optimizer.physical.AnnotateRunTimeStatsOptimizer.getAllOperatorsForSimpleFetch; Review comment: Yes, I meant `getAllOperatorsForSimpleFetch`, since it seems it used beyond the scope of `AnnotateRunTimeStatsOptimizer`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465969) Time Spent: 2h 40m (was: 2.5h) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465963 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:57 Start Date: 03/Aug/20 23:57 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464722308 ## File path: ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java ## @@ -205,7 +205,9 @@ DROP_MAPPING("DROP MAPPING", HiveParser.TOK_DROP_MAPPING, null, null, false, false), CREATE_SCHEDULED_QUERY("CREATE SCHEDULED QUERY", HiveParser.TOK_CREATE_SCHEDULED_QUERY, null, null), ALTER_SCHEDULED_QUERY("ALTER SCHEDULED QUERY", HiveParser.TOK_ALTER_SCHEDULED_QUERY, null, null), - DROP_SCHEDULED_QUERY("DROP SCHEDULED QUERY", HiveParser.TOK_DROP_SCHEDULED_QUERY, null, null) + DROP_SCHEDULED_QUERY("DROP SCHEDULED QUERY", HiveParser.TOK_DROP_SCHEDULED_QUERY, null, null), + PREPARE("PREPARE QUERY", HiveParser.TOK_PREPARE, null, null), + EXECUTE("EXECUTE QUERY", HiveParser.TOK_EXECUTE, null, null) Review comment: IIRC I had to make this change for explain plan to work. Let me re-investigate, I will get back. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465963) Time Spent: 2.5h (was: 2h 20m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465962=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465962 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:56 Start Date: 03/Aug/20 23:56 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464722096 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java ## @@ -283,6 +283,33 @@ public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, } + /** + * Processor for processing Dynamic expression. + */ + public class DynamicParameterProcessor implements SemanticNodeProcessor { + +@Override +public Object process(Node nd, Stack stack, NodeProcessorCtx procCtx, +Object... nodeOutputs) throws SemanticException { + TypeCheckCtx ctx = (TypeCheckCtx) procCtx; + if (ctx.getError() != null) { +return null; + } + + T desc = processGByExpr(nd, procCtx); Review comment: No I believe this is not required. I will remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465962) Time Spent: 2h 20m (was: 2h 10m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465959=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465959 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:55 Start Date: 03/Aug/20 23:55 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464721831 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/drop/PrepareStatementAnalyzer.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.drop; + +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.CalcitePlanner; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.session.SessionState; + +/** + * Analyzer for Prepare queries. This analyzer generates plan for the parameterized query + * and save it in cache + */ +@DDLType(types = HiveParser.TOK_PREPARE) +public class PrepareStatementAnalyzer extends CalcitePlanner { + + public PrepareStatementAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + private String getQueryName(ASTNode root) { +ASTNode queryNameAST = (ASTNode)(root.getChild(1)); +return queryNameAST.getText(); + } + + /** + * This method saves the current {@link PrepareStatementAnalyzer} object as well as + * the config used to compile the plan. + * @param root + * @throws SemanticException + */ + private void savePlan(String queryName) throws SemanticException{ +SessionState ss = SessionState.get(); +assert(ss != null); + +if (ss.getPreparePlans().containsKey(queryName)) { + throw new SemanticException("Prepare query: " + queryName + " already exists."); +} +ss.getPreparePlans().put(queryName, this); Review comment: Yes, will update the code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465959) Time Spent: 2h (was: 1h 50m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465961 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:55 Start Date: 03/Aug/20 23:55 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464722011 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorUtils.java ## @@ -49,6 +50,8 @@ import com.google.common.collect.Lists; import com.google.common.collect.Multimap; +import static org.apache.hadoop.hive.ql.optimizer.physical.AnnotateRunTimeStatsOptimizer.getAllOperatorsForSimpleFetch; Review comment: You mean move `getAllOperatorsForSimpleFetch` from `AnnotateRunTimeStatsOptimizer`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465961) Time Spent: 2h 10m (was: 2h) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465951 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:27 Start Date: 03/Aug/20 23:27 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464713600 ## File path: ql/src/test/results/clientpositive/llap/prepare_plan.q.out ## @@ -0,0 +1,1575 @@ +PREHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +PREHOOK: type: QUERY +PREHOOK: Input: default@src + A masked pattern was here +POSTHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +POSTHOOK: type: QUERY +POSTHOOK: Input: default@src + A masked pattern was here +OPTIMIZED SQL: SELECT COUNT(*) AS `$f0` +FROM `default`.`src` +WHERE `key` > CAST(? AS STRING) +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 +Tez + A masked pattern was here + Edges: +Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) + A masked pattern was here + Vertices: +Map 1 +Map Operator Tree: +TableScan + alias: src + filterExpr: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) + Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE + GatherStats: false + Filter Operator +isSamplingPred: false +predicate: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) +Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE +Select Operator + Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE + Group By Operator +aggregations: count() +minReductionHashAggr: 0.99 +mode: hash +outputColumnNames: _col0 +Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE +Reduce Output Operator + bucketingVersion: 2 + null sort order: + numBuckets: -1 + sort order: + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE + tag: -1 + value expressions: _col0 (type: bigint) + auto parallelism: false +Execution mode: llap +LLAP IO: no inputs +Path -> Alias: + A masked pattern was here +Path -> Partition: + A masked pattern was here +Partition + base file name: src + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: +bucket_count -1 +bucketing_version 2 +column.name.delimiter , +columns key,value +columns.types string:string + A masked pattern was here +name default.src +serialization.format 1 +serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + +input format: org.apache.hadoop.mapred.TextInputFormat +output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat +properties: + bucketing_version 2 + column.name.delimiter , + columns key,value + columns.comments 'default','default' + columns.types string:string + A masked pattern was here + name default.src + serialization.format 1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +name: default.src + name: default.src +Truncated Path -> Alias: + /src [src] +Reducer 2 +Execution mode: llap +Needs Tagging: false +Reduce
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465952=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465952 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:30 Start Date: 03/Aug/20 23:30 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464714632 ## File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ## @@ -1619,6 +1620,9 @@ public static ColStatistics getColStatisticsFromExpression(HiveConf conf, Statis colName = enfd.getFieldName(); colType = enfd.getTypeString(); countDistincts = numRows; +} else if (end instanceof ExprDynamicParamDesc) { + //skip colecting stats for parameters Review comment: This method tries to figure out column statistics involved in the given expression. I guess the stats are used by parent callers to do various estimation like map join, aggregate min/max. For dynamic expression stats are returned as null. I think it makes more sense to do what `buildColStatForConstant` is doing and return an estimation instead of null. I will update the code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465952) Time Spent: 1h 50m (was: 1h 40m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465948 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:16 Start Date: 03/Aug/20 23:16 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464710291 ## File path: ql/src/test/results/clientpositive/llap/prepare_plan.q.out ## @@ -0,0 +1,1575 @@ +PREHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +PREHOOK: type: QUERY +PREHOOK: Input: default@src + A masked pattern was here +POSTHOOK: query: explain extended prepare pcount from select count(*) from src where key > ? +POSTHOOK: type: QUERY +POSTHOOK: Input: default@src + A masked pattern was here +OPTIMIZED SQL: SELECT COUNT(*) AS `$f0` +FROM `default`.`src` +WHERE `key` > CAST(? AS STRING) +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 +Tez + A masked pattern was here + Edges: +Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) + A masked pattern was here + Vertices: +Map 1 +Map Operator Tree: +TableScan + alias: src + filterExpr: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) + Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE + GatherStats: false + Filter Operator +isSamplingPred: false +predicate: (key > CAST( Dynamic Parameter index: 1 AS STRING)) (type: boolean) +Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE +Select Operator + Statistics: Num rows: 166 Data size: 14442 Basic stats: COMPLETE Column stats: COMPLETE + Group By Operator +aggregations: count() +minReductionHashAggr: 0.99 +mode: hash +outputColumnNames: _col0 +Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE +Reduce Output Operator + bucketingVersion: 2 + null sort order: + numBuckets: -1 + sort order: + Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE + tag: -1 + value expressions: _col0 (type: bigint) + auto parallelism: false +Execution mode: llap +LLAP IO: no inputs +Path -> Alias: + A masked pattern was here +Path -> Partition: + A masked pattern was here +Partition + base file name: src + input format: org.apache.hadoop.mapred.TextInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat + properties: +bucket_count -1 +bucketing_version 2 +column.name.delimiter , +columns key,value +columns.types string:string + A masked pattern was here +name default.src +serialization.format 1 +serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + +input format: org.apache.hadoop.mapred.TextInputFormat +output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat +properties: + bucketing_version 2 + column.name.delimiter , + columns key,value + columns.comments 'default','default' + columns.types string:string + A masked pattern was here + name default.src + serialization.format 1 + serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +name: default.src + name: default.src +Truncated Path -> Alias: + /src [src] +Reducer 2 +Execution mode: llap +Needs Tagging: false +Reduce
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465946 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 23:13 Start Date: 03/Aug/20 23:13 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464709177 ## File path: ql/src/java/org/apache/hadoop/hive/ql/plan/ExprDynamicParamDesc.java ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.plan; + +import java.io.Serializable; +import java.util.List; + +import org.apache.commons.lang3.builder.HashCodeBuilder; +import org.apache.hadoop.hive.common.StringInternUtils; +import org.apache.hadoop.hive.serde.serdeConstants; +import org.apache.hadoop.hive.serde2.objectinspector.ConstantObjectInspector; +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils; +import org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils; + +/** + * A constant expression. + */ +public class ExprDynamicParamDesc extends ExprNodeDesc implements Serializable { + private static final long serialVersionUID = 1L; + final protected transient static char[] hexArray = "0123456789ABCDEF".toCharArray(); + + private int index; + private Object value; + + public ExprDynamicParamDesc() { + } + + public ExprDynamicParamDesc(TypeInfo typeInfo, int index, Object value) { +super(typeInfo); +this.index = index; +this.value = value; + } + + public Object getValue() { +return value; + } + + public int getIndex() { +return index; + } + + + @Override + public String toString() { +return "Dynamic Parameter " + " index: " + index; Review comment: "Dynamic Parameter" makes it clear that the expression in an explain plan is dynamic expression. Just showing index will make it hard to read. What is the benefit of making it more compact? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465946) Time Spent: 1h 20m (was: 1h 10m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465938 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 22:47 Start Date: 03/Aug/20 22:47 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464701187 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/drop/ExecuteStatementAnalyzer.java ## @@ -0,0 +1,377 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.drop; + +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.exec.ExplainTask; +import org.apache.hadoop.hive.ql.exec.FetchTask; +import org.apache.hadoop.hive.ql.exec.FilterOperator; +import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.OperatorUtils; +import org.apache.hadoop.hive.ql.exec.SelectOperator; +import org.apache.hadoop.hive.ql.exec.SerializationUtilities; +import org.apache.hadoop.hive.ql.exec.Task; +import org.apache.hadoop.hive.ql.exec.Utilities; +import org.apache.hadoop.hive.ql.exec.tez.TezTask; +import org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.type.ExprNodeDescExprFactory; +import org.apache.hadoop.hive.ql.plan.BaseWork; +import org.apache.hadoop.hive.ql.plan.ExprDynamicParamDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeDesc; +import org.apache.hadoop.hive.ql.session.SessionState; +import org.apache.hadoop.hive.serde2.typeinfo.CharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Analyzer for Execute statement. + * This analyzer + * retreives cached {@link BaseSemanticAnalyzer}, + * makes copy of all tasks by serializing/deserializing it, + * bind dynamic parameters inside cached {@link BaseSemanticAnalyzer} using values provided + */ +@DDLType(types = HiveParser.TOK_EXECUTE) +public class ExecuteStatementAnalyzer extends BaseSemanticAnalyzer { + + public ExecuteStatementAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + /** + * This class encapsulate all {@link Task} required to be copied. + * This is required because {@link FetchTask} list of {@link Task} may hold reference to same + * objects (e.g. list of result files) and are required to be serialized/de-serialized together. + */ + private class PlanCopy { +FetchTask fetchTask; +List> tasks; + +PlanCopy(FetchTask fetchTask, List> tasks) { + this.fetchTask = fetchTask; + this.tasks = tasks; +} + +FetchTask getFetchTask() { + return fetchTask; +} + +List> getTasks() { + return tasks; +} + } + + private String getQueryName(ASTNode root) { +ASTNode queryNameAST = (ASTNode)(root.getChild(1)); +return queryNameAST.getText(); + } + + /** + * Utility method to create copy of provided object using kyro serialization/de-serialization. + */ + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +SerializationUtilities.serializePlan(task, baos); + +return SerializationUtilities.deserializePlan( +new ByteArrayInputStream(baos.toByteArray()), objClass); +
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465931 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 03/Aug/20 22:38 Start Date: 03/Aug/20 22:38 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on a change in pull request #1315: URL: https://github.com/apache/hive/pull/1315#discussion_r464698359 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/drop/ExecuteStatementAnalyzer.java ## @@ -0,0 +1,377 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.ddl.table.drop; + +import org.apache.hadoop.hive.ql.QueryState; +import org.apache.hadoop.hive.ql.ddl.DDLSemanticAnalyzerFactory.DDLType; +import org.apache.hadoop.hive.ql.exec.ExplainTask; +import org.apache.hadoop.hive.ql.exec.FetchTask; +import org.apache.hadoop.hive.ql.exec.FilterOperator; +import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.OperatorUtils; +import org.apache.hadoop.hive.ql.exec.SelectOperator; +import org.apache.hadoop.hive.ql.exec.SerializationUtilities; +import org.apache.hadoop.hive.ql.exec.Task; +import org.apache.hadoop.hive.ql.exec.Utilities; +import org.apache.hadoop.hive.ql.exec.tez.TezTask; +import org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator; +import org.apache.hadoop.hive.ql.parse.ASTNode; +import org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer; +import org.apache.hadoop.hive.ql.parse.HiveParser; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.type.ExprNodeDescExprFactory; +import org.apache.hadoop.hive.ql.plan.BaseWork; +import org.apache.hadoop.hive.ql.plan.ExprDynamicParamDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc; +import org.apache.hadoop.hive.ql.plan.ExprNodeDesc; +import org.apache.hadoop.hive.ql.session.SessionState; +import org.apache.hadoop.hive.serde2.typeinfo.CharTypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Analyzer for Execute statement. + * This analyzer + * retreives cached {@link BaseSemanticAnalyzer}, + * makes copy of all tasks by serializing/deserializing it, + * bind dynamic parameters inside cached {@link BaseSemanticAnalyzer} using values provided + */ +@DDLType(types = HiveParser.TOK_EXECUTE) +public class ExecuteStatementAnalyzer extends BaseSemanticAnalyzer { + + public ExecuteStatementAnalyzer(QueryState queryState) throws SemanticException { +super(queryState); + } + + /** + * This class encapsulate all {@link Task} required to be copied. + * This is required because {@link FetchTask} list of {@link Task} may hold reference to same + * objects (e.g. list of result files) and are required to be serialized/de-serialized together. + */ + private class PlanCopy { +FetchTask fetchTask; +List> tasks; + +PlanCopy(FetchTask fetchTask, List> tasks) { + this.fetchTask = fetchTask; + this.tasks = tasks; +} + +FetchTask getFetchTask() { + return fetchTask; +} + +List> getTasks() { + return tasks; +} + } + + private String getQueryName(ASTNode root) { +ASTNode queryNameAST = (ASTNode)(root.getChild(1)); +return queryNameAST.getText(); + } + + /** + * Utility method to create copy of provided object using kyro serialization/de-serialization. + */ + private T makeCopy(final Object task, Class objClass) { +ByteArrayOutputStream baos = new ByteArrayOutputStream(); +SerializationUtilities.serializePlan(task, baos); + +return SerializationUtilities.deserializePlan( +new ByteArrayInputStream(baos.toByteArray()), objClass); +
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465481=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465481 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 02/Aug/20 23:52 Start Date: 02/Aug/20 23:52 Worklog Time Spent: 10m Work Description: jcamachor commented on pull request #1315: URL: https://github.com/apache/hive/pull/1315#issuecomment-667741338 {quote} @jcamachor @zabetak I was thinking of changing all the existing tpcds queries to replace literals with ? and run explain on tpcds queries (using TestTezPerfCliDriver). This should provide us with some good test coverage. What I am undecided on is if it worth pushing it to hive repo and have a separate clidriver testing execute/prepare for tpcds. What do you suggest? {quote} Why do we need a new driver though? We should probably add it to the existing one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465481) Time Spent: 0.5h (was: 20m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=465482=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465482 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 02/Aug/20 23:52 Start Date: 02/Aug/20 23:52 Worklog Time Spent: 10m Work Description: jcamachor edited a comment on pull request #1315: URL: https://github.com/apache/hive/pull/1315#issuecomment-667741338 > @jcamachor @zabetak I was thinking of changing all the existing tpcds queries to replace literals with ? and run explain on tpcds queries (using TestTezPerfCliDriver). This should provide us with some good test coverage. What I am undecided on is if it worth pushing it to hive repo and have a separate clidriver testing execute/prepare for tpcds. What do you suggest? Why do we need a new driver though? We should probably add it to the existing one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 465482) Time Spent: 40m (was: 0.5h) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=464521=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-464521 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 30/Jul/20 12:28 Start Date: 30/Jul/20 12:28 Worklog Time Spent: 10m Work Description: zabetak commented on pull request #1315: URL: https://github.com/apache/hive/pull/1315#issuecomment-666334140 > @jcamachor @zabetak I was thinking of changing all the existing tpcds queries to replace literals with `?` and run explain on tpcds queries (using TestTezPerfCliDriver). This should provide us with some good test coverage. What I am undecided on is if it worth pushing it to hive repo and have a separate clidriver testing execute/prepare for tpcds. What do you suggest? If we want to avoid regressions I guess having the queries in the repo makes sense. Apart from that I guess that not all of the tpcds queries are needed to ensure code coverage. Possibly after testing you will end-up with a subset that is sufficient. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 464521) Time Spent: 20m (was: 10m) > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23951) Support parameterized queries in WHERE/HAVING clause
[ https://issues.apache.org/jira/browse/HIVE-23951?focusedWorklogId=464393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-464393 ] ASF GitHub Bot logged work on HIVE-23951: - Author: ASF GitHub Bot Created on: 30/Jul/20 07:36 Start Date: 30/Jul/20 07:36 Worklog Time Spent: 10m Work Description: vineetgarg02 commented on pull request #1315: URL: https://github.com/apache/hive/pull/1315#issuecomment-665947086 @jcamachor @zabetak I was thinking of changing all the existing tpcds queries to replace literals with `?` and run explain on tpcds queries (using TestTezPerfCliDriver). This should provide us with some good test coverage. What I am undecided on is if it worth pushing it to hive repo and have a separate clidriver testing execute/prepare for tpcds. What do you suggest? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 464393) Remaining Estimate: 0h Time Spent: 10m > Support parameterized queries in WHERE/HAVING clause > > > Key: HIVE-23951 > URL: https://issues.apache.org/jira/browse/HIVE-23951 > Project: Hive > Issue Type: Sub-task > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)