[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207312#comment-15207312 ] Vikram Dixit K commented on HIVE-13286: --- Committed to both master and branch-2.0. Thanks [~aihuaxu]! > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, > HIVE-13286.3.patch, HIVE-13286.4.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207015#comment-15207015 ] Aihua Xu commented on HIVE-13286: - [~vikram.dixit] Those tests are not related. Sorry. Forgot to mention that. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, > HIVE-13286.3.patch, HIVE-13286.4.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206870#comment-15206870 ] Vikram Dixit K commented on HIVE-13286: --- [~aihuaxu] Are the test failures related? Otherwise let me know and I can commit the patch to master and branch-2. I will raise a follow-on jira for disallowing the user to set this configuration. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, > HIVE-13286.3.patch, HIVE-13286.4.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205308#comment-15205308 ] Vikram Dixit K commented on HIVE-13286: --- I tested the latest patch. It works as expected. +1 > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, > HIVE-13286.3.patch, HIVE-13286.4.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205076#comment-15205076 ] Aihua Xu commented on HIVE-13286: - [~vikram.dixit] How is the new patch? Can you take a look? > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, > HIVE-13286.3.patch, HIVE-13286.4.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200688#comment-15200688 ] Aihua Xu commented on HIVE-13286: - Let me take a look at CLI cases. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, > HIVE-13286.3.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199635#comment-15199635 ] Hive QA commented on HIVE-13286: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793636/HIVE-13286.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9832 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7292/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7292/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7292/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12793636 - PreCommit-HIVE-TRUNK-Build > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200729#comment-15200729 ] Vikram Dixit K commented on HIVE-13286: --- Yeah. In one of the cli tests, I just added this: set hive.query.id = abc; And added a log in TezTask class. I saw that we were retaining the id in this case. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, > HIVE-13286.3.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197821#comment-15197821 ] Vikram Dixit K commented on HIVE-13286: --- It looks good to me. I ran a local test for the same. +1 pending tests. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200337#comment-15200337 ] Vikram Dixit K commented on HIVE-13286: --- Actually this is what you need: {code} diff --git a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java index 7327a42..fc10242 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java @@ -402,14 +402,9 @@ public int compile(String command, boolean resetTaskIds) { TaskFactory.resetId(); } saveSession(queryState); +String queryId = QueryPlan.makeQueryId(); -// Generate new query id if it's not set for CLI case. If it's session based, -// query id is passed in from the client or initialized when the session starts. -String queryId = conf.getVar(HiveConf.ConfVars.HIVEQUERYID); -if (queryId == null || queryId.isEmpty()) { - queryId = QueryPlan.makeQueryId(); - conf.setVar(HiveConf.ConfVars.HIVEQUERYID, queryId); -} +conf.setVar(HiveConf.ConfVars.HIVEQUERYID, queryId); //save some info for webUI for use after plan is freed this.queryDisplay.setQueryStr(queryStr); {code} > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200051#comment-15200051 ] Aihua Xu commented on HIVE-13286: - Attached patch-2: fix the unit test. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15200317#comment-15200317 ] Vikram Dixit K commented on HIVE-13286: --- [~aihuaxu] I think the bug still exists here. I see that once I set a query id, it never changes. I think you need the following change in the Driver class as well: {code} diff --git a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java index 7327a42..1fac526 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/Driver.java @@ -403,13 +403,7 @@ public int compile(String command, boolean resetTaskIds) { } saveSession(queryState); -// Generate new query id if it's not set for CLI case. If it's session based, -// query id is passed in from the client or initialized when the session starts. -String queryId = conf.getVar(HiveConf.ConfVars.HIVEQUERYID); -if (queryId == null || queryId.isEmpty()) { - queryId = QueryPlan.makeQueryId(); - conf.setVar(HiveConf.ConfVars.HIVEQUERYID, queryId); -} +conf.setVar(HiveConf.ConfVars.HIVEQUERYID, QueryPlan.makeQueryId()); //save some info for webUI for use after plan is freed this.queryDisplay.setQueryStr(queryStr); {code} I ran a test with a lot more queries than earlier and it turned out that the query id did not change. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196271#comment-15196271 ] Aihua Xu commented on HIVE-13286: - Attached the patch-1: disallow the input of the queryId. queryId will be regenernated and put in confOverlay for each query. [~vikram.dixit] Can you take a look? > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > Attachments: HIVE-13286.1.patch > > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196105#comment-15196105 ] Vikram Dixit K commented on HIVE-13286: --- Great! Thanks! > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196100#comment-15196100 ] Vikram Dixit K commented on HIVE-13286: --- Yeah. The same queryId causes issues. We should disallow a change from the client. The HIVE_LOG_TRACE_ID is already present in the hive configuration. You could use that. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196097#comment-15196097 ] Aihua Xu commented on HIVE-13286: - I see. I will disallow the input of queryId and generate a new one every time then. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196054#comment-15196054 ] Aihua Xu commented on HIVE-13286: - I moved to initialize the queryId earlier so that starting from the beginning of the execution, the workflow will have unique queryId. Actually I think your statement makes sense. What I need is really a traceId. Does the same queryId cause the issues? If it does, I can change to disallow the change from the client. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196044#comment-15196044 ] Vikram Dixit K commented on HIVE-13286: --- [~aihuaxu] Consider the following scenario: In Tez/Spark, if we end up caching the small table based on the hive query id. If say the user set the hive query id for 1 query and does not reset it for the subsequent query, we will end up picking the previously cached hash table for the join resulting in incorrect results right? Creating a new conf object would only work if we reset the query id after the query completes. If we allow it to exist in the configuration object after a query has completed running, it will result in incorrect results or some weird behavior. Consider hs2 or cli session, if a user in a session assigns a query id and doesn't reset it, it can result in incorrect results. You are expecting a user to set a query id each time after setting it once? I don't think that is great behavior. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195987#comment-15195987 ] Aihua Xu commented on HIVE-13286: - Actually what I need is the unique queryId. Think of the scenario that hive is just one of the components in the pipeline. The client could have a queryId (e.g., to trace the generation of the query) and then call hive. Then such queryId can link them together and better for diagnosis. If we create other ids, then seems to defeat that purpose. If the user doesn't provide queryId, then Hive will take care of that. Is the following the actual issue you see? {nformat} I think there is an issue there. confOverlay is passed from the client. Seems we need to make a copy of that otherwise if the client reuses the same confOverlay, then queryId is reused. Is that the issue? I will correct that. {noformat} > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195836#comment-15195836 ] Vikram Dixit K commented on HIVE-13286: --- The issue here is that if we make a change in the incoming configuration, it remains set for the duration of the session. We need to make sure that the user does not set the query id configuration because there is a chance of them breaking what a query id is - a unique id for each query. I think what you really want is something like the HIVE_LOG_TRACE_ID which could be renamed to something like a HIVE_USER_TRACE_ID - an id that a user can set and trace which can stay constant until the user decides to change it. You could create a separate configuration too for the use case you have. I think messing around with the query id looks like a recipe for bugs. I think we should move the query id to a config that the user cannot change and just put it in the utilities class for e.g. like INPUT_NAME that mapreduce used. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195815#comment-15195815 ] Aihua Xu commented on HIVE-13286: - OK. I think there is an issue there. confOverlay is passed from the client. Seems we need to make a copy of that otherwise if the client reuses the same confOverlay, then queryId is reused. Is that the issue? I will correct that. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195794#comment-15195794 ] Aihua Xu commented on HIVE-13286: - QueryId should be unique. The user overwritten queryId is for the user to provide meaningful queryId if the user wants to and the user needs to make sure it's unique. If the user doesn't overwrite the queryId, then hive will generate one as before. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195761#comment-15195761 ] Siddharth Seth commented on HIVE-13286: --- [~aihuaxu] - I'm curious as to why we allow the queryId to be overwritten by users. Isn't that meant to be unique within HiveServer. If some historic query information were to be retained by HiveServer - that would break. The query name can already be overwritten. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195298#comment-15195298 ] Aihua Xu commented on HIVE-13286: - OK. We had a followup to fix HIVE-12456 to avoid storing queryId in SessionState since multiple queries can run in the same session at the same time. Later we will combine session conf and confOverlay conf to the query conf so the query should have the new queryId. [~vikram.dixit] Did you have the patch-12456 applied? > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195278#comment-15195278 ] Aihua Xu commented on HIVE-13286: - [~vikram.dixit] This is to check if we provide queryId from the client. If a client provides a queryId, then we will use that queryId internally, otherwise, we will make a new one since the client could need a meaningful queryId. Seems there is a bug in here. We should make a new queryId inside if statement and set it outside the if statement. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194593#comment-15194593 ] Aihua Xu commented on HIVE-13286: - I will take a look. That seems to be an issue and not my intention. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194509#comment-15194509 ] Vikram Dixit K commented on HIVE-13286: --- I think it primarily comes down to this: the hive conf object once modified with a generated query id, never resets the query id for a subsequent query. {code} +String queryId = confOverlay.get(HiveConf.ConfVars.HIVEQUERYID.varname); +if (queryId == null || queryId.isEmpty()) { + queryId = QueryPlan.makeQueryId(); + confOverlay.put(HiveConf.ConfVars.HIVEQUERYID.varname, queryId); + sessionState.getConf().setVar(HiveConf.ConfVars.HIVEQUERYID, queryId); +} {code} Once the query id has been set by a previous query, it never changes. This is incorrect behavior. I am not sure about what the change was trying to do but this needs to get fixed. Thanks! > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)