[jira] [Resolved] (HIVE-24091) Replace multiple constraints call with getAllTableConstraints api call in query planner
[ https://issues.apache.org/jira/browse/HIVE-24091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan resolved HIVE-24091. - Resolution: Duplicate > Replace multiple constraints call with getAllTableConstraints api call in > query planner > --- > > Key: HIVE-24091 > URL: https://issues.apache.org/jira/browse/HIVE-24091 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Inorder to get all the constraints of table i.e. PrimaryKey, ForeignKey, > UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We > have to do 6 different metastore call. Replace these call with one > getAllTableConstraints api which provide all the constraints at once -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24075) Optimise KeyValuesInputMerger
[ https://issues.apache.org/jira/browse/HIVE-24075?focusedWorklogId=478262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478262 ] ASF GitHub Bot logged work on HIVE-24075: - Author: ASF GitHub Bot Created on: 03/Sep/20 03:35 Start Date: 03/Sep/20 03:35 Worklog Time Spent: 10m Work Description: rbalamohan opened a new pull request #1463: URL: https://github.com/apache/hive/pull/1463 https://issues.apache.org/jira/browse/HIVE-24075 When the reader comparisons in the queue are same, we could reuse "nextKVReaders" in next subsequent iteration instead of doing the comparison all over again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478262) Remaining Estimate: 0h Time Spent: 10m > Optimise KeyValuesInputMerger > - > > Key: HIVE-24075 > URL: https://issues.apache.org/jira/browse/HIVE-24075 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Comparisons in KeyValueInputMerger can be reduced. > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L165|https://github.infra.cloudera.com/CDH/hive/blob/cdpd-master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L165] > [https://github.infra.cloudera.com/CDH/hive/blob/cdpd-master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L150|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L150] > If the reader comparisons in the queue are same, we could reuse > "{{nextKVReaders}}" in next subsequent iteration instead of doing the > comparison all over again. > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L178] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24075) Optimise KeyValuesInputMerger
[ https://issues.apache.org/jira/browse/HIVE-24075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24075: -- Labels: pull-request-available (was: ) > Optimise KeyValuesInputMerger > - > > Key: HIVE-24075 > URL: https://issues.apache.org/jira/browse/HIVE-24075 > Project: Hive > Issue Type: Improvement >Reporter: Rajesh Balamohan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Comparisons in KeyValueInputMerger can be reduced. > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L165|https://github.infra.cloudera.com/CDH/hive/blob/cdpd-master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L165] > [https://github.infra.cloudera.com/CDH/hive/blob/cdpd-master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L150|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L150] > If the reader comparisons in the queue are same, we could reuse > "{{nextKVReaders}}" in next subsequent iteration instead of doing the > comparison all over again. > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValuesInputMerger.java#L178] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24044) Implement listPartitionNames with filter or order on temporary tables
[ https://issues.apache.org/jira/browse/HIVE-24044?focusedWorklogId=478257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478257 ] ASF GitHub Bot logged work on HIVE-24044: - Author: ASF GitHub Bot Created on: 03/Sep/20 03:22 Start Date: 03/Sep/20 03:22 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on pull request #1408: URL: https://github.com/apache/hive/pull/1408#issuecomment-686226872 @laszlopinter86 @pvary could you please take a look? thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478257) Time Spent: 20m (was: 10m) > Implement listPartitionNames with filter or order on temporary tables > -- > > Key: HIVE-24044 > URL: https://issues.apache.org/jira/browse/HIVE-24044 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 4.0.0 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Temporary tables can have their own partitions, and IMetaStoreClient use > {code:java} > List listPartitionNames(PartitionsByExprRequest request){code} > to filter or sort the results. This method can be implemented on temporary > tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-24039) Update jquery version to mitigate CVE-2020-11023
[ https://issues.apache.org/jira/browse/HIVE-24039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189789#comment-17189789 ] Rajkumar Singh edited comment on HIVE-24039 at 9/3/20, 3:21 AM: Hi Kishen, The pull request is already available for this, https://github.com/apache/hive/pull/1403, can you please review it? was (Author: rajkumar singh): Hi Kishen, The pul request is already available for this, https://github.com/apache/hive/pull/1403, can you please review it? > Update jquery version to mitigate CVE-2020-11023 > > > Key: HIVE-24039 > URL: https://issues.apache.org/jira/browse/HIVE-24039 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Rajkumar Singh >Assignee: Kishen Das >Priority: Major > > there is known vulnerability in jquery version used by hive, with this jira > plan is to upgrade the jquery version 3.5.0 where it's been fixed. more > details about the vulnerability can be found here. > https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24039) Update jquery version to mitigate CVE-2020-11023
[ https://issues.apache.org/jira/browse/HIVE-24039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189789#comment-17189789 ] Rajkumar Singh commented on HIVE-24039: --- Hi Kishen, The pul request is already available for this, https://github.com/apache/hive/pull/1403, can you please review it? > Update jquery version to mitigate CVE-2020-11023 > > > Key: HIVE-24039 > URL: https://issues.apache.org/jira/browse/HIVE-24039 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Rajkumar Singh >Assignee: Kishen Das >Priority: Major > > there is known vulnerability in jquery version used by hive, with this jira > plan is to upgrade the jquery version 3.5.0 where it's been fixed. more > details about the vulnerability can be found here. > https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-23981) Use task counter enum to get the approximate counter value
[ https://issues.apache.org/jira/browse/HIVE-23981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-23981 started by mahesh kumar behera. -- > Use task counter enum to get the approximate counter value > -- > > Key: HIVE-23981 > URL: https://issues.apache.org/jira/browse/HIVE-23981 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > > The value for APPROXIMATE_INPUT_RECORDS should be obtained using the enum > name instead of static string. Once Tez release is done with the specific > information we should change it to > org.apache.tez.common.counters.TaskCounter.APPROXIMATE_INPUT_RECORDS. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HIVE-23981) Use task counter enum to get the approximate counter value
[ https://issues.apache.org/jira/browse/HIVE-23981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera reopened HIVE-23981: > Use task counter enum to get the approximate counter value > -- > > Key: HIVE-23981 > URL: https://issues.apache.org/jira/browse/HIVE-23981 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > > The value for APPROXIMATE_INPUT_RECORDS should be obtained using the enum > name instead of static string. Once Tez release is done with the specific > information we should change it to > org.apache.tez.common.counters.TaskCounter.APPROXIMATE_INPUT_RECORDS. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23953) Use task counter information to compute keycount during hashtable loading
[ https://issues.apache.org/jira/browse/HIVE-23953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera resolved HIVE-23953. Resolution: Fixed > Use task counter information to compute keycount during hashtable loading > - > > Key: HIVE-23953 > URL: https://issues.apache.org/jira/browse/HIVE-23953 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > There are cases when compiler misestimates key count and this results in a > number of hashtable resizes during runtime. > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTableLoader.java#L128] > In such cases, it would be good to get "approximate_input_records" (TEZ-4207) > counter from upstream to compute the key count more accurately at runtime. > > * > * > Options > h4. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24115) Kryo's instantiation strategy should use the DefaultInstantiatorStrategy instead of the dangerous StdInstantiatorStrategy
[ https://issues.apache.org/jira/browse/HIVE-24115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hao updated HIVE-24115: --- Description: DefaultInstantiatorStrategy is the recommended way of creating objects with Kryo. It runs constructors just like would be done with Java code. Alternative, extralinguistic mechanisms can also be used to create objects. The [Objenesis|http://objenesis.org/] StdInstantiatorStrategy uses JVM specific APIs to create an instance of a class without calling any constructor at all. Using this is dangerous because most classes expect their constructors to be called. Creating the object by bypassing its constructors may leave the object in an uninitialized or invalid state. Classes must be designed to be created in this way. Kryo can be configured to try DefaultInstantiatorStrategy first, then fallback to StdInstantiatorStrategy if necessary like : kryo.setInstantiatorStrategy(new DefaultInstantiatorStrategy(new StdInstantiatorStrategy())); was:DefaultInstantiatorStrategy is the recommended way of creating objects with Kryo. It runs constructors just like would be done with Java code. Alternative, extralinguistic mechanisms can also be used to create objects. The [Objenesis|http://objenesis.org/] StdInstantiatorStrategy uses JVM specific APIs to create an instance of a class without calling any constructor at all. Using this is dangerous because most classes expect their constructors to be called. Creating the object by bypassing its constructors may leave the object in an uninitialized or invalid state. Classes must be designed to be created in this way. > Kryo's instantiation strategy should use the DefaultInstantiatorStrategy > instead of the dangerous StdInstantiatorStrategy > -- > > Key: HIVE-24115 > URL: https://issues.apache.org/jira/browse/HIVE-24115 > Project: Hive > Issue Type: Wish >Reporter: hao >Priority: Minor > > DefaultInstantiatorStrategy is the recommended way of creating objects with > Kryo. It runs constructors just like would be done with Java code. > Alternative, extralinguistic mechanisms can also be used to create objects. > The [Objenesis|http://objenesis.org/] StdInstantiatorStrategy uses JVM > specific APIs to create an instance of a class without calling any > constructor at all. Using this is dangerous because most classes expect their > constructors to be called. Creating the object by bypassing its constructors > may leave the object in an uninitialized or invalid state. Classes must be > designed to be created in this way. > Kryo can be configured to try DefaultInstantiatorStrategy first, then > fallback to StdInstantiatorStrategy if necessary > like : > kryo.setInstantiatorStrategy(new DefaultInstantiatorStrategy(new > StdInstantiatorStrategy())); -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24039) Update jquery version to mitigate CVE-2020-11023
[ https://issues.apache.org/jira/browse/HIVE-24039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189782#comment-17189782 ] Kishen Das commented on HIVE-24039: --- Created a pull request -> [https://github.com/apache/hive/pull/1462] for review. > Update jquery version to mitigate CVE-2020-11023 > > > Key: HIVE-24039 > URL: https://issues.apache.org/jira/browse/HIVE-24039 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Rajkumar Singh >Assignee: Kishen Das >Priority: Major > > there is known vulnerability in jquery version used by hive, with this jira > plan is to upgrade the jquery version 3.5.0 where it's been fixed. more > details about the vulnerability can be found here. > https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24039) Update jquery version to mitigate CVE-2020-11023
[ https://issues.apache.org/jira/browse/HIVE-24039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24039 started by Kishen Das. - > Update jquery version to mitigate CVE-2020-11023 > > > Key: HIVE-24039 > URL: https://issues.apache.org/jira/browse/HIVE-24039 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Rajkumar Singh >Assignee: Kishen Das >Priority: Major > > there is known vulnerability in jquery version used by hive, with this jira > plan is to upgrade the jquery version 3.5.0 where it's been fixed. more > details about the vulnerability can be found here. > https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24039) Update jquery version to mitigate CVE-2020-11023
[ https://issues.apache.org/jira/browse/HIVE-24039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kishen Das reassigned HIVE-24039: - Assignee: Kishen Das (was: Rajkumar Singh) > Update jquery version to mitigate CVE-2020-11023 > > > Key: HIVE-24039 > URL: https://issues.apache.org/jira/browse/HIVE-24039 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Rajkumar Singh >Assignee: Kishen Das >Priority: Major > > there is known vulnerability in jquery version used by hive, with this jira > plan is to upgrade the jquery version 3.5.0 where it's been fixed. more > details about the vulnerability can be found here. > https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3
[ https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=478245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478245 ] ASF GitHub Bot logged work on HIVE-24035: - Author: ASF GitHub Bot Created on: 03/Sep/20 01:05 Start Date: 03/Sep/20 01:05 Worklog Time Spent: 10m Work Description: viirya commented on pull request #1398: URL: https://github.com/apache/hive/pull/1398#issuecomment-686170323 Ok, cool! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478245) Time Spent: 1h 40m (was: 1.5h) > Add Jenkinsfile for branch-2.3 > -- > > Key: HIVE-24035 > URL: https://issues.apache.org/jira/browse/HIVE-24035 > Project: Hive > Issue Type: Test >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > To enable precommit tests for github PR, we need to have a Jenkinsfile in the > repo. This is already done for master and branch-2. This adds the same for > branch-2.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3
[ https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=478242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478242 ] ASF GitHub Bot logged work on HIVE-24035: - Author: ASF GitHub Bot Created on: 03/Sep/20 00:55 Start Date: 03/Sep/20 00:55 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #1398: URL: https://github.com/apache/hive/pull/1398#issuecomment-686166326 OK will do that. We can then add the file to branch-2.3 so that we can have a base line of test results. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478242) Time Spent: 1.5h (was: 1h 20m) > Add Jenkinsfile for branch-2.3 > -- > > Key: HIVE-24035 > URL: https://issues.apache.org/jira/browse/HIVE-24035 > Project: Hive > Issue Type: Test >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > To enable precommit tests for github PR, we need to have a Jenkinsfile in the > repo. This is already done for master and branch-2. This adds the same for > branch-2.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3
[ https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=478240=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478240 ] ASF GitHub Bot logged work on HIVE-24035: - Author: ASF GitHub Bot Created on: 03/Sep/20 00:53 Start Date: 03/Sep/20 00:53 Worklog Time Spent: 10m Work Description: viirya commented on pull request #1398: URL: https://github.com/apache/hive/pull/1398#issuecomment-686165574 @sunchao Ok. I see. We can revert it first. I can continue looking into it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478240) Time Spent: 1h 20m (was: 1h 10m) > Add Jenkinsfile for branch-2.3 > -- > > Key: HIVE-24035 > URL: https://issues.apache.org/jira/browse/HIVE-24035 > Project: Hive > Issue Type: Test >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > To enable precommit tests for github PR, we need to have a Jenkinsfile in the > repo. This is already done for master and branch-2. This adds the same for > branch-2.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24114: Attachment: HIVE-24114.01.patch > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24114: Attachment: (was: HIVE-24114.01.patch) > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23668) Clean up Task for Hive Metrics
[ https://issues.apache.org/jira/browse/HIVE-23668?focusedWorklogId=478238=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478238 ] ASF GitHub Bot logged work on HIVE-23668: - Author: ASF GitHub Bot Created on: 03/Sep/20 00:44 Start Date: 03/Sep/20 00:44 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1129: URL: https://github.com/apache/hive/pull/1129 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478238) Time Spent: 2h 40m (was: 2.5h) > Clean up Task for Hive Metrics > -- > > Key: HIVE-23668 > URL: https://issues.apache.org/jira/browse/HIVE-23668 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23668.01.patch, HIVE-23668.02.patch, > HIVE-23668.03.patch, HIVE-23668.04.patch, HIVE-23668.05.patch, > HIVE-23668.06.patch > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24035) Add Jenkinsfile for branch-2.3
[ https://issues.apache.org/jira/browse/HIVE-24035?focusedWorklogId=478237=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478237 ] ASF GitHub Bot logged work on HIVE-24035: - Author: ASF GitHub Bot Created on: 03/Sep/20 00:35 Start Date: 03/Sep/20 00:35 Worklog Time Spent: 10m Work Description: sunchao commented on pull request #1398: URL: https://github.com/apache/hive/pull/1398#issuecomment-686159274 We made some progress in enabling this for branch-2.3, _only_ 781 tests are failing right now. @viirya it seems #1356 has caused some issues related to Guava, and that I merged it too early :(. There are lots of errors like the following: ``` [2020-09-01T21:07:31.008Z] Caused by: java.lang.NoSuchMethodError: location = jar:file:/home/jenkins/agent/workspace/hive-precommit_PR-1398/.git/m2/org/apache/calcite/calcite-core/1.10.0/calcite-core-1.10.0.jar!/org/apache/calcite/rel/RelCollationImpl.class. error msg = org.apache.calcite.rel.RelCollationImpl.(Lorg/apache/hive/com/google/common/collect/ImmutableList;)V [2020-09-01T21:07:31.008Z] at org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getCollationList(RelOptHiveTable.java:186) [2020-09-01T21:07:31.008Z] at org.apache.calcite.rel.metadata.RelMdCollation.table(RelMdCollation.java:175) [2020-09-01T21:07:31.008Z] at org.apache.calcite.rel.metadata.RelMdCollation.collations(RelMdCollation.java:122) [2020-09-01T21:07:31.008Z] at GeneratedMetadataHandler_Collation.collations_$(Unknown Source) [2020-09-01T21:07:31.008Z] at GeneratedMetadataHandler_Collation.collations(Unknown Source) [2020-09-01T21:07:31.008Z] at org.apache.calcite.rel.metadata.RelMetadataQuery.collations(RelMetadataQuery.java:482) [2020-09-01T21:07:31.008Z] at org.apache.calcite.sql2rel.RelFieldTrimmer.trimChild(RelFieldTrimmer.java:189) [2020-09-01T21:07:31.008Z] at org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(RelFieldTrimmer.java:374) [2020-09-01T21:07:31.008Z] at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelFieldTrimmer.trimFields(HiveRelFieldTrimmer.java:273) [2020-09-01T21:07:31.008Z] ... 47 more ``` like following. Please ignore the debugging message I inserted into the `NoSuchMethodError`. Note that the `RelCollationImpl` constructor is referring to the shaded Guava class instead of the original one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478237) Time Spent: 1h 10m (was: 1h) > Add Jenkinsfile for branch-2.3 > -- > > Key: HIVE-24035 > URL: https://issues.apache.org/jira/browse/HIVE-24035 > Project: Hive > Issue Type: Test >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > To enable precommit tests for github PR, we need to have a Jenkinsfile in the > repo. This is already done for master and branch-2. This adds the same for > branch-2.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24114: Attachment: (was: HIVE-24114.01.patch) > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24114: Attachment: HIVE-24114.01.patch > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24114: -- Labels: pull-request-available (was: ) > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24114: Attachment: HIVE-24114.01.patch > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?focusedWorklogId=478235=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478235 ] ASF GitHub Bot logged work on HIVE-24114: - Author: ASF GitHub Bot Created on: 03/Sep/20 00:20 Start Date: 03/Sep/20 00:20 Worklog Time Spent: 10m Work Description: pkumarsinha opened a new pull request #1461: URL: https://github.com/apache/hive/pull/1461 …on target ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478235) Remaining Estimate: 0h Time Spent: 10m > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24114: Status: Patch Available (was: Open) > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24114.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24114) Repl Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha updated HIVE-24114: Summary: Repl Load is not working with both staging and data copy on target (was: Load is not working with both staging and data copy on target ) > Repl Load is not working with both staging and data copy on target > --- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24114) Load is not working with both staging and data copy on target
[ https://issues.apache.org/jira/browse/HIVE-24114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pravin Sinha reassigned HIVE-24114: --- > Load is not working with both staging and data copy on target > -- > > Key: HIVE-24114 > URL: https://issues.apache.org/jira/browse/HIVE-24114 > Project: Hive > Issue Type: Task >Reporter: Pravin Sinha >Assignee: Pravin Sinha >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21574) return wrong result when execute left join sql
[ https://issues.apache.org/jira/browse/HIVE-21574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189700#comment-17189700 ] Yevgeniya commented on HIVE-21574: -- facing exactly the same problem in Hive 3.1.2 using MR execution engine, while TEZ engine returns correct results > return wrong result when execute left join sql > -- > > Key: HIVE-21574 > URL: https://issues.apache.org/jira/browse/HIVE-21574 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 > Environment: hive 3.1.0 hdfs 3.1.1 >Reporter: Panda Song >Priority: Blocker > > Can somebody delete this issue please? > when I use a table instead of the sub select,I get the right result,much more > rows are joined together(metrics old_uv is bigger!!!) > Is there some bugs here? > Please help me ,thanks a lot!! > {code:java} > select > a.event_date, > count(distinct a.device_id) as uv, > count(distinct case when b.device_id is not null then b.device_id end) as > old_uv, > count(distinct a.device_id) - count(distinct case when b.device_id is not > null then b.device_id end) as new_uv > from > ( > select > event_date, > device_id, > qingting_id > from datacenter.bl_page_chain_day > where event_date = '2019-03-31' > and (current_content like '/membership5%' > or current_content like '/vips/members%' > or current_content like '/members/v2/%') > )a > left join > (select > b.device_id > from > lzq_test.first_buy_vip a > inner join datacenter.device_qingting b on a.qingting_id = b.qingting_id > where a.first_buy < '2019-03-31' > group by b.device_id > )b > on a.device_id = b.device_id > group by a.event_date; > {code} > plan: > {code:java} > Plan optimized by CBO. > > Vertex dependency in root stage > Map 1 <- Map 3 (BROADCAST_EDGE) > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 5 <- Map 4 (CUSTOM_SIMPLE_EDGE), Reducer 2 (ONE_TO_ONE_EDGE) > Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > > Stage-0 >Fetch Operator > limit:-1 > Stage-1 >Reducer 6 >File Output Operator [FS_26] > Select Operator [SEL_25] (rows=35527639 width=349) >Output:["_col0","_col1","_col2","_col3"] >Group By Operator [GBY_24] (rows=35527639 width=349) > Output:["_col0","_col1","_col2"],aggregations:["count(DISTINCT > KEY._col1:0._col0)","count(DISTINCT KEY._col1:1._col0)"],keys:KEY._col0 ><-Reducer 5 [SIMPLE_EDGE] > SHUFFLE [RS_23] >PartitionCols:_col0 >Group By Operator [GBY_22] (rows=71055278 width=349) > > Output:["_col0","_col1","_col2","_col3","_col4"],aggregations:["count(DISTINCT > _col1)","count(DISTINCT _col2)"],keys:true, _col1, _col2 > Select Operator [SEL_20] (rows=71055278 width=349) >Output:["_col1","_col2"] >Map Join Operator [MAPJOIN_45] (rows=71055278 width=349) > > Conds:RS_17.KEY.reducesinkkey0=RS_18.KEY.reducesinkkey0(Right > Outer),Output:["_col0","_col1"] ><-Reducer 2 [ONE_TO_ONE_EDGE] > FORWARD [RS_17] >PartitionCols:_col0 >Group By Operator [GBY_12] (rows=21738609 width=235) > Output:["_col0"],keys:KEY._col0 ><-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_11] >PartitionCols:_col0 >Group By Operator [GBY_10] (rows=43477219 > width=235) > Output:["_col0"],keys:_col0 > Map Join Operator [MAPJOIN_44] (rows=43477219 > width=235) > > Conds:SEL_2._col1=RS_7._col0(Inner),Output:["_col0"] > <-Map 3 [BROADCAST_EDGE] >BROADCAST [RS_7] > PartitionCols:_col0 > Select Operator [SEL_5] (rows=301013 > width=228) >Output:["_col0"] >Filter Operator [FIL_32] (rows=301013 > width=228) > predicate:((first_buy < > DATE'2019-03-31') and qingting_id is not null)
[jira] [Commented] (HIVE-24113) NPE in GenericUDFToUnixTimeStamp
[ https://issues.apache.org/jira/browse/HIVE-24113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189688#comment-17189688 ] Rajkumar Singh commented on HIVE-24113: --- created pull request https://github.com/apache/hive/pull/1460/files > NPE in GenericUDFToUnixTimeStamp > > > Key: HIVE-24113 > URL: https://issues.apache.org/jira/browse/HIVE-24113 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.2 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > > Following query will trigger the getPartitionsByExpr call at HMS, HMS will > try to evaluate the filter based on the PartitionExpressionForMetastore > proxy, this proxy uses the QL packages to evaluate the filter and call > GenericUDFToUnixTimeStamp. > select * from table_name where hour between > from_unixtime(unix_timestamp('2020090120', 'MMddHH') - 1*60*60, > 'MMddHH') and from_unixtime(unix_timestamp('2020090122', 'MMddHH') + > 2*60*60, 'MMddHH'); > I think SessionState in the code path will always be NULL thats why it hit > the NPE. > {code:java} > java.lang.NullPointerException: null > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initializeInput(GenericUDFToUnixTimeStamp.java:126) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initialize(GenericUDFToUnixTimeStamp.java:75) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:146) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.prepareExpr(PartExprEvalUtils.java:119) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prunePartitionNames(PartitionPruner.java:551) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.filterPartitionsByExpr(PartitionExpressionForMetastore.java:82) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesPrunedByExprNoTxn(ObjectStore.java:3527) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.access$1400(ObjectStore.java:252) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3493) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3464) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3764) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3499) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:3452) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_112] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_112] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_112] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at
[jira] [Assigned] (HIVE-24113) NPE in GenericUDFToUnixTimeStamp
[ https://issues.apache.org/jira/browse/HIVE-24113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajkumar Singh reassigned HIVE-24113: - > NPE in GenericUDFToUnixTimeStamp > > > Key: HIVE-24113 > URL: https://issues.apache.org/jira/browse/HIVE-24113 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.1.2 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > > Following query will trigger the getPartitionsByExpr call at HMS, HMS will > try to evaluate the filter based on the PartitionExpressionForMetastore > proxy, this proxy uses the QL packages to evaluate the filter and call > GenericUDFToUnixTimeStamp. > select * from table_name where hour between > from_unixtime(unix_timestamp('2020090120', 'MMddHH') - 1*60*60, > 'MMddHH') and from_unixtime(unix_timestamp('2020090122', 'MMddHH') + > 2*60*60, 'MMddHH'); > I think SessionState in the code path will always be NULL thats why it hit > the NPE. > {code:java} > java.lang.NullPointerException: null > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initializeInput(GenericUDFToUnixTimeStamp.java:126) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initialize(GenericUDFToUnixTimeStamp.java:75) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:146) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.prepareExpr(PartExprEvalUtils.java:119) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prunePartitionNames(PartitionPruner.java:551) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.filterPartitionsByExpr(PartitionExpressionForMetastore.java:82) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesPrunedByExprNoTxn(ObjectStore.java:3527) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.access$1400(ObjectStore.java:252) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3493) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3464) > ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3764) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3499) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:3452) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_112] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_112] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_112] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] > at com.sun.proxy.$Proxy28.getPartitionsByExpr(Unknown Source) [?:?] > at >
[jira] [Work logged] (HIVE-24084) Enhance cost model to push down more Aggregates
[ https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=478065=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478065 ] ASF GitHub Bot logged work on HIVE-24084: - Author: ASF GitHub Bot Created on: 02/Sep/20 18:49 Start Date: 02/Sep/20 18:49 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1439: URL: https://github.com/apache/hive/pull/1439#discussion_r482301409 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/cost/HiveOnTezCostModel.java ## @@ -89,22 +89,23 @@ public RelOptCost getAggregateCost(HiveAggregate aggregate) { } else { final RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery(); // 1. Sum of input cardinalities - final Double rCount = mq.getRowCount(aggregate.getInput()); - if (rCount == null) { + final Double inputRowCount = mq.getRowCount(aggregate.getInput()); + final Double rowCount = mq.getRowCount(aggregate); + if (inputRowCount == null || rowCount == null) { return null; } // 2. CPU cost = sorting cost - final double cpuCost = algoUtils.computeSortCPUCost(rCount); + final double cpuCost = algoUtils.computeSortCPUCost(rowCount) + inputRowCount * algoUtils.getCpuUnitCost(); Review comment: I think the problem is that we are trying to encapsulate here the algorithm selection too: The fact that we are grouping in each node before sorting the data (I think this is also somehow reflected in the `isLe` discussion above). However, that is not represented with precision by current model, since output rows is supposed to be the output of the final step in the aggregation. Wrt read, there is also the IO part of the cost, I am trying to understand whether some of the cost representation that you are talking about is IO. There is some more info about the original formulas that were used to compute this here: https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive Can we split this into two patches and have the changes to the cost model on their own? This should also help to discuss this in more detail. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 478065) Time Spent: 1.5h (was: 1h 20m) > Enhance cost model to push down more Aggregates > --- > > Key: HIVE-24084 > URL: https://issues.apache.org/jira/browse/HIVE-24084 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23928) Support conversion of not-exists to Anti join directly
[ https://issues.apache.org/jira/browse/HIVE-23928?focusedWorklogId=477998=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477998 ] ASF GitHub Bot logged work on HIVE-23928: - Author: ASF GitHub Bot Created on: 02/Sep/20 17:43 Start Date: 02/Sep/20 17:43 Worklog Time Spent: 10m Work Description: maheshk114 opened a new pull request #1459: URL: https://github.com/apache/hive/pull/1459 The current anti join conversion does not support direct conversion of not-exists to anti join. The not exists sub query is converted first to left out join and then its converted to anti join. This may cause some of the optimization rule to be skipped. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477998) Remaining Estimate: 0h Time Spent: 10m > Support conversion of not-exists to Anti join directly > -- > > Key: HIVE-23928 > URL: https://issues.apache.org/jira/browse/HIVE-23928 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The current anti join conversion does not support direct conversion of > not-exists to anti join. The not exists sub query is converted first to left > out join and then its converted to anti join. This may cause some of the > optimization rule to be skipped. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23928) Support conversion of not-exists to Anti join directly
[ https://issues.apache.org/jira/browse/HIVE-23928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23928: -- Labels: pull-request-available (was: ) > Support conversion of not-exists to Anti join directly > -- > > Key: HIVE-23928 > URL: https://issues.apache.org/jira/browse/HIVE-23928 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > The current anti join conversion does not support direct conversion of > not-exists to anti join. The not exists sub query is converted first to left > out join and then its converted to anti join. This may cause some of the > optimization rule to be skipped. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24091) Replace multiple constraints call with getAllTableConstraints api call in query planner
[ https://issues.apache.org/jira/browse/HIVE-24091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Sharma updated HIVE-24091: - Description: Inorder to get all the constraints of table i.e. PrimaryKey, ForeignKey, UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We have to do 6 different metastore call. Replace these call with one getAllTableConstraints api which provide all the constraints at once (was: Inorder get all the constraints of table i.e. PrimaryKey, ForeignKey, UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We have to do 6 different metastore call. Replace these call with one getAllTableConstraints api which provide all the constraints at once) > Replace multiple constraints call with getAllTableConstraints api call in > query planner > --- > > Key: HIVE-24091 > URL: https://issues.apache.org/jira/browse/HIVE-24091 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Inorder to get all the constraints of table i.e. PrimaryKey, ForeignKey, > UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We > have to do 6 different metastore call. Replace these call with one > getAllTableConstraints api which provide all the constraints at once -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24084) Enhance cost model to push down more Aggregates
[ https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=477992=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477992 ] ASF GitHub Bot logged work on HIVE-24084: - Author: ASF GitHub Bot Created on: 02/Sep/20 17:27 Start Date: 02/Sep/20 17:27 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #1439: URL: https://github.com/apache/hive/pull/1439#discussion_r482243376 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java ## @@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) { } } + /** + * Determines weather the give grouping is unique. + * + * Consider a join which might produce non-unique rows; but later the results are aggregated again. + * This method determines if there are sufficient columns in the grouping which have been present previously as unique column(s). + */ + private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) { +if (groups.isEmpty()) { + return false; +} +RelMetadataQuery mq = input.getCluster().getMetadataQuery(); +Set uKeys = mq.getUniqueKeys(input); +for (ImmutableBitSet u : uKeys) { + if (groups.contains(u)) { +return true; + } +} +if (input instanceof Join) { + Join join = (Join) input; + RexBuilder rexBuilder = input.getCluster().getRexBuilder(); + SimpleConditionInfo cond = new SimpleConditionInfo(join.getCondition(), rexBuilder); + + if (cond.valid) { +ImmutableBitSet newGroup = groups.intersect(ImmutableBitSet.fromBitSet(cond.fields)); +RelNode l = join.getLeft(); +RelNode r = join.getRight(); + +int joinFieldCount = join.getRowType().getFieldCount(); +int lFieldCount = l.getRowType().getFieldCount(); + +ImmutableBitSet groupL = newGroup.get(0, lFieldCount); +ImmutableBitSet groupR = newGroup.get(lFieldCount, joinFieldCount).shift(-lFieldCount); + +if (isGroupingUnique(l, groupL)) { Review comment: Could you execute `areColumnsUnique` on the join input then? Wouldn't that simplify this logic? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477992) Time Spent: 1h 20m (was: 1h 10m) > Enhance cost model to push down more Aggregates > --- > > Key: HIVE-24084 > URL: https://issues.apache.org/jira/browse/HIVE-24084 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23454) Querying hive table which has Materialized view fails with HiveAccessControlException
[ https://issues.apache.org/jira/browse/HIVE-23454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189573#comment-17189573 ] Vineet Garg commented on HIVE-23454: [~nisgoel] Are you working on this? If you aren't I would like to take it over. > Querying hive table which has Materialized view fails with > HiveAccessControlException > - > > Key: HIVE-23454 > URL: https://issues.apache.org/jira/browse/HIVE-23454 > Project: Hive > Issue Type: Bug > Components: Authorization, HiveServer2 >Affects Versions: 3.0.0, 3.2.0 >Reporter: Chiran Ravani >Assignee: Nishant Goel >Priority: Critical > > Query fails with HiveAccessControlException against table when there is > Materialized view pointing to that table which end user does not have access > to, but the actual table user has all the privileges. > From the HiveServer2 logs - it looks as part of optimization Hive uses > materialized view to query the data instead of table and since end user does > not have access on MV we receive HiveAccessControlException. > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/cost/HiveVolcanoPlanner.java#L99 > The Simplest reproducer for this issue is as below. > 1. Create a table using hive user and insert some data > {code:java} > create table db1.testmvtable(id int, name string) partitioned by(year int); > insert into db1.testmvtable partition(year=2020) values(1,'Name1'); > insert into db1.testmvtable partition(year=2020) values(1,'Name2'); > insert into db1.testmvtable partition(year=2016) values(1,'Name1'); > insert into db1.testmvtable partition(year=2016) values(1,'Name2'); > {code} > 2. Create Materialized view on top of above table with partitioned and where > clause as hive user. > {code:java} > CREATE MATERIALIZED VIEW db2.testmv PARTITIONED ON(year) as select * from > db1.testmvtable tmv where year >= 2018; > {code} > 3. Grant all (Select to be minimum) access to user 'chiran' via Ranger on > database db1. > 4. Run select on base table db1.testmvtable as 'chiran' with where clause > having partition value >=2018, it runs into HiveAccessControlException on > db2.testmv > {code:java} > eg:- (select * from db1.testmvtable where year=2020;) > 0: jdbc:hive2://node2> select * from db1.testmvtable where year=2020; > Error: Error while compiling statement: FAILED: HiveAccessControlException > Permission denied: user [chiran] does not have [SELECT] privilege on > [db2/testmv/*] (state=42000,code=4) > {code} > 5. This works when partition column is not in MV > {code:java} > 0: jdbc:hive2://node2> select * from db1.testmvtable where year=2016; > DEBUG : Acquired the compile lock. > INFO : Compiling > command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a): > select * from db1.testmvtable where year=2016 > DEBUG : Encoding valid txns info 897:9223372036854775807::893,895,896 > txnid:897 > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:testmvtable.id, type:int, > comment:null), FieldSchema(name:testmvtable.name, type:string, comment:null), > FieldSchema(name:testmvtable.year, type:int, comment:null)], properties:null) > INFO : Completed compiling > command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a); > Time taken: 0.222 seconds > DEBUG : Encoding valid txn write ids info > 897$db1.testmvtable:4:9223372036854775807:: txnid:897 > INFO : Executing > command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a): > select * from db1.testmvtable where year=2016 > INFO : Completed executing > command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a); > Time taken: 0.008 seconds > INFO : OK > DEBUG : Shutting down query select * from db1.testmvtable where year=2016 > +-+---+---+ > | testmvtable.id | testmvtable.name | testmvtable.year | > +-+---+---+ > | 1 | Name1 | 2016 | > | 1 | Name2 | 2016 | > +-+---+---+ > 2 rows selected (0.302 seconds) > 0: jdbc:hive2://node2> > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries
[ https://issues.apache.org/jira/browse/HIVE-24081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez resolved HIVE-24081. Fix Version/s: 4.0.0 Resolution: Fixed Pushed to master, thanks [~kkasa]! > Enable pre-materializing CTEs referenced in scalar subqueries > - > > Key: HIVE-24081 > URL: https://issues.apache.org/jira/browse/HIVE-24081 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > HIVE-11752 introduces materializing CTE based on config > {code} > hive.optimize.cte.materialize.threshold > {code} > Goal of this jira is > * extending the implementation to support materializing CTE's referenced in > scalar subqueries > * add a config to materialize CTEs with aggregate output only -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries
[ https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=477976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477976 ] ASF GitHub Bot logged work on HIVE-24081: - Author: ASF GitHub Bot Created on: 02/Sep/20 17:01 Start Date: 02/Sep/20 17:01 Worklog Time Spent: 10m Work Description: jcamachor merged pull request #1437: URL: https://github.com/apache/hive/pull/1437 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477976) Time Spent: 1h 50m (was: 1h 40m) > Enable pre-materializing CTEs referenced in scalar subqueries > - > > Key: HIVE-24081 > URL: https://issues.apache.org/jira/browse/HIVE-24081 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > HIVE-11752 introduces materializing CTE based on config > {code} > hive.optimize.cte.materialize.threshold > {code} > Goal of this jira is > * extending the implementation to support materializing CTE's referenced in > scalar subqueries > * add a config to materialize CTEs with aggregate output only -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23406) SharedWorkOptimizer should check nullSortOrders when comparing ReduceSink operators
[ https://issues.apache.org/jira/browse/HIVE-23406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa reassigned HIVE-23406: - Assignee: Krisztian Kasa (was: Jesus Camacho Rodriguez) > SharedWorkOptimizer should check nullSortOrders when comparing ReduceSink > operators > --- > > Key: HIVE-23406 > URL: https://issues.apache.org/jira/browse/HIVE-23406 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > > SharedWorkOptimizer does not checks null sort order in ReduceSinkDesc when > compares ReduceSink operators: > > [https://github.com/apache/hive/blob/ca9aba606c4d09b91ee28bf9ee1ae918db8cdfb9/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java#L1444] > {code:java} > ReduceSinkDesc op1Conf = ((ReduceSinkOperator) op1).getConf(); > ReduceSinkDesc op2Conf = ((ReduceSinkOperator) op2).getConf(); > if (StringUtils.equals(op1Conf.getKeyColString(), > op2Conf.getKeyColString()) && > StringUtils.equals(op1Conf.getValueColsString(), > op2Conf.getValueColsString()) && > StringUtils.equals(op1Conf.getParitionColsString(), > op2Conf.getParitionColsString()) && > op1Conf.getTag() == op2Conf.getTag() && > StringUtils.equals(op1Conf.getOrder(), op2Conf.getOrder()) && > op1Conf.getTopN() == op2Conf.getTopN() && > canDeduplicateReduceTraits(op1Conf, op2Conf)) { > return true; > } else { > return false; > } > {code} > An expression like > {code:java} > StringUtils.equals(op1Conf.getNullOrder(), op2Conf.getNullOrder()) && > {code} > should be added. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24059) Llap external client - Initial changes for running in cloud environment
[ https://issues.apache.org/jira/browse/HIVE-24059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anishek Agarwal resolved HIVE-24059. Resolution: Fixed committed to master, thanks for patch [~ShubhamChaurasia] and review [~prasanth_j] > Llap external client - Initial changes for running in cloud environment > --- > > Key: HIVE-24059 > URL: https://issues.apache.org/jira/browse/HIVE-24059 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Shubham Chaurasia >Assignee: Shubham Chaurasia >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24059.01.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Please see problem description in > https://issues.apache.org/jira/browse/HIVE-24058 > Initial changes include - > 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) > side. > 2. Opening additional RPC port in LLAP Daemon. > 3. JWT Based authentication on this port. > cc [~prasanth_j] [~jdere] [~anishek] [~thejas] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24059) Llap external client - Initial changes for running in cloud environment
[ https://issues.apache.org/jira/browse/HIVE-24059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shubham Chaurasia updated HIVE-24059: - Attachment: HIVE-24059.01.patch > Llap external client - Initial changes for running in cloud environment > --- > > Key: HIVE-24059 > URL: https://issues.apache.org/jira/browse/HIVE-24059 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Shubham Chaurasia >Assignee: Shubham Chaurasia >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24059.01.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Please see problem description in > https://issues.apache.org/jira/browse/HIVE-24058 > Initial changes include - > 1. Moving LLAP discovery logic from client side to server (HS2 / get_splits) > side. > 2. Opening additional RPC port in LLAP Daemon. > 3. JWT Based authentication on this port. > cc [~prasanth_j] [~jdere] [~anishek] [~thejas] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries
[ https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=477932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477932 ] ASF GitHub Bot logged work on HIVE-24031: - Author: ASF GitHub Bot Created on: 02/Sep/20 15:02 Start Date: 02/Sep/20 15:02 Worklog Time Spent: 10m Work Description: zabetak opened a new pull request #1424: URL: https://github.com/apache/hive/pull/1424 ### What changes were proposed in this pull request? 1. Drop the defensive copy of children inside ASTNode#getChildren. 2. Protect clients by accidentally modifying the list via an unmodifiable collection. ### Why are the changes needed? Profiling shows the vast majority of time spend on creating defensive copies of the node expression list inside ASTNode#getChildren. The method is called extensively from various places in the code especially those walking over the expression tree so it needs to be efficient. Most of the time creating defensive copies is not necessary. For those cases (if any) that the list needs to be modified clients should perform a copy themselves. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? The test was added in a separate branch since it is not meant to be committed upstream for the following reasons: - the query for reproducing the problem takes up a few MBs - requires some changes in the default configurations. If you want to run the test run the following commands: ``` git checkout -b HIVE-24031-TEST master git pull g...@github.com:zabetak/hive.git HIVE-24031-PLUS-TEST mvn clean install -DskipTests cd itests mvn clean install -DskipTests cd qtest mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=big_query_with_array_constructor.q -Dtest.output.overwrite ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477932) Time Spent: 40m (was: 0.5h) > Infinite planning time on syntactically big queries > --- > > Key: HIVE-24031 > URL: https://issues.apache.org/jira/browse/HIVE-24031 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: ASTNode_getChildren_cost.png, > query_big_array_constructor.nps > > Time Spent: 40m > Remaining Estimate: 0h > > Syntactically big queries (~1 million tokens), such as the query shown below, > lead to very big (seemingly infinite) planning times. > {code:sql} > select posexplode(array('item1', 'item2', ..., 'item1M')); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries
[ https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=477930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477930 ] ASF GitHub Bot logged work on HIVE-24031: - Author: ASF GitHub Bot Created on: 02/Sep/20 15:02 Start Date: 02/Sep/20 15:02 Worklog Time Spent: 10m Work Description: zabetak commented on pull request #1424: URL: https://github.com/apache/hive/pull/1424#issuecomment-685795081 Closing pull request to trigger pre-commits This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477930) Time Spent: 20m (was: 10m) > Infinite planning time on syntactically big queries > --- > > Key: HIVE-24031 > URL: https://issues.apache.org/jira/browse/HIVE-24031 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: ASTNode_getChildren_cost.png, > query_big_array_constructor.nps > > Time Spent: 20m > Remaining Estimate: 0h > > Syntactically big queries (~1 million tokens), such as the query shown below, > lead to very big (seemingly infinite) planning times. > {code:sql} > select posexplode(array('item1', 'item2', ..., 'item1M')); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries
[ https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=477931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477931 ] ASF GitHub Bot logged work on HIVE-24031: - Author: ASF GitHub Bot Created on: 02/Sep/20 15:02 Start Date: 02/Sep/20 15:02 Worklog Time Spent: 10m Work Description: zabetak closed pull request #1424: URL: https://github.com/apache/hive/pull/1424 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477931) Time Spent: 0.5h (was: 20m) > Infinite planning time on syntactically big queries > --- > > Key: HIVE-24031 > URL: https://issues.apache.org/jira/browse/HIVE-24031 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: ASTNode_getChildren_cost.png, > query_big_array_constructor.nps > > Time Spent: 0.5h > Remaining Estimate: 0h > > Syntactically big queries (~1 million tokens), such as the query shown below, > lead to very big (seemingly infinite) planning times. > {code:sql} > select posexplode(array('item1', 'item2', ..., 'item1M')); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24084) Enhance cost model to push down more Aggregates
[ https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=477907=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477907 ] ASF GitHub Bot logged work on HIVE-24084: - Author: ASF GitHub Bot Created on: 02/Sep/20 14:41 Start Date: 02/Sep/20 14:41 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1439: URL: https://github.com/apache/hive/pull/1439#discussion_r482123540 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/cost/HiveOnTezCostModel.java ## @@ -89,22 +89,23 @@ public RelOptCost getAggregateCost(HiveAggregate aggregate) { } else { final RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery(); // 1. Sum of input cardinalities - final Double rCount = mq.getRowCount(aggregate.getInput()); - if (rCount == null) { + final Double inputRowCount = mq.getRowCount(aggregate.getInput()); + final Double rowCount = mq.getRowCount(aggregate); + if (inputRowCount == null || rowCount == null) { return null; } // 2. CPU cost = sorting cost - final double cpuCost = algoUtils.computeSortCPUCost(rCount); + final double cpuCost = algoUtils.computeSortCPUCost(rowCount) + inputRowCount * algoUtils.getCpuUnitCost(); // 3. IO cost = cost of writing intermediary results to local FS + // cost of reading from local FS for transferring to GBy + // cost of transferring map outputs to GBy operator final Double rAverageSize = mq.getAverageRowSize(aggregate.getInput()); if (rAverageSize == null) { return null; } - final double ioCost = algoUtils.computeSortIOCost(new Pair(rCount,rAverageSize)); + final double ioCost = algoUtils.computeSortIOCost(new Pair(rowCount, rAverageSize)); Review comment: if we will be doing a 2 phase groupby: every mapper will do some grouping before it starts emitting; in case `iRC >> oRC` the mappers could eliminate a lot of rows ; and they will most likely utilize `O(oRC)` io this is an underestimation ; I wanted to multiply it with the number of mappers - but I don't think that's known at this pointI can add a config key for a fixed multiplier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477907) Time Spent: 1h 10m (was: 1h) > Enhance cost model to push down more Aggregates > --- > > Key: HIVE-24084 > URL: https://issues.apache.org/jira/browse/HIVE-24084 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24084) Enhance cost model to push down more Aggregates
[ https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=477900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477900 ] ASF GitHub Bot logged work on HIVE-24084: - Author: ASF GitHub Bot Created on: 02/Sep/20 14:35 Start Date: 02/Sep/20 14:35 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1439: URL: https://github.com/apache/hive/pull/1439#discussion_r482119043 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/cost/HiveOnTezCostModel.java ## @@ -89,22 +89,23 @@ public RelOptCost getAggregateCost(HiveAggregate aggregate) { } else { final RelMetadataQuery mq = aggregate.getCluster().getMetadataQuery(); // 1. Sum of input cardinalities - final Double rCount = mq.getRowCount(aggregate.getInput()); - if (rCount == null) { + final Double inputRowCount = mq.getRowCount(aggregate.getInput()); + final Double rowCount = mq.getRowCount(aggregate); + if (inputRowCount == null || rowCount == null) { return null; } // 2. CPU cost = sorting cost - final double cpuCost = algoUtils.computeSortCPUCost(rCount); + final double cpuCost = algoUtils.computeSortCPUCost(rowCount) + inputRowCount * algoUtils.getCpuUnitCost(); Review comment: maybe...I'm trying to catch the case when `inputRowCount >> outputRowCount`; we are also grouping - so it will not be a full sort at all ; I was using the above to achieve: ``` log(outputRowCount)*outputRowCount + inputRowCount*COST ``` the rational behind this is that it needs to really sort `oRC` and read `iRC` rows - this could be an underestimation...but `log(iRC)*iRC` was highly overestimating the cost one alternative for the above could be to use: ``` log(outputRowCount) * inputRowCount ``` the rational behind this: we will need to find the place for every input row; but we also know that the output will be at most `outputRowCount` - so it shouldn't take more time to find the place for the actual row than `log(outputRowCount)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477900) Time Spent: 1h (was: 50m) > Enhance cost model to push down more Aggregates > --- > > Key: HIVE-24084 > URL: https://issues.apache.org/jira/browse/HIVE-24084 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24084) Enhance cost model to push down more Aggregates
[ https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=477884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477884 ] ASF GitHub Bot logged work on HIVE-24084: - Author: ASF GitHub Bot Created on: 02/Sep/20 14:00 Start Date: 02/Sep/20 14:00 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1439: URL: https://github.com/apache/hive/pull/1439#discussion_r482092255 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java ## @@ -290,7 +291,8 @@ public void onMatch(RelOptRuleCall call) { RelNode r = relBuilder.build(); RelOptCost afterCost = mq.getCumulativeCost(r); RelOptCost beforeCost = mq.getCumulativeCost(aggregate); - if (afterCost.isLt(beforeCost)) { Review comment: yes; if we use `isLe` the current cost model which only takes rowcount into account will prefer the pushing aggregates further there is another alternative to the `force` based approach: the rule can be configured to use the more advanced cost system - so that it could take cpu/io cost into account This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477884) Time Spent: 50m (was: 40m) > Enhance cost model to push down more Aggregates > --- > > Key: HIVE-24084 > URL: https://issues.apache.org/jira/browse/HIVE-24084 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24084) Enhance cost model to push down more Aggregates
[ https://issues.apache.org/jira/browse/HIVE-24084?focusedWorklogId=477882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477882 ] ASF GitHub Bot logged work on HIVE-24084: - Author: ASF GitHub Bot Created on: 02/Sep/20 13:58 Start Date: 02/Sep/20 13:58 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1439: URL: https://github.com/apache/hive/pull/1439#discussion_r482090289 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveAggregateJoinTransposeRule.java ## @@ -303,6 +305,90 @@ public void onMatch(RelOptRuleCall call) { } } + /** + * Determines weather the give grouping is unique. + * + * Consider a join which might produce non-unique rows; but later the results are aggregated again. + * This method determines if there are sufficient columns in the grouping which have been present previously as unique column(s). + */ + private boolean isGroupingUnique(RelNode input, ImmutableBitSet groups) { +if (groups.isEmpty()) { + return false; +} +RelMetadataQuery mq = input.getCluster().getMetadataQuery(); +Set uKeys = mq.getUniqueKeys(input); +for (ImmutableBitSet u : uKeys) { + if (groups.contains(u)) { +return true; + } +} +if (input instanceof Join) { + Join join = (Join) input; + RexBuilder rexBuilder = input.getCluster().getRexBuilder(); + SimpleConditionInfo cond = new SimpleConditionInfo(join.getCondition(), rexBuilder); + + if (cond.valid) { +ImmutableBitSet newGroup = groups.intersect(ImmutableBitSet.fromBitSet(cond.fields)); +RelNode l = join.getLeft(); +RelNode r = join.getRight(); + +int joinFieldCount = join.getRowType().getFieldCount(); +int lFieldCount = l.getRowType().getFieldCount(); + +ImmutableBitSet groupL = newGroup.get(0, lFieldCount); +ImmutableBitSet groupR = newGroup.get(lFieldCount, joinFieldCount).shift(-lFieldCount); + +if (isGroupingUnique(l, groupL)) { Review comment: this method does a bit different thing - honestly I feeled like I'm in trouble when I've given this name to it :) this method checks if the given columns contain an unique column somewhere in the covered joins; (this still sound fuzzy) so let's take an example consider: ``` select c_id, sum(i_prize) from customer c join item i on(i.c_id=c.c_id) ``` * do an aggregate grouping by the column C_ID ; and sum up something * below is a join which joins by C_ID * asking wether C_ID is a unique column on top of the join is false; but there is subtree in which C_ID is unique => so if we push the aggregate on that branch the aggregation will be a no-op I think this case is not handled by `areColumnsUnique` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477882) Time Spent: 40m (was: 0.5h) > Enhance cost model to push down more Aggregates > --- > > Key: HIVE-24084 > URL: https://issues.apache.org/jira/browse/HIVE-24084 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23976) Enable vectorization for multi-col semi join reducers
[ https://issues.apache.org/jira/browse/HIVE-23976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23976: -- Labels: pull-request-available (was: ) > Enable vectorization for multi-col semi join reducers > - > > Key: HIVE-23976 > URL: https://issues.apache.org/jira/browse/HIVE-23976 > Project: Hive > Issue Type: Improvement >Reporter: Stamatis Zampetakis >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-21196 introduces multi-column semi-join reducers in the query engine. > However, the implementation relies on GenericUDFMurmurHash which is not > vectorized thus the respective operators cannot be executed in vectorized > mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23976) Enable vectorization for multi-col semi join reducers
[ https://issues.apache.org/jira/browse/HIVE-23976?focusedWorklogId=477862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477862 ] ASF GitHub Bot logged work on HIVE-23976: - Author: ASF GitHub Bot Created on: 02/Sep/20 13:32 Start Date: 02/Sep/20 13:32 Worklog Time Spent: 10m Work Description: abstractdog opened a new pull request #1458: URL: https://github.com/apache/hive/pull/1458 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477862) Remaining Estimate: 0h Time Spent: 10m > Enable vectorization for multi-col semi join reducers > - > > Key: HIVE-23976 > URL: https://issues.apache.org/jira/browse/HIVE-23976 > Project: Hive > Issue Type: Improvement >Reporter: Stamatis Zampetakis >Assignee: László Bodor >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-21196 introduces multi-column semi-join reducers in the query engine. > However, the implementation relies on GenericUDFMurmurHash which is not > vectorized thus the respective operators cannot be executed in vectorized > mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24091) Replace multiple constraints call with getAllTableConstraints api call in query planner
[ https://issues.apache.org/jira/browse/HIVE-24091?focusedWorklogId=477859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477859 ] ASF GitHub Bot logged work on HIVE-24091: - Author: ASF GitHub Bot Created on: 02/Sep/20 13:30 Start Date: 02/Sep/20 13:30 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma closed pull request #1444: URL: https://github.com/apache/hive/pull/1444 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477859) Time Spent: 50m (was: 40m) > Replace multiple constraints call with getAllTableConstraints api call in > query planner > --- > > Key: HIVE-24091 > URL: https://issues.apache.org/jira/browse/HIVE-24091 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Inorder get all the constraints of table i.e. PrimaryKey, ForeignKey, > UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We > have to do 6 different metastore call. Replace these call with one > getAllTableConstraints api which provide all the constraints at once -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24091) Replace multiple constraints call with getAllTableConstraints api call in query planner
[ https://issues.apache.org/jira/browse/HIVE-24091?focusedWorklogId=477858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477858 ] ASF GitHub Bot logged work on HIVE-24091: - Author: ASF GitHub Bot Created on: 02/Sep/20 13:30 Start Date: 02/Sep/20 13:30 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on pull request #1444: URL: https://github.com/apache/hive/pull/1444#issuecomment-685738116 Merged in PR-1419 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477858) Time Spent: 40m (was: 0.5h) > Replace multiple constraints call with getAllTableConstraints api call in > query planner > --- > > Key: HIVE-24091 > URL: https://issues.apache.org/jira/browse/HIVE-24091 > Project: Hive > Issue Type: Improvement >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Inorder get all the constraints of table i.e. PrimaryKey, ForeignKey, > UniqueConstraint ,NotNullConstraint ,DefaultConstraint ,CheckConstraint. We > have to do 6 different metastore call. Replace these call with one > getAllTableConstraints api which provide all the constraints at once -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-22156) SHOW CREATE TABLE not showing HAVING Clause
[ https://issues.apache.org/jira/browse/HIVE-22156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sujeet-A resolved HIVE-22156. - Resolution: Cannot Reproduce Hi [~Bone An], I am closing this JIRA. Thanks. Sujeet > SHOW CREATE TABLE not showing HAVING Clause > --- > > Key: HIVE-22156 > URL: https://issues.apache.org/jira/browse/HIVE-22156 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0 >Reporter: Sujeet-A >Priority: Major > > HI Team, > I am trying to get show create table for one of hive view. > I am unable to get HAVING clause after create of view. > Please can you check whether it is a bug or is there other -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23839) Use LongAdder instead of AtomicLong
[ https://issues.apache.org/jira/browse/HIVE-23839?focusedWorklogId=477841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477841 ] ASF GitHub Bot logged work on HIVE-23839: - Author: ASF GitHub Bot Created on: 02/Sep/20 12:46 Start Date: 02/Sep/20 12:46 Worklog Time Spent: 10m Work Description: dai closed pull request #1246: URL: https://github.com/apache/hive/pull/1246 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477841) Time Spent: 1h 10m (was: 1h) > Use LongAdder instead of AtomicLong > --- > > Key: HIVE-23839 > URL: https://issues.apache.org/jira/browse/HIVE-23839 > Project: Hive > Issue Type: Improvement >Reporter: Dai Wenqing >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > LongAdder performs better than AtomicLong in high concurrent environment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24112) TestMiniLlapLocalCliDriver[dynamic_semijoin_reduction_on_aggcol] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis updated HIVE-24112: --- Labels: flaky-test (was: ) > TestMiniLlapLocalCliDriver[dynamic_semijoin_reduction_on_aggcol] is flaky > - > > Key: HIVE-24112 > URL: https://issues.apache.org/jira/browse/HIVE-24112 > Project: Hive > Issue Type: Bug >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Labels: flaky-test > Fix For: 4.0.0 > > > http://ci.hive.apache.org/job/hive-flaky-check/96/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24112) TestMiniLlapLocalCliDriver[dynamic_semijoin_reduction_on_aggcol] is flaky
[ https://issues.apache.org/jira/browse/HIVE-24112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis reassigned HIVE-24112: -- > TestMiniLlapLocalCliDriver[dynamic_semijoin_reduction_on_aggcol] is flaky > - > > Key: HIVE-24112 > URL: https://issues.apache.org/jira/browse/HIVE-24112 > Project: Hive > Issue Type: Bug >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > Fix For: 4.0.0 > > > http://ci.hive.apache.org/job/hive-flaky-check/96/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24111) TestMmCompactorOnTez hangs when running against Tez 0.10.0 staging artifact
[ https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24111: Description: Reproduced issue in ptest run which I made to run against tez staging artifacts (https://repository.apache.org/content/repositories/orgapachetez-1068/) http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417 I'm about to investigate this. I think Tez 0.10.0 cannot be released until we won't confirm if it's a hive or tez bug. > TestMmCompactorOnTez hangs when running against Tez 0.10.0 staging artifact > --- > > Key: HIVE-24111 > URL: https://issues.apache.org/jira/browse/HIVE-24111 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > > Reproduced issue in ptest run which I made to run against tez staging > artifacts > (https://repository.apache.org/content/repositories/orgapachetez-1068/) > http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417 > I'm about to investigate this. I think Tez 0.10.0 cannot be released until we > won't confirm if it's a hive or tez bug. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24111) TestMmCompactorOnTez hangs when running against Tez 0.10.0 staging artifact
[ https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor reassigned HIVE-24111: --- > TestMmCompactorOnTez hangs when running against Tez 0.10.0 staging artifact > --- > > Key: HIVE-24111 > URL: https://issues.apache.org/jira/browse/HIVE-24111 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-17909) JDK9: Tez may not use URLClassloader
[ https://issues.apache.org/jira/browse/HIVE-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor resolved HIVE-17909. - Resolution: Duplicate > JDK9: Tez may not use URLClassloader > > > Key: HIVE-17909 > URL: https://issues.apache.org/jira/browse/HIVE-17909 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure >Reporter: Zoltan Haindrich >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-17909) JDK9: Tez may not use URLClassloader
[ https://issues.apache.org/jira/browse/HIVE-17909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189145#comment-17189145 ] László Bodor commented on HIVE-17909: - this is solved by TEZ-3860 / TEZ-4223 / TEZ-4228 and is being released as part of tez 0.10.0 closing this > JDK9: Tez may not use URLClassloader > > > Key: HIVE-17909 > URL: https://issues.apache.org/jira/browse/HIVE-17909 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure >Reporter: Zoltan Haindrich >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23887) Reset table level basic/column stats during import.
[ https://issues.apache.org/jira/browse/HIVE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan resolved HIVE-23887. - Resolution: Fixed > Reset table level basic/column stats during import. > --- > > Key: HIVE-23887 > URL: https://issues.apache.org/jira/browse/HIVE-23887 > Project: Hive > Issue Type: Bug > Components: Import/Export, Statistics >Affects Versions: 4.0.0 >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > While doing "export table db.table to '/import/table' " column stats are not > dumped but import doesn't reset the flag which leads to incorrect stats. > Reset columns stats while import to force Imported to recalculate the Columns > stats -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23887) Reset table level basic/column stats during import.
[ https://issues.apache.org/jira/browse/HIVE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan resolved HIVE-23887. - Resolution: Duplicate > Reset table level basic/column stats during import. > --- > > Key: HIVE-23887 > URL: https://issues.apache.org/jira/browse/HIVE-23887 > Project: Hive > Issue Type: Bug > Components: Import/Export, Statistics >Affects Versions: 4.0.0 >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > While doing "export table db.table to '/import/table' " column stats are not > dumped but import doesn't reset the flag which leads to incorrect stats. > Reset columns stats while import to force Imported to recalculate the Columns > stats -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HIVE-23887) Reset table level basic/column stats during import.
[ https://issues.apache.org/jira/browse/HIVE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reopened HIVE-23887: - > Reset table level basic/column stats during import. > --- > > Key: HIVE-23887 > URL: https://issues.apache.org/jira/browse/HIVE-23887 > Project: Hive > Issue Type: Bug > Components: Import/Export, Statistics >Affects Versions: 4.0.0 >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > While doing "export table db.table to '/import/table' " column stats are not > dumped but import doesn't reset the flag which leads to incorrect stats. > Reset columns stats while import to force Imported to recalculate the Columns > stats -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HIVE-23887) Reset table level basic/column stats during import.
[ https://issues.apache.org/jira/browse/HIVE-23887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sankar Hariappan reopened HIVE-23887: - > Reset table level basic/column stats during import. > --- > > Key: HIVE-23887 > URL: https://issues.apache.org/jira/browse/HIVE-23887 > Project: Hive > Issue Type: Bug > Components: Import/Export, Statistics >Affects Versions: 4.0.0 >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > While doing "export table db.table to '/import/table' " column stats are not > dumped but import doesn't reset the flag which leads to incorrect stats. > Reset columns stats while import to force Imported to recalculate the Columns > stats -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24110) NullPointerException occurs in some UDFs
[ https://issues.apache.org/jira/browse/HIVE-24110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi updated HIVE-24110: - Attachment: HIVE-24110.01.patch > NullPointerException occurs in some UDFs > > > Key: HIVE-24110 > URL: https://issues.apache.org/jira/browse/HIVE-24110 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryu Kobayashi >Assignee: Ryu Kobayashi >Priority: Major > Attachments: HIVE-24110.01.patch > > > Since it refers to a variable that has not been initialized, > NullPointerException occurs and the correct error message is not displayed. > > {code:java} > if (arguments[0].getCategory() != ObjectInspector.Category.PRIMITIVE) { > throw new UDFArgumentException( > "OCTET_LENGTH only takes primitive types, got " + > argumentOI.getTypeName()); > } > argumentOI = (PrimitiveObjectInspector) arguments[0]; > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24110) NullPointerException occurs in some UDFs
[ https://issues.apache.org/jira/browse/HIVE-24110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi updated HIVE-24110: - Status: Patch Available (was: Open) > NullPointerException occurs in some UDFs > > > Key: HIVE-24110 > URL: https://issues.apache.org/jira/browse/HIVE-24110 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryu Kobayashi >Assignee: Ryu Kobayashi >Priority: Major > Attachments: HIVE-24110.01.patch > > > Since it refers to a variable that has not been initialized, > NullPointerException occurs and the correct error message is not displayed. > > {code:java} > if (arguments[0].getCategory() != ObjectInspector.Category.PRIMITIVE) { > throw new UDFArgumentException( > "OCTET_LENGTH only takes primitive types, got " + > argumentOI.getTypeName()); > } > argumentOI = (PrimitiveObjectInspector) arguments[0]; > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24110) NullPointerException occurs in some UDFs
[ https://issues.apache.org/jira/browse/HIVE-24110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi reassigned HIVE-24110: > NullPointerException occurs in some UDFs > > > Key: HIVE-24110 > URL: https://issues.apache.org/jira/browse/HIVE-24110 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Ryu Kobayashi >Assignee: Ryu Kobayashi >Priority: Major > > Since it refers to a variable that has not been initialized, > NullPointerException occurs and the correct error message is not displayed. > > {code:java} > if (arguments[0].getCategory() != ObjectInspector.Category.PRIMITIVE) { > throw new UDFArgumentException( > "OCTET_LENGTH only takes primitive types, got " + > argumentOI.getTypeName()); > } > argumentOI = (PrimitiveObjectInspector) arguments[0]; > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24093) Remove unused hive.debug.localtask
[ https://issues.apache.org/jira/browse/HIVE-24093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24093: Fix Version/s: 4.0.0 > Remove unused hive.debug.localtask > -- > > Key: HIVE-24093 > URL: https://issues.apache.org/jira/browse/HIVE-24093 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > hive.debug.local.task was added in HIVE-1642. Even then, it was never used. > It was possibly a leftover from development/debugging. There are no > references to either HIVEDEBUGLOCALTASK or hive.debug.localtask in the > codebase. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24093) Remove unused hive.debug.localtask
[ https://issues.apache.org/jira/browse/HIVE-24093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24093: Resolution: Fixed Status: Resolved (was: Patch Available) > Remove unused hive.debug.localtask > -- > > Key: HIVE-24093 > URL: https://issues.apache.org/jira/browse/HIVE-24093 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > hive.debug.local.task was added in HIVE-1642. Even then, it was never used. > It was possibly a leftover from development/debugging. There are no > references to either HIVEDEBUGLOCALTASK or hive.debug.localtask in the > codebase. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24093) Remove unused hive.debug.localtask
[ https://issues.apache.org/jira/browse/HIVE-24093?focusedWorklogId=477729=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477729 ] ASF GitHub Bot logged work on HIVE-24093: - Author: ASF GitHub Bot Created on: 02/Sep/20 09:34 Start Date: 02/Sep/20 09:34 Worklog Time Spent: 10m Work Description: abstractdog merged pull request #1445: URL: https://github.com/apache/hive/pull/1445 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477729) Time Spent: 0.5h (was: 20m) > Remove unused hive.debug.localtask > -- > > Key: HIVE-24093 > URL: https://issues.apache.org/jira/browse/HIVE-24093 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > hive.debug.local.task was added in HIVE-1642. Even then, it was never used. > It was possibly a leftover from development/debugging. There are no > references to either HIVEDEBUGLOCALTASK or hive.debug.localtask in the > codebase. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24093) Remove unused hive.debug.localtask
[ https://issues.apache.org/jira/browse/HIVE-24093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189110#comment-17189110 ] László Bodor commented on HIVE-24093: - PR merged, thanks [~mustafaiman] for the patch and [~pgaref] for the review! > Remove unused hive.debug.localtask > -- > > Key: HIVE-24093 > URL: https://issues.apache.org/jira/browse/HIVE-24093 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > hive.debug.local.task was added in HIVE-1642. Even then, it was never used. > It was possibly a leftover from development/debugging. There are no > references to either HIVEDEBUGLOCALTASK or hive.debug.localtask in the > codebase. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24095) Load partitions in parallel for external tables in the bootstrap phase
[ https://issues.apache.org/jira/browse/HIVE-24095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24095: --- Description: This is part 1 of the change. This will load partitions in parallel for external tables. Managed table is tracked as part of https://issues.apache.org/jira/browse/HIVE-24109 (was: This is part 1 of the change. This will load partitions in parallel for external tables. ) > Load partitions in parallel for external tables in the bootstrap phase > -- > > Key: HIVE-24095 > URL: https://issues.apache.org/jira/browse/HIVE-24095 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > > This is part 1 of the change. This will load partitions in parallel for > external tables. Managed table is tracked as part of > https://issues.apache.org/jira/browse/HIVE-24109 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24109) Load partitions in parallel for managed tables in the bootstrap phase
[ https://issues.apache.org/jira/browse/HIVE-24109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi reassigned HIVE-24109: -- > Load partitions in parallel for managed tables in the bootstrap phase > - > > Key: HIVE-24109 > URL: https://issues.apache.org/jira/browse/HIVE-24109 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24095) Load partitions in parallel for external tables in the bootstrap phase
[ https://issues.apache.org/jira/browse/HIVE-24095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24095: --- Description: This is part 1 of the change. This will load partitions in parallel for external tables. > Load partitions in parallel for external tables in the bootstrap phase > -- > > Key: HIVE-24095 > URL: https://issues.apache.org/jira/browse/HIVE-24095 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > > This is part 1 of the change. This will load partitions in parallel for > external tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24095) Load partitions in parallel for external tables in the bootstrap phase
[ https://issues.apache.org/jira/browse/HIVE-24095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24095: --- Summary: Load partitions in parallel for external tables in the bootstrap phase (was: Load partitions in parallel in the bootstrap phase) > Load partitions in parallel for external tables in the bootstrap phase > -- > > Key: HIVE-24095 > URL: https://issues.apache.org/jira/browse/HIVE-24095 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24105) Refactor partition pruning
[ https://issues.apache.org/jira/browse/HIVE-24105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-24105: - Assignee: Steve Carlin > Refactor partition pruning > -- > > Key: HIVE-24105 > URL: https://issues.apache.org/jira/browse/HIVE-24105 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Steve Carlin >Assignee: Steve Carlin >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > A small refactor of partition pruning. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted
[ https://issues.apache.org/jira/browse/HIVE-24106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-24106: - Assignee: Zhihua Deng > Abort polling on the operation state when the current thread is interrupted > --- > > Key: HIVE-24106 > URL: https://issues.apache.org/jira/browse/HIVE-24106 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If running HiveStatement asynchronously as a task like in a thread or future, > if we interrupt the task, the HiveStatement would continue to poll on the > operation state until finish. It's may better to provide a way to abort the > executing in such case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189073#comment-17189073 ] László Bodor commented on HIVE-24108: - this patch has been included into the patchset of HIVE-23930 for testing on tez 0.10.0: http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline if it's green, this patch is safe to be pushed IMO > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24060) When the CBO is false, NPE is thrown by an EXCEPT or INTERSECT execution
[ https://issues.apache.org/jira/browse/HIVE-24060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng reassigned HIVE-24060: -- Assignee: Zhihua Deng > When the CBO is false, NPE is thrown by an EXCEPT or INTERSECT execution > > > Key: HIVE-24060 > URL: https://issues.apache.org/jira/browse/HIVE-24060 > Project: Hive > Issue Type: Bug > Components: CBO, Hive >Affects Versions: 3.1.0, 3.1.2 >Reporter: GuangMing Lu >Assignee: Zhihua Deng >Priority: Major > > {code:java} > set hive.cbo.enable=false; > create table testtable(idx string, namex string) stored as orc; > insert into testtable values('123', 'aaa'), ('234', 'bbb'); > explain select a.idx from (select idx,namex from testtable intersect select > idx,namex from testtable) a > {code} > The execution throws a NullPointException: > {code:java} > 2020-08-24 15:12:24,261 | WARN | HiveServer2-Handler-Pool: Thread-345 | > Error executing statement: | > org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1155) > org.apache.hive.service.cli.HiveSQLException: Error while compiling > statement: FAILED: NullPointerException null > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:341) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:215) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:316) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:253) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:684) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:670) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:342) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.executeNewStatement(ThriftCLIService.java:1144) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:1280) > ~[hive-service-3.1.0.jar:3.1.0] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) > ~[hive-service-rpc-3.1.0.jar:3.1.0] > at > org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) > ~[hive-service-rpc-3.1.0.jar:3.1.0] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > ~[libthrift-0.9.3.jar:0.9.3] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:648) > ~[hive-standalone-metastore-3.1.0.jar:3.1.0] > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > ~[libthrift-0.9.3.jar:0.9.3] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[?:1.8.0_201] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ~[?:1.8.0_201] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201] > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4367) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:4346) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:10576) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10515) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11434) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11291) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11318) > ~[hive-exec-3.1.0.jar:3.1.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11304) > ~[hive-exec-3.1.0.jar:3.1.0] > at >
[jira] [Assigned] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor reassigned HIVE-24108: --- Assignee: László Bodor > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: hive_log_llap.log > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24108: Attachment: hive_log_llap.log > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: hive_log_llap.log > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24089) Run QB compaction as table directory user with impersonation
[ https://issues.apache.org/jira/browse/HIVE-24089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage resolved HIVE-24089. -- Fix Version/s: 4.0.0 Resolution: Fixed Committed to master. Thanks for the review Peter Vary and Laszlo Pinter! > Run QB compaction as table directory user with impersonation > > > Key: HIVE-24089 > URL: https://issues.apache.org/jira/browse/HIVE-24089 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently QB compaction runs as the session user, unlike MR compaction which > runs as the table/partition directory owner (see > CompactorThread#findUserToRunAs). > We should make QB compaction run as the table/partition directory owner and > enable user impersonation during compaction to avoid any issues with temp > directories. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189041#comment-17189041 ] László Bodor commented on HIVE-24108: - this patch can be committed when HIVE-23930 is committed > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24108: Attachment: HIVE-24108.01.patch > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24108: Description: TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader instead of the system classloader. However, there are some codepaths, e.g. in [^hive_log_llap.log] which shows that the system class loader is used. As thread context classloaders are inherited, the easier solution is to early-initialize TezClassLoader in LlapDaemon, and let all threads use that as context class loader, so this solution is more like TEZ-4223 for llap daemons. was:TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader instead of the system classloader. However, there are some codepaths, e.g. in [^hive_log_llap.log] which shows that the system class loader is used. As thread context classloaders are inherited, the easier solution is to early-initialize TezClassLoader in LlapDaemon, and let all threads use that as context class loader, so this solution is more like TEZ-4223 for llap daemons. > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24108: Description: TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader instead of the system classloader. However, there are some codepaths, e.g. in [^hive_log_llap.log] which shows that the system class loader is used. As thread context classloaders are inherited, the easier solution is to early-initialize TezClassLoader in LlapDaemon, and let all threads use that as context class loader, so this solution is more like TEZ-4223 for llap daemons. {code} 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.serde2.TestSerDe at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) ... 16 more Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.serde2.TestSerDe at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) ... 18 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.serde2.TestSerDe at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) ... 21 more {code} was: TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader instead of the system classloader. However, there are some codepaths, e.g. in [^hive_log_llap.log] which shows that the system class loader is used. As thread context classloaders are inherited, the easier solution is to early-initialize TezClassLoader in LlapDaemon, and let all threads use that as context class loader, so this solution is more like TEZ-4223 for llap daemons. > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class
[jira] [Updated] (HIVE-24108) LlapDaemon should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24108: Description: TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader instead of the system classloader. However, there are some codepaths, e.g. in [^hive_log_llap.log] which shows that the system class loader is used. As thread context classloaders are inherited, the easier solution is to early-initialize TezClassLoader in LlapDaemon, and let all threads use that as context class loader, so this solution is more like TEZ-4223 for llap daemons. > LlapDaemon should use TezClassLoader > > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-22030) Bumping jackson version to 2.9.9 and 2.9.9.3 (jackson-databind)
[ https://issues.apache.org/jira/browse/HIVE-22030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akos Dombi resolved HIVE-22030. --- Resolution: Abandoned Closing this issue, as the jackson version was already bumped in HIVE-23338. > Bumping jackson version to 2.9.9 and 2.9.9.3 (jackson-databind) > --- > > Key: HIVE-22030 > URL: https://issues.apache.org/jira/browse/HIVE-22030 > Project: Hive > Issue Type: Task >Reporter: Akos Dombi >Assignee: Akos Dombi >Priority: Major > Fix For: 4.0.0 > > > Bump the following jackson versions: > - jackson version to 2.9.9 > - jackson-databind version to 2.9.9.3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24107) Fix typo in ReloadFunctionsOperation
[ https://issues.apache.org/jira/browse/HIVE-24107?focusedWorklogId=477661=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477661 ] ASF GitHub Bot logged work on HIVE-24107: - Author: ASF GitHub Bot Created on: 02/Sep/20 06:57 Start Date: 02/Sep/20 06:57 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #1457: URL: https://github.com/apache/hive/pull/1457 ### What changes were proposed in this pull request? Fix typo in ReloadFunctionsOperation ### Why are the changes needed? Hive.get() will register all functions as doRegisterAllFns is true, so Hive.get().reloadFunctions() may load all functions from metastore twice, use Hive.get(false) instead may be better. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477661) Remaining Estimate: 0h Time Spent: 10m > Fix typo in ReloadFunctionsOperation > > > Key: HIVE-24107 > URL: https://issues.apache.org/jira/browse/HIVE-24107 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Hive.get() will register all functions as doRegisterAllFns is true, so > Hive.get().reloadFunctions() may load all functions from metastore twice, use > Hive.get(false) instead may be better. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24102) Add ENGINE=InnoDB for replication mysql schema changes and not exists clause for the table creation
[ https://issues.apache.org/jira/browse/HIVE-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anishek Agarwal updated HIVE-24102: --- Resolution: Fixed Status: Resolved (was: Patch Available) committed to master, thanks for the patch [~aasha] and review [~pkumarsinha] > Add ENGINE=InnoDB for replication mysql schema changes and not exists clause > for the table creation > --- > > Key: HIVE-24102 > URL: https://issues.apache.org/jira/browse/HIVE-24102 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24102.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24107) Fix typo in ReloadFunctionsOperation
[ https://issues.apache.org/jira/browse/HIVE-24107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24107: -- Labels: pull-request-available (was: ) > Fix typo in ReloadFunctionsOperation > > > Key: HIVE-24107 > URL: https://issues.apache.org/jira/browse/HIVE-24107 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Hive.get() will register all functions as doRegisterAllFns is true, so > Hive.get().reloadFunctions() may load all functions from metastore twice, use > Hive.get(false) instead may be better. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted
[ https://issues.apache.org/jira/browse/HIVE-24106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24106: -- Labels: pull-request-available (was: ) > Abort polling on the operation state when the current thread is interrupted > --- > > Key: HIVE-24106 > URL: https://issues.apache.org/jira/browse/HIVE-24106 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If running HiveStatement asynchronously as a task like in a thread or future, > if we interrupt the task, the HiveStatement would continue to poll on the > operation state until finish. It's may better to provide a way to abort the > executing in such case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24106) Abort polling on the operation state when the current thread is interrupted
[ https://issues.apache.org/jira/browse/HIVE-24106?focusedWorklogId=477642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477642 ] ASF GitHub Bot logged work on HIVE-24106: - Author: ASF GitHub Bot Created on: 02/Sep/20 06:33 Start Date: 02/Sep/20 06:33 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #1456: URL: https://github.com/apache/hive/pull/1456 …ead is interrupted ### What changes were proposed in this pull request? Abort polling on the operation state when the current thread is interrupted ### Why are the changes needed? If running HiveStatement asynchronously as a task like in a thread or future, if we interrupt the task, the HiveStatement would continue to poll on the operation state until finish. It's may better to provide a way to abort the executing in such case. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Local machine This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 477642) Remaining Estimate: 0h Time Spent: 10m > Abort polling on the operation state when the current thread is interrupted > --- > > Key: HIVE-24106 > URL: https://issues.apache.org/jira/browse/HIVE-24106 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Zhihua Deng >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > If running HiveStatement asynchronously as a task like in a thread or future, > if we interrupt the task, the HiveStatement would continue to poll on the > operation state until finish. It's may better to provide a way to abort the > executing in such case. -- This message was sent by Atlassian Jira (v8.3.4#803005)