[jira] [Work logged] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?focusedWorklogId=447650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447650 ] ASF GitHub Bot logged work on HIVE-23720: - Author: ASF GitHub Bot Created on: 18/Jun/20 05:53 Start Date: 18/Jun/20 05:53 Worklog Time Spent: 10m Work Description: dengzhhu653 closed pull request #1144: URL: https://github.com/apache/hive/pull/1144 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447650) Time Spent: 20m (was: 10m) > Background task may not be interrupted when operation being canceled or > timeout > --- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Now SQLOperation cancels the background task only when the condition is met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139107#comment-17139107 ] Zhihua Deng edited comment on HIVE-23720 at 6/18/20, 5:48 AM: -- Should do a further research as I see there no releaseDriverContext() is called. https://issues.apache.org/jira/browse/HIVE-16426 was (Author: dengzh): Should do a further research as I see there no releaseDriverContext() was called. https://issues.apache.org/jira/browse/HIVE-16426 > Background task may not be interrupted when operation being canceled or > timeout > --- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now SQLOperation cancels the background task only when the condition is met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139107#comment-17139107 ] Zhihua Deng commented on HIVE-23720: Should do a further research as I see there no releaseDriverContext() was called. https://issues.apache.org/jira/browse/HIVE-16426 > Background task may not be interrupted when operation being canceled or > timeout > --- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now SQLOperation cancels the background task only when the condition is met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23720: --- Description: Now SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The condition is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources. was: Now SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The condition is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources, no need to dependent on the driver check his own status to get the background task stop. > Background task may not be interrupted when operation being canceled or > timeout > --- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now SQLOperation cancels the background task only when the condition is met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23722) Emit operation's drilldown link to client
[ https://issues.apache.org/jira/browse/HIVE-23722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23722: -- Labels: pull-request-available (was: ) > Emit operation's drilldown link to client > - > > Key: HIVE-23722 > URL: https://issues.apache.org/jira/browse/HIVE-23722 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now the HiveServer2 webui provides a drilldown link for many collected > metrics or messages about a operation, but it's not easy for a end user to > find the target url of his submitted query. Less knowledge on the deployment, > ha based environment(such as using LVS for balancing or routing), and the > multiple running queries can make things more difficult. The jira provides a > way to emit the link to the interested end user when enabled. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23722) Emit operation's drilldown link to client
[ https://issues.apache.org/jira/browse/HIVE-23722?focusedWorklogId=447646=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447646 ] ASF GitHub Bot logged work on HIVE-23722: - Author: ASF GitHub Bot Created on: 18/Jun/20 05:35 Start Date: 18/Jun/20 05:35 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #1145: URL: https://github.com/apache/hive/pull/1145 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447646) Remaining Estimate: 0h Time Spent: 10m > Emit operation's drilldown link to client > - > > Key: HIVE-23722 > URL: https://issues.apache.org/jira/browse/HIVE-23722 > Project: Hive > Issue Type: Improvement >Reporter: Zhihua Deng >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Now the HiveServer2 webui provides a drilldown link for many collected > metrics or messages about a operation, but it's not easy for a end user to > find the target url of his submitted query. Less knowledge on the deployment, > ha based environment(such as using LVS for balancing or routing), and the > multiple running queries can make things more difficult. The jira provides a > way to emit the link to the interested end user when enabled. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23720: --- Description: Now SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The condition is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources, no need to dependent on the driver check his own status to get the background task stop. was: Now SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The condition is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources. > Background task may not be interrupted when operation being canceled or > timeout > --- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now SQLOperation cancels the background task only when the condition is met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources, no need to dependent on the driver > check his own status to get the background task stop. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23720: --- Summary: Background task may not be interrupted when operation being canceled or timeout (was: Background task should be interrupted when operation being canceled or timeout) > Background task may not be interrupted when operation being canceled or > timeout > --- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now SQLOperation cancels the background task only when the condition is met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23720: --- Description: Now SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The condition is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources. was: Currently SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The condition is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources. > Background task should be interrupted when operation being canceled or timeout > -- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now SQLOperation cancels the background task only when the condition is met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21218) KafkaSerDe doesn't support topics created via Confluent Avro serializer
[ https://issues.apache.org/jira/browse/HIVE-21218?focusedWorklogId=447643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447643 ] ASF GitHub Bot logged work on HIVE-21218: - Author: ASF GitHub Bot Created on: 18/Jun/20 04:48 Start Date: 18/Jun/20 04:48 Worklog Time Spent: 10m Work Description: OneCricketeer commented on pull request #526: URL: https://github.com/apache/hive/pull/526#issuecomment-645769927 @b-slim What was the status on this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447643) Time Spent: 15h 50m (was: 15h 40m) > KafkaSerDe doesn't support topics created via Confluent Avro serializer > --- > > Key: HIVE-21218 > URL: https://issues.apache.org/jira/browse/HIVE-21218 > Project: Hive > Issue Type: Bug > Components: kafka integration, Serializers/Deserializers >Affects Versions: 3.1.1 >Reporter: Milan Baran >Assignee: David McGinnis >Priority: Major > Labels: pull-request-available > Attachments: > 0001-HIVE-21818-Adding-ability-for-Kafka-Handler-to-proce.patch, > HIVE-21218.10.patch, HIVE-21218.11.patch, HIVE-21218.12.patch, > HIVE-21218.13.patch, HIVE-21218.2.patch, HIVE-21218.3.patch, > HIVE-21218.4.patch, HIVE-21218.5.patch, HIVE-21218.6.patch, > HIVE-21218.7.patch, HIVE-21218.8.patch, HIVE-21218.9.patch, HIVE-21218.patch > > Time Spent: 15h 50m > Remaining Estimate: 0h > > According to [Google > groups|https://groups.google.com/forum/#!topic/confluent-platform/JYhlXN0u9_A] > the Confluent avro serialzier uses propertiary format for kafka value - > <4 bytes of schema ID> conforms to schema>. > This format does not cause any problem for Confluent kafka deserializer which > respect the format however for hive kafka handler its bit a problem to > correctly deserialize kafka value, because Hive uses custom deserializer from > bytes to objects and ignores kafka consumer ser/deser classes provided via > table property. > It would be nice to support Confluent format with magic byte. > Also it would be great to support Schema registry as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
[ https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YulongZ updated HIVE-23721: --- Environment: Hadoop 3.1(1700+ nodes) YARN 3.1 (with timelineserver enabled,https enabled) Hive 3.1 (15 HS2 instance) 6+ YARN Applications every day > MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL > --- > > Key: HIVE-23721 > URL: https://issues.apache.org/jira/browse/HIVE-23721 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2 > Environment: Hadoop 3.1(1700+ nodes) > YARN 3.1 (with timelineserver enabled,https enabled) > Hive 3.1 (15 HS2 instance) > 6+ YARN Applications every day >Reporter: YulongZ >Priority: Critical > > From Hive3.0,catalog added to hivemeta,many schema of metastore added column > “catName”,and index for table added column “catName”。 > In MetaStoreDirectSql.ensureDbInit() ,two queries below > “ > initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == > ''")); > initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName > == ''")); > ” > should use "catName == ''" instead of "dbName == ''",because “catName” is the > first index column。 > When data of metastore become large,for example, table of > MPartitionColumnStatistics have millions of lines。The > “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore > executed very slowly,and the query “show tables“ for hiveserver2 executed > very slowly too。 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
[ https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YulongZ updated HIVE-23721: --- Priority: Critical (was: Minor) > MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL > --- > > Key: HIVE-23721 > URL: https://issues.apache.org/jira/browse/HIVE-23721 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2 >Reporter: YulongZ >Priority: Critical > > From Hive3.0,catalog added to hivemeta,many schema of metastore added column > “catName”,and index for table added column “catName”。 > In MetaStoreDirectSql.ensureDbInit() ,two queries below > “ > initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == > ''")); > initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName > == ''")); > ” > should use "catName == ''" instead of "dbName == ''",because “catName” is the > first index column。 > When data of metastore become large,for example, table of > MPartitionColumnStatistics have millions of lines。The > “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore > executed very slowly,and the query “show tables“ for hiveserver2 executed > very slowly too。 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
[ https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YulongZ updated HIVE-23721: --- Description: >From Hive3.0,catalog added to hivemeta,many schema of metastore added column >“catName”,and index for table added column “catName”。 In MetaStoreDirectSql.ensureDbInit() ,two queries below “ initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == ''")); initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName == ''")); ” should use "catName == ''" instead of "dbName == ''",because “catName” is the first index column。 When data of metastore become large,for example, table of MPartitionColumnStatistics have millions of lines。The “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore executed very slowly,and the query “show tables“ for hiveserver2 executed very slowly too。 was:From Hive3.0,catalog added to hivemeta,many schema of metastore added column “catName”,and index for table added column “catName”。For Hive2.0, > MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL > --- > > Key: HIVE-23721 > URL: https://issues.apache.org/jira/browse/HIVE-23721 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2 >Reporter: YulongZ >Priority: Minor > > From Hive3.0,catalog added to hivemeta,many schema of metastore added column > “catName”,and index for table added column “catName”。 > In MetaStoreDirectSql.ensureDbInit() ,two queries below > “ > initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == > ''")); > initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName > == ''")); > ” > should use "catName == ''" instead of "dbName == ''",because “catName” is the > first index column。 > When data of metastore become large,for example, table of > MPartitionColumnStatistics have millions of lines。The > “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore > executed very slowly,and the query “show tables“ for hiveserver2 executed > very slowly too。 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?focusedWorklogId=447636=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447636 ] ASF GitHub Bot logged work on HIVE-23720: - Author: ASF GitHub Bot Created on: 18/Jun/20 04:08 Start Date: 18/Jun/20 04:08 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #1144: URL: https://github.com/apache/hive/pull/1144 …g canceled or timeout ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447636) Remaining Estimate: 0h Time Spent: 10m > Background task should be interrupted when operation being canceled or timeout > -- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently SQLOperation cancels the background task only when the condition is > met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23720: -- Labels: pull-request-available (was: ) > Background task should be interrupted when operation being canceled or timeout > -- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently SQLOperation cancels the background task only when the condition is > met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
[ https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YulongZ updated HIVE-23721: --- Description: From Hive3.0,catalog added to hivemeta,many schema of metastore added column “catName”,and index for table added column “catName”。For Hive2.0, > MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL > --- > > Key: HIVE-23721 > URL: https://issues.apache.org/jira/browse/HIVE-23721 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.2 >Reporter: YulongZ >Priority: Minor > > From Hive3.0,catalog added to hivemeta,many schema of metastore added column > “catName”,and index for table added column “catName”。For Hive2.0, -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout
[ https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-23720: --- Description: Currently SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The condition is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources. was: Currently SQLOperation cancels the background task only when the condition is met: if (shouldRunAsync() && state != OperationState.CANCELED && state != OperationState.TIMEDOUT) The conditions is evaluated to false when state is OperationState.CANCELED or OperationState.TIMEDOUT, but operations in such states should stop the background tasks to release resources. > Background task should be interrupted when operation being canceled or timeout > -- > > Key: HIVE-23720 > URL: https://issues.apache.org/jira/browse/HIVE-23720 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Zhihua Deng >Priority: Major > > Currently SQLOperation cancels the background task only when the condition is > met: > if (shouldRunAsync() && state != OperationState.CANCELED && state != > OperationState.TIMEDOUT) > The condition is evaluated to false when state is OperationState.CANCELED or > OperationState.TIMEDOUT, but operations in such states should stop the > background tasks to release resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23585) Retrieve replication instance metrics details
[ https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anishek Agarwal updated HIVE-23585: --- Resolution: Fixed Status: Resolved (was: Patch Available) +1 merged to master > Retrieve replication instance metrics details > - > > Key: HIVE-23585 > URL: https://issues.apache.org/jira/browse/HIVE-23585 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, > HIVE-23585.03.patch, Replication Metrics.pdf > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23708) MergeFileTask.execute() need to close jobclient
[ https://issues.apache.org/jira/browse/HIVE-23708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YulongZ updated HIVE-23708: --- Attachment: HIVE-23708.patch > MergeFileTask.execute() need to close jobclient > --- > > Key: HIVE-23708 > URL: https://issues.apache.org/jira/browse/HIVE-23708 > Project: Hive > Issue Type: Bug >Affects Versions: All Versions > Environment: Hadoop 3.1(1700+ nodes) > YARN 3.1 (with timelineserver enabled,https enabled) > Hive 3.1 (15 HS2 instance) > 6+ YARN Applications every day >Reporter: YulongZ >Priority: Critical > Attachments: HIVE-23708.patch > > > So when YARN use Https, MergeFileTask causes more and more Threads named > “ReloadingX509TrustManager” in HiveServer2。The threads named > “ReloadingX509TrustManager” does not interrupt,and HS2 becomes abnormal。 > In MergeFileTask.execute(DriverContext driverContext) ,a Jobclient created > but not closed。The issue cause above。 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23452) Exception occur when a SQL query across data stored in two relational DB by JDBCStorageHandler with Tez
[ https://issues.apache.org/jira/browse/HIVE-23452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138970#comment-17138970 ] De Li commented on HIVE-23452: -- It seems resolved by Hive-20652 but still need to further test. > Exception occur when a SQL query across data stored in two relational DB by > JDBCStorageHandler with Tez > --- > > Key: HIVE-23452 > URL: https://issues.apache.org/jira/browse/HIVE-23452 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 3.1.0 >Reporter: De Li >Priority: Major > > Exception occur when a SQL query across data stored in two relational DB by > JDBCStorageHandler with Tez. It seems there is an incorrect JDBC driver by > Tez and it works when query with MR. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly
[ https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447602 ] ASF GitHub Bot logged work on HIVE-23704: - Author: ASF GitHub Bot Created on: 18/Jun/20 02:06 Start Date: 18/Jun/20 02:06 Worklog Time Spent: 10m Work Description: belugabehr opened a new pull request #1127: URL: https://github.com/apache/hive/pull/1127 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447602) Time Spent: 1h 10m (was: 1h) > Thrift HTTP Server Does Not Handle Auth Handle Correctly > > > Key: HIVE-23704 > URL: https://issues.apache.org/jira/browse/HIVE-23704 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 3.1.2, 2.3.7 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Base64NegotiationError.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > {code:java|title=ThriftHttpServlet.java} > private String[] getAuthHeaderTokens(HttpServletRequest request, > String authType) throws HttpAuthenticationException { > String authHeaderBase64 = getAuthHeader(request, authType); > String authHeaderString = StringUtils.newStringUtf8( > Base64.decodeBase64(authHeaderBase64.getBytes())); > String[] creds = authHeaderString.split(":"); > return creds; > } > {code} > So here, it takes the authHeaderBase64 (which is a base-64 string), and > converts it into bytes, and then it tries to decode those bytes. That is > incorrect It should covert base-64 string directly into bytes. > I tried to do this as part of [HIVE-22676] and the tests was failing because > the string that is being decoded is not actually Base-64 (see attached image) > It has a stray space and a colon. Again, the existing code doesn't care > because it's not parsing Base-64 text, it is parsing the bytes generated by > converting base-64 text to bytes. > I'm not sure what affect this has, what security issues this may present, but > it's definitely not correct. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly
[ https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447601 ] ASF GitHub Bot logged work on HIVE-23704: - Author: ASF GitHub Bot Created on: 18/Jun/20 02:05 Start Date: 18/Jun/20 02:05 Worklog Time Spent: 10m Work Description: belugabehr closed pull request #1127: URL: https://github.com/apache/hive/pull/1127 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447601) Time Spent: 1h (was: 50m) > Thrift HTTP Server Does Not Handle Auth Handle Correctly > > > Key: HIVE-23704 > URL: https://issues.apache.org/jira/browse/HIVE-23704 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 3.1.2, 2.3.7 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Base64NegotiationError.png > > Time Spent: 1h > Remaining Estimate: 0h > > {code:java|title=ThriftHttpServlet.java} > private String[] getAuthHeaderTokens(HttpServletRequest request, > String authType) throws HttpAuthenticationException { > String authHeaderBase64 = getAuthHeader(request, authType); > String authHeaderString = StringUtils.newStringUtf8( > Base64.decodeBase64(authHeaderBase64.getBytes())); > String[] creds = authHeaderString.split(":"); > return creds; > } > {code} > So here, it takes the authHeaderBase64 (which is a base-64 string), and > converts it into bytes, and then it tries to decode those bytes. That is > incorrect It should covert base-64 string directly into bytes. > I tried to do this as part of [HIVE-22676] and the tests was failing because > the string that is being decoded is not actually Base-64 (see attached image) > It has a stray space and a colon. Again, the existing code doesn't care > because it's not parsing Base-64 text, it is parsing the bytes generated by > converting base-64 text to bytes. > I'm not sure what affect this has, what security issues this may present, but > it's definitely not correct. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20172) StatsUpdater failed with GSS Exception while trying to connect to remote metastore
[ https://issues.apache.org/jira/browse/HIVE-20172?focusedWorklogId=447569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447569 ] ASF GitHub Bot logged work on HIVE-20172: - Author: ASF GitHub Bot Created on: 18/Jun/20 00:24 Start Date: 18/Jun/20 00:24 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #400: URL: https://github.com/apache/hive/pull/400 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447569) Time Spent: 20m (was: 10m) > StatsUpdater failed with GSS Exception while trying to connect to remote > metastore > -- > > Key: HIVE-20172 > URL: https://issues.apache.org/jira/browse/HIVE-20172 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.1.1 > Environment: Hive-1.2.1,Hive2.1,java8 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20172.patch > > Time Spent: 20m > Remaining Estimate: 0h > > StatsUpdater task failed with GSS Exception while trying to connect to remote > Metastore. > {code} > org.apache.thrift.transport.TTransportException: GSS initiate failed > at > org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) > > at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) > at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:487) > > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282) > > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76) > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564) > > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:92) > > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138) > > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110) > > at > org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3526) > at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3558) > at > org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:533) > at > org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:300) > > at > org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265) > at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:177) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) > > at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) > ) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:534) > > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282) > > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76) > > {code} > since metastore client is running in HMS so there is no need to connect to > remote URI. -- This message
[jira] [Work logged] (HIVE-23238) FIX PreemptionQueueComparator edge cases
[ https://issues.apache.org/jira/browse/HIVE-23238?focusedWorklogId=447573=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447573 ] ASF GitHub Bot logged work on HIVE-23238: - Author: ASF GitHub Bot Created on: 18/Jun/20 00:24 Start Date: 18/Jun/20 00:24 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #985: URL: https://github.com/apache/hive/pull/985#issuecomment-645695696 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447573) Time Spent: 20m (was: 10m) > FIX PreemptionQueueComparator edge cases > > > Key: HIVE-23238 > URL: https://issues.apache.org/jira/browse/HIVE-23238 > Project: Hive > Issue Type: Improvement >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Fix For: llap > > Attachments: HIVE-23238.01.patch, HIVE-23238.02.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Properly handle preemption comparator edge cases where tasks are same type > and have the same number or upstream tasks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20131) SQL Script changes for creating txn write notification in 3.2.0 files
[ https://issues.apache.org/jira/browse/HIVE-20131?focusedWorklogId=447572=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447572 ] ASF GitHub Bot logged work on HIVE-20131: - Author: ASF GitHub Bot Created on: 18/Jun/20 00:24 Start Date: 18/Jun/20 00:24 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #398: URL: https://github.com/apache/hive/pull/398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447572) Time Spent: 20m (was: 10m) > SQL Script changes for creating txn write notification in 3.2.0 files > --- > > Key: HIVE-20131 > URL: https://issues.apache.org/jira/browse/HIVE-20131 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-20131.01.patch, HIVE-20131.02.patch > > Time Spent: 20m > Remaining Estimate: 0h > > 1. Change partition name size from 1024 to 767 . (mySQL 5.6 and before that > supports max 767 length keys) > 2. Remove the create txn_write_notification_log table creation from 3.1.0 > scripts and add a new scripts for 3.2.0 > 3. Remove the file 3.1.0-to-4.0.0 and instead add file for 3.2.0-to-4.0.0 and > 3.1.0-to-3.2.0 > 4. Change in metastore init schema xml file to take 4.0.0 instead of 3.1.0 > as current version. > h1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20044) Arrow Serde should pad char values and handle empty strings correctly
[ https://issues.apache.org/jira/browse/HIVE-20044?focusedWorklogId=447570=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447570 ] ASF GitHub Bot logged work on HIVE-20044: - Author: ASF GitHub Bot Created on: 18/Jun/20 00:24 Start Date: 18/Jun/20 00:24 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #397: URL: https://github.com/apache/hive/pull/397 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447570) Time Spent: 0.5h (was: 20m) > Arrow Serde should pad char values and handle empty strings correctly > - > > Key: HIVE-20044 > URL: https://issues.apache.org/jira/browse/HIVE-20044 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20044.1.branch-3.patch, HIVE-20044.1.patch, > HIVE-20044.1.patch, HIVE-20044.2.patch, HIVE-20044.3.patch, > HIVE-20044.3.patch, HIVE-20044.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > When Arrow Serde serializes char values, it loses padding. Also when it > counts empty strings, sometimes it makes a smaller number. It should pad char > values and handle empty strings correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20044) Arrow Serde should pad char values and handle empty strings correctly
[ https://issues.apache.org/jira/browse/HIVE-20044?focusedWorklogId=447571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447571 ] ASF GitHub Bot logged work on HIVE-20044: - Author: ASF GitHub Bot Created on: 18/Jun/20 00:24 Start Date: 18/Jun/20 00:24 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #396: URL: https://github.com/apache/hive/pull/396 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447571) Time Spent: 40m (was: 0.5h) > Arrow Serde should pad char values and handle empty strings correctly > - > > Key: HIVE-20044 > URL: https://issues.apache.org/jira/browse/HIVE-20044 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Teddy Choi >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20044.1.branch-3.patch, HIVE-20044.1.patch, > HIVE-20044.1.patch, HIVE-20044.2.patch, HIVE-20044.3.patch, > HIVE-20044.3.patch, HIVE-20044.patch > > Time Spent: 40m > Remaining Estimate: 0h > > When Arrow Serde serializes char values, it loses padding. Also when it > counts empty strings, sometimes it makes a smaller number. It should pad char > values and handle empty strings correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20070) ptest optimization - Replicate ACID/MM tables write operations.
[ https://issues.apache.org/jira/browse/HIVE-20070?focusedWorklogId=447574=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447574 ] ASF GitHub Bot logged work on HIVE-20070: - Author: ASF GitHub Bot Created on: 18/Jun/20 00:24 Start Date: 18/Jun/20 00:24 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #395: URL: https://github.com/apache/hive/pull/395 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447574) Time Spent: 20m (was: 10m) > ptest optimization - Replicate ACID/MM tables write operations. > > > Key: HIVE-20070 > URL: https://issues.apache.org/jira/browse/HIVE-20070 > Project: Hive > Issue Type: Sub-task > Components: repl, Transactions >Affects Versions: 3.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 4.0.0 > > Attachments: HIVE-20070.01.patch > > Time Spent: 20m > Remaining Estimate: 0h > > change the test to do incremental replication for each operation , instead of > only one incremental replication at the end -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23717) In jdbcUrl add config to create External + purge table by default
[ https://issues.apache.org/jira/browse/HIVE-23717?focusedWorklogId=447559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447559 ] ASF GitHub Bot logged work on HIVE-23717: - Author: ASF GitHub Bot Created on: 17/Jun/20 23:00 Start Date: 17/Jun/20 23:00 Worklog Time Spent: 10m Work Description: xiaomengzhang opened a new pull request #1143: URL: https://github.com/apache/hive/pull/1143 …efault External + purge tables are more backward compatible with the old managed tables in CDH and HDP 2. So add a jdbc config "defaultExternalTable". When the value is true, set "hive.create.as.acid" and "hive.create.as.insert.only" to false in session level. In such session, table created by default is external purge table. Test: Add unit test testDefaultExternal in TestJdbcDriver2.java Change-Id: I3b1adc3eb63596ebc1955d116498745fa9356547 ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447559) Remaining Estimate: 0h Time Spent: 10m > In jdbcUrl add config to create External + purge table by default > -- > > Key: HIVE-23717 > URL: https://issues.apache.org/jira/browse/HIVE-23717 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Affects Versions: 3.1.0 >Reporter: Xiaomeng Zhang >Assignee: Xiaomeng Zhang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > External + purge tables are more backward compatible with the old managed > tables. > Applications can use a HS2 URL that sets the session level property for > default table type to external-purge tables to be true. > As part of this we need a notion of a "session level only" config > parameter(s). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23717) In jdbcUrl add config to create External + purge table by default
[ https://issues.apache.org/jira/browse/HIVE-23717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23717: -- Labels: pull-request-available (was: ) > In jdbcUrl add config to create External + purge table by default > -- > > Key: HIVE-23717 > URL: https://issues.apache.org/jira/browse/HIVE-23717 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Affects Versions: 3.1.0 >Reporter: Xiaomeng Zhang >Assignee: Xiaomeng Zhang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > External + purge tables are more backward compatible with the old managed > tables. > Applications can use a HS2 URL that sets the session level property for > default table type to external-purge tables to be true. > As part of this we need a notion of a "session level only" config > parameter(s). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23718) Extract transaction handling from Driver
[ https://issues.apache.org/jira/browse/HIVE-23718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23718: -- Labels: pull-request-available (was: ) > Extract transaction handling from Driver > > > Key: HIVE-23718 > URL: https://issues.apache.org/jira/browse/HIVE-23718 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23718) Extract transaction handling from Driver
[ https://issues.apache.org/jira/browse/HIVE-23718?focusedWorklogId=447547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447547 ] ASF GitHub Bot logged work on HIVE-23718: - Author: ASF GitHub Bot Created on: 17/Jun/20 21:58 Start Date: 17/Jun/20 21:58 Worklog Time Spent: 10m Work Description: miklosgergely opened a new pull request #1142: URL: https://github.com/apache/hive/pull/1142 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447547) Remaining Estimate: 0h Time Spent: 10m > Extract transaction handling from Driver > > > Key: HIVE-23718 > URL: https://issues.apache.org/jira/browse/HIVE-23718 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23718) Extract transaction handling from Driver
[ https://issues.apache.org/jira/browse/HIVE-23718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Gergely reassigned HIVE-23718: - > Extract transaction handling from Driver > > > Key: HIVE-23718 > URL: https://issues.apache.org/jira/browse/HIVE-23718 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23467) Add a skip.trash config for HMS to skip trash when deleting external table data
[ https://issues.apache.org/jira/browse/HIVE-23467?focusedWorklogId=447523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447523 ] ASF GitHub Bot logged work on HIVE-23467: - Author: ASF GitHub Bot Created on: 17/Jun/20 20:52 Start Date: 17/Jun/20 20:52 Worklog Time Spent: 10m Work Description: sam-an-cloudera commented on a change in pull request #1133: URL: https://github.com/apache/hive/pull/1133#discussion_r441826009 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java ## @@ -105,6 +105,7 @@ protected DateFormat initialValue() { public static final String DB_EMPTY_MARKER = "!"; public static final String EXTERNAL_TABLE_PURGE = "external.table.purge"; + public static final String EXTERNAL_TABLE_AUTODELETE = "external.table.autodelete"; Review comment: After discussion with @nrg4878 and @thejasmn , we thought it's best that we don't add this new option because "external.table.purge" is widely used. Introducing the new autodelete might cause confusion. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447523) Time Spent: 0.5h (was: 20m) > Add a skip.trash config for HMS to skip trash when deleting external table > data > --- > > Key: HIVE-23467 > URL: https://issues.apache.org/jira/browse/HIVE-23467 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Sam An >Assignee: Yu-Wen Lai >Priority: Trivial > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > We have an auto.purge flag, which means skip trash. It can be confusing as we > have 'external.table.purge'='true' to indicate delete table data when this > tblproperties is set. > We should make the meaning clearer by introducing a skip trash alias/option. > Additionally, we shall add an alias for external.table.purge, and name it > external.table.autodelete, and document it more prominently, so as to > maintain backward compatibility, and make the meaning of auto deletion of > data more obvious. > The net effect of these 2 changes will be. If the user sets > 'external.table.autodelete'='true' > the table data will be removed when table is dropped. and if > 'skip.trash'='true' > is set, HMS will not move the table data to trash folder when removing the > files. This will result in faster removal, especially when underlying FS is > S3. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23612) Option for HiveStrictManagedMigration to impersonate a user for FS operations
[ https://issues.apache.org/jira/browse/HIVE-23612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138822#comment-17138822 ] Jason Dere commented on HIVE-23612: --- +1 > Option for HiveStrictManagedMigration to impersonate a user for FS operations > - > > Key: HIVE-23612 > URL: https://issues.apache.org/jira/browse/HIVE-23612 > Project: Hive > Issue Type: Improvement >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23612.0.patch, HIVE-23612.1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > HiveStrictManagedMigration tool can be used to move HDFS paths and to change > ownership on said paths. It may be beneficial to do such file system > operations as a different user than the one the tool itself is run. > Moreover, while creating the external DB directory, the tool will chown the > new directory to the user set as DB owner in HMS. If this is unset, no chown > command is used. In this case we should make the 'hive' user the directory > owner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23717) In jdbcUrl add config to create External + purge table by default
[ https://issues.apache.org/jira/browse/HIVE-23717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaomeng Zhang reassigned HIVE-23717: - > In jdbcUrl add config to create External + purge table by default > -- > > Key: HIVE-23717 > URL: https://issues.apache.org/jira/browse/HIVE-23717 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Metastore >Affects Versions: 3.1.0 >Reporter: Xiaomeng Zhang >Assignee: Xiaomeng Zhang >Priority: Major > > External + purge tables are more backward compatible with the old managed > tables. > Applications can use a HS2 URL that sets the session level property for > default table type to external-purge tables to be true. > As part of this we need a notion of a "session level only" config > parameter(s). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23467) Add a skip.trash config for HMS to skip trash when deleting external table data
[ https://issues.apache.org/jira/browse/HIVE-23467?focusedWorklogId=447466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447466 ] ASF GitHub Bot logged work on HIVE-23467: - Author: ASF GitHub Bot Created on: 17/Jun/20 19:11 Start Date: 17/Jun/20 19:11 Worklog Time Spent: 10m Work Description: nrg4878 commented on pull request #1133: URL: https://github.com/apache/hive/pull/1133#issuecomment-645567880 so if the goal is to deprecate "external.table.purge" going forward in favor of "external.table.autodelete", can we remove all the current references in the code as well and convert them to the new property. so all the new tables should be creating using the new property. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447466) Time Spent: 20m (was: 10m) > Add a skip.trash config for HMS to skip trash when deleting external table > data > --- > > Key: HIVE-23467 > URL: https://issues.apache.org/jira/browse/HIVE-23467 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Sam An >Assignee: Yu-Wen Lai >Priority: Trivial > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > We have an auto.purge flag, which means skip trash. It can be confusing as we > have 'external.table.purge'='true' to indicate delete table data when this > tblproperties is set. > We should make the meaning clearer by introducing a skip trash alias/option. > Additionally, we shall add an alias for external.table.purge, and name it > external.table.autodelete, and document it more prominently, so as to > maintain backward compatibility, and make the meaning of auto deletion of > data more obvious. > The net effect of these 2 changes will be. If the user sets > 'external.table.autodelete'='true' > the table data will be removed when table is dropped. and if > 'skip.trash'='true' > is set, HMS will not move the table data to trash folder when removing the > files. This will result in faster removal, especially when underlying FS is > S3. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23715) Fix zookeeper ssl keystore password handling issues
[ https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Varga updated HIVE-23715: --- Status: Patch Available (was: In Progress) > Fix zookeeper ssl keystore password handling issues > --- > > Key: HIVE-23715 > URL: https://issues.apache.org/jira/browse/HIVE-23715 > Project: Hive > Issue Type: Bug >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In HIVE-23045 Zookeeper SSL communication support was introduced, but the > password config for the keystore and truststore is not handled correctly is > they are stored in jceks. > Also the ZooKeeperTokenStore is not handling well the fallback to the global > zookeeper configurations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23715) Fix zookeeper ssl keystore password handling issues
[ https://issues.apache.org/jira/browse/HIVE-23715?focusedWorklogId=447445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447445 ] ASF GitHub Bot logged work on HIVE-23715: - Author: ASF GitHub Bot Created on: 17/Jun/20 18:12 Start Date: 17/Jun/20 18:12 Worklog Time Spent: 10m Work Description: pvargacl opened a new pull request #1141: URL: https://github.com/apache/hive/pull/1141 HIVE-23715: Fix zookeeper ssl keystore password handling issues This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447445) Remaining Estimate: 0h Time Spent: 10m > Fix zookeeper ssl keystore password handling issues > --- > > Key: HIVE-23715 > URL: https://issues.apache.org/jira/browse/HIVE-23715 > Project: Hive > Issue Type: Bug >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In HIVE-23045 Zookeeper SSL communication support was introduced, but the > password config for the keystore and truststore is not handled correctly is > they are stored in jceks. > Also the ZooKeeperTokenStore is not handling well the fallback to the global > zookeeper configurations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23715) Fix zookeeper ssl keystore password handling issues
[ https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23715: -- Labels: pull-request-available (was: ) > Fix zookeeper ssl keystore password handling issues > --- > > Key: HIVE-23715 > URL: https://issues.apache.org/jira/browse/HIVE-23715 > Project: Hive > Issue Type: Bug >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In HIVE-23045 Zookeeper SSL communication support was introduced, but the > password config for the keystore and truststore is not handled correctly is > they are stored in jceks. > Also the ZooKeeperTokenStore is not handling well the fallback to the global > zookeeper configurations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23716) Support Anti Join in Hive
[ https://issues.apache.org/jira/browse/HIVE-23716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mahesh kumar behera reassigned HIVE-23716: -- > Support Anti Join in Hive > -- > > Key: HIVE-23716 > URL: https://issues.apache.org/jira/browse/HIVE-23716 > Project: Hive > Issue Type: Bug >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > > Currently hive does not support Anti join. The query for anti join is > converted to left outer join and null filter on right side join key is added > to get the desired result. This is causing > # Extra computation — The left outer join projects the redundant columns > from right side. Along with that, filtering is done to remove the redundant > rows. This is can be avoided in case of anti join as anti join will project > only the required columns and rows from the left side table. > # Extra shuffle — In case of anti join the duplicate records moved to join > node can be avoided from the child node. This can reduce significant amount > of data movement if the number of distinct rows( join keys) is significant. > # Extra Memory Usage - In case of map based anti join , hash set is > sufficient as just the key is required to check if the records matches the > join condition. In case of left join, we need the key and the non key columns > also and thus a hash table will be required. > For a query like > {code:java} > select wr_order_number FROM web_returns LEFT JOIN web_sales ON > wr_order_number = ws_order_number WHERE ws_order_number IS NULL;{code} > The number of distinct ws_order_number in web_sales table in a typical 10TB > TPCDS set up is just 10% of total records. So when we convert this query to > anti join, instead of 7 billion rows, only 600 million rows are moved to join > node. > In the current patch, just one conversion is done. The pattern of > project->filter->left-join is converted to project->anti-join. This will take > care of sub queries with “not exists” clause. The queries with “not exists” > are converted first to filter + left-join and then its converted to anti > join. The queries with “not in” are not handled in the current patch. > From execution side, both merge join and map join with vectorized execution > is supported for anti join. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23698) Compiler support for row-level filtering on filterPredicates
[ https://issues.apache.org/jira/browse/HIVE-23698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-23698: -- Description: Similar to what we currently do for StorageHandlers, we should pushdown the static expression for row-level filtering when the file-format supports the feature (ORC). I propose to split the filterExpr to residual and pushed predicate. If predicate is completely pushed then we remove the operator. was: Similar to what we currently do for StorageHandlers, we should pushdown the static expression for row-level filtering when the file-format supports the feature (ORC). I propose to split the filterExpr to residual and pushed predicate. If predicate is completely pushed then we remove the operator. If its partially pushed we are not updating the filter as its could trigger constantFolding (and thus clearing existing TsFilters) > Compiler support for row-level filtering on filterPredicates > > > Key: HIVE-23698 > URL: https://issues.apache.org/jira/browse/HIVE-23698 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Similar to what we currently do for StorageHandlers, we should pushdown the > static expression for row-level filtering when the file-format supports the > feature (ORC). > I propose to split the filterExpr to residual and pushed predicate. If > predicate is completely pushed then we remove the operator. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23698) Compiler support for row-level filtering on filterPredicates
[ https://issues.apache.org/jira/browse/HIVE-23698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-23698: -- Description: Similar to what we currently do for StorageHandlers, we should pushdown the static expression for row-level filtering when the file-format supports the feature (ORC). I propose to split the filterExpr to residual and pushed predicate. If predicate is completely pushed then we remove the operator. If its partially pushed we are not updating the filter as its could trigger constantFolding (and thus clearing existing TsFilters) > Compiler support for row-level filtering on filterPredicates > > > Key: HIVE-23698 > URL: https://issues.apache.org/jira/browse/HIVE-23698 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Similar to what we currently do for StorageHandlers, we should pushdown the > static expression for row-level filtering when the file-format supports the > feature (ORC). > I propose to split the filterExpr to residual and pushed predicate. If > predicate is completely pushed then we remove the operator. > If its partially pushed we are not updating the filter as its could trigger > constantFolding (and thus clearing existing TsFilters) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly
[ https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447430=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447430 ] ASF GitHub Bot logged work on HIVE-23704: - Author: ASF GitHub Bot Created on: 17/Jun/20 17:50 Start Date: 17/Jun/20 17:50 Worklog Time Spent: 10m Work Description: belugabehr edited a comment on pull request #1127: URL: https://github.com/apache/hive/pull/1127#issuecomment-645524894 Existing code works because Commons Digest Base64 implementation ignores invalid characters:. https://github.com/apache/commons-codec/blob/41c6f486fd4f5c2450c9311c40dbbf7e576d2907/src/main/java/org/apache/commons/codec/binary/Base64.java#L621 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447430) Time Spent: 50m (was: 40m) > Thrift HTTP Server Does Not Handle Auth Handle Correctly > > > Key: HIVE-23704 > URL: https://issues.apache.org/jira/browse/HIVE-23704 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 3.1.2, 2.3.7 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Base64NegotiationError.png > > Time Spent: 50m > Remaining Estimate: 0h > > {code:java|title=ThriftHttpServlet.java} > private String[] getAuthHeaderTokens(HttpServletRequest request, > String authType) throws HttpAuthenticationException { > String authHeaderBase64 = getAuthHeader(request, authType); > String authHeaderString = StringUtils.newStringUtf8( > Base64.decodeBase64(authHeaderBase64.getBytes())); > String[] creds = authHeaderString.split(":"); > return creds; > } > {code} > So here, it takes the authHeaderBase64 (which is a base-64 string), and > converts it into bytes, and then it tries to decode those bytes. That is > incorrect It should covert base-64 string directly into bytes. > I tried to do this as part of [HIVE-22676] and the tests was failing because > the string that is being decoded is not actually Base-64 (see attached image) > It has a stray space and a colon. Again, the existing code doesn't care > because it's not parsing Base-64 text, it is parsing the bytes generated by > converting base-64 text to bytes. > I'm not sure what affect this has, what security issues this may present, but > it's definitely not correct. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly
[ https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447429 ] ASF GitHub Bot logged work on HIVE-23704: - Author: ASF GitHub Bot Created on: 17/Jun/20 17:49 Start Date: 17/Jun/20 17:49 Worklog Time Spent: 10m Work Description: belugabehr commented on pull request #1127: URL: https://github.com/apache/hive/pull/1127#issuecomment-645524894 Existing code works because Commons Digest Base64 implementation ignores invalid characters:. https://github.com/apache/commons-codec/blob/41c6f486fd4f5c2450c9311c40dbbf7e576d2907/src/main/java/org/apache/commons/codec/binary/Base64.java#L640 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447429) Time Spent: 40m (was: 0.5h) > Thrift HTTP Server Does Not Handle Auth Handle Correctly > > > Key: HIVE-23704 > URL: https://issues.apache.org/jira/browse/HIVE-23704 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 3.1.2, 2.3.7 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Base64NegotiationError.png > > Time Spent: 40m > Remaining Estimate: 0h > > {code:java|title=ThriftHttpServlet.java} > private String[] getAuthHeaderTokens(HttpServletRequest request, > String authType) throws HttpAuthenticationException { > String authHeaderBase64 = getAuthHeader(request, authType); > String authHeaderString = StringUtils.newStringUtf8( > Base64.decodeBase64(authHeaderBase64.getBytes())); > String[] creds = authHeaderString.split(":"); > return creds; > } > {code} > So here, it takes the authHeaderBase64 (which is a base-64 string), and > converts it into bytes, and then it tries to decode those bytes. That is > incorrect It should covert base-64 string directly into bytes. > I tried to do this as part of [HIVE-22676] and the tests was failing because > the string that is being decoded is not actually Base-64 (see attached image) > It has a stray space and a colon. Again, the existing code doesn't care > because it's not parsing Base-64 text, it is parsing the bytes generated by > converting base-64 text to bytes. > I'm not sure what affect this has, what security issues this may present, but > it's definitely not correct. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque
[ https://issues.apache.org/jira/browse/HIVE-23700?focusedWorklogId=447427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447427 ] ASF GitHub Bot logged work on HIVE-23700: - Author: ASF GitHub Bot Created on: 17/Jun/20 17:45 Start Date: 17/Jun/20 17:45 Worklog Time Spent: 10m Work Description: frankgh opened a new pull request #1140: URL: https://github.com/apache/hive/pull/1140 Handle IllegalArgumentExceptions thrown by the File constructor when the jar URI is not supported. This fixes the static initialization of the HiveConf class when four conditions are met: 1. hive-site.xml is not present on the classpath 2. hive-site.xml is not present on the "HIVE_CONF_DIR" directory 3. hive-site.xml is not present on the "HIVE_HOME" directory 4. jar URI is not absolute, or is opaque, or URI scheme is null or not file, uri authority is not null, uri fragment is not null, uri query is not null and finally uri path is empty. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447427) Remaining Estimate: 119.5h (was: 119h 40m) Time Spent: 0.5h (was: 20m) > HiveConf static initialization fails when JAR URI is opaque > --- > > Key: HIVE-23700 > URL: https://issues.apache.org/jira/browse/HIVE-23700 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.7 >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Minor > Labels: pull-request-available > Attachments: HIVE-23700.1.patch > > Original Estimate: 120h > Time Spent: 0.5h > Remaining Estimate: 119.5h > > HiveConf static initialization fails when the jar URI is opaque, for example > when it's embedded as a fat jar in a spring boot application. Then > initialization of the HiveConf static block fails and the HiveConf class does > not get classloaded. The opaque URI in my case looks like this > _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_ > HiveConf#findConfigFile should be able to handle `IllegalArgumentException` > when the jar `URI` provided to `File` throws the exception. > To surface this issue three conditions need to be met. > 1. hive-site.xml should not be on the classpath > 2. hive-site.xml should not be on "HIVE_CONF_DIR" > 3. hive-site.xml should not be on "HIVE_HOME" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque
[ https://issues.apache.org/jira/browse/HIVE-23700?focusedWorklogId=447415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447415 ] ASF GitHub Bot logged work on HIVE-23700: - Author: ASF GitHub Bot Created on: 17/Jun/20 17:25 Start Date: 17/Jun/20 17:25 Worklog Time Spent: 10m Work Description: frankgh opened a new pull request #1139: URL: https://github.com/apache/hive/pull/1139 Handle IllegalArgumentExceptions thrown by the File constructor when the jar URI is not supported. This fixes the static initialization of the HiveConf class when four conditions are met: 1. hive-site.xml is not present on the classpath 2. hive-site.xml is not present on the "HIVE_CONF_DIR" directory 3. hive-site.xml is not present on the "HIVE_HOME" directory 4. jar URI is not absolute, or is opaque, or URI scheme is null or not file, uri authority is not null, uri fragment is not null, uri query is not null and finally uri path is empty. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447415) Remaining Estimate: 119h 40m (was: 119h 50m) Time Spent: 20m (was: 10m) > HiveConf static initialization fails when JAR URI is opaque > --- > > Key: HIVE-23700 > URL: https://issues.apache.org/jira/browse/HIVE-23700 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.7 >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Minor > Labels: pull-request-available > Attachments: HIVE-23700.1.patch > > Original Estimate: 120h > Time Spent: 20m > Remaining Estimate: 119h 40m > > HiveConf static initialization fails when the jar URI is opaque, for example > when it's embedded as a fat jar in a spring boot application. Then > initialization of the HiveConf static block fails and the HiveConf class does > not get classloaded. The opaque URI in my case looks like this > _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_ > HiveConf#findConfigFile should be able to handle `IllegalArgumentException` > when the jar `URI` provided to `File` throws the exception. > To surface this issue three conditions need to be met. > 1. hive-site.xml should not be on the classpath > 2. hive-site.xml should not be on "HIVE_CONF_DIR" > 3. hive-site.xml should not be on "HIVE_HOME" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name
[ https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Poepping updated HIVE-22928: --- Status: Patch Available (was: In Progress) resubmitting .5.patch as .6.patch, hopefully the PreCommit job picks it up this time. > Allow hive.exec.stagingdir to be a fully qualified directory name > - > > Key: HIVE-22928 > URL: https://issues.apache.org/jira/browse/HIVE-22928 > Project: Hive > Issue Type: Improvement > Components: Configuration, Hive >Affects Versions: 3.1.2 >Reporter: Thomas Poepping >Assignee: Thomas Poepping >Priority: Minor > Attachments: HIVE-22928.2.patch, HIVE-22928.3.patch, > HIVE-22928.4.patch, HIVE-22928.5.patch, HIVE-22928.6.patch, HIVE-22928.patch > > > Currently, {{hive.exec.stagingdir}} can only be set as a relative directory > name that, for operations like {{insert}} or {{insert overwrite}}, will be > placed either under the table directory or the partition directory. > For cases where an HDFS cluster is small but the data being inserted is very > large (greater than the capacity of the HDFS cluster, as mentioned in a > comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their > staging directory to be an explicit blobstore path (or any filesystem path), > rather than relying on Hive to intelligently build the blobstore path based > on an interpretation of the job. We may lose locality guarantees, but because > renames are just as expensive on blobstores no matter what the prefix is, > this isn't considered a terribly large loss (assuming only blobstore > customers use this functionality). > Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually > suffice in this case, as the stagingdir is not the same. > This commit enables Hive customers to set an absolute location for all > staging directories. For instances where the configured stagingdir scheme is > not the same as the scheme for the table location, the default stagingdir > configuration is used. This avoids a cross-filesystem rename, which is > impossible anyway. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name
[ https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Poepping updated HIVE-22928: --- Status: In Progress (was: Patch Available) > Allow hive.exec.stagingdir to be a fully qualified directory name > - > > Key: HIVE-22928 > URL: https://issues.apache.org/jira/browse/HIVE-22928 > Project: Hive > Issue Type: Improvement > Components: Configuration, Hive >Affects Versions: 3.1.2 >Reporter: Thomas Poepping >Assignee: Thomas Poepping >Priority: Minor > Attachments: HIVE-22928.2.patch, HIVE-22928.3.patch, > HIVE-22928.4.patch, HIVE-22928.5.patch, HIVE-22928.6.patch, HIVE-22928.patch > > > Currently, {{hive.exec.stagingdir}} can only be set as a relative directory > name that, for operations like {{insert}} or {{insert overwrite}}, will be > placed either under the table directory or the partition directory. > For cases where an HDFS cluster is small but the data being inserted is very > large (greater than the capacity of the HDFS cluster, as mentioned in a > comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their > staging directory to be an explicit blobstore path (or any filesystem path), > rather than relying on Hive to intelligently build the blobstore path based > on an interpretation of the job. We may lose locality guarantees, but because > renames are just as expensive on blobstores no matter what the prefix is, > this isn't considered a terribly large loss (assuming only blobstore > customers use this functionality). > Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually > suffice in this case, as the stagingdir is not the same. > This commit enables Hive customers to set an absolute location for all > staging directories. For instances where the configured stagingdir scheme is > not the same as the scheme for the table location, the default stagingdir > configuration is used. This avoids a cross-filesystem rename, which is > impossible anyway. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name
[ https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Poepping updated HIVE-22928: --- Attachment: HIVE-22928.6.patch > Allow hive.exec.stagingdir to be a fully qualified directory name > - > > Key: HIVE-22928 > URL: https://issues.apache.org/jira/browse/HIVE-22928 > Project: Hive > Issue Type: Improvement > Components: Configuration, Hive >Affects Versions: 3.1.2 >Reporter: Thomas Poepping >Assignee: Thomas Poepping >Priority: Minor > Attachments: HIVE-22928.2.patch, HIVE-22928.3.patch, > HIVE-22928.4.patch, HIVE-22928.5.patch, HIVE-22928.6.patch, HIVE-22928.patch > > > Currently, {{hive.exec.stagingdir}} can only be set as a relative directory > name that, for operations like {{insert}} or {{insert overwrite}}, will be > placed either under the table directory or the partition directory. > For cases where an HDFS cluster is small but the data being inserted is very > large (greater than the capacity of the HDFS cluster, as mentioned in a > comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their > staging directory to be an explicit blobstore path (or any filesystem path), > rather than relying on Hive to intelligently build the blobstore path based > on an interpretation of the job. We may lose locality guarantees, but because > renames are just as expensive on blobstores no matter what the prefix is, > this isn't considered a terribly large loss (assuming only blobstore > customers use this functionality). > Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually > suffice in this case, as the stagingdir is not the same. > This commit enables Hive customers to set an absolute location for all > staging directories. For instances where the configured stagingdir scheme is > not the same as the scheme for the table location, the default stagingdir > configuration is used. This avoids a cross-filesystem rename, which is > impossible anyway. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-23715) Fix zookeeper ssl keystore password handling issues
[ https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-23715 started by Peter Varga. -- > Fix zookeeper ssl keystore password handling issues > --- > > Key: HIVE-23715 > URL: https://issues.apache.org/jira/browse/HIVE-23715 > Project: Hive > Issue Type: Bug >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > > In HIVE-23045 Zookeeper SSL communication support was introduced, but the > password config for the keystore and truststore is not handled correctly is > they are stored in jceks. > Also the ZooKeeperTokenStore is not handling well the fallback to the global > zookeeper configurations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23715) Fix zookeeper ssl keystore password handling issues
[ https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Varga reassigned HIVE-23715: -- > Fix zookeeper ssl keystore password handling issues > --- > > Key: HIVE-23715 > URL: https://issues.apache.org/jira/browse/HIVE-23715 > Project: Hive > Issue Type: Bug >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > > In HIVE-23045 Zookeeper SSL communication support was introduced, but the > password config for the keystore and truststore is not handled correctly is > they are stored in jceks. > Also the ZooKeeperTokenStore is not handling well the fallback to the global > zookeeper configurations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque
[ https://issues.apache.org/jira/browse/HIVE-23700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23700: -- Labels: pull-request-available (was: ) > HiveConf static initialization fails when JAR URI is opaque > --- > > Key: HIVE-23700 > URL: https://issues.apache.org/jira/browse/HIVE-23700 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.7 >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Minor > Labels: pull-request-available > Attachments: HIVE-23700.1.patch > > Original Estimate: 120h > Time Spent: 10m > Remaining Estimate: 119h 50m > > HiveConf static initialization fails when the jar URI is opaque, for example > when it's embedded as a fat jar in a spring boot application. Then > initialization of the HiveConf static block fails and the HiveConf class does > not get classloaded. The opaque URI in my case looks like this > _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_ > HiveConf#findConfigFile should be able to handle `IllegalArgumentException` > when the jar `URI` provided to `File` throws the exception. > To surface this issue three conditions need to be met. > 1. hive-site.xml should not be on the classpath > 2. hive-site.xml should not be on "HIVE_CONF_DIR" > 3. hive-site.xml should not be on "HIVE_HOME" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque
[ https://issues.apache.org/jira/browse/HIVE-23700?focusedWorklogId=447365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447365 ] ASF GitHub Bot logged work on HIVE-23700: - Author: ASF GitHub Bot Created on: 17/Jun/20 15:42 Start Date: 17/Jun/20 15:42 Worklog Time Spent: 10m Work Description: frankgh opened a new pull request #1138: URL: https://github.com/apache/hive/pull/1138 Handle IllegalArgumentExceptions thrown by the File constructor when the jar URI is not supported. This fixes the static initialization of the HiveConf class when four conditions are met: 1. hive-site.xml is not present in the classpath 2. hive-site.xml is not present on the "HIVE_CONF_DIR" directory 3. hive-site.xml is not present on the "HIVE_HOME" directory 4. jar URI is not absolute, or is opaque, or URI scheme is null or not file, uri authority is not null, uri fragment is not null, uri query is not null and finally uri path is empty. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447365) Remaining Estimate: 119h 50m (was: 120h) Time Spent: 10m > HiveConf static initialization fails when JAR URI is opaque > --- > > Key: HIVE-23700 > URL: https://issues.apache.org/jira/browse/HIVE-23700 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.7 >Reporter: Francisco Guerrero >Assignee: Francisco Guerrero >Priority: Minor > Attachments: HIVE-23700.1.patch > > Original Estimate: 120h > Time Spent: 10m > Remaining Estimate: 119h 50m > > HiveConf static initialization fails when the jar URI is opaque, for example > when it's embedded as a fat jar in a spring boot application. Then > initialization of the HiveConf static block fails and the HiveConf class does > not get classloaded. The opaque URI in my case looks like this > _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_ > HiveConf#findConfigFile should be able to handle `IllegalArgumentException` > when the jar `URI` provided to `File` throws the exception. > To surface this issue three conditions need to be met. > 1. hive-site.xml should not be on the classpath > 2. hive-site.xml should not be on "HIVE_CONF_DIR" > 3. hive-site.xml should not be on "HIVE_HOME" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-20890) ACID: Allow whole table ReadLocks to skip all partition locks
[ https://issues.apache.org/jira/browse/HIVE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138524#comment-17138524 ] Peter Vary commented on HIVE-20890: --- Thanks [~dkuzmenko] for the notification on the duplicate! Do you plan to create a pull request for this? 2 quick questions: * You added the check for the {{TABLE}} - we still want to have the table level lock to be there, or I miss something? {code} case TABLE: t = input.getTable(); if (!fullTableLock.contains(t)) { continue; } {code} * We supposed to read the conf like this nowadays: {code} HiveConf.getIntVar(conf, ConfVars. HIVE_LOCKS_PARTITION_THRESHOLD); {code} Thanks again, Peter > ACID: Allow whole table ReadLocks to skip all partition locks > - > > Key: HIVE-20890 > URL: https://issues.apache.org/jira/browse/HIVE-20890 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Gopal Vijayaraghavan >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-20890.1.patch, HIVE-20890.2.patch, > HIVE-20890.3.patch, HIVE-20890.4.patch > > > HIVE-19369 proposes adding a EXCL_WRITE lock which does not wait for any > SHARED_READ locks for read operations - in the presence of that lock, the > insert overwrite no longer takes an exclusive lock. > The only exclusive operation will be a schema change or drop table, which > should take an exclusive lock on the entire table directly. > {code} > explain locks select * from tpcds_bin_partitioned_orc_1000.store_sales where > ss_sold_date_sk=2452626 > ++ > | Explain | > ++ > | LOCK INFORMATION: | > | tpcds_bin_partitioned_orc_1000.store_sales -> SHARED_READ | > | tpcds_bin_partitioned_orc_1000.store_sales.ss_sold_date_sk=2452626 -> > SHARED_READ | > ++ > {code} > So the per-partition SHARED_READ locks are no longer necessary, if the lock > builder already includes the table-wide SHARED_READ locks. > The removal of entire partitions is the only part which needs to be taken > care of within this semantics as row-removal instead of directory removal > (i.e "drop partition" -> "truncate partition" and have the truncation trigger > a whole directory cleaner, so that the partition disappears when there are 0 > rows left). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23714) Add new configuration for lock escalation
[ https://issues.apache.org/jira/browse/HIVE-23714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary resolved HIVE-23714. --- Resolution: Duplicate Duplicates: HIVE-20890 > Add new configuration for lock escalation > - > > Key: HIVE-23714 > URL: https://issues.apache.org/jira/browse/HIVE-23714 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > It would be good to have an opportunity that after a given number of locks is > reached on a partitioned table, we can escalate the lock and request a table > level lock instead of a multiple partition level locks. > This is part of the solution proposed on HIVE-21354 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly
[ https://issues.apache.org/jira/browse/HIVE-23704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138490#comment-17138490 ] David Mollitor commented on HIVE-23704: --- Existing code works because Commons Digest Base64 implementation ignores invalid characters:. https://github.com/apache/commons-codec/blob/41c6f486fd4f5c2450c9311c40dbbf7e576d2907/src/main/java/org/apache/commons/codec/binary/Base64.java#L640 > Thrift HTTP Server Does Not Handle Auth Handle Correctly > > > Key: HIVE-23704 > URL: https://issues.apache.org/jira/browse/HIVE-23704 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 3.1.2, 2.3.7 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Base64NegotiationError.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > {code:java|title=ThriftHttpServlet.java} > private String[] getAuthHeaderTokens(HttpServletRequest request, > String authType) throws HttpAuthenticationException { > String authHeaderBase64 = getAuthHeader(request, authType); > String authHeaderString = StringUtils.newStringUtf8( > Base64.decodeBase64(authHeaderBase64.getBytes())); > String[] creds = authHeaderString.split(":"); > return creds; > } > {code} > So here, it takes the authHeaderBase64 (which is a base-64 string), and > converts it into bytes, and then it tries to decode those bytes. That is > incorrect It should covert base-64 string directly into bytes. > I tried to do this as part of [HIVE-22676] and the tests was failing because > the string that is being decoded is not actually Base-64 (see attached image) > It has a stray space and a colon. Again, the existing code doesn't care > because it's not parsing Base-64 text, it is parsing the bytes generated by > converting base-64 text to bytes. > I'm not sure what affect this has, what security issues this may present, but > it's definitely not correct. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly
[ https://issues.apache.org/jira/browse/HIVE-23704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated HIVE-23704: -- Priority: Major (was: Critical) > Thrift HTTP Server Does Not Handle Auth Handle Correctly > > > Key: HIVE-23704 > URL: https://issues.apache.org/jira/browse/HIVE-23704 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 3.1.2, 2.3.7 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: Base64NegotiationError.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > {code:java|title=ThriftHttpServlet.java} > private String[] getAuthHeaderTokens(HttpServletRequest request, > String authType) throws HttpAuthenticationException { > String authHeaderBase64 = getAuthHeader(request, authType); > String authHeaderString = StringUtils.newStringUtf8( > Base64.decodeBase64(authHeaderBase64.getBytes())); > String[] creds = authHeaderString.split(":"); > return creds; > } > {code} > So here, it takes the authHeaderBase64 (which is a base-64 string), and > converts it into bytes, and then it tries to decode those bytes. That is > incorrect It should covert base-64 string directly into bytes. > I tried to do this as part of [HIVE-22676] and the tests was failing because > the string that is being decoded is not actually Base-64 (see attached image) > It has a stray space and a colon. Again, the existing code doesn't care > because it's not parsing Base-64 text, it is parsing the bytes generated by > converting base-64 text to bytes. > I'm not sure what affect this has, what security issues this may present, but > it's definitely not correct. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23585) Retrieve replication instance metrics details
[ https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138465#comment-17138465 ] Pravin Sinha commented on HIVE-23585: - +1 > Retrieve replication instance metrics details > - > > Key: HIVE-23585 > URL: https://issues.apache.org/jira/browse/HIVE-23585 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, > HIVE-23585.03.patch, Replication Metrics.pdf > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23714) Add new configuration for lock escalation
[ https://issues.apache.org/jira/browse/HIVE-23714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138463#comment-17138463 ] Denys Kuzmenko commented on HIVE-23714: --- [~pvary], https://issues.apache.org/jira/browse/HIVE-20890 addresses the same but only for read operations > Add new configuration for lock escalation > - > > Key: HIVE-23714 > URL: https://issues.apache.org/jira/browse/HIVE-23714 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > It would be good to have an opportunity that after a given number of locks is > reached on a partitioned table, we can escalate the lock and request a table > level lock instead of a multiple partition level locks. > This is part of the solution proposed on HIVE-21354 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors
[ https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LuGuangMing updated HIVE-22753: --- Description: In case of exception in SQLOperation, operational log does not get cleared up. This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to OOM after some time. !image-2020-01-21-11-14-37-911.png|width=431,height=267! Allocation tree !image-2020-01-21-11-18-37-294.png|width=425,height=178! Prod instance mem !image-2020-01-21-11-17-59-279.png|width=951,height=285! Each HushableRandomAccessFileAppender holds internal ref to RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak. Related ticket: HIVE-18820 was: In case of exception in SQLOperation, operational log does not get cleared up. This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to OOM after some time. !image-2020-01-21-11-14-37-911.png|width=431,height=267! Allocation tree !image-2020-01-21-11-18-37-294.png|width=425,height=178! Prod instance mem !image-2020-01-21-11-17-59-279.png|width=698,height=209! Each HushableRandomAccessFileAppender holds internal ref to RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak. Related ticket: HIVE-18820 > Fix gradual mem leak: Operationlog related appenders should be cleared up on > errors > > > Key: HIVE-22753 > URL: https://issues.apache.org/jira/browse/HIVE-22753 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 4.0.0 > > Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, > HIVE-22753.3.patch, HIVE-22753.4.patch, image-2020-01-21-11-14-37-911.png, > image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png > > > In case of exception in SQLOperation, operational log does not get cleared > up. This causes gradual build up of HushableRandomAccessFileAppender causing > HS2 to OOM after some time. > !image-2020-01-21-11-14-37-911.png|width=431,height=267! > > Allocation tree > !image-2020-01-21-11-18-37-294.png|width=425,height=178! > > Prod instance mem > !image-2020-01-21-11-17-59-279.png|width=951,height=285! > > Each HushableRandomAccessFileAppender holds internal ref to > RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem > leak. > Related ticket: HIVE-18820 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors
[ https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LuGuangMing updated HIVE-22753: --- Description: In case of exception in SQLOperation, operational log does not get cleared up. This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to OOM after some time. !image-2020-01-21-11-14-37-911.png|width=431,height=267! Allocation tree !image-2020-01-21-11-18-37-294.png|width=425,height=178! Prod instance mem !image-2020-01-21-11-17-59-279.png|width=671,height=201! Each HushableRandomAccessFileAppender holds internal ref to RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak. Related ticket: HIVE-18820 was: In case of exception in SQLOperation, operational log does not get cleared up. This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to OOM after some time. !image-2020-01-21-11-14-37-911.png|width=431,height=267! Allocation tree !image-2020-01-21-11-18-37-294.png|width=425,height=178! Prod instance mem !image-2020-01-21-11-17-59-279.png|width=951,height=285! Each HushableRandomAccessFileAppender holds internal ref to RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak. Related ticket: HIVE-18820 > Fix gradual mem leak: Operationlog related appenders should be cleared up on > errors > > > Key: HIVE-22753 > URL: https://issues.apache.org/jira/browse/HIVE-22753 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 4.0.0 > > Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, > HIVE-22753.3.patch, HIVE-22753.4.patch, image-2020-01-21-11-14-37-911.png, > image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png > > > In case of exception in SQLOperation, operational log does not get cleared > up. This causes gradual build up of HushableRandomAccessFileAppender causing > HS2 to OOM after some time. > !image-2020-01-21-11-14-37-911.png|width=431,height=267! > > Allocation tree > !image-2020-01-21-11-18-37-294.png|width=425,height=178! > > Prod instance mem > !image-2020-01-21-11-17-59-279.png|width=671,height=201! > > Each HushableRandomAccessFileAppender holds internal ref to > RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem > leak. > Related ticket: HIVE-18820 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-23712) metadata-only queries return incorrect results with empty partition
[ https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138436#comment-17138436 ] László Bodor edited comment on HIVE-23712 at 6/17/20, 1:35 PM: --- the root cause is that in [MetadataOnlyOptimizer|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java#L124] the TS operator of test1 table is considered to be subject of metadata-only optimization and later [NullScanTaskDispatcher|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java#L106] find a non-empty folder for this partition because of ACID operations, with files: {code} hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delete_delta_003_003_ hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delta_002_002_ {code} not sure about the perfect solution at the moment, but maybe the following scenario should be excluded somehow from metadata-only optimization: 1. there is a partitioned table: create table test1 (id int, val string) partitioned by (val2 string) STORED AS ORC TBLPROPERTIES ('transactional'='true'); 2. in a distinct query, only the partitioned column is selected: {code} select distinct val2, current_timestamp, 'metadata true' as query from test1; {code} in this case tsOp.getNeededColumnIDs() is empty (partition column is not present in needed columns) was (Author: abstractdog): the root cause is that in [MetadataOnlyOptimizer|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java#L124] the TS operator of test1 table is considered to be subject of metadata-only optimization and later [NullScanTaskDispatcher|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java#L106] find a non-empty folder for this partition because of ACID operations, with files: {code} hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delete_delta_003_003_ hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delta_002_002_ {code} not sure about the perfect solution at the moment, but maybe the following scenario should be excluded somehow from metadata-only optimization: 1. there is a partitioned table: create table test1 (id int, val string) partitioned by (val2 string) STORED AS ORC TBLPROPERTIES ('transactional'='true'); 2. in a distinct query, only the partitioned column is selected: select distinct val2, current_timestamp, 'metadata true' as query from test1; {code} in this case tsOp.getNeededColumnIDs() is empty (partition column is not present in needed columns) {code} > metadata-only queries return incorrect results with empty partition > --- > > Key: HIVE-23712 > URL: https://issues.apache.org/jira/browse/HIVE-23712 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > > Similarly to HIVE-15397, queries can return incorrect results for > metadata-only queries, here is a repro scenario which affects master: > {code} > set hive.support.concurrency=true; > set hive.exec.dynamic.partition.mode=nonstrict; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > set hive.optimize.metadataonly=true; > create table test1 (id int, val string) partitioned by (val2 string) STORED > AS ORC TBLPROPERTIES ('transactional'='true'); > describe formatted test1; > alter table test1 add partition (val2='foo'); > alter table test1 add partition (val2='bar'); > insert into test1 partition (val2='foo') values (1, 'abc'); > select distinct val2, current_timestamp from test1; > insert into test1 partition (val2='bar') values (1, 'def'); > delete from test1 where val2 = 'bar'; > select '--> hive.optimize.metadataonly=true'; > select distinct val2, current_timestamp from test1; > set hive.optimize.metadataonly=false; > select '--> hive.optimize.metadataonly=false'; > select distinct val2, current_timestamp from test1; > select current_timestamp, * from test1; > {code} > in this case 2 rows returned instead of 1 after a delete with metadata only > optimization: > https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23714) Add new configuration for lock escalation
[ https://issues.apache.org/jira/browse/HIVE-23714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary reassigned HIVE-23714: - > Add new configuration for lock escalation > - > > Key: HIVE-23714 > URL: https://issues.apache.org/jira/browse/HIVE-23714 > Project: Hive > Issue Type: Improvement > Components: Transactions >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > It would be good to have an opportunity that after a given number of locks is > reached on a partitioned table, we can escalate the lock and request a table > level lock instead of a multiple partition level locks. > This is part of the solution proposed on HIVE-21354 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23712) metadata-only queries return incorrect results with empty partition
[ https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138436#comment-17138436 ] László Bodor commented on HIVE-23712: - the root cause is that in [MetadataOnlyOptimizer|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java#L124] the TS operator of test1 table is considered to be subject of metadata-only optimization and later [NullScanTaskDispatcher|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java#L106] find a non-empty folder for this partition because of ACID operations, with files: {code} hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delete_delta_003_003_ hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delta_002_002_ {code} not sure about the perfect solution at the moment, but maybe the following scenario should be excluded somehow from metadata-only optimization: 1. there is a partitioned table: create table test1 (id int, val string) partitioned by (val2 string) STORED AS ORC TBLPROPERTIES ('transactional'='true'); 2. in a distinct query, only the partitioned column is selected: select distinct val2, current_timestamp, 'metadata true' as query from test1; {code} in this case tsOp.getNeededColumnIDs() is empty (partition column is not present in needed columns) {code} > metadata-only queries return incorrect results with empty partition > --- > > Key: HIVE-23712 > URL: https://issues.apache.org/jira/browse/HIVE-23712 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > > Similarly to HIVE-15397, queries can return incorrect results for > metadata-only queries, here is a repro scenario which affects master: > {code} > set hive.support.concurrency=true; > set hive.exec.dynamic.partition.mode=nonstrict; > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > set hive.optimize.metadataonly=true; > create table test1 (id int, val string) partitioned by (val2 string) STORED > AS ORC TBLPROPERTIES ('transactional'='true'); > describe formatted test1; > alter table test1 add partition (val2='foo'); > alter table test1 add partition (val2='bar'); > insert into test1 partition (val2='foo') values (1, 'abc'); > select distinct val2, current_timestamp from test1; > insert into test1 partition (val2='bar') values (1, 'def'); > delete from test1 where val2 = 'bar'; > select '--> hive.optimize.metadataonly=true'; > select distinct val2, current_timestamp from test1; > set hive.optimize.metadataonly=false; > select '--> hive.optimize.metadataonly=false'; > select distinct val2, current_timestamp from test1; > select current_timestamp, * from test1; > {code} > in this case 2 rows returned instead of 1 after a delete with metadata only > optimization: > https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23026) Allow for custom YARN application name for TEZ queries
[ https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor resolved HIVE-23026. --- Fix Version/s: (was: 3.0.0) (was: 2.0.0) 3.2.0 4.0.0 2.4.0 Resolution: Fixed Thanks so much [~xiejiajun]. PR has been merged across versions. > Allow for custom YARN application name for TEZ queries > -- > > Key: HIVE-23026 > URL: https://issues.apache.org/jira/browse/HIVE-23026 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Jake Xie >Assignee: Jake Xie >Priority: Major > Labels: pull-request-available > Fix For: 2.4.0, 4.0.0, 3.2.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > Currently tez on hiveServer2 cannot specify yarn application name, which is > not very convenient for locating the problem SQL, so i added a configuration > item to support setting tez job name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23026) Allow for custom YARN application name for TEZ queries
[ https://issues.apache.org/jira/browse/HIVE-23026?focusedWorklogId=447275=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447275 ] ASF GitHub Bot logged work on HIVE-23026: - Author: ASF GitHub Bot Created on: 17/Jun/20 13:33 Start Date: 17/Jun/20 13:33 Worklog Time Spent: 10m Work Description: belugabehr merged pull request #1083: URL: https://github.com/apache/hive/pull/1083 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447275) Time Spent: 5h 10m (was: 5h) > Allow for custom YARN application name for TEZ queries > -- > > Key: HIVE-23026 > URL: https://issues.apache.org/jira/browse/HIVE-23026 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Jake Xie >Assignee: Jake Xie >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 3.0.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > Currently tez on hiveServer2 cannot specify yarn application name, which is > not very convenient for locating the problem SQL, so i added a configuration > item to support setting tez job name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23026) Allow for custom YARN application name for TEZ queries
[ https://issues.apache.org/jira/browse/HIVE-23026?focusedWorklogId=447274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447274 ] ASF GitHub Bot logged work on HIVE-23026: - Author: ASF GitHub Bot Created on: 17/Jun/20 13:32 Start Date: 17/Jun/20 13:32 Worklog Time Spent: 10m Work Description: belugabehr merged pull request #1080: URL: https://github.com/apache/hive/pull/1080 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447274) Time Spent: 5h (was: 4h 50m) > Allow for custom YARN application name for TEZ queries > -- > > Key: HIVE-23026 > URL: https://issues.apache.org/jira/browse/HIVE-23026 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Jake Xie >Assignee: Jake Xie >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 3.0.0 > > Time Spent: 5h > Remaining Estimate: 0h > > Currently tez on hiveServer2 cannot specify yarn application name, which is > not very convenient for locating the problem SQL, so i added a configuration > item to support setting tez job name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23026) Allow for custom YARN application name for TEZ queries
[ https://issues.apache.org/jira/browse/HIVE-23026?focusedWorklogId=447273=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447273 ] ASF GitHub Bot logged work on HIVE-23026: - Author: ASF GitHub Bot Created on: 17/Jun/20 13:30 Start Date: 17/Jun/20 13:30 Worklog Time Spent: 10m Work Description: belugabehr merged pull request #1082: URL: https://github.com/apache/hive/pull/1082 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447273) Time Spent: 4h 50m (was: 4h 40m) > Allow for custom YARN application name for TEZ queries > -- > > Key: HIVE-23026 > URL: https://issues.apache.org/jira/browse/HIVE-23026 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Jake Xie >Assignee: Jake Xie >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 3.0.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > Currently tez on hiveServer2 cannot specify yarn application name, which is > not very convenient for locating the problem SQL, so i added a configuration > item to support setting tez job name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23026) Allow for custom YARN application name for TEZ queries
[ https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138430#comment-17138430 ] David Mollitor commented on HIVE-23026: --- Pushed to master (4.x) > Allow for custom YARN application name for TEZ queries > -- > > Key: HIVE-23026 > URL: https://issues.apache.org/jira/browse/HIVE-23026 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Jake Xie >Assignee: Jake Xie >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 3.0.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > Currently tez on hiveServer2 cannot specify yarn application name, which is > not very convenient for locating the problem SQL, so i added a configuration > item to support setting tez job name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23026) Allow for custom YARN application name for TEZ queries
[ https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated HIVE-23026: -- Summary: Allow for custom YARN application name for TEZ queries (was: support add a yarn application name for tez on hiveserver2) > Allow for custom YARN application name for TEZ queries > -- > > Key: HIVE-23026 > URL: https://issues.apache.org/jira/browse/HIVE-23026 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Jake Xie >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 3.0.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Currently tez on hiveServer2 cannot specify yarn application name, which is > not very convenient for locating the problem SQL, so i added a configuration > item to support setting tez job name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23026) Allow for custom YARN application name for TEZ queries
[ https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor reassigned HIVE-23026: - Assignee: Jake Xie > Allow for custom YARN application name for TEZ queries > -- > > Key: HIVE-23026 > URL: https://issues.apache.org/jira/browse/HIVE-23026 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Jake Xie >Assignee: Jake Xie >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 3.0.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > Currently tez on hiveServer2 cannot specify yarn application name, which is > not very convenient for locating the problem SQL, so i added a configuration > item to support setting tez job name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23706) Fix nulls first sorting behavior
[ https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138426#comment-17138426 ] Jesus Camacho Rodriguez commented on HIVE-23706: +1 > Fix nulls first sorting behavior > > > Key: HIVE-23706 > URL: https://issues.apache.org/jira/browse/HIVE-23706 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {code} > INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); > select a from t order by a desc; > {code} > instead of > {code} > 3, 2, 2, 2, 1, null > {code} > should return > {code} > null, 3, 2 ,2 ,2, 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23585) Retrieve replication instance metrics details
[ https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-23585: --- Status: In Progress (was: Patch Available) > Retrieve replication instance metrics details > - > > Key: HIVE-23585 > URL: https://issues.apache.org/jira/browse/HIVE-23585 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, > HIVE-23585.03.patch, Replication Metrics.pdf > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23585) Retrieve replication instance metrics details
[ https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-23585: --- Attachment: HIVE-23585.03.patch Status: Patch Available (was: In Progress) > Retrieve replication instance metrics details > - > > Key: HIVE-23585 > URL: https://issues.apache.org/jira/browse/HIVE-23585 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, > HIVE-23585.03.patch, Replication Metrics.pdf > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q
[ https://issues.apache.org/jira/browse/HIVE-23418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Gergely resolved HIVE-23418. --- Resolution: Fixed Merged to master, thank you [~kgyrtkirk] > Investigate why msck command found different partitions at repair.q, > msck_repair*, partition_discovery.q > > > Key: HIVE-23418 > URL: https://issues.apache.org/jira/browse/HIVE-23418 > Project: Hive > Issue Type: Sub-task >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Check [https://reviews.apache.org/r/72485/] for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q
[ https://issues.apache.org/jira/browse/HIVE-23418?focusedWorklogId=447262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447262 ] ASF GitHub Bot logged work on HIVE-23418: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:58 Start Date: 17/Jun/20 12:58 Worklog Time Spent: 10m Work Description: miklosgergely merged pull request #1128: URL: https://github.com/apache/hive/pull/1128 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447262) Time Spent: 20m (was: 10m) > Investigate why msck command found different partitions at repair.q, > msck_repair*, partition_discovery.q > > > Key: HIVE-23418 > URL: https://issues.apache.org/jira/browse/HIVE-23418 > Project: Hive > Issue Type: Sub-task >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Check [https://reviews.apache.org/r/72485/] for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23696) DB Metadata and Progress column not taking the defined length
[ https://issues.apache.org/jira/browse/HIVE-23696?focusedWorklogId=447258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447258 ] ASF GitHub Bot logged work on HIVE-23696: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:53 Start Date: 17/Jun/20 12:53 Worklog Time Spent: 10m Work Description: kgyrtkirk closed pull request #1110: URL: https://github.com/apache/hive/pull/1110 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447258) Time Spent: 1h 10m (was: 1h) > DB Metadata and Progress column not taking the defined length > - > > Key: HIVE-23696 > URL: https://issues.apache.org/jira/browse/HIVE-23696 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23696.01.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Caused by: org.datanucleus.exceptions.NucleusUserException: Attempt to store > value > "{"dbName":"testAcidTablesReplLoadBootstrapIncr_1592205875387","replicationType":"BOOTSTRAP","stagingDir":"hdfs://localhost:65158/tmp/org_apache_hadoop_hive_ql_parse_TestReplicationScenarios_245261428230295/hrepl0/dGVzdGFjaWR0YWJsZXNyZXBsbG9hZGJvb3RzdHJhcGluY3JfMTU5MjIwNTg3NTM4Nw==/0/hive","lastReplId":25}" > in column "RM_METADATA" that has maximum length of 255. Please correct your > data! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23706) Fix nulls first sorting behavior
[ https://issues.apache.org/jira/browse/HIVE-23706?focusedWorklogId=447257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447257 ] ASF GitHub Bot logged work on HIVE-23706: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:52 Start Date: 17/Jun/20 12:52 Worklog Time Spent: 10m Work Description: kasakrisz commented on a change in pull request #1131: URL: https://github.com/apache/hive/pull/1131#discussion_r441520582 ## File path: ql/src/test/results/clientpositive/llap/order_null.q.out ## @@ -116,12 +116,12 @@ POSTHOOK: query: SELECT x.* FROM src_null_n1 x ORDER BY b desc, a asc POSTHOOK: type: QUERY POSTHOOK: Input: default@src_null_n1 A masked pattern was here -2 B -1 A -2 A 2 NULL 3 NULL NULL NULL +2 B +1 A +2 A Review comment: I agree that description is confusing I changed accordingly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447257) Time Spent: 50m (was: 40m) > Fix nulls first sorting behavior > > > Key: HIVE-23706 > URL: https://issues.apache.org/jira/browse/HIVE-23706 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {code} > INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); > select a from t order by a desc; > {code} > instead of > {code} > 3, 2, 2, 2, 1, null > {code} > should return > {code} > null, 3, 2 ,2 ,2, 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q
[ https://issues.apache.org/jira/browse/HIVE-23418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138411#comment-17138411 ] Zoltan Haindrich commented on HIVE-23418: - +1 > Investigate why msck command found different partitions at repair.q, > msck_repair*, partition_discovery.q > > > Key: HIVE-23418 > URL: https://issues.apache.org/jira/browse/HIVE-23418 > Project: Hive > Issue Type: Sub-task >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Check [https://reviews.apache.org/r/72485/] for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q
[ https://issues.apache.org/jira/browse/HIVE-23418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-23418: --- Assignee: Miklos Gergely > Investigate why msck command found different partitions at repair.q, > msck_repair*, partition_discovery.q > > > Key: HIVE-23418 > URL: https://issues.apache.org/jira/browse/HIVE-23418 > Project: Hive > Issue Type: Sub-task >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Check [https://reviews.apache.org/r/72485/] for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23711) Some IDE generated files should not be checked for license header by rat plugin
[ https://issues.apache.org/jira/browse/HIVE-23711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-23711. - Fix Version/s: 4.0.0 Resolution: Fixed pushed to master. Thank you Laszlo! > Some IDE generated files should not be checked for license header by rat > plugin > --- > > Key: HIVE-23711 > URL: https://issues.apache.org/jira/browse/HIVE-23711 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: rat.txt > > Time Spent: 20m > Remaining Estimate: 0h > > As attached in [^rat.txt], there was an incorrect rat check: > {code} > Files with unapproved licenses: > /Users/lbodor/apache/hive/shims/common/.factorypath > {code} > In this patch, I'm about to take care of .factorypath and some other files > which should be ignored by rat, as they are generated by IDE and ignored > already by git -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23711) Some IDE generated files should not be checked for license header by rat plugin
[ https://issues.apache.org/jira/browse/HIVE-23711?focusedWorklogId=447251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447251 ] ASF GitHub Bot logged work on HIVE-23711: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:46 Start Date: 17/Jun/20 12:46 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #1136: URL: https://github.com/apache/hive/pull/1136 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447251) Time Spent: 20m (was: 10m) > Some IDE generated files should not be checked for license header by rat > plugin > --- > > Key: HIVE-23711 > URL: https://issues.apache.org/jira/browse/HIVE-23711 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Attachments: rat.txt > > Time Spent: 20m > Remaining Estimate: 0h > > As attached in [^rat.txt], there was an incorrect rat check: > {code} > Files with unapproved licenses: > /Users/lbodor/apache/hive/shims/common/.factorypath > {code} > In this patch, I'm about to take care of .factorypath and some other files > which should be ignored by rat, as they are generated by IDE and ignored > already by git -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23711) Some IDE generated files should not be checked for license header by rat plugin
[ https://issues.apache.org/jira/browse/HIVE-23711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138406#comment-17138406 ] Zoltan Haindrich commented on HIVE-23711: - +1 > Some IDE generated files should not be checked for license header by rat > plugin > --- > > Key: HIVE-23711 > URL: https://issues.apache.org/jira/browse/HIVE-23711 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Attachments: rat.txt > > Time Spent: 10m > Remaining Estimate: 0h > > As attached in [^rat.txt], there was an incorrect rat check: > {code} > Files with unapproved licenses: > /Users/lbodor/apache/hive/shims/common/.factorypath > {code} > In this patch, I'm about to take care of .factorypath and some other files > which should be ignored by rat, as they are generated by IDE and ignored > already by git -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-20291) Allow HiveStreamingConnection to receive a WriteId
[ https://issues.apache.org/jira/browse/HIVE-20291?focusedWorklogId=447249=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447249 ] ASF GitHub Bot logged work on HIVE-20291: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:45 Start Date: 17/Jun/20 12:45 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #406: URL: https://github.com/apache/hive/pull/406 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447249) Time Spent: 20m (was: 10m) > Allow HiveStreamingConnection to receive a WriteId > -- > > Key: HIVE-20291 > URL: https://issues.apache.org/jira/browse/HIVE-20291 > Project: Hive > Issue Type: Improvement >Reporter: Jaume M >Assignee: Jaume M >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20291.1.patch, HIVE-20291.10.patch, > HIVE-20291.11.patch, HIVE-20291.2.patch, HIVE-20291.3.patch, > HIVE-20291.4.patch, HIVE-20291.5.patch, HIVE-20291.6.patch, > HIVE-20291.7.patch, HIVE-20291.8.patch, HIVE-20291.9.patch > > Time Spent: 20m > Remaining Estimate: 0h > > If the writeId is received externally it won't need to open connections to > the metastore. It won't be able to the commit in this case as well so it must > be done by the entity passing the writeId. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23689) Bump Tez version to 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-23689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-23689: Fix Version/s: 4.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) pushed to master. Thank you [~jagatsingh]! > Bump Tez version to 0.9.2 > - > > Key: HIVE-23689 > URL: https://issues.apache.org/jira/browse/HIVE-23689 > Project: Hive > Issue Type: Improvement >Reporter: Jagat Singh >Assignee: Jagat Singh >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-23689.1.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Bump Tez version to 0.9.2 from 0.9.1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23689) Bump Tez version to 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-23689?focusedWorklogId=447247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447247 ] ASF GitHub Bot logged work on HIVE-23689: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:43 Start Date: 17/Jun/20 12:43 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #1108: URL: https://github.com/apache/hive/pull/1108 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447247) Time Spent: 20m (was: 10m) > Bump Tez version to 0.9.2 > - > > Key: HIVE-23689 > URL: https://issues.apache.org/jira/browse/HIVE-23689 > Project: Hive > Issue Type: Improvement >Reporter: Jagat Singh >Assignee: Jagat Singh >Priority: Minor > Labels: pull-request-available > Attachments: HIVE-23689.1.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Bump Tez version to 0.9.2 from 0.9.1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-23138) Run q test with TestMiniLlapLocalCliDriver by default
[ https://issues.apache.org/jira/browse/HIVE-23138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Gergely reassigned HIVE-23138: - Assignee: Ashutosh Chauhan (was: Miklos Gergely) > Run q test with TestMiniLlapLocalCliDriver by default > - > > Key: HIVE-23138 > URL: https://issues.apache.org/jira/browse/HIVE-23138 > Project: Hive > Issue Type: Improvement >Reporter: Miklos Gergely >Assignee: Ashutosh Chauhan >Priority: Major > > TestCliDriver, the current default driver for q tests is running tests on MR, > which is getting less and less used. Instead we should test everything by > default using TestMiniLlapLocalCliDriver. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23329) Investigate why the results have changed for correlationoptimizer14.q
[ https://issues.apache.org/jira/browse/HIVE-23329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Gergely resolved HIVE-23329. --- Release Note: It is working fine with LLAP, and it was erroneous with MR. As we are planning to remove MR in the near future anyway, there is no point to investigate. Assignee: Miklos Gergely Resolution: Won't Do > Investigate why the results have changed for correlationoptimizer14.q > - > > Key: HIVE-23329 > URL: https://issues.apache.org/jira/browse/HIVE-23329 > Project: Hive > Issue Type: Sub-task >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > > Find out why the result set is different for correlationoptimizer14.q after > moving to TestMiniLlapLocalCliDriver. Check > [https://reviews.apache.org/r/72421/#comment308835] for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23703) Major QB compaction with multiple FileSinkOperators results in data loss and one original file
[ https://issues.apache.org/jira/browse/HIVE-23703?focusedWorklogId=447242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447242 ] ASF GitHub Bot logged work on HIVE-23703: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:35 Start Date: 17/Jun/20 12:35 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #1134: URL: https://github.com/apache/hive/pull/1134#discussion_r441510102 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java ## @@ -334,6 +334,18 @@ public void initializeBucketPaths(int filesIdx, String taskId, boolean isNativeT if (!isMmTable && !isDirectInsert) { if (!bDynParts && !isSkewedStoredAsSubDirectories) { finalPaths[filesIdx] = new Path(parent, taskWithExt); +if (conf.isCompactionTable()) { + // tables used in compaction are external and non-acid. We need to keep track of Review comment: Done ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -4123,9 +4125,28 @@ private static void copyFiles(final HiveConf conf, final FileSystem destFs, } throw new HiveException(e); } - } else { + else { Review comment: Typo, done. ## File path: ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveCopyFiles.java ## @@ -83,7 +83,8 @@ public void testRenameNewFilesOnSameFileSystem() throws IOException { FileSystem targetFs = targetPath.getFileSystem(hiveConf); try { - Hive.copyFiles(hiveConf, sourcePath, targetPath, targetFs, isSourceLocal, NO_ACID, false,null, false, false, false); + Hive.copyFiles(hiveConf, sourcePath, targetPath, targetFs, isSourceLocal, NO_ACID, false,null, Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447242) Time Spent: 1h 10m (was: 1h) > Major QB compaction with multiple FileSinkOperators results in data loss and > one original file > -- > > Key: HIVE-23703 > URL: https://issues.apache.org/jira/browse/HIVE-23703 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Critical > Labels: compaction, pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > h4. Problems > Example: > {code:java} > drop table if exists tbl2; > create transactional table tbl2 (a int, b int) clustered by (a) into 4 > buckets stored as ORC > TBLPROPERTIES('transactional'='true','transactional_properties'='default'); > insert into tbl2 values(1,2),(1,3),(1,4),(2,2),(2,3),(2,4); > insert into tbl2 values(3,2),(3,3),(3,4),(4,2),(4,3),(4,4); > insert into tbl2 values(5,2),(5,3),(5,4),(6,2),(6,3),(6,4);{code} > E.g. in the example above, bucketId=0 when a=2 and a=6. > 1. Data loss > In non-acid tables, an operator's temp files are named with their task id. > Because of this snippet, temp files in the FileSinkOperator for compaction > tables are identified by their bucket_id. > {code:java} > if (conf.isCompactionTable()) { > fsp.initializeBucketPaths(filesIdx, AcidUtils.BUCKET_PREFIX + > String.format(AcidUtils.BUCKET_DIGITS, bucketId), > isNativeTable(), isSkewedStoredAsSubDirectories); > } else { > fsp.initializeBucketPaths(filesIdx, taskId, isNativeTable(), > isSkewedStoredAsSubDirectories); > } > {code} > So 2 temp files containing data with a=2 and a=6 will be named bucket_0 and > not 00_0 and 00_1 as they would normally. > In FileSinkOperator.commit, when data with a=2, filename: bucket_0 is moved > from _task_tmp.-ext-10002 to _tmp.-ext-10002, it overwrites the files already > there with a=6 data, because it too is named bucket_0. You can see in the > logs: > {code:java} > WARN [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Target > path > file:.../hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnNoBuckets-1591107230237/warehouse/testmajorcompaction/base_002_v013/.hive-staging_hive_2020-06-02_07-15-21_771_8551447285061957908-1/_tmp.-ext-10002/bucket_0 > with a size 610 exists. Trying to delete it. > {code} > 2. Results in one original file > OrcFileMergeOperator merges the results of the FSOp into 1 file named > 00_0. > h4. Fix > 1. FSOp will store data as: taskid/bucketId. e.g. 0_0/bucket_0 > 2. OrcMergeFileOp,
[jira] [Work logged] (HIVE-23706) Fix nulls first sorting behavior
[ https://issues.apache.org/jira/browse/HIVE-23706?focusedWorklogId=447240=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447240 ] ASF GitHub Bot logged work on HIVE-23706: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:33 Start Date: 17/Jun/20 12:33 Worklog Time Spent: 10m Work Description: zabetak commented on a change in pull request #1131: URL: https://github.com/apache/hive/pull/1131#discussion_r441509021 ## File path: ql/src/test/results/clientpositive/llap/order_null.q.out ## @@ -116,12 +116,12 @@ POSTHOOK: query: SELECT x.* FROM src_null_n1 x ORDER BY b desc, a asc POSTHOOK: type: QUERY POSTHOOK: Input: default@src_null_n1 A masked pattern was here -2 B -1 A -2 A 2 NULL 3 NULL NULL NULL +2 B +1 A +2 A Review comment: OK, now I understand better, thanks for the clarification. The `hive.default.nulls.last` is a bit confusing. From the name and description I get the impression that NULLS LAST is the default behavior for both ASC and DESC and apparently it is not the case either. Yes the new results are in sync with Postgres since the latter uses the [semantics that you mentioned](https://www.postgresql.org/docs/12/queries-order.html). Note that this is not a rule from every DBMS. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447240) Time Spent: 40m (was: 0.5h) > Fix nulls first sorting behavior > > > Key: HIVE-23706 > URL: https://issues.apache.org/jira/browse/HIVE-23706 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > {code} > INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); > select a from t order by a desc; > {code} > instead of > {code} > 3, 2, 2, 2, 1, null > {code} > should return > {code} > null, 3, 2 ,2 ,2, 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23593) Schemainit fails with NoSuchFieldError
[ https://issues.apache.org/jira/browse/HIVE-23593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-23593: -- Labels: pull-request-available (was: ) > Schemainit fails with NoSuchFieldError > --- > > Key: HIVE-23593 > URL: https://issues.apache.org/jira/browse/HIVE-23593 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > the issue comes from a calcite related class ; it's very interesting because > ql already has a shaded calcite > {code} > Caused by: java.lang.NoSuchFieldError: operands > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:192) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:98) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) > ~[calcite-core-1.21.0.jar:1.21.0] > at > org.apache.hadoop.hive.ql.optimizer.calcite.HiveRexExecutorImpl.reduce(HiveRexExecutorImpl.java:56) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.foldExpression(HiveFunctionHelper.java:544) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createConstantObjectInspector(HiveFunctionHelper.java:452) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createObjectInspector(HiveFunctionHelper.java:435) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getReturnType(HiveFunctionHelper.java:124) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:647) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23593) Schemainit fails with NoSuchFieldError
[ https://issues.apache.org/jira/browse/HIVE-23593?focusedWorklogId=447236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447236 ] ASF GitHub Bot logged work on HIVE-23593: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:27 Start Date: 17/Jun/20 12:27 Worklog Time Spent: 10m Work Description: kgyrtkirk opened a new pull request #1137: URL: https://github.com/apache/hive/pull/1137 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447236) Remaining Estimate: 0h Time Spent: 10m > Schemainit fails with NoSuchFieldError > --- > > Key: HIVE-23593 > URL: https://issues.apache.org/jira/browse/HIVE-23593 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > the issue comes from a calcite related class ; it's very interesting because > ql already has a shaded calcite > {code} > Caused by: java.lang.NoSuchFieldError: operands > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:192) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:98) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) > ~[calcite-core-1.21.0.jar:1.21.0] > at > org.apache.hadoop.hive.ql.optimizer.calcite.HiveRexExecutorImpl.reduce(HiveRexExecutorImpl.java:56) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.foldExpression(HiveFunctionHelper.java:544) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createConstantObjectInspector(HiveFunctionHelper.java:452) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createObjectInspector(HiveFunctionHelper.java:435) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getReturnType(HiveFunctionHelper.java:124) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:647) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23706) Fix nulls first sorting behavior
[ https://issues.apache.org/jira/browse/HIVE-23706?focusedWorklogId=447234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447234 ] ASF GitHub Bot logged work on HIVE-23706: - Author: ASF GitHub Bot Created on: 17/Jun/20 12:23 Start Date: 17/Jun/20 12:23 Worklog Time Spent: 10m Work Description: kasakrisz commented on a change in pull request #1131: URL: https://github.com/apache/hive/pull/1131#discussion_r441503698 ## File path: ql/src/test/results/clientpositive/llap/order_null.q.out ## @@ -116,12 +116,12 @@ POSTHOOK: query: SELECT x.* FROM src_null_n1 x ORDER BY b desc, a asc POSTHOOK: type: QUERY POSTHOOK: Input: default@src_null_n1 A masked pattern was here -2 B -1 A -2 A 2 NULL 3 NULL NULL NULL +2 B +1 A +2 A Review comment: Not sure about how up-to-date that document is but as far as I know we want nulls last as the default for asc and nulls first for desc. It can be controlled by the setting `hive.default.nulls.last`. Default value is `true`. I also run these queries with Postgres and got the same results as with this patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 447234) Time Spent: 0.5h (was: 20m) > Fix nulls first sorting behavior > > > Key: HIVE-23706 > URL: https://issues.apache.org/jira/browse/HIVE-23706 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > {code} > INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); > select a from t order by a desc; > {code} > instead of > {code} > 3, 2, 2, 2, 1, null > {code} > should return > {code} > null, 3, 2 ,2 ,2, 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23331) Investigate why the results have changed for authorization_9.q
[ https://issues.apache.org/jira/browse/HIVE-23331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Gergely resolved HIVE-23331. --- Release Note: Whatever caused it back then, is not causing it anymore. Even in the migration commit it was added without the extra lines. We'll open a new jira if it occurs again,. Assignee: Miklos Gergely Resolution: Cannot Reproduce > Investigate why the results have changed for authorization_9.q > -- > > Key: HIVE-23331 > URL: https://issues.apache.org/jira/browse/HIVE-23331 > Project: Hive > Issue Type: Sub-task >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > > Find out why the result set is different for authorization_9.q after moving > to TestMiniLlapLocalCliDriver. Check > [https://reviews.apache.org/r/72421/#comment308838|https://reviews.apache.org/r/72421/#comment308835] > for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23706) Fix nulls first sorting behavior
[ https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138385#comment-17138385 ] Krisztian Kasa commented on HIVE-23706: --- [~zabetak] Thanks for pointing out, the goal is changing the behavior when NULLS FIRST/LAST is not specified explicitly. > Fix nulls first sorting behavior > > > Key: HIVE-23706 > URL: https://issues.apache.org/jira/browse/HIVE-23706 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {code} > INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); > select a from t order by a desc; > {code} > instead of > {code} > 3, 2, 2, 2, 1, null > {code} > should return > {code} > null, 3, 2 ,2 ,2, 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23333) Investigate why the results have changed for char_udf1.q
[ https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Gergely resolved HIVE-2. --- Release Note: As the difference is because of the MR result is bad, and the LLAP result is good, and as MR is planned to be removed anyway, there is no point to investigate it. Assignee: Miklos Gergely Resolution: Won't Do > Investigate why the results have changed for char_udf1.q > > > Key: HIVE-2 > URL: https://issues.apache.org/jira/browse/HIVE-2 > Project: Hive > Issue Type: Sub-task >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > > Find out why the result set is different for char_udf1.q after moving to > TestMiniLlapLocalCliDriver. Check > [https://reviews.apache.org/r/72421/#comment308875|https://reviews.apache.org/r/72421/#comment308835] > for details. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23706) Fix nulls first sorting behavior
[ https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-23706: -- Description: {code} INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); select a from t order by a desc; {code} instead of {code} 3, 2, 2, 2, 1, null {code} should return {code} null, 3, 2 ,2 ,2, 1 {code} was: {code} INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); select a from t order by a desc; {code} instead of {code} null, 3, 2 ,2 ,2, 1 {code} should return {code} 3, 2, 2, 2, 1, null {code} > Fix nulls first sorting behavior > > > Key: HIVE-23706 > URL: https://issues.apache.org/jira/browse/HIVE-23706 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {code} > INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); > select a from t order by a desc; > {code} > instead of > {code} > 3, 2, 2, 2, 1, null > {code} > should return > {code} > null, 3, 2 ,2 ,2, 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23706) Fix nulls first sorting behavior
[ https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Kasa updated HIVE-23706: -- Description: {code} INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); select a from t order by a desc; {code} instead of {code} null, 3, 2 ,2 ,2, 1 {code} should return {code} 3, 2, 2, 2, 1, null {code} was: {code} INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); SELECT a FROM t ORDER BY a DESC NULLS FIRST {code} should return {code} 3 2 2 2 1 null {code} > Fix nulls first sorting behavior > > > Key: HIVE-23706 > URL: https://issues.apache.org/jira/browse/HIVE-23706 > Project: Hive > Issue Type: Bug > Components: Parser >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {code} > INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2); > select a from t order by a desc; > {code} > instead of {code} > null, 3, 2 ,2 ,2, 1 > {code} > should return > {code} > 3, 2, 2, 2, 1, null > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23668) Clean up Task for Hive Metrics
[ https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-23668: --- Attachment: HIVE-23668.03.patch Status: Patch Available (was: In Progress) > Clean up Task for Hive Metrics > -- > > Key: HIVE-23668 > URL: https://issues.apache.org/jira/browse/HIVE-23668 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23668.01.patch, HIVE-23668.02.patch, > HIVE-23668.03.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)