date:20200617

[jira] [Work logged] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?focusedWorklogId=447650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447650
 ]

ASF GitHub Bot logged work on HIVE-23720:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 05:53
Start Date: 18/Jun/20 05:53
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 closed pull request #1144:
URL: https://github.com/apache/hive/pull/1144


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447650)
Time Spent: 20m  (was: 10m)

> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139107#comment-17139107
 ] 

Zhihua Deng edited comment on HIVE-23720 at 6/18/20, 5:48 AM:
--

Should do a further research as I see there no releaseDriverContext() is called.

https://issues.apache.org/jira/browse/HIVE-16426


was (Author: dengzh):
Should do a further research as I see there no releaseDriverContext() was 
called.

https://issues.apache.org/jira/browse/HIVE-16426

> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139107#comment-17139107
 ] 

Zhihua Deng commented on HIVE-23720:


Should do a further research as I see there no releaseDriverContext() was 
called.

https://issues.apache.org/jira/browse/HIVE-16426

> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-23720:
---
Description: 
Now SQLOperation cancels the background task only when the condition is met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The condition is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources.

  was:
 Now SQLOperation cancels the background task only when the condition is met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The condition is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources, no need to dependent on the driver check 
his own status to get the background task stop.


> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23722) Emit operation's drilldown link to client

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23722:
--
Labels: pull-request-available  (was: )

> Emit operation's drilldown link to client
> -
>
> Key: HIVE-23722
> URL: https://issues.apache.org/jira/browse/HIVE-23722
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now the HiveServer2 webui provides a drilldown link for many collected 
> metrics or messages about a operation, but it's not easy for a end user to 
> find the target url of his submitted query. Less knowledge on the deployment, 
> ha based environment(such as using LVS for balancing or routing), and the 
> multiple running queries can make things more difficult. The jira provides a 
> way to emit the link to the interested end user when enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23722) Emit operation's drilldown link to client

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23722?focusedWorklogId=447646=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447646
 ]

ASF GitHub Bot logged work on HIVE-23722:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 05:35
Start Date: 18/Jun/20 05:35
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1145:
URL: https://github.com/apache/hive/pull/1145


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447646)
Remaining Estimate: 0h
Time Spent: 10m

> Emit operation's drilldown link to client
> -
>
> Key: HIVE-23722
> URL: https://issues.apache.org/jira/browse/HIVE-23722
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now the HiveServer2 webui provides a drilldown link for many collected 
> metrics or messages about a operation, but it's not easy for a end user to 
> find the target url of his submitted query. Less knowledge on the deployment, 
> ha based environment(such as using LVS for balancing or routing), and the 
> multiple running queries can make things more difficult. The jira provides a 
> way to emit the link to the interested end user when enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-23720:
---
Description: 
 Now SQLOperation cancels the background task only when the condition is met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The condition is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources, no need to dependent on the driver check 
his own status to get the background task stop.

  was:
 Now SQLOperation cancels the background task only when the condition is met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The condition is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources.


> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources, no need to dependent on the driver 
> check his own status to get the background task stop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-23720:
---
Summary: Background task may not be interrupted when operation being 
canceled or timeout  (was: Background task should be interrupted when operation 
being canceled or timeout)

> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-23720:
---
Description: 
 Now SQLOperation cancels the background task only when the condition is met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The condition is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources.

  was:
Currently SQLOperation cancels the background task only when the condition is 
met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The condition is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources.


> Background task should be interrupted when operation being canceled or timeout
> --
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21218) KafkaSerDe doesn't support topics created via Confluent Avro serializer

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21218?focusedWorklogId=447643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447643
 ]

ASF GitHub Bot logged work on HIVE-21218:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 04:48
Start Date: 18/Jun/20 04:48
Worklog Time Spent: 10m 
  Work Description: OneCricketeer commented on pull request #526:
URL: https://github.com/apache/hive/pull/526#issuecomment-645769927


   @b-slim What was the status on this?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447643)
Time Spent: 15h 50m  (was: 15h 40m)

> KafkaSerDe doesn't support topics created via Confluent Avro serializer
> ---
>
> Key: HIVE-21218
> URL: https://issues.apache.org/jira/browse/HIVE-21218
> Project: Hive
>  Issue Type: Bug
>  Components: kafka integration, Serializers/Deserializers
>Affects Versions: 3.1.1
>Reporter: Milan Baran
>Assignee: David McGinnis
>Priority: Major
>  Labels: pull-request-available
> Attachments: 
> 0001-HIVE-21818-Adding-ability-for-Kafka-Handler-to-proce.patch, 
> HIVE-21218.10.patch, HIVE-21218.11.patch, HIVE-21218.12.patch, 
> HIVE-21218.13.patch, HIVE-21218.2.patch, HIVE-21218.3.patch, 
> HIVE-21218.4.patch, HIVE-21218.5.patch, HIVE-21218.6.patch, 
> HIVE-21218.7.patch, HIVE-21218.8.patch, HIVE-21218.9.patch, HIVE-21218.patch
>
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>
> According to [Google 
> groups|https://groups.google.com/forum/#!topic/confluent-platform/JYhlXN0u9_A]
>  the Confluent avro serialzier uses propertiary format for kafka value - 
> <4 bytes of schema ID> conforms to schema>. 
> This format does not cause any problem for Confluent kafka deserializer which 
> respect the format however for hive kafka handler its bit a problem to 
> correctly deserialize kafka value, because Hive uses custom deserializer from 
> bytes to objects and ignores kafka consumer ser/deser classes provided via 
> table property.
> It would be nice to support Confluent format with magic byte.
> Also it would be great to support Schema registry as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL

2020-06-17 Thread YulongZ (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YulongZ updated HIVE-23721:
---
Environment: 
Hadoop 3.1（1700+ nodes）
YARN 3.1 （with timelineserver enabled，https enabled)
Hive 3.1 (15 HS2 instance)
6+ YARN Applications every day

> MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
> ---
>
> Key: HIVE-23721
> URL: https://issues.apache.org/jira/browse/HIVE-23721
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
> Environment: Hadoop 3.1（1700+ nodes）
> YARN 3.1 （with timelineserver enabled，https enabled)
> Hive 3.1 (15 HS2 instance)
> 6+ YARN Applications every day
>Reporter: YulongZ
>Priority: Critical
>
> From Hive3.0，catalog added to hivemeta，many schema of metastore added column 
> “catName”，and index for table added column “catName”。
> In MetaStoreDirectSql.ensureDbInit() ，two queries below
> “
>   initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == 
> ''"));
>   initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName 
> == ''"));
> ”
> should use "catName == ''" instead of "dbName == ''"，because “catName” is the 
> first index column。
> When  data of metastore become large，for example， table of 
> MPartitionColumnStatistics have millions of lines。The 
> “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore 
> executed very slowly，and the query “show tables“ for hiveserver2 executed 
> very slowly too。



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL

2020-06-17 Thread YulongZ (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YulongZ updated HIVE-23721:
---
Priority: Critical  (was: Minor)

> MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
> ---
>
> Key: HIVE-23721
> URL: https://issues.apache.org/jira/browse/HIVE-23721
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: YulongZ
>Priority: Critical
>
> From Hive3.0，catalog added to hivemeta，many schema of metastore added column 
> “catName”，and index for table added column “catName”。
> In MetaStoreDirectSql.ensureDbInit() ，two queries below
> “
>   initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == 
> ''"));
>   initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName 
> == ''"));
> ”
> should use "catName == ''" instead of "dbName == ''"，because “catName” is the 
> first index column。
> When  data of metastore become large，for example， table of 
> MPartitionColumnStatistics have millions of lines。The 
> “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore 
> executed very slowly，and the query “show tables“ for hiveserver2 executed 
> very slowly too。



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL

2020-06-17 Thread YulongZ (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YulongZ updated HIVE-23721:
---
Description: 
>From Hive3.0，catalog added to hivemeta，many schema of metastore added column 
>“catName”，and index for table added column “catName”。

In MetaStoreDirectSql.ensureDbInit() ，two queries below
“
  initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == 
''"));
  initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName == 
''"));
”
should use "catName == ''" instead of "dbName == ''"，because “catName” is the 
first index column。

When  data of metastore become large，for example， table of 
MPartitionColumnStatistics have millions of lines。The 
“newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore 
executed very slowly，and the query “show tables“ for hiveserver2 executed very 
slowly too。

  was:From Hive3.0，catalog added to hivemeta，many schema of metastore added 
column “catName”，and index for table added column “catName”。For Hive2.0，


> MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
> ---
>
> Key: HIVE-23721
> URL: https://issues.apache.org/jira/browse/HIVE-23721
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: YulongZ
>Priority: Minor
>
> From Hive3.0，catalog added to hivemeta，many schema of metastore added column 
> “catName”，and index for table added column “catName”。
> In MetaStoreDirectSql.ensureDbInit() ，two queries below
> “
>   initQueries.add(pm.newQuery(MTableColumnStatistics.class, "dbName == 
> ''"));
>   initQueries.add(pm.newQuery(MPartitionColumnStatistics.class, "dbName 
> == ''"));
> ”
> should use "catName == ''" instead of "dbName == ''"，because “catName” is the 
> first index column。
> When  data of metastore become large，for example， table of 
> MPartitionColumnStatistics have millions of lines。The 
> “newQuery(MPartitionColumnStatistics.class, "dbName == ''")” for metastore 
> executed very slowly，and the query “show tables“ for hiveserver2 executed 
> very slowly too。



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?focusedWorklogId=447636=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447636
 ]

ASF GitHub Bot logged work on HIVE-23720:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 04:08
Start Date: 18/Jun/20 04:08
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #1144:
URL: https://github.com/apache/hive/pull/1144


   …g canceled or timeout
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447636)
Remaining Estimate: 0h
Time Spent: 10m

> Background task should be interrupted when operation being canceled or timeout
> --
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently SQLOperation cancels the background task only when the condition is 
> met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23720:
--
Labels: pull-request-available  (was: )

> Background task should be interrupted when operation being canceled or timeout
> --
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently SQLOperation cancels the background task only when the condition is 
> met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23721) MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL

2020-06-17 Thread YulongZ (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YulongZ updated HIVE-23721:
---
Description: From Hive3.0，catalog added to hivemeta，many schema of 
metastore added column “catName”，and index for table added column “catName”。For 
Hive2.0，

> MetaStoreDirectSql.ensureDbInit() need to optimize QuerySQL
> ---
>
> Key: HIVE-23721
> URL: https://issues.apache.org/jira/browse/HIVE-23721
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: YulongZ
>Priority: Minor
>
> From Hive3.0，catalog added to hivemeta，many schema of metastore added column 
> “catName”，and index for table added column “catName”。For Hive2.0，



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23720) Background task should be interrupted when operation being canceled or timeout

2020-06-17 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-23720:
---
Description: 
Currently SQLOperation cancels the background task only when the condition is 
met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The condition is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources.

  was:
Currently SQLOperation cancels the background task only when the condition is 
met:

if (shouldRunAsync() && state != OperationState.CANCELED && state != 
OperationState.TIMEDOUT)

The conditions is evaluated to false when state is OperationState.CANCELED or 
OperationState.TIMEDOUT,  but operations in such states should stop the 
background tasks to release resources.


> Background task should be interrupted when operation being canceled or timeout
> --
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>
> Currently SQLOperation cancels the background task only when the condition is 
> met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23585) Retrieve replication instance metrics details

2020-06-17 Thread Anishek Agarwal (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anishek Agarwal updated HIVE-23585:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 merged to master 

> Retrieve replication instance metrics details
> -
>
> Key: HIVE-23585
> URL: https://issues.apache.org/jira/browse/HIVE-23585
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, 
> HIVE-23585.03.patch, Replication Metrics.pdf
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23708) MergeFileTask.execute() need to close jobclient

2020-06-17 Thread YulongZ (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YulongZ updated HIVE-23708:
---
Attachment: HIVE-23708.patch

> MergeFileTask.execute() need to close jobclient
> ---
>
> Key: HIVE-23708
> URL: https://issues.apache.org/jira/browse/HIVE-23708
> Project: Hive
>  Issue Type: Bug
>Affects Versions: All Versions
> Environment: Hadoop 3.1（1700+ nodes）
> YARN 3.1 （with timelineserver enabled，https enabled)
> Hive 3.1 (15 HS2 instance)
> 6+ YARN Applications every day
>Reporter: YulongZ
>Priority: Critical
> Attachments: HIVE-23708.patch
>
>
> So when YARN use Https, MergeFileTask causes more and more Threads named 
> “ReloadingX509TrustManager” in HiveServer2。The threads named 
> “ReloadingX509TrustManager” does not interrupt，and HS2 becomes abnormal。
> In MergeFileTask.execute(DriverContext driverContext) ,a Jobclient created 
> but not closed。The issue cause above。
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23452) Exception occur when a SQL query across data stored in two relational DB by JDBCStorageHandler with Tez

2020-06-17 Thread De Li (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138970#comment-17138970
 ] 

De Li commented on HIVE-23452:
--

It seems resolved by Hive-20652 but still need to further test.

> Exception occur when a SQL query across data stored in two relational DB by 
> JDBCStorageHandler with Tez
> ---
>
> Key: HIVE-23452
> URL: https://issues.apache.org/jira/browse/HIVE-23452
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 3.1.0
>Reporter: De Li
>Priority: Major
>
> Exception occur when a SQL query across data stored in two relational DB by 
> JDBCStorageHandler with Tez. It seems there is an incorrect JDBC driver by 
> Tez and it works when query with MR. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447602
 ]

ASF GitHub Bot logged work on HIVE-23704:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 02:06
Start Date: 18/Jun/20 02:06
Worklog Time Spent: 10m 
  Work Description: belugabehr opened a new pull request #1127:
URL: https://github.com/apache/hive/pull/1127


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447602)
Time Spent: 1h 10m  (was: 1h)

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447601
 ]

ASF GitHub Bot logged work on HIVE-23704:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 02:05
Start Date: 18/Jun/20 02:05
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #1127:
URL: https://github.com/apache/hive/pull/1127


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447601)
Time Spent: 1h  (was: 50m)

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20172) StatsUpdater failed with GSS Exception while trying to connect to remote metastore

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20172?focusedWorklogId=447569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447569
 ]

ASF GitHub Bot logged work on HIVE-20172:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 00:24
Start Date: 18/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #400:
URL: https://github.com/apache/hive/pull/400


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447569)
Time Spent: 20m  (was: 10m)

> StatsUpdater failed with GSS Exception while trying to connect to remote 
> metastore
> --
>
> Key: HIVE-20172
> URL: https://issues.apache.org/jira/browse/HIVE-20172
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.1.1
> Environment: Hive-1.2.1,Hive2.1,java8
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20172.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> StatsUpdater task failed with GSS Exception while trying to connect to remote 
> Metastore.
> {code}
> org.apache.thrift.transport.TTransportException: GSS initiate failed 
> at 
> org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
>  
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) 
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
>  
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
>  
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
>  
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>  
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
>  
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:487)
>  
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
>  
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
>  
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564)
>  
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:92)
>  
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138)
>  
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110)
>  
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3526) 
> at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3558) 
> at 
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:533) 
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:300)
>  
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265) 
> at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:177) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
>  
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) 
> ) 
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:534)
>  
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
>  
> at 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
>  
> {code}
> since metastore client is running in HMS so there is no need to connect to 
> remote URI.



--
This message

[jira] [Work logged] (HIVE-23238) FIX PreemptionQueueComparator edge cases

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23238?focusedWorklogId=447573=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447573
 ]

ASF GitHub Bot logged work on HIVE-23238:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 00:24
Start Date: 18/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #985:
URL: https://github.com/apache/hive/pull/985#issuecomment-645695696


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447573)
Time Spent: 20m  (was: 10m)

> FIX PreemptionQueueComparator edge cases
> 
>
> Key: HIVE-23238
> URL: https://issues.apache.org/jira/browse/HIVE-23238
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: llap
>
> Attachments: HIVE-23238.01.patch, HIVE-23238.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Properly handle preemption comparator edge cases where tasks are same type 
> and have the same number or upstream tasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20131) SQL Script changes for creating txn write notification in 3.2.0 files

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20131?focusedWorklogId=447572=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447572
 ]

ASF GitHub Bot logged work on HIVE-20131:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 00:24
Start Date: 18/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #398:
URL: https://github.com/apache/hive/pull/398


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447572)
Time Spent: 20m  (was: 10m)

> SQL Script changes for creating  txn write notification in 3.2.0 files 
> ---
>
> Key: HIVE-20131
> URL: https://issues.apache.org/jira/browse/HIVE-20131
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-20131.01.patch, HIVE-20131.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> 1. Change partition name size from 1024 to 767 . (mySQL 5.6 and before that 
> supports max 767 length keys)
>  2. Remove the create txn_write_notification_log table creation from 3.1.0 
> scripts and add a new scripts for 3.2.0
> 3. Remove the file 3.1.0-to-4.0.0 and instead add file for 3.2.0-to-4.0.0 and 
> 3.1.0-to-3.2.0
> 4. Change in metastore init schema  xml file to take 4.0.0 instead of 3.1.0 
> as current version.
> h1.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20044) Arrow Serde should pad char values and handle empty strings correctly

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20044?focusedWorklogId=447570=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447570
 ]

ASF GitHub Bot logged work on HIVE-20044:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 00:24
Start Date: 18/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #397:
URL: https://github.com/apache/hive/pull/397


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447570)
Time Spent: 0.5h  (was: 20m)

> Arrow Serde should pad char values and handle empty strings correctly
> -
>
> Key: HIVE-20044
> URL: https://issues.apache.org/jira/browse/HIVE-20044
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20044.1.branch-3.patch, HIVE-20044.1.patch, 
> HIVE-20044.1.patch, HIVE-20044.2.patch, HIVE-20044.3.patch, 
> HIVE-20044.3.patch, HIVE-20044.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When Arrow Serde serializes char values, it loses padding. Also when it 
> counts empty strings, sometimes it makes a smaller number. It should pad char 
> values and handle empty strings correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20044) Arrow Serde should pad char values and handle empty strings correctly

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20044?focusedWorklogId=447571=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447571
 ]

ASF GitHub Bot logged work on HIVE-20044:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 00:24
Start Date: 18/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #396:
URL: https://github.com/apache/hive/pull/396


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447571)
Time Spent: 40m  (was: 0.5h)

> Arrow Serde should pad char values and handle empty strings correctly
> -
>
> Key: HIVE-20044
> URL: https://issues.apache.org/jira/browse/HIVE-20044
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20044.1.branch-3.patch, HIVE-20044.1.patch, 
> HIVE-20044.1.patch, HIVE-20044.2.patch, HIVE-20044.3.patch, 
> HIVE-20044.3.patch, HIVE-20044.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When Arrow Serde serializes char values, it loses padding. Also when it 
> counts empty strings, sometimes it makes a smaller number. It should pad char 
> values and handle empty strings correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20070) ptest optimization - Replicate ACID/MM tables write operations.

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20070?focusedWorklogId=447574=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447574
 ]

ASF GitHub Bot logged work on HIVE-20070:
-

Author: ASF GitHub Bot
Created on: 18/Jun/20 00:24
Start Date: 18/Jun/20 00:24
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #395:
URL: https://github.com/apache/hive/pull/395


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447574)
Time Spent: 20m  (was: 10m)

> ptest optimization  - Replicate ACID/MM tables write operations.
> 
>
> Key: HIVE-20070
> URL: https://issues.apache.org/jira/browse/HIVE-20070
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl, Transactions
>Affects Versions: 3.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 4.0.0
>
> Attachments: HIVE-20070.01.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> change the test to do incremental replication for each operation , instead of 
> only one incremental replication at the end



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23717) In jdbcUrl add config to create External + purge table by default

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23717?focusedWorklogId=447559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447559
 ]

ASF GitHub Bot logged work on HIVE-23717:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 23:00
Start Date: 17/Jun/20 23:00
Worklog Time Spent: 10m 
  Work Description: xiaomengzhang opened a new pull request #1143:
URL: https://github.com/apache/hive/pull/1143


   …efault
   
   External + purge tables are more backward compatible with the old
   managed tables in CDH and HDP 2.
   So add a jdbc config "defaultExternalTable". When the value is true,
   set "hive.create.as.acid" and "hive.create.as.insert.only" to false
   in session level. In such session, table created by default is
   external purge table.
   
   Test:
   Add unit test testDefaultExternal in TestJdbcDriver2.java
   
   Change-Id: I3b1adc3eb63596ebc1955d116498745fa9356547
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY)
   For more details, please see 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447559)
Remaining Estimate: 0h
Time Spent: 10m

> In jdbcUrl add config to create External + purge table by default 
> --
>
> Key: HIVE-23717
> URL: https://issues.apache.org/jira/browse/HIVE-23717
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Affects Versions: 3.1.0
>Reporter: Xiaomeng Zhang
>Assignee: Xiaomeng Zhang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> External + purge tables are more backward compatible with the old managed 
> tables.
> Applications can use a HS2 URL that sets the session level property for 
> default table type to external-purge tables to be true.
> As part of this we need a notion of a "session level only" config 
> parameter(s).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23717) In jdbcUrl add config to create External + purge table by default

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23717:
--
Labels: pull-request-available  (was: )

> In jdbcUrl add config to create External + purge table by default 
> --
>
> Key: HIVE-23717
> URL: https://issues.apache.org/jira/browse/HIVE-23717
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Affects Versions: 3.1.0
>Reporter: Xiaomeng Zhang
>Assignee: Xiaomeng Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> External + purge tables are more backward compatible with the old managed 
> tables.
> Applications can use a HS2 URL that sets the session level property for 
> default table type to external-purge tables to be true.
> As part of this we need a notion of a "session level only" config 
> parameter(s).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23718) Extract transaction handling from Driver

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23718:
--
Labels: pull-request-available  (was: )

> Extract transaction handling from Driver
> 
>
> Key: HIVE-23718
> URL: https://issues.apache.org/jira/browse/HIVE-23718
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23718) Extract transaction handling from Driver

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23718?focusedWorklogId=447547=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447547
 ]

ASF GitHub Bot logged work on HIVE-23718:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 21:58
Start Date: 17/Jun/20 21:58
Worklog Time Spent: 10m 
  Work Description: miklosgergely opened a new pull request #1142:
URL: https://github.com/apache/hive/pull/1142


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447547)
Remaining Estimate: 0h
Time Spent: 10m

> Extract transaction handling from Driver
> 
>
> Key: HIVE-23718
> URL: https://issues.apache.org/jira/browse/HIVE-23718
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23718) Extract transaction handling from Driver

2020-06-17 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-23718:
-


> Extract transaction handling from Driver
> 
>
> Key: HIVE-23718
> URL: https://issues.apache.org/jira/browse/HIVE-23718
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23467) Add a skip.trash config for HMS to skip trash when deleting external table data

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23467?focusedWorklogId=447523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447523
 ]

ASF GitHub Bot logged work on HIVE-23467:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 20:52
Start Date: 17/Jun/20 20:52
Worklog Time Spent: 10m 
  Work Description: sam-an-cloudera commented on a change in pull request 
#1133:
URL: https://github.com/apache/hive/pull/1133#discussion_r441826009



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
##
@@ -105,6 +105,7 @@ protected DateFormat initialValue() {
   public static final String DB_EMPTY_MARKER = "!";
 
   public static final String EXTERNAL_TABLE_PURGE = "external.table.purge";
+  public static final String EXTERNAL_TABLE_AUTODELETE = 
"external.table.autodelete";

Review comment:
   After discussion with @nrg4878 and @thejasmn , we thought it's best that 
we don't add this new option because "external.table.purge" is widely used. 
Introducing the new autodelete might cause confusion. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447523)
Time Spent: 0.5h  (was: 20m)

> Add a skip.trash config for HMS to skip trash when deleting external table 
> data
> ---
>
> Key: HIVE-23467
> URL: https://issues.apache.org/jira/browse/HIVE-23467
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Sam An
>Assignee: Yu-Wen Lai
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We have an auto.purge flag, which means skip trash. It can be confusing as we 
> have 'external.table.purge'='true' to indicate delete table data when this 
> tblproperties is set. 
> We should make the meaning clearer by introducing a skip trash alias/option. 
> Additionally, we shall add an alias for external.table.purge, and name it 
> external.table.autodelete, and document it more prominently, so as to 
> maintain backward compatibility, and make the meaning of auto deletion of 
> data more obvious. 
> The net effect of these 2 changes will be. If the user sets 
> 'external.table.autodelete'='true'
> the table data will be removed when table is dropped. and if 
> 'skip.trash'='true' 
> is set, HMS will not move the table data to trash folder when removing the 
> files. This will result in faster removal, especially when underlying FS is 
> S3. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23612) Option for HiveStrictManagedMigration to impersonate a user for FS operations

2020-06-17 Thread Jason Dere (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138822#comment-17138822
 ] 

Jason Dere commented on HIVE-23612:
---

+1

> Option for HiveStrictManagedMigration to impersonate a user for FS operations
> -
>
> Key: HIVE-23612
> URL: https://issues.apache.org/jira/browse/HIVE-23612
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23612.0.patch, HIVE-23612.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HiveStrictManagedMigration tool can be used to move HDFS paths and to change 
> ownership on said paths. It may be beneficial to do such file system 
> operations as a different user than the one the tool itself is run.
> Moreover, while creating the external DB directory, the tool will chown the 
> new directory to the user set as DB owner in HMS. If this is unset, no chown 
> command is used. In this case we should make the 'hive' user the directory 
> owner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23717) In jdbcUrl add config to create External + purge table by default

2020-06-17 Thread Xiaomeng Zhang (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Zhang reassigned HIVE-23717:
-


> In jdbcUrl add config to create External + purge table by default 
> --
>
> Key: HIVE-23717
> URL: https://issues.apache.org/jira/browse/HIVE-23717
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Affects Versions: 3.1.0
>Reporter: Xiaomeng Zhang
>Assignee: Xiaomeng Zhang
>Priority: Major
>
> External + purge tables are more backward compatible with the old managed 
> tables.
> Applications can use a HS2 URL that sets the session level property for 
> default table type to external-purge tables to be true.
> As part of this we need a notion of a "session level only" config 
> parameter(s).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23467) Add a skip.trash config for HMS to skip trash when deleting external table data

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23467?focusedWorklogId=447466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447466
 ]

ASF GitHub Bot logged work on HIVE-23467:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 19:11
Start Date: 17/Jun/20 19:11
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1133:
URL: https://github.com/apache/hive/pull/1133#issuecomment-645567880


   so if the goal is to deprecate "external.table.purge" going forward in favor 
of "external.table.autodelete", can we remove all the current references in the 
code as well and convert them to the new property. so all the new tables should 
be creating using the new property.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447466)
Time Spent: 20m  (was: 10m)

> Add a skip.trash config for HMS to skip trash when deleting external table 
> data
> ---
>
> Key: HIVE-23467
> URL: https://issues.apache.org/jira/browse/HIVE-23467
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Sam An
>Assignee: Yu-Wen Lai
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have an auto.purge flag, which means skip trash. It can be confusing as we 
> have 'external.table.purge'='true' to indicate delete table data when this 
> tblproperties is set. 
> We should make the meaning clearer by introducing a skip trash alias/option. 
> Additionally, we shall add an alias for external.table.purge, and name it 
> external.table.autodelete, and document it more prominently, so as to 
> maintain backward compatibility, and make the meaning of auto deletion of 
> data more obvious. 
> The net effect of these 2 changes will be. If the user sets 
> 'external.table.autodelete'='true'
> the table data will be removed when table is dropped. and if 
> 'skip.trash'='true' 
> is set, HMS will not move the table data to trash folder when removing the 
> files. This will result in faster removal, especially when underlying FS is 
> S3. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23715) Fix zookeeper ssl keystore password handling issues

2020-06-17 Thread Peter Varga (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Varga updated HIVE-23715:
---
Status: Patch Available  (was: In Progress)

> Fix zookeeper ssl keystore password handling issues
> ---
>
> Key: HIVE-23715
> URL: https://issues.apache.org/jira/browse/HIVE-23715
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In HIVE-23045 Zookeeper SSL communication support was introduced, but the 
> password config for the keystore and truststore is not handled correctly is 
> they are stored in jceks.
> Also the ZooKeeperTokenStore is not handling well the fallback to the global 
> zookeeper configurations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23715) Fix zookeeper ssl keystore password handling issues

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23715?focusedWorklogId=447445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447445
 ]

ASF GitHub Bot logged work on HIVE-23715:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 18:12
Start Date: 17/Jun/20 18:12
Worklog Time Spent: 10m 
  Work Description: pvargacl opened a new pull request #1141:
URL: https://github.com/apache/hive/pull/1141


   
   HIVE-23715: Fix zookeeper ssl keystore password handling issues



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447445)
Remaining Estimate: 0h
Time Spent: 10m

> Fix zookeeper ssl keystore password handling issues
> ---
>
> Key: HIVE-23715
> URL: https://issues.apache.org/jira/browse/HIVE-23715
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In HIVE-23045 Zookeeper SSL communication support was introduced, but the 
> password config for the keystore and truststore is not handled correctly is 
> they are stored in jceks.
> Also the ZooKeeperTokenStore is not handling well the fallback to the global 
> zookeeper configurations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23715) Fix zookeeper ssl keystore password handling issues

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23715:
--
Labels: pull-request-available  (was: )

> Fix zookeeper ssl keystore password handling issues
> ---
>
> Key: HIVE-23715
> URL: https://issues.apache.org/jira/browse/HIVE-23715
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In HIVE-23045 Zookeeper SSL communication support was introduced, but the 
> password config for the keystore and truststore is not handled correctly is 
> they are stored in jceks.
> Also the ZooKeeperTokenStore is not handling well the fallback to the global 
> zookeeper configurations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23716) Support Anti Join in Hive

2020-06-17 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-23716:
--


> Support Anti Join in Hive 
> --
>
> Key: HIVE-23716
> URL: https://issues.apache.org/jira/browse/HIVE-23716
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> Currently hive does not support Anti join. The query for anti join is 
> converted to left outer join and null filter on right side join key is added 
> to get the desired result. This is causing
>  # Extra computation — The left outer join projects the redundant columns 
> from right side. Along with that, filtering is done to remove the redundant 
> rows. This is can be avoided in case of anti join as anti join will project 
> only the required columns and rows from the left side table.
>  # Extra shuffle — In case of anti join the duplicate records moved to join 
> node can be avoided from the child node. This can reduce significant amount 
> of data movement if the number of distinct rows( join keys) is significant.
>  # Extra Memory Usage - In case of map based anti join , hash set is 
> sufficient as just the key is required to check  if the records matches the 
> join condition. In case of left join, we need the key and the non key columns 
> also and thus a hash table will be required.
> For a query like
> {code:java}
>  select wr_order_number FROM web_returns LEFT JOIN web_sales  ON 
> wr_order_number = ws_order_number WHERE ws_order_number IS NULL;{code}
> The number of distinct ws_order_number in web_sales table in a typical 10TB 
> TPCDS set up is just 10% of total records. So when we convert this query to 
> anti join, instead of 7 billion rows, only 600 million rows are moved to join 
> node.
> In the current patch, just one conversion is done. The pattern of 
> project->filter->left-join is converted to project->anti-join. This will take 
> care of sub queries with “not exists” clause. The queries with “not exists” 
> are converted first to filter + left-join and then its converted to anti 
> join. The queries with “not in” are not handled in the current patch.
> From execution side, both merge join and map join with vectorized execution  
> is supported for anti join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23698) Compiler support for row-level filtering on filterPredicates

2020-06-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23698:
--
Description: 
Similar to what we currently do for StorageHandlers, we should pushdown the 
static expression for row-level filtering when the file-format supports the 
feature (ORC).
I propose to split the  filterExpr to residual and pushed predicate. If 
predicate is completely pushed then we remove the operator.

  was:
Similar to what we currently do for StorageHandlers, we should pushdown the 
static expression for row-level filtering when the file-format supports the 
feature (ORC).

I propose to split the  filterExpr to residual and pushed predicate. If 
predicate is completely pushed then we remove the operator.
If its partially pushed we are not updating the filter as its could trigger 
constantFolding (and thus clearing existing TsFilters)


> Compiler support for row-level filtering on filterPredicates
> 
>
> Key: HIVE-23698
> URL: https://issues.apache.org/jira/browse/HIVE-23698
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Similar to what we currently do for StorageHandlers, we should pushdown the 
> static expression for row-level filtering when the file-format supports the 
> feature (ORC).
> I propose to split the  filterExpr to residual and pushed predicate. If 
> predicate is completely pushed then we remove the operator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23698) Compiler support for row-level filtering on filterPredicates

2020-06-17 Thread Panagiotis Garefalakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-23698:
--
Description: 
Similar to what we currently do for StorageHandlers, we should pushdown the 
static expression for row-level filtering when the file-format supports the 
feature (ORC).

I propose to split the  filterExpr to residual and pushed predicate. If 
predicate is completely pushed then we remove the operator.
If its partially pushed we are not updating the filter as its could trigger 
constantFolding (and thus clearing existing TsFilters)

> Compiler support for row-level filtering on filterPredicates
> 
>
> Key: HIVE-23698
> URL: https://issues.apache.org/jira/browse/HIVE-23698
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Similar to what we currently do for StorageHandlers, we should pushdown the 
> static expression for row-level filtering when the file-format supports the 
> feature (ORC).
> I propose to split the  filterExpr to residual and pushed predicate. If 
> predicate is completely pushed then we remove the operator.
> If its partially pushed we are not updating the filter as its could trigger 
> constantFolding (and thus clearing existing TsFilters)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447430=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447430
 ]

ASF GitHub Bot logged work on HIVE-23704:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 17:50
Start Date: 17/Jun/20 17:50
Worklog Time Spent: 10m 
  Work Description: belugabehr edited a comment on pull request #1127:
URL: https://github.com/apache/hive/pull/1127#issuecomment-645524894


   Existing code works because Commons Digest Base64 implementation ignores 
invalid characters:.
   
   
https://github.com/apache/commons-codec/blob/41c6f486fd4f5c2450c9311c40dbbf7e576d2907/src/main/java/org/apache/commons/codec/binary/Base64.java#L621



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447430)
Time Spent: 50m  (was: 40m)

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23704?focusedWorklogId=447429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447429
 ]

ASF GitHub Bot logged work on HIVE-23704:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 17:49
Start Date: 17/Jun/20 17:49
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #1127:
URL: https://github.com/apache/hive/pull/1127#issuecomment-645524894


   Existing code works because Commons Digest Base64 implementation ignores 
invalid characters:.
   
   
https://github.com/apache/commons-codec/blob/41c6f486fd4f5c2450c9311c40dbbf7e576d2907/src/main/java/org/apache/commons/codec/binary/Base64.java#L640



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447429)
Time Spent: 40m  (was: 0.5h)

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23700?focusedWorklogId=447427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447427
 ]

ASF GitHub Bot logged work on HIVE-23700:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 17:45
Start Date: 17/Jun/20 17:45
Worklog Time Spent: 10m 
  Work Description: frankgh opened a new pull request #1140:
URL: https://github.com/apache/hive/pull/1140


   Handle IllegalArgumentExceptions thrown by the File constructor when the
   jar URI is not supported. This fixes the static initialization of the
   HiveConf class when four conditions are met:
   
   1. hive-site.xml is not present on the classpath
   2. hive-site.xml is not present on the "HIVE_CONF_DIR" directory
   3. hive-site.xml is not present on the "HIVE_HOME" directory
   4. jar URI is not absolute, or is opaque, or URI scheme is null or not
  file, uri authority is not null, uri fragment is not null, uri query
  is not null and finally uri path is empty.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447427)
Remaining Estimate: 119.5h  (was: 119h 40m)
Time Spent: 0.5h  (was: 20m)

> HiveConf static initialization fails when JAR URI is opaque
> ---
>
> Key: HIVE-23700
> URL: https://issues.apache.org/jira/browse/HIVE-23700
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.7
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23700.1.patch
>
>   Original Estimate: 120h
>  Time Spent: 0.5h
>  Remaining Estimate: 119.5h
>
> HiveConf static initialization fails when the jar URI is opaque, for example 
> when it's embedded as a fat jar in a spring boot application. Then 
> initialization of the HiveConf static block fails and the HiveConf class does 
> not get classloaded. The opaque URI in my case looks like this 
> _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_
> HiveConf#findConfigFile should be able to handle `IllegalArgumentException` 
> when the jar `URI` provided to `File` throws the exception.
> To surface this issue three conditions need to be met.
> 1. hive-site.xml should not be on the classpath
> 2. hive-site.xml should not be on "HIVE_CONF_DIR"
> 3. hive-site.xml should not be on "HIVE_HOME"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23700?focusedWorklogId=447415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447415
 ]

ASF GitHub Bot logged work on HIVE-23700:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 17:25
Start Date: 17/Jun/20 17:25
Worklog Time Spent: 10m 
  Work Description: frankgh opened a new pull request #1139:
URL: https://github.com/apache/hive/pull/1139


   Handle IllegalArgumentExceptions thrown by the File constructor when the
   jar URI is not supported. This fixes the static initialization of the
   HiveConf class when four conditions are met:
   
   1. hive-site.xml is not present on the classpath
   2. hive-site.xml is not present on the "HIVE_CONF_DIR" directory
   3. hive-site.xml is not present on the "HIVE_HOME" directory
   4. jar URI is not absolute, or is opaque, or URI scheme is null or not
  file, uri authority is not null, uri fragment is not null, uri query
  is not null and finally uri path is empty.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447415)
Remaining Estimate: 119h 40m  (was: 119h 50m)
Time Spent: 20m  (was: 10m)

> HiveConf static initialization fails when JAR URI is opaque
> ---
>
> Key: HIVE-23700
> URL: https://issues.apache.org/jira/browse/HIVE-23700
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.7
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23700.1.patch
>
>   Original Estimate: 120h
>  Time Spent: 20m
>  Remaining Estimate: 119h 40m
>
> HiveConf static initialization fails when the jar URI is opaque, for example 
> when it's embedded as a fat jar in a spring boot application. Then 
> initialization of the HiveConf static block fails and the HiveConf class does 
> not get classloaded. The opaque URI in my case looks like this 
> _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_
> HiveConf#findConfigFile should be able to handle `IllegalArgumentException` 
> when the jar `URI` provided to `File` throws the exception.
> To surface this issue three conditions need to be met.
> 1. hive-site.xml should not be on the classpath
> 2. hive-site.xml should not be on "HIVE_CONF_DIR"
> 3. hive-site.xml should not be on "HIVE_HOME"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name

2020-06-17 Thread Thomas Poepping (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-22928:
---
Status: Patch Available  (was: In Progress)

resubmitting .5.patch as .6.patch, hopefully the PreCommit job picks it up this 
time.

> Allow hive.exec.stagingdir to be a fully qualified directory name
> -
>
> Key: HIVE-22928
> URL: https://issues.apache.org/jira/browse/HIVE-22928
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Hive
>Affects Versions: 3.1.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>Priority: Minor
> Attachments: HIVE-22928.2.patch, HIVE-22928.3.patch, 
> HIVE-22928.4.patch, HIVE-22928.5.patch, HIVE-22928.6.patch, HIVE-22928.patch
>
>
> Currently, {{hive.exec.stagingdir}} can only be set as a relative directory 
> name that, for operations like {{insert}} or {{insert overwrite}}, will be 
> placed either under the table directory or the partition directory. 
> For cases where an HDFS cluster is small but the data being inserted is very 
> large (greater than the capacity of the HDFS cluster, as mentioned in a 
> comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their 
> staging directory to be an explicit blobstore path (or any filesystem path), 
> rather than relying on Hive to intelligently build the blobstore path based 
> on an interpretation of the job. We may lose locality guarantees, but because 
> renames are just as expensive on blobstores no matter what the prefix is, 
> this isn't considered a terribly large loss (assuming only blobstore 
> customers use this functionality).
> Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually 
> suffice in this case, as the stagingdir is not the same.
> This commit enables Hive customers to set an absolute location for all 
> staging directories. For instances where the configured stagingdir scheme is 
> not the same as the scheme for the table location, the default stagingdir 
> configuration is used. This avoids a cross-filesystem rename, which is 
> impossible anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name

2020-06-17 Thread Thomas Poepping (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-22928:
---
Status: In Progress  (was: Patch Available)

> Allow hive.exec.stagingdir to be a fully qualified directory name
> -
>
> Key: HIVE-22928
> URL: https://issues.apache.org/jira/browse/HIVE-22928
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Hive
>Affects Versions: 3.1.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>Priority: Minor
> Attachments: HIVE-22928.2.patch, HIVE-22928.3.patch, 
> HIVE-22928.4.patch, HIVE-22928.5.patch, HIVE-22928.6.patch, HIVE-22928.patch
>
>
> Currently, {{hive.exec.stagingdir}} can only be set as a relative directory 
> name that, for operations like {{insert}} or {{insert overwrite}}, will be 
> placed either under the table directory or the partition directory. 
> For cases where an HDFS cluster is small but the data being inserted is very 
> large (greater than the capacity of the HDFS cluster, as mentioned in a 
> comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their 
> staging directory to be an explicit blobstore path (or any filesystem path), 
> rather than relying on Hive to intelligently build the blobstore path based 
> on an interpretation of the job. We may lose locality guarantees, but because 
> renames are just as expensive on blobstores no matter what the prefix is, 
> this isn't considered a terribly large loss (assuming only blobstore 
> customers use this functionality).
> Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually 
> suffice in this case, as the stagingdir is not the same.
> This commit enables Hive customers to set an absolute location for all 
> staging directories. For instances where the configured stagingdir scheme is 
> not the same as the scheme for the table location, the default stagingdir 
> configuration is used. This avoids a cross-filesystem rename, which is 
> impossible anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name

2020-06-17 Thread Thomas Poepping (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping updated HIVE-22928:
---
Attachment: HIVE-22928.6.patch

> Allow hive.exec.stagingdir to be a fully qualified directory name
> -
>
> Key: HIVE-22928
> URL: https://issues.apache.org/jira/browse/HIVE-22928
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Hive
>Affects Versions: 3.1.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>Priority: Minor
> Attachments: HIVE-22928.2.patch, HIVE-22928.3.patch, 
> HIVE-22928.4.patch, HIVE-22928.5.patch, HIVE-22928.6.patch, HIVE-22928.patch
>
>
> Currently, {{hive.exec.stagingdir}} can only be set as a relative directory 
> name that, for operations like {{insert}} or {{insert overwrite}}, will be 
> placed either under the table directory or the partition directory. 
> For cases where an HDFS cluster is small but the data being inserted is very 
> large (greater than the capacity of the HDFS cluster, as mentioned in a 
> comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their 
> staging directory to be an explicit blobstore path (or any filesystem path), 
> rather than relying on Hive to intelligently build the blobstore path based 
> on an interpretation of the job. We may lose locality guarantees, but because 
> renames are just as expensive on blobstores no matter what the prefix is, 
> this isn't considered a terribly large loss (assuming only blobstore 
> customers use this functionality).
> Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually 
> suffice in this case, as the stagingdir is not the same.
> This commit enables Hive customers to set an absolute location for all 
> staging directories. For instances where the configured stagingdir scheme is 
> not the same as the scheme for the table location, the default stagingdir 
> configuration is used. This avoids a cross-filesystem rename, which is 
> impossible anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-23715) Fix zookeeper ssl keystore password handling issues

2020-06-17 Thread Peter Varga (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23715 started by Peter Varga.
--
> Fix zookeeper ssl keystore password handling issues
> ---
>
> Key: HIVE-23715
> URL: https://issues.apache.org/jira/browse/HIVE-23715
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>
> In HIVE-23045 Zookeeper SSL communication support was introduced, but the 
> password config for the keystore and truststore is not handled correctly is 
> they are stored in jceks.
> Also the ZooKeeperTokenStore is not handling well the fallback to the global 
> zookeeper configurations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23715) Fix zookeeper ssl keystore password handling issues

2020-06-17 Thread Peter Varga (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Varga reassigned HIVE-23715:
--


> Fix zookeeper ssl keystore password handling issues
> ---
>
> Key: HIVE-23715
> URL: https://issues.apache.org/jira/browse/HIVE-23715
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>
> In HIVE-23045 Zookeeper SSL communication support was introduced, but the 
> password config for the keystore and truststore is not handled correctly is 
> they are stored in jceks.
> Also the ZooKeeperTokenStore is not handling well the fallback to the global 
> zookeeper configurations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23700:
--
Labels: pull-request-available  (was: )

> HiveConf static initialization fails when JAR URI is opaque
> ---
>
> Key: HIVE-23700
> URL: https://issues.apache.org/jira/browse/HIVE-23700
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.7
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23700.1.patch
>
>   Original Estimate: 120h
>  Time Spent: 10m
>  Remaining Estimate: 119h 50m
>
> HiveConf static initialization fails when the jar URI is opaque, for example 
> when it's embedded as a fat jar in a spring boot application. Then 
> initialization of the HiveConf static block fails and the HiveConf class does 
> not get classloaded. The opaque URI in my case looks like this 
> _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_
> HiveConf#findConfigFile should be able to handle `IllegalArgumentException` 
> when the jar `URI` provided to `File` throws the exception.
> To surface this issue three conditions need to be met.
> 1. hive-site.xml should not be on the classpath
> 2. hive-site.xml should not be on "HIVE_CONF_DIR"
> 3. hive-site.xml should not be on "HIVE_HOME"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23700) HiveConf static initialization fails when JAR URI is opaque

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23700?focusedWorklogId=447365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447365
 ]

ASF GitHub Bot logged work on HIVE-23700:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 15:42
Start Date: 17/Jun/20 15:42
Worklog Time Spent: 10m 
  Work Description: frankgh opened a new pull request #1138:
URL: https://github.com/apache/hive/pull/1138


   Handle IllegalArgumentExceptions thrown by the File constructor when the
   jar URI is not supported. This fixes the static initialization of the
   HiveConf class when four conditions are met:
   
   1. hive-site.xml is not present in the classpath
   2. hive-site.xml is not present on the "HIVE_CONF_DIR" directory
   3. hive-site.xml is not present on the "HIVE_HOME" directory
   4. jar URI is not absolute, or is opaque, or URI scheme is null or not
  file, uri authority is not null, uri fragment is not null, uri query
  is not null and finally uri path is empty.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447365)
Remaining Estimate: 119h 50m  (was: 120h)
Time Spent: 10m

> HiveConf static initialization fails when JAR URI is opaque
> ---
>
> Key: HIVE-23700
> URL: https://issues.apache.org/jira/browse/HIVE-23700
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.7
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Minor
> Attachments: HIVE-23700.1.patch
>
>   Original Estimate: 120h
>  Time Spent: 10m
>  Remaining Estimate: 119h 50m
>
> HiveConf static initialization fails when the jar URI is opaque, for example 
> when it's embedded as a fat jar in a spring boot application. Then 
> initialization of the HiveConf static block fails and the HiveConf class does 
> not get classloaded. The opaque URI in my case looks like this 
> _jar:file:/usr/local/server/some-service-jar.jar!/BOOT-INF/lib/hive-common-2.3.7.jar!/_
> HiveConf#findConfigFile should be able to handle `IllegalArgumentException` 
> when the jar `URI` provided to `File` throws the exception.
> To surface this issue three conditions need to be met.
> 1. hive-site.xml should not be on the classpath
> 2. hive-site.xml should not be on "HIVE_CONF_DIR"
> 3. hive-site.xml should not be on "HIVE_HOME"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-20890) ACID: Allow whole table ReadLocks to skip all partition locks

2020-06-17 Thread Peter Vary (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-20890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138524#comment-17138524
 ] 

Peter Vary commented on HIVE-20890:
---

Thanks [~dkuzmenko] for the notification on the duplicate!

Do you plan to create a pull request for this?

2 quick questions:
* You added the check for the {{TABLE}} - we still want to have the table level 
lock to be there, or I miss something?
{code}
case TABLE:
  t = input.getTable();
  if (!fullTableLock.contains(t)) {
continue;
  }
{code}
* We supposed to read the conf like this nowadays:
{code}
HiveConf.getIntVar(conf, ConfVars. HIVE_LOCKS_PARTITION_THRESHOLD);
{code}

Thanks again,
Peter

> ACID: Allow whole table ReadLocks to skip all partition locks
> -
>
> Key: HIVE-20890
> URL: https://issues.apache.org/jira/browse/HIVE-20890
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal Vijayaraghavan
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-20890.1.patch, HIVE-20890.2.patch, 
> HIVE-20890.3.patch, HIVE-20890.4.patch
>
>
> HIVE-19369 proposes adding a EXCL_WRITE lock which does not wait for any 
> SHARED_READ locks for read operations - in the presence of that lock, the 
> insert overwrite no longer takes an exclusive lock.
> The only exclusive operation will be a schema change or drop table, which 
> should take an exclusive lock on the entire table directly.
> {code}
> explain locks select * from tpcds_bin_partitioned_orc_1000.store_sales where 
> ss_sold_date_sk=2452626 
> ++
> |  Explain   |
> ++
> | LOCK INFORMATION:  |
> | tpcds_bin_partitioned_orc_1000.store_sales -> SHARED_READ |
> | tpcds_bin_partitioned_orc_1000.store_sales.ss_sold_date_sk=2452626 -> 
> SHARED_READ |
> ++
> {code}
> So the per-partition SHARED_READ locks are no longer necessary, if the lock 
> builder already includes the table-wide SHARED_READ locks.
> The removal of entire partitions is the only part which needs to be taken 
> care of within this semantics as row-removal instead of directory removal 
> (i.e "drop partition" -> "truncate partition" and have the truncation trigger 
> a whole directory cleaner, so that the partition disappears when there are 0 
> rows left).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23714) Add new configuration for lock escalation

2020-06-17 Thread Peter Vary (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-23714.
---
Resolution: Duplicate

Duplicates: HIVE-20890

> Add new configuration for lock escalation
> -
>
> Key: HIVE-23714
> URL: https://issues.apache.org/jira/browse/HIVE-23714
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> It would be good to have an opportunity that after a given number of locks is 
> reached on a partitioned table, we can escalate the lock and request a table 
> level lock instead of a multiple partition level locks.
> This is part of the solution proposed on HIVE-21354



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-17 Thread David Mollitor (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138490#comment-17138490
 ] 

David Mollitor commented on HIVE-23704:
---

Existing code works because Commons Digest Base64 implementation ignores 
invalid characters:.

https://github.com/apache/commons-codec/blob/41c6f486fd4f5c2450c9311c40dbbf7e576d2907/src/main/java/org/apache/commons/codec/binary/Base64.java#L640

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23704) Thrift HTTP Server Does Not Handle Auth Handle Correctly

2020-06-17 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-23704:
--
Priority: Major  (was: Critical)

> Thrift HTTP Server Does Not Handle Auth Handle Correctly
> 
>
> Key: HIVE-23704
> URL: https://issues.apache.org/jira/browse/HIVE-23704
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 3.1.2, 2.3.7
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Base64NegotiationError.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code:java|title=ThriftHttpServlet.java}
>   private String[] getAuthHeaderTokens(HttpServletRequest request,
>   String authType) throws HttpAuthenticationException {
> String authHeaderBase64 = getAuthHeader(request, authType);
> String authHeaderString = StringUtils.newStringUtf8(
> Base64.decodeBase64(authHeaderBase64.getBytes()));
> String[] creds = authHeaderString.split(":");
> return creds;
>   }
> {code}
> So here, it takes the authHeaderBase64 (which is a base-64 string), and 
> converts it into bytes, and then it tries to decode those bytes.  That is 
> incorrect   It should covert base-64 string directly into bytes.
> I tried to do this as part of [HIVE-22676] and the tests was failing because 
> the string that is being decoded is not actually Base-64 (see attached image) 
>  It has a stray space and a colon.  Again, the existing code doesn't care 
> because it's not parsing Base-64 text, it is parsing the bytes generated by 
> converting base-64 text to bytes.
> I'm not sure what affect this has, what security issues this may present, but 
> it's definitely not correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23585) Retrieve replication instance metrics details

2020-06-17 Thread Pravin Sinha (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138465#comment-17138465
 ] 

Pravin Sinha commented on HIVE-23585:
-

+1

> Retrieve replication instance metrics details
> -
>
> Key: HIVE-23585
> URL: https://issues.apache.org/jira/browse/HIVE-23585
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, 
> HIVE-23585.03.patch, Replication Metrics.pdf
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23714) Add new configuration for lock escalation

2020-06-17 Thread Denys Kuzmenko (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138463#comment-17138463
 ] 

Denys Kuzmenko commented on HIVE-23714:
---

[~pvary], https://issues.apache.org/jira/browse/HIVE-20890 addresses the same 
but only for read operations

> Add new configuration for lock escalation
> -
>
> Key: HIVE-23714
> URL: https://issues.apache.org/jira/browse/HIVE-23714
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> It would be good to have an opportunity that after a given number of locks is 
> reached on a partitioned table, we can escalate the lock and request a table 
> level lock instead of a multiple partition level locks.
> This is part of the solution proposed on HIVE-21354



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-06-17 Thread LuGuangMing (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LuGuangMing updated HIVE-22753:
---
Description: 
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

 

Allocation tree

!image-2020-01-21-11-18-37-294.png|width=425,height=178!

 

Prod instance mem

!image-2020-01-21-11-17-59-279.png|width=951,height=285!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820

  was:
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

 

Allocation tree

!image-2020-01-21-11-18-37-294.png|width=425,height=178!

 

Prod instance mem

!image-2020-01-21-11-17-59-279.png|width=698,height=209!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820


> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, 
> HIVE-22753.3.patch, HIVE-22753.4.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=951,height=285!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22753) Fix gradual mem leak: Operationlog related appenders should be cleared up on errors

2020-06-17 Thread LuGuangMing (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LuGuangMing updated HIVE-22753:
---
Description: 
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

 

Allocation tree

!image-2020-01-21-11-18-37-294.png|width=425,height=178!

 

Prod instance mem

!image-2020-01-21-11-17-59-279.png|width=671,height=201!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820

  was:
In case of exception in SQLOperation, operational log does not get cleared up. 
This causes gradual build up of HushableRandomAccessFileAppender causing HS2 to 
OOM after some time.

!image-2020-01-21-11-14-37-911.png|width=431,height=267!

 

Allocation tree

!image-2020-01-21-11-18-37-294.png|width=425,height=178!

 

Prod instance mem

!image-2020-01-21-11-17-59-279.png|width=951,height=285!

 

Each HushableRandomAccessFileAppender holds internal ref to 
RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem leak.

Related ticket: HIVE-18820


> Fix gradual mem leak: Operationlog related appenders should be cleared up on 
> errors 
> 
>
> Key: HIVE-22753
> URL: https://issues.apache.org/jira/browse/HIVE-22753
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22753.1.patch, HIVE-22753.2.patch, 
> HIVE-22753.3.patch, HIVE-22753.4.patch, image-2020-01-21-11-14-37-911.png, 
> image-2020-01-21-11-17-59-279.png, image-2020-01-21-11-18-37-294.png
>
>
> In case of exception in SQLOperation, operational log does not get cleared 
> up. This causes gradual build up of HushableRandomAccessFileAppender causing 
> HS2 to OOM after some time.
> !image-2020-01-21-11-14-37-911.png|width=431,height=267!
>  
> Allocation tree
> !image-2020-01-21-11-18-37-294.png|width=425,height=178!
>  
> Prod instance mem
> !image-2020-01-21-11-17-59-279.png|width=671,height=201!
>  
> Each HushableRandomAccessFileAppender holds internal ref to 
> RandomAccessFileAppender which holds a 256 KB bytebuffer, causing the mem 
> leak.
> Related ticket: HIVE-18820



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-23712) metadata-only queries return incorrect results with empty partition

2020-06-17 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138436#comment-17138436
 ] 

László Bodor edited comment on HIVE-23712 at 6/17/20, 1:35 PM:
---

the root cause is that in 
[MetadataOnlyOptimizer|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java#L124]
 the TS operator of test1 table is considered to be subject of metadata-only 
optimization and later 
[NullScanTaskDispatcher|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java#L106]
 find a non-empty folder for this partition because of ACID operations, with 
files:
{code}
hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delete_delta_003_003_
hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delta_002_002_
{code}

not sure about the perfect solution at the moment, but maybe the following 
scenario should be excluded somehow from metadata-only optimization:
1. there is a partitioned table:
create table test1 (id int, val string) partitioned by (val2 string) STORED AS 
ORC TBLPROPERTIES ('transactional'='true');
2. in a distinct query, only the partitioned column is selected:
{code}
select distinct val2, current_timestamp, 'metadata true' as query from test1;
{code}
in this case tsOp.getNeededColumnIDs() is empty (partition column is not 
present in needed columns)



was (Author: abstractdog):
the root cause is that in 
[MetadataOnlyOptimizer|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java#L124]
 the TS operator of test1 table is considered to be subject of metadata-only 
optimization and later 
[NullScanTaskDispatcher|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java#L106]
 find a non-empty folder for this partition because of ACID operations, with 
files:
{code}
hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delete_delta_003_003_
hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delta_002_002_
{code}

not sure about the perfect solution at the moment, but maybe the following 
scenario should be excluded somehow from metadata-only optimization:
1. there is a partitioned table:
create table test1 (id int, val string) partitioned by (val2 string) STORED AS 
ORC TBLPROPERTIES ('transactional'='true');
2. in a distinct query, only the partitioned column is selected:
select distinct val2, current_timestamp, 'metadata true' as query from test1;
{code}
in this case tsOp.getNeededColumnIDs() is empty (partition column is not 
present in needed columns)
{code}


> metadata-only queries return incorrect results with empty partition
> ---
>
> Key: HIVE-23712
> URL: https://issues.apache.org/jira/browse/HIVE-23712
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Similarly to HIVE-15397, queries can return incorrect results for 
> metadata-only queries, here is a repro scenario which affects master:
> {code}
> set hive.support.concurrency=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.metadataonly=true;
> create table test1 (id int, val string) partitioned by (val2 string) STORED 
> AS ORC TBLPROPERTIES ('transactional'='true');
> describe formatted test1;
> alter table test1 add partition (val2='foo');
> alter table test1 add partition (val2='bar');
> insert into test1 partition (val2='foo') values (1, 'abc');
> select distinct val2, current_timestamp from test1;
> insert into test1 partition (val2='bar') values (1, 'def');
> delete from test1 where val2 = 'bar';
> select '--> hive.optimize.metadataonly=true';
> select distinct val2, current_timestamp from test1;
> set hive.optimize.metadataonly=false;
> select '--> hive.optimize.metadataonly=false';
> select distinct val2, current_timestamp from test1;
> select current_timestamp, * from test1;
> {code}
> in this case 2 rows returned instead of 1 after a delete with metadata only 
> optimization:
> https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23714) Add new configuration for lock escalation

2020-06-17 Thread Peter Vary (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-23714:
-


> Add new configuration for lock escalation
> -
>
> Key: HIVE-23714
> URL: https://issues.apache.org/jira/browse/HIVE-23714
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> It would be good to have an opportunity that after a given number of locks is 
> reached on a partitioned table, we can escalate the lock and request a table 
> level lock instead of a multiple partition level locks.
> This is part of the solution proposed on HIVE-21354



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23712) metadata-only queries return incorrect results with empty partition

2020-06-17 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-23712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138436#comment-17138436
 ] 

László Bodor commented on HIVE-23712:
-

the root cause is that in 
[MetadataOnlyOptimizer|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java#L124]
 the TS operator of test1 table is considered to be subject of metadata-only 
optimization and later 
[NullScanTaskDispatcher|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java#L106]
 find a non-empty folder for this partition because of ACID operations, with 
files:
{code}
hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delete_delta_003_003_
hdfs://localhost:58447/build/ql/test/data/warehouse/test1/val2=bar/delta_002_002_
{code}

not sure about the perfect solution at the moment, but maybe the following 
scenario should be excluded somehow from metadata-only optimization:
1. there is a partitioned table:
create table test1 (id int, val string) partitioned by (val2 string) STORED AS 
ORC TBLPROPERTIES ('transactional'='true');
2. in a distinct query, only the partitioned column is selected:
select distinct val2, current_timestamp, 'metadata true' as query from test1;
{code}
in this case tsOp.getNeededColumnIDs() is empty (partition column is not 
present in needed columns)
{code}


> metadata-only queries return incorrect results with empty partition
> ---
>
> Key: HIVE-23712
> URL: https://issues.apache.org/jira/browse/HIVE-23712
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Similarly to HIVE-15397, queries can return incorrect results for 
> metadata-only queries, here is a repro scenario which affects master:
> {code}
> set hive.support.concurrency=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.optimize.metadataonly=true;
> create table test1 (id int, val string) partitioned by (val2 string) STORED 
> AS ORC TBLPROPERTIES ('transactional'='true');
> describe formatted test1;
> alter table test1 add partition (val2='foo');
> alter table test1 add partition (val2='bar');
> insert into test1 partition (val2='foo') values (1, 'abc');
> select distinct val2, current_timestamp from test1;
> insert into test1 partition (val2='bar') values (1, 'def');
> delete from test1 where val2 = 'bar';
> select '--> hive.optimize.metadataonly=true';
> select distinct val2, current_timestamp from test1;
> set hive.optimize.metadataonly=false;
> select '--> hive.optimize.metadataonly=false';
> select distinct val2, current_timestamp from test1;
> select current_timestamp, * from test1;
> {code}
> in this case 2 rows returned instead of 1 after a delete with metadata only 
> optimization:
> https://github.com/abstractdog/hive/commit/a7f03513564d01f7c3ba4aa61c4c6537100b4d3f#diff-cb23043000831f41fe7041cb38f82224R114-R128



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23026) Allow for custom YARN application name for TEZ queries

2020-06-17 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor resolved HIVE-23026.
---
Fix Version/s: (was: 3.0.0)
   (was: 2.0.0)
   3.2.0
   4.0.0
   2.4.0
   Resolution: Fixed

Thanks so much [~xiejiajun].  PR has been merged across versions.

> Allow for custom YARN application name for TEZ queries
> --
>
> Key: HIVE-23026
> URL: https://issues.apache.org/jira/browse/HIVE-23026
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Jake Xie
>Assignee: Jake Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.0, 4.0.0, 3.2.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Currently tez on hiveServer2 cannot specify yarn application name, which is 
> not very convenient for locating the problem SQL, so i added a configuration 
> item to support setting tez job name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23026) Allow for custom YARN application name for TEZ queries

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23026?focusedWorklogId=447275=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447275
 ]

ASF GitHub Bot logged work on HIVE-23026:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 13:33
Start Date: 17/Jun/20 13:33
Worklog Time Spent: 10m 
  Work Description: belugabehr merged pull request #1083:
URL: https://github.com/apache/hive/pull/1083


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447275)
Time Spent: 5h 10m  (was: 5h)

> Allow for custom YARN application name for TEZ queries
> --
>
> Key: HIVE-23026
> URL: https://issues.apache.org/jira/browse/HIVE-23026
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Jake Xie
>Assignee: Jake Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 3.0.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Currently tez on hiveServer2 cannot specify yarn application name, which is 
> not very convenient for locating the problem SQL, so i added a configuration 
> item to support setting tez job name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23026) Allow for custom YARN application name for TEZ queries

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23026?focusedWorklogId=447274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447274
 ]

ASF GitHub Bot logged work on HIVE-23026:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 13:32
Start Date: 17/Jun/20 13:32
Worklog Time Spent: 10m 
  Work Description: belugabehr merged pull request #1080:
URL: https://github.com/apache/hive/pull/1080


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447274)
Time Spent: 5h  (was: 4h 50m)

> Allow for custom YARN application name for TEZ queries
> --
>
> Key: HIVE-23026
> URL: https://issues.apache.org/jira/browse/HIVE-23026
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Jake Xie
>Assignee: Jake Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 3.0.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently tez on hiveServer2 cannot specify yarn application name, which is 
> not very convenient for locating the problem SQL, so i added a configuration 
> item to support setting tez job name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23026) Allow for custom YARN application name for TEZ queries

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23026?focusedWorklogId=447273=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447273
 ]

ASF GitHub Bot logged work on HIVE-23026:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 13:30
Start Date: 17/Jun/20 13:30
Worklog Time Spent: 10m 
  Work Description: belugabehr merged pull request #1082:
URL: https://github.com/apache/hive/pull/1082


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447273)
Time Spent: 4h 50m  (was: 4h 40m)

> Allow for custom YARN application name for TEZ queries
> --
>
> Key: HIVE-23026
> URL: https://issues.apache.org/jira/browse/HIVE-23026
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Jake Xie
>Assignee: Jake Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 3.0.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently tez on hiveServer2 cannot specify yarn application name, which is 
> not very convenient for locating the problem SQL, so i added a configuration 
> item to support setting tez job name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23026) Allow for custom YARN application name for TEZ queries

2020-06-17 Thread David Mollitor (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138430#comment-17138430
 ] 

David Mollitor commented on HIVE-23026:
---

Pushed to master (4.x)

> Allow for custom YARN application name for TEZ queries
> --
>
> Key: HIVE-23026
> URL: https://issues.apache.org/jira/browse/HIVE-23026
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Jake Xie
>Assignee: Jake Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 3.0.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently tez on hiveServer2 cannot specify yarn application name, which is 
> not very convenient for locating the problem SQL, so i added a configuration 
> item to support setting tez job name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23026) Allow for custom YARN application name for TEZ queries

2020-06-17 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-23026:
--
Summary: Allow for custom YARN application name for TEZ queries  (was: 
support add a yarn application name for tez on hiveserver2)

> Allow for custom YARN application name for TEZ queries
> --
>
> Key: HIVE-23026
> URL: https://issues.apache.org/jira/browse/HIVE-23026
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Jake Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 3.0.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Currently tez on hiveServer2 cannot specify yarn application name, which is 
> not very convenient for locating the problem SQL, so i added a configuration 
> item to support setting tez job name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23026) Allow for custom YARN application name for TEZ queries

2020-06-17 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor reassigned HIVE-23026:
-

Assignee: Jake Xie

> Allow for custom YARN application name for TEZ queries
> --
>
> Key: HIVE-23026
> URL: https://issues.apache.org/jira/browse/HIVE-23026
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Jake Xie
>Assignee: Jake Xie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 3.0.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Currently tez on hiveServer2 cannot specify yarn application name, which is 
> not very convenient for locating the problem SQL, so i added a configuration 
> item to support setting tez job name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23706) Fix nulls first sorting behavior

2020-06-17 Thread Jesus Camacho Rodriguez (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138426#comment-17138426
 ] 

Jesus Camacho Rodriguez commented on HIVE-23706:


+1

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);
> select a from t order by a desc;
> {code}
> instead of 
> {code}
> 3, 2, 2, 2, 1, null
> {code}
> should return 
> {code}
> null, 3, 2 ,2 ,2, 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23585) Retrieve replication instance metrics details

2020-06-17 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23585:
---
Status: In Progress  (was: Patch Available)

> Retrieve replication instance metrics details
> -
>
> Key: HIVE-23585
> URL: https://issues.apache.org/jira/browse/HIVE-23585
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, 
> HIVE-23585.03.patch, Replication Metrics.pdf
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23585) Retrieve replication instance metrics details

2020-06-17 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23585:
---
Attachment: HIVE-23585.03.patch
Status: Patch Available  (was: In Progress)

> Retrieve replication instance metrics details
> -
>
> Key: HIVE-23585
> URL: https://issues.apache.org/jira/browse/HIVE-23585
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23585.01.patch, HIVE-23585.02.patch, 
> HIVE-23585.03.patch, Replication Metrics.pdf
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q

2020-06-17 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely resolved HIVE-23418.
---
Resolution: Fixed

Merged to master, thank you [~kgyrtkirk]

> Investigate why msck command found different partitions at repair.q, 
> msck_repair*, partition_discovery.q
> 
>
> Key: HIVE-23418
> URL: https://issues.apache.org/jira/browse/HIVE-23418
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Check [https://reviews.apache.org/r/72485/] for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23418?focusedWorklogId=447262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447262
 ]

ASF GitHub Bot logged work on HIVE-23418:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:58
Start Date: 17/Jun/20 12:58
Worklog Time Spent: 10m 
  Work Description: miklosgergely merged pull request #1128:
URL: https://github.com/apache/hive/pull/1128


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447262)
Time Spent: 20m  (was: 10m)

> Investigate why msck command found different partitions at repair.q, 
> msck_repair*, partition_discovery.q
> 
>
> Key: HIVE-23418
> URL: https://issues.apache.org/jira/browse/HIVE-23418
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Check [https://reviews.apache.org/r/72485/] for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23696) DB Metadata and Progress column not taking the defined length

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23696?focusedWorklogId=447258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447258
 ]

ASF GitHub Bot logged work on HIVE-23696:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:53
Start Date: 17/Jun/20 12:53
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk closed pull request #1110:
URL: https://github.com/apache/hive/pull/1110


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447258)
Time Spent: 1h 10m  (was: 1h)

> DB Metadata and Progress column not taking the defined length
> -
>
> Key: HIVE-23696
> URL: https://issues.apache.org/jira/browse/HIVE-23696
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23696.01.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Caused by: org.datanucleus.exceptions.NucleusUserException: Attempt to store 
> value 
> "{"dbName":"testAcidTablesReplLoadBootstrapIncr_1592205875387","replicationType":"BOOTSTRAP","stagingDir":"hdfs://localhost:65158/tmp/org_apache_hadoop_hive_ql_parse_TestReplicationScenarios_245261428230295/hrepl0/dGVzdGFjaWR0YWJsZXNyZXBsbG9hZGJvb3RzdHJhcGluY3JfMTU5MjIwNTg3NTM4Nw==/0/hive","lastReplId":25}"
>  in column "RM_METADATA" that has maximum length of 255. Please correct your 
> data!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23706) Fix nulls first sorting behavior

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23706?focusedWorklogId=447257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447257
 ]

ASF GitHub Bot logged work on HIVE-23706:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:52
Start Date: 17/Jun/20 12:52
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1131:
URL: https://github.com/apache/hive/pull/1131#discussion_r441520582



##
File path: ql/src/test/results/clientpositive/llap/order_null.q.out
##
@@ -116,12 +116,12 @@ POSTHOOK: query: SELECT x.* FROM src_null_n1 x ORDER BY b 
desc, a asc
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@src_null_n1
  A masked pattern was here 
-2  B
-1  A
-2  A
 2  NULL
 3  NULL
 NULL   NULL
+2  B
+1  A
+2  A

Review comment:
   I agree that description is confusing I changed accordingly.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447257)
Time Spent: 50m  (was: 40m)

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);
> select a from t order by a desc;
> {code}
> instead of 
> {code}
> 3, 2, 2, 2, 1, null
> {code}
> should return 
> {code}
> null, 3, 2 ,2 ,2, 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q

2020-06-17 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138411#comment-17138411
 ] 

Zoltan Haindrich commented on HIVE-23418:
-

+1

> Investigate why msck command found different partitions at repair.q, 
> msck_repair*, partition_discovery.q
> 
>
> Key: HIVE-23418
> URL: https://issues.apache.org/jira/browse/HIVE-23418
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Check [https://reviews.apache.org/r/72485/] for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23418) Investigate why msck command found different partitions at repair.q, msck_repair*, partition_discovery.q

2020-06-17 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-23418:
---

Assignee: Miklos Gergely

> Investigate why msck command found different partitions at repair.q, 
> msck_repair*, partition_discovery.q
> 
>
> Key: HIVE-23418
> URL: https://issues.apache.org/jira/browse/HIVE-23418
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Check [https://reviews.apache.org/r/72485/] for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23711) Some IDE generated files should not be checked for license header by rat plugin

2020-06-17 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-23711.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

pushed to master. Thank you Laszlo!

> Some IDE generated files should not be checked for license header by rat 
> plugin
> ---
>
> Key: HIVE-23711
> URL: https://issues.apache.org/jira/browse/HIVE-23711
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: rat.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As attached in  [^rat.txt], there was an incorrect rat check:
> {code}
> Files with unapproved licenses:
>   /Users/lbodor/apache/hive/shims/common/.factorypath
> {code}
> In this patch, I'm about to take care of .factorypath and some other files 
> which should be ignored by rat, as they are generated by IDE and ignored 
> already by git



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23711) Some IDE generated files should not be checked for license header by rat plugin

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23711?focusedWorklogId=447251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447251
 ]

ASF GitHub Bot logged work on HIVE-23711:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:46
Start Date: 17/Jun/20 12:46
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1136:
URL: https://github.com/apache/hive/pull/1136


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447251)
Time Spent: 20m  (was: 10m)

> Some IDE generated files should not be checked for license header by rat 
> plugin
> ---
>
> Key: HIVE-23711
> URL: https://issues.apache.org/jira/browse/HIVE-23711
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: rat.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As attached in  [^rat.txt], there was an incorrect rat check:
> {code}
> Files with unapproved licenses:
>   /Users/lbodor/apache/hive/shims/common/.factorypath
> {code}
> In this patch, I'm about to take care of .factorypath and some other files 
> which should be ignored by rat, as they are generated by IDE and ignored 
> already by git



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23711) Some IDE generated files should not be checked for license header by rat plugin

2020-06-17 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138406#comment-17138406
 ] 

Zoltan Haindrich commented on HIVE-23711:
-

+1

> Some IDE generated files should not be checked for license header by rat 
> plugin
> ---
>
> Key: HIVE-23711
> URL: https://issues.apache.org/jira/browse/HIVE-23711
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Attachments: rat.txt
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As attached in  [^rat.txt], there was an incorrect rat check:
> {code}
> Files with unapproved licenses:
>   /Users/lbodor/apache/hive/shims/common/.factorypath
> {code}
> In this patch, I'm about to take care of .factorypath and some other files 
> which should be ignored by rat, as they are generated by IDE and ignored 
> already by git



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-20291) Allow HiveStreamingConnection to receive a WriteId

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20291?focusedWorklogId=447249=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447249
 ]

ASF GitHub Bot logged work on HIVE-20291:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:45
Start Date: 17/Jun/20 12:45
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #406:
URL: https://github.com/apache/hive/pull/406


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447249)
Time Spent: 20m  (was: 10m)

> Allow HiveStreamingConnection to receive a WriteId
> --
>
> Key: HIVE-20291
> URL: https://issues.apache.org/jira/browse/HIVE-20291
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20291.1.patch, HIVE-20291.10.patch, 
> HIVE-20291.11.patch, HIVE-20291.2.patch, HIVE-20291.3.patch, 
> HIVE-20291.4.patch, HIVE-20291.5.patch, HIVE-20291.6.patch, 
> HIVE-20291.7.patch, HIVE-20291.8.patch, HIVE-20291.9.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the writeId is received externally it won't need to open connections to 
> the metastore. It won't be able to the commit in this case as well so it must 
> be done by the entity passing the writeId.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23689) Bump Tez version to 0.9.2

2020-06-17 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-23689:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you [~jagatsingh]!

> Bump Tez version to 0.9.2
> -
>
> Key: HIVE-23689
> URL: https://issues.apache.org/jira/browse/HIVE-23689
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jagat Singh
>Assignee: Jagat Singh
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23689.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Bump Tez version to 0.9.2 from 0.9.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23689) Bump Tez version to 0.9.2

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23689?focusedWorklogId=447247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447247
 ]

ASF GitHub Bot logged work on HIVE-23689:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:43
Start Date: 17/Jun/20 12:43
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1108:
URL: https://github.com/apache/hive/pull/1108


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447247)
Time Spent: 20m  (was: 10m)

> Bump Tez version to 0.9.2
> -
>
> Key: HIVE-23689
> URL: https://issues.apache.org/jira/browse/HIVE-23689
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jagat Singh
>Assignee: Jagat Singh
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23689.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Bump Tez version to 0.9.2 from 0.9.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-23138) Run q test with TestMiniLlapLocalCliDriver by default

2020-06-17 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-23138:
-

Assignee: Ashutosh Chauhan  (was: Miklos Gergely)

> Run q test with TestMiniLlapLocalCliDriver by default
> -
>
> Key: HIVE-23138
> URL: https://issues.apache.org/jira/browse/HIVE-23138
> Project: Hive
>  Issue Type: Improvement
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> TestCliDriver, the current default driver for q tests is running tests on MR, 
> which is getting less and less used. Instead we should test everything by 
> default using TestMiniLlapLocalCliDriver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23329) Investigate why the results have changed for correlationoptimizer14.q

2020-06-17 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely resolved HIVE-23329.
---
Release Note: It is working fine with LLAP, and it was erroneous with MR. 
As we are planning to remove MR in the near future anyway, there is no point to 
investigate.
Assignee: Miklos Gergely
  Resolution: Won't Do

> Investigate why the results have changed for correlationoptimizer14.q
> -
>
> Key: HIVE-23329
> URL: https://issues.apache.org/jira/browse/HIVE-23329
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>
> Find out why the result set is different for correlationoptimizer14.q after 
> moving to TestMiniLlapLocalCliDriver. Check 
> [https://reviews.apache.org/r/72421/#comment308835] for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23703) Major QB compaction with multiple FileSinkOperators results in data loss and one original file

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23703?focusedWorklogId=447242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447242
 ]

ASF GitHub Bot logged work on HIVE-23703:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:35
Start Date: 17/Jun/20 12:35
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #1134:
URL: https://github.com/apache/hive/pull/1134#discussion_r441510102



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
##
@@ -334,6 +334,18 @@ public void initializeBucketPaths(int filesIdx, String 
taskId, boolean isNativeT
 if (!isMmTable && !isDirectInsert) {
   if (!bDynParts && !isSkewedStoredAsSubDirectories) {
 finalPaths[filesIdx] = new Path(parent, taskWithExt);
+if (conf.isCompactionTable()) {
+  // tables used in compaction are external and non-acid. We need 
to keep track of

Review comment:
   Done

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -4123,9 +4125,28 @@ private static void copyFiles(final HiveConf conf, final 
FileSystem destFs,
   }
   throw new HiveException(e);
 }
-  } else {
+  else {

Review comment:
   Typo, done.

##
File path: ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveCopyFiles.java
##
@@ -83,7 +83,8 @@ public void testRenameNewFilesOnSameFileSystem() throws 
IOException {
 FileSystem targetFs = targetPath.getFileSystem(hiveConf);
 
 try {
-  Hive.copyFiles(hiveConf, sourcePath, targetPath, targetFs, 
isSourceLocal, NO_ACID, false,null, false, false, false);
+  Hive.copyFiles(hiveConf, sourcePath, targetPath, targetFs, 
isSourceLocal, NO_ACID, false,null,

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447242)
Time Spent: 1h 10m  (was: 1h)

> Major QB compaction with multiple FileSinkOperators results in data loss and 
> one original file
> --
>
> Key: HIVE-23703
> URL: https://issues.apache.org/jira/browse/HIVE-23703
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Critical
>  Labels: compaction, pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> h4. Problems
> Example:
> {code:java}
> drop table if exists tbl2;
> create transactional table tbl2 (a int, b int) clustered by (a) into 4 
> buckets stored as ORC 
> TBLPROPERTIES('transactional'='true','transactional_properties'='default');
> insert into tbl2 values(1,2),(1,3),(1,4),(2,2),(2,3),(2,4);
> insert into tbl2 values(3,2),(3,3),(3,4),(4,2),(4,3),(4,4);
> insert into tbl2 values(5,2),(5,3),(5,4),(6,2),(6,3),(6,4);{code}
> E.g. in the example above, bucketId=0 when a=2 and a=6.
> 1. Data loss 
>  In non-acid tables, an operator's temp files are named with their task id. 
> Because of this snippet, temp files in the FileSinkOperator for compaction 
> tables are identified by their bucket_id.
> {code:java}
> if (conf.isCompactionTable()) {
>  fsp.initializeBucketPaths(filesIdx, AcidUtils.BUCKET_PREFIX + 
> String.format(AcidUtils.BUCKET_DIGITS, bucketId),
>  isNativeTable(), isSkewedStoredAsSubDirectories);
>  } else {
>  fsp.initializeBucketPaths(filesIdx, taskId, isNativeTable(), 
> isSkewedStoredAsSubDirectories);
>  }
> {code}
> So 2 temp files containing data with a=2 and a=6 will be named bucket_0 and 
> not 00_0 and 00_1 as they would normally.
>  In FileSinkOperator.commit, when data with a=2, filename: bucket_0 is moved 
> from _task_tmp.-ext-10002 to _tmp.-ext-10002, it overwrites the files already 
> there with a=6 data, because it too is named bucket_0. You can see in the 
> logs:
> {code:java}
>  WARN [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Target 
> path 
> file:.../hive/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnNoBuckets-1591107230237/warehouse/testmajorcompaction/base_002_v013/.hive-staging_hive_2020-06-02_07-15-21_771_8551447285061957908-1/_tmp.-ext-10002/bucket_0
>  with a size 610 exists. Trying to delete it.
> {code}
> 2. Results in one original file
>  OrcFileMergeOperator merges the results of the FSOp into 1 file named 
> 00_0.
> h4. Fix
> 1. FSOp will store data as: taskid/bucketId. e.g. 0_0/bucket_0
> 2. OrcMergeFileOp,

[jira] [Work logged] (HIVE-23706) Fix nulls first sorting behavior

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23706?focusedWorklogId=447240=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447240
 ]

ASF GitHub Bot logged work on HIVE-23706:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:33
Start Date: 17/Jun/20 12:33
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #1131:
URL: https://github.com/apache/hive/pull/1131#discussion_r441509021



##
File path: ql/src/test/results/clientpositive/llap/order_null.q.out
##
@@ -116,12 +116,12 @@ POSTHOOK: query: SELECT x.* FROM src_null_n1 x ORDER BY b 
desc, a asc
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@src_null_n1
  A masked pattern was here 
-2  B
-1  A
-2  A
 2  NULL
 3  NULL
 NULL   NULL
+2  B
+1  A
+2  A

Review comment:
   OK, now I understand better, thanks for the clarification.
   
   The `hive.default.nulls.last` is a bit confusing. From the name and 
description I get the impression that NULLS LAST is the default behavior for 
both ASC and DESC and apparently it is not the case either. 
   
   Yes the new results are in sync with Postgres since the latter uses the 
[semantics that you 
mentioned](https://www.postgresql.org/docs/12/queries-order.html). Note that 
this is not a rule from every DBMS.  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447240)
Time Spent: 40m  (was: 0.5h)

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);
> select a from t order by a desc;
> {code}
> instead of 
> {code}
> 3, 2, 2, 2, 1, null
> {code}
> should return 
> {code}
> null, 3, 2 ,2 ,2, 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23593) Schemainit fails with NoSuchFieldError

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-23593:
--
Labels: pull-request-available  (was: )

> Schemainit fails with NoSuchFieldError 
> ---
>
> Key: HIVE-23593
> URL: https://issues.apache.org/jira/browse/HIVE-23593
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> the issue comes from a calcite related class ; it's very interesting because 
> ql already has a shaded calcite
> {code}
> Caused by: java.lang.NoSuchFieldError: operands
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:192)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:98)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[calcite-core-1.21.0.jar:1.21.0]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRexExecutorImpl.reduce(HiveRexExecutorImpl.java:56)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.foldExpression(HiveFunctionHelper.java:544)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createConstantObjectInspector(HiveFunctionHelper.java:452)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createObjectInspector(HiveFunctionHelper.java:435)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getReturnType(HiveFunctionHelper.java:124)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:647)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23593) Schemainit fails with NoSuchFieldError

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23593?focusedWorklogId=447236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447236
 ]

ASF GitHub Bot logged work on HIVE-23593:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:27
Start Date: 17/Jun/20 12:27
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1137:
URL: https://github.com/apache/hive/pull/1137


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447236)
Remaining Estimate: 0h
Time Spent: 10m

> Schemainit fails with NoSuchFieldError 
> ---
>
> Key: HIVE-23593
> URL: https://issues.apache.org/jira/browse/HIVE-23593
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> the issue comes from a calcite related class ; it's very interesting because 
> ql already has a shaded calcite
> {code}
> Caused by: java.lang.NoSuchFieldError: operands
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:192)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ExprNodeConverter.visitCall(ExprNodeConverter.java:98)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[calcite-core-1.21.0.jar:1.21.0]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRexExecutorImpl.reduce(HiveRexExecutorImpl.java:56)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.foldExpression(HiveFunctionHelper.java:544)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createConstantObjectInspector(HiveFunctionHelper.java:452)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.createObjectInspector(HiveFunctionHelper.java:435)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getReturnType(HiveFunctionHelper.java:124)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:647)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23706) Fix nulls first sorting behavior

2020-06-17 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23706?focusedWorklogId=447234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447234
 ]

ASF GitHub Bot logged work on HIVE-23706:
-

Author: ASF GitHub Bot
Created on: 17/Jun/20 12:23
Start Date: 17/Jun/20 12:23
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #1131:
URL: https://github.com/apache/hive/pull/1131#discussion_r441503698



##
File path: ql/src/test/results/clientpositive/llap/order_null.q.out
##
@@ -116,12 +116,12 @@ POSTHOOK: query: SELECT x.* FROM src_null_n1 x ORDER BY b 
desc, a asc
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@src_null_n1
  A masked pattern was here 
-2  B
-1  A
-2  A
 2  NULL
 3  NULL
 NULL   NULL
+2  B
+1  A
+2  A

Review comment:
   Not sure about how up-to-date that document is but as far as I know we 
want nulls last as the default for asc and nulls first for desc.
   It can be controlled by the setting `hive.default.nulls.last`. Default value 
is `true`.
   
   I also run these queries with Postgres and got the same results as with this 
patch.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 447234)
Time Spent: 0.5h  (was: 20m)

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);
> select a from t order by a desc;
> {code}
> instead of 
> {code}
> 3, 2, 2, 2, 1, null
> {code}
> should return 
> {code}
> null, 3, 2 ,2 ,2, 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23331) Investigate why the results have changed for authorization_9.q

2020-06-17 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely resolved HIVE-23331.
---
Release Note: Whatever caused it back then, is not causing it anymore. Even 
in the migration commit it was added without the extra lines. We'll open a new 
jira if it occurs again,.
Assignee: Miklos Gergely
  Resolution: Cannot Reproduce

> Investigate why the results have changed for authorization_9.q
> --
>
> Key: HIVE-23331
> URL: https://issues.apache.org/jira/browse/HIVE-23331
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>
> Find out why the result set is different for authorization_9.q after moving 
> to TestMiniLlapLocalCliDriver. Check 
> [https://reviews.apache.org/r/72421/#comment308838|https://reviews.apache.org/r/72421/#comment308835]
>  for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23706) Fix nulls first sorting behavior

2020-06-17 Thread Krisztian Kasa (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138385#comment-17138385
 ] 

Krisztian Kasa commented on HIVE-23706:
---

[~zabetak]
Thanks for pointing out, the goal is changing the behavior when NULLS 
FIRST/LAST is not specified explicitly.  
 

> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);
> select a from t order by a desc;
> {code}
> instead of 
> {code}
> 3, 2, 2, 2, 1, null
> {code}
> should return 
> {code}
> null, 3, 2 ,2 ,2, 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23333) Investigate why the results have changed for char_udf1.q

2020-06-17 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely resolved HIVE-2.
---
Release Note: As the difference is because of the MR result is bad, and the 
LLAP result is good, and as MR is planned to be removed anyway, there is no 
point to investigate it.
Assignee: Miklos Gergely
  Resolution: Won't Do

> Investigate why the results have changed for char_udf1.q
> 
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>
> Find out why the result set is different for char_udf1.q after moving to 
> TestMiniLlapLocalCliDriver. Check 
> [https://reviews.apache.org/r/72421/#comment308875|https://reviews.apache.org/r/72421/#comment308835]
>  for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23706) Fix nulls first sorting behavior

2020-06-17 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-23706:
--
Description: 
{code}
INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);

select a from t order by a desc;
{code}
instead of 
{code}
3, 2, 2, 2, 1, null
{code}
should return 
{code}
null, 3, 2 ,2 ,2, 1
{code}

  was:
{code}
INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);

select a from t order by a desc;
{code}
instead of {code}
null, 3, 2 ,2 ,2, 1
{code}
should return 
{code}
3, 2, 2, 2, 1, null
{code}


> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);
> select a from t order by a desc;
> {code}
> instead of 
> {code}
> 3, 2, 2, 2, 1, null
> {code}
> should return 
> {code}
> null, 3, 2 ,2 ,2, 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23706) Fix nulls first sorting behavior

2020-06-17 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-23706:
--
Description: 
{code}
INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);

select a from t order by a desc;
{code}
instead of {code}
null, 3, 2 ,2 ,2, 1
{code}
should return 
{code}
3, 2, 2, 2, 1, null
{code}

  was:
{code}
INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);

SELECT a FROM t ORDER BY a DESC NULLS FIRST
{code}
should return 
{code}
3
2
2
2
1
null
{code}


> Fix nulls first sorting behavior
> 
>
> Key: HIVE-23706
> URL: https://issues.apache.org/jira/browse/HIVE-23706
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code}
> INSERT INTO t(a) VALUES (1), (null), (3), (2), (2), (2);
> select a from t order by a desc;
> {code}
> instead of {code}
> null, 3, 2 ,2 ,2, 1
> {code}
> should return 
> {code}
> 3, 2, 2, 2, 1, null
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23668) Clean up Task for Hive Metrics

2020-06-17 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-23668:
---
Attachment: HIVE-23668.03.patch
Status: Patch Available  (was: In Progress)

> Clean up Task for Hive Metrics
> --
>
> Key: HIVE-23668
> URL: https://issues.apache.org/jira/browse/HIVE-23668
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23668.01.patch, HIVE-23668.02.patch, 
> HIVE-23668.03.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 >

1 - 100 of 172 matches

Mail list logo