[jira] [Resolved] (HIVE-24344) [cache store] Add valid flag in table wrapper for all constraint

2020-11-26 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma resolved HIVE-24344.
--
Resolution: Duplicate

> [cache store] Add valid flag in table wrapper for all constraint 
> -
>
> Key: HIVE-24344
> URL: https://issues.apache.org/jira/browse/HIVE-24344
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description
> Currently if get null for a constraint value we fall back to raw store to 
> validate weather NULL is correct or not. We can add a valid flag which states 
> that NULL constraint value is correct and thus reduce raw store call.
> DOD
> Add flag for all 6 constraint in cachedstore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work stopped] (HIVE-24344) [cache store] Add valid flag in table wrapper for all constraint

2020-11-26 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24344 stopped by Ashish Sharma.

> [cache store] Add valid flag in table wrapper for all constraint 
> -
>
> Key: HIVE-24344
> URL: https://issues.apache.org/jira/browse/HIVE-24344
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>
> Description
> Currently if get null for a constraint value we fall back to raw store to 
> validate weather NULL is correct or not. We can add a valid flag which states 
> that NULL constraint value is correct and thus reduce raw store call.
> DOD
> Add flag for all 6 constraint in cachedstore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=517244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517244
 ]

ASF GitHub Bot logged work on HIVE-24436:
-

Author: ASF GitHub Bot
Created on: 27/Nov/20 05:38
Start Date: 27/Nov/20 05:38
Worklog Time Spent: 10m 
  Work Description: wangyum commented on pull request #1715:
URL: https://github.com/apache/hive/pull/1715#issuecomment-734654148


   cc @sunchao  @iemejia @viirya @dongjoon-hyun 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517244)
Time Spent: 0.5h  (was: 20m)

> Fix Avro NULL_DEFAULT_VALUE compatibility issue
> ---
>
> Key: HIVE-24436
> URL: https://issues.apache.org/jira/browse/HIVE-24436
> Project: Hive
>  Issue Type: Improvement
>  Components: Avro
>Affects Versions: 2.3.8
>Reporter: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Exception1:
> {noformat}
> - create hive serde table with Catalog
> *** RUN ABORTED ***
>   java.lang.NoSuchMethodError: 'void 
> org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, 
> java.lang.String, org.codehaus.jackson.JsonNode)'
>   at 
> org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76)
>   at 
> org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281)
>   at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263)
> {noformat}
> Exception2:
> {noformat}
> - alter hive serde table add columns -- partitioned - AVRO *** FAILED ***
>   org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.avro.AvroRuntimeException: Unknown datum class: class 
> org.codehaus.jackson.node.NullNode;
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245)
>   at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346)
>   at 
> org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
>   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=517243=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517243
 ]

ASF GitHub Bot logged work on HIVE-24436:
-

Author: ASF GitHub Bot
Created on: 27/Nov/20 05:36
Start Date: 27/Nov/20 05:36
Worklog Time Spent: 10m 
  Work Description: wangyum commented on a change in pull request #1715:
URL: https://github.com/apache/hive/pull/1715#discussion_r531391483



##
File path: 
serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java
##
@@ -235,14 +236,14 @@ private Schema createAvroArray(TypeInfo typeInfo) {
   private List getFields(Schema.Field schemaField) {
 List fields = new ArrayList();
 
-JsonNode nullDefault = JsonNodeFactory.instance.nullNode();
+JsonProperties.Null nullDefault = JsonProperties.NULL_VALUE;
 if (schemaField.schema().getType() == Schema.Type.RECORD) {
   for (Schema.Field field : schemaField.schema().getFields()) {
 fields.add(new Schema.Field(field.name(), field.schema(), field.doc(), 
nullDefault));
   }
 } else {
   fields.add(new Schema.Field(schemaField.name(), schemaField.schema(), 
schemaField.doc(),
-  nullDefault));
+nullDefault));

Review comment:
   Related code:
   
https://github.com/apache/avro/blob/release-1.8.2/lang/java/avro/src/main/java/org/apache/avro/Schema.java#L421-L424
   
   
https://github.com/apache/avro/blob/release-1.8.2/lang/java/avro/src/main/java/org/apache/avro/util/internal/JacksonUtils.java#L57
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517243)
Time Spent: 20m  (was: 10m)

> Fix Avro NULL_DEFAULT_VALUE compatibility issue
> ---
>
> Key: HIVE-24436
> URL: https://issues.apache.org/jira/browse/HIVE-24436
> Project: Hive
>  Issue Type: Improvement
>  Components: Avro
>Affects Versions: 2.3.8
>Reporter: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Exception1:
> {noformat}
> - create hive serde table with Catalog
> *** RUN ABORTED ***
>   java.lang.NoSuchMethodError: 'void 
> org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, 
> java.lang.String, org.codehaus.jackson.JsonNode)'
>   at 
> org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76)
>   at 
> org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281)
>   at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263)
> {noformat}
> Exception2:
> {noformat}
> - alter hive serde table add columns -- partitioned - AVRO *** FAILED ***
>   org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.avro.AvroRuntimeException: Unknown datum class: class 
> org.codehaus.jackson.node.NullNode;
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245)
>   at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346)
>   at 
> org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
>  

[jira] [Work logged] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24436?focusedWorklogId=517241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517241
 ]

ASF GitHub Bot logged work on HIVE-24436:
-

Author: ASF GitHub Bot
Created on: 27/Nov/20 05:32
Start Date: 27/Nov/20 05:32
Worklog Time Spent: 10m 
  Work Description: wangyum opened a new pull request #1715:
URL: https://github.com/apache/hive/pull/1715


   ### What changes were proposed in this pull request?
   
   This pr replace `null` with `JsonProperties.NULL_VALUE` to fix compatibility 
issue:
   1. java.lang.NoSuchMethodError: 'void 
org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, 
java.lang.String, org.codehaus.jackson.JsonNode)'
  ```
  - create hive serde table with Catalog
  *** RUN ABORTED ***
java.lang.NoSuchMethodError: 'void 
org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, 
  java.lang.String, org.codehaus.jackson.JsonNode)'
at 
org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76)
at 
org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61)
at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170)
at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114)
at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83)
at 
org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450)
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281)
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263)
  ```
   2. org.apache.avro.AvroRuntimeException: Unknown datum class: class 
org.codehaus.jackson.node.NullNode
  ```
  - alter hive serde table add columns -- partitioned - AVRO *** FAILED ***
org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
  org.apache.avro.AvroRuntimeException: Unknown datum class: class 
org.codehaus.jackson.node.NullNode;
at 
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112)
at 
org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245)
at 
org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
at 
org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346)
at 
org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
at 
org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
at 
org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680)
  ```
   
   ### Why are the changes needed?
   
   For compatibility with Avro 1.9.x and Avro 1.10.0.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   
   ### How was this patch tested?
   
   Build and run Spark test:
   ```
mvn -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.hive.execution.HiveDDLSuite test
   ```
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517241)
Remaining Estimate: 0h
Time Spent: 10m

> Fix Avro NULL_DEFAULT_VALUE compatibility issue
> ---
>
> Key: HIVE-24436
> URL: https://issues.apache.org/jira/browse/HIVE-24436
> Project: Hive
>  Issue Type: Improvement
>  Components: Avro
>Affects Versions: 2.3.8
>Reporter: Yuming Wang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Exception1:
> {noformat}
> - create hive serde table with Catalog
> *** RUN ABORTED ***
>   java.lang.NoSuchMethodError: 'void 
> org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, 
> java.lang.String, org.codehaus.jackson.JsonNode)'
>   

[jira] [Updated] (HIVE-24436) Fix Avro NULL_DEFAULT_VALUE compatibility issue

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24436:
--
Labels: pull-request-available  (was: )

> Fix Avro NULL_DEFAULT_VALUE compatibility issue
> ---
>
> Key: HIVE-24436
> URL: https://issues.apache.org/jira/browse/HIVE-24436
> Project: Hive
>  Issue Type: Improvement
>  Components: Avro
>Affects Versions: 2.3.8
>Reporter: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Exception1:
> {noformat}
> - create hive serde table with Catalog
> *** RUN ABORTED ***
>   java.lang.NoSuchMethodError: 'void 
> org.apache.avro.Schema$Field.(java.lang.String, org.apache.avro.Schema, 
> java.lang.String, org.codehaus.jackson.JsonNode)'
>   at 
> org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.createAvroField(TypeInfoToSchema.java:76)
>   at 
> org.apache.hadoop.hive.serde2.avro.TypeInfoToSchema.convert(TypeInfoToSchema.java:61)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.getSchemaFromCols(AvroSerDe.java:170)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:114)
>   at 
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:83)
>   at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:533)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:450)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:437)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:281)
>   at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:263)
> {noformat}
> Exception2:
> {noformat}
> - alter hive serde table add columns -- partitioned - AVRO *** FAILED ***
>   org.apache.spark.sql.AnalysisException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.avro.AvroRuntimeException: Unknown datum class: class 
> org.codehaus.jackson.node.NullNode;
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112)
>   at 
> org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:245)
>   at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createTable(ExternalCatalogWithListener.scala:94)
>   at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:346)
>   at 
> org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:166)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
>   at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3680)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24411) Make ThreadPoolExecutorWithOomHook more awareness of OutOfMemoryError

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24411?focusedWorklogId=517216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517216
 ]

ASF GitHub Bot logged work on HIVE-24411:
-

Author: ASF GitHub Bot
Created on: 27/Nov/20 02:24
Start Date: 27/Nov/20 02:24
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1695:
URL: https://github.com/apache/hive/pull/1695#issuecomment-734548436


   @kgyrtkirk could you please take a look? thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517216)
Time Spent: 40m  (was: 0.5h)

> Make ThreadPoolExecutorWithOomHook more awareness of OutOfMemoryError
> -
>
> Key: HIVE-24411
> URL: https://issues.apache.org/jira/browse/HIVE-24411
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Now the ThreadPoolExecutorWithOomHook invokes some oom hooks and stops the 
> HiveServer2 in case of OutOfMemoryError when executing the tasks. The 
> exception is obtained by calling method _future.get()_, however the exception 
> may never be an instance of OutOfMemoryError,  as the exception is wrapped in 
> ExecutionException,  see the method report in class FutureTask.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517215
 ]

ASF GitHub Bot logged work on HIVE-24424:
-

Author: ASF GitHub Bot
Created on: 27/Nov/20 02:18
Start Date: 27/Nov/20 02:18
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #1704:
URL: https://github.com/apache/hive/pull/1704#issuecomment-734539984


   Thanks @miklosgergely for the review.  Can you please take a look once more? 
I believe I've addressed your comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517215)
Time Spent: 1h 10m  (was: 1h)

> Use PreparedStatements in DbNotificationListener getNextNLId
> 
>
> Key: HIVE-24424
> URL: https://issues.apache.org/jira/browse/HIVE-24424
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Simplify the code, remove debug logging concatenation, and make it more 
> readable,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517214
 ]

ASF GitHub Bot logged work on HIVE-24424:
-

Author: ASF GitHub Bot
Created on: 27/Nov/20 02:17
Start Date: 27/Nov/20 02:17
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1704:
URL: https://github.com/apache/hive/pull/1704#discussion_r531349263



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -970,28 +971,44 @@ private static void close(ResultSet rs) {
 }
   }
 
-  private long getNextNLId(Statement stmt, SQLGenerator sqlGenerator, String 
sequence)
+  /**
+   * Get the next notification log ID.
+   *
+   * @return The next ID to use for a notification log message
+   * @throws SQLException if a database access error occurs or this method is
+   *   called on a closed connection
+   * @throws MetaException if the sequence table is not properly initialized
+   */
+  private long getNextNLId(Connection con, SQLGenerator sqlGenerator, String 
sequence)
   throws SQLException, MetaException {
-String s = sqlGenerator.addForUpdateClause("select \"NEXT_VAL\" from " +
-"\"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = " + 
quoteString(sequence));
-LOG.debug("Going to execute query <" + s + ">");
-ResultSet rs = null;
-try {
-  rs = stmt.executeQuery(s);
-  if (!rs.next()) {
-throw new MetaException("Transaction database not properly configured, 
can't find next NL id.");
+final String seq_sql = "select \"NEXT_VAL\" from \"SEQUENCE_TABLE\" where 
\"SEQUENCE_NAME\" = ?";

Review comment:
   Fixed.  Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517214)
Time Spent: 1h  (was: 50m)

> Use PreparedStatements in DbNotificationListener getNextNLId
> 
>
> Key: HIVE-24424
> URL: https://issues.apache.org/jira/browse/HIVE-24424
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Simplify the code, remove debug logging concatenation, and make it more 
> readable,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23891) Using UNION sql clause and speculative execution can cause file duplication in Tez

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23891?focusedWorklogId=517204=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517204
 ]

ASF GitHub Bot logged work on HIVE-23891:
-

Author: ASF GitHub Bot
Created on: 27/Nov/20 00:44
Start Date: 27/Nov/20 00:44
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1294:
URL: https://github.com/apache/hive/pull/1294#issuecomment-734518903


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517204)
Time Spent: 2h 10m  (was: 2h)

> Using UNION sql clause and speculative execution can cause file duplication 
> in Tez
> --
>
> Key: HIVE-23891
> URL: https://issues.apache.org/jira/browse/HIVE-23891
> Project: Hive
>  Issue Type: Bug
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23891.1.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Hello, 
> the specific scenario when this can happen:
>  - the execution engine is Tez;
>  - speculative execution is on;
>  - the query inserts into a table and the last step is a UNION sql clause;
> The problem is that Tez creates an extra layer of subdirectories when there 
> is a UNION. Later, when deduplicating, Hive doesn't take that into account 
> and only deduplicates folders but not the files inside.
> So for a query like this:
> {code:sql}
> insert overwrite table union_all
> select * from union_first_part
> union all
> select * from union_second_part;
> {code}
> The folder structure afterwards will be like this (a possible example):
> {code:java}
> .../union_all/HIVE_UNION_SUBDIR_1/00_0
> .../union_all/HIVE_UNION_SUBDIR_1/00_1
> .../union_all/HIVE_UNION_SUBDIR_2/00_1
> {code}
> The attached patch increases the number of folder levels that Hive will check 
> recursively for duplicates when we have a UNION in Tez.
> Feel free to reach out if you have any questions :).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23965) Improve plan regression tests using TPCDS30TB metastore dump and custom configs

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23965?focusedWorklogId=517196=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517196
 ]

ASF GitHub Bot logged work on HIVE-23965:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 22:24
Start Date: 26/Nov/20 22:24
Worklog Time Spent: 10m 
  Work Description: zabetak commented on pull request #1714:
URL: https://github.com/apache/hive/pull/1714#issuecomment-734496312


   This is the same PR with https://github.com/apache/hive/pull/1347 plus an 
extra commit 
https://github.com/apache/hive/pull/1714/commits/df6e610c7f7b11b0bf06b500b25613c1a811c055
 to handle metastore upgrades without the need to rebuild and publish the 
docker image.

   The initial PR (https://github.com/apache/hive/pull/1347) was reverted from 
master since tests were failing. Between the pre-commit runs and the 
post-commit runs some commits affected the schema of the metastore thus leading 
to these failures. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517196)
Time Spent: 5h 40m  (was: 5.5h)

> Improve plan regression tests using TPCDS30TB metastore dump and custom 
> configs
> ---
>
> Key: HIVE-23965
> URL: https://issues.apache.org/jira/browse/HIVE-23965
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: master355.tgz
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> The existing regression tests (HIVE-12586) based on TPC-DS have certain 
> shortcomings:
> The table statistics do not reflect cardinalities from a specific TPC-DS 
> scale factor (SF). Some tables are from a 30TB dataset, others from 200GB 
> dataset, and others from a 3GB dataset. This mix leads to plans that may 
> never appear when using an actual TPC-DS dataset. 
> The existing statistics do not contain information about partitions something 
> that can have a big impact on the resulting plans.
> The existing regression tests rely on more or less on the default 
> configuration (hive-site.xml). In real-life scenarios though some of the 
> configurations differ and may impact the choices of the optimizer.
> This issue aims to address the above shortcomings by using a curated 
> TPCDS30TB metastore dump along with some custom hive configurations. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23965) Improve plan regression tests using TPCDS30TB metastore dump and custom configs

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23965?focusedWorklogId=517195=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517195
 ]

ASF GitHub Bot logged work on HIVE-23965:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 22:19
Start Date: 26/Nov/20 22:19
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1714:
URL: https://github.com/apache/hive/pull/1714


   
   ### What changes were proposed in this pull request and why?
   1. Add new perf driver, TestTezTPCDS30TBCliDriver, relying on a dockerized 
metastore.
   2. Use Dockerized postgres metastore with TPC-DS 30TB dump
   3. Remove old drivers (with and without constraints), related classes
   (e.g., MetastoreDumpUtility), and resources.
   4. Use Hive config properties obtained and curated from real-life usages
   
   5. Allow AbstractCliConfig to override metastore DB type
   6. Rework CorePerfCliDriver to allow pre-initialized metastores
   7. Remove redundant logs in System.err. Logging and throwing an exception
   is an anti-pattern.
   8. Replace assertions with exceptions and improve the messages.
   
   9. Upgrade postgres JDBC driver to version 42.2.14 to be compatible
   with the docker image used
   10. Disable queries 14 (HIVE-24167), 30 (HIVE-23964)
   11. Re-enable CBO plan tests for queries 44, 45, 67, 70, 86
   
   The queries were disabled as part of HIVE-20718. They were supposed to
   be fixed in Calcite 1.18.0 and currently Hive is in 1.21.0 so it is not
   surprising that they pass.
   
   12. Add missing queries: cbo_query41, cbo_query62, query62
   
   ### Does this PR introduce _any_ user-facing change?
   No, except the fact that old TestTezPerf drivers no longer exist. 
   
   ### How was this patch tested?
   `mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver`
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517195)
Time Spent: 5.5h  (was: 5h 20m)

> Improve plan regression tests using TPCDS30TB metastore dump and custom 
> configs
> ---
>
> Key: HIVE-23965
> URL: https://issues.apache.org/jira/browse/HIVE-23965
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: master355.tgz
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> The existing regression tests (HIVE-12586) based on TPC-DS have certain 
> shortcomings:
> The table statistics do not reflect cardinalities from a specific TPC-DS 
> scale factor (SF). Some tables are from a 30TB dataset, others from 200GB 
> dataset, and others from a 3GB dataset. This mix leads to plans that may 
> never appear when using an actual TPC-DS dataset. 
> The existing statistics do not contain information about partitions something 
> that can have a big impact on the resulting plans.
> The existing regression tests rely on more or less on the default 
> configuration (hive-site.xml). In real-life scenarios though some of the 
> configurations differ and may impact the choices of the optimizer.
> This issue aims to address the above shortcomings by using a curated 
> TPCDS30TB metastore dump along with some custom hive configurations. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise get constraints call by removing redundant table check

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=517128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517128
 ]

ASF GitHub Bot logged work on HIVE-24259:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 17:20
Start Date: 26/Nov/20 17:20
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1610:
URL: https://github.com/apache/hive/pull/1610#discussion_r531158744



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -2844,15 +2814,10 @@ public SQLAllTableConstraints 
getAllTableConstraints(String catName, String dbNa
   return rawStore.getAllTableConstraints(catName, dbName, tblName);
 }
 
-Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName);
-if (tbl == null) {
-  // The table containing the constraints is not yet loaded in cache
-  return rawStore.getAllTableConstraints(catName, dbName, tblName);
-}
 SQLAllTableConstraints constraints = 
sharedCache.listCachedAllTableConstraints(catName, dbName, tblName);
 
-// if any of the constraint value is missing then there might be the case 
of partial constraints are stored in cached.
-// So fall back to raw store for correct values
+/* If constraint value is missing then there might be the case that table 
is not stored in cached or

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517128)
Time Spent: 1h 40m  (was: 1.5h)

> [CachedStore] Optimise get constraints call by removing redundant table check 
> --
>
> Key: HIVE-24259
> URL: https://issues.apache.org/jira/browse/HIVE-24259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Description -
> Problem - 
> 1. Redundant check if table is present or not
> 2. Currently in order to get all constraint form the cachedstore. 6 different 
> call is made with in the cached store. Which led to 6 different call to raw 
> store
>  
> DOD
> 1. Check only once if table exit in cached store.
> 2. Instead of calling individual constraint in cached store. Add a method 
> which return all constraint at once and if data is not consistent then fall 
> back to rawstore.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise get constraints call by removing redundant table check

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=517126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517126
 ]

ASF GitHub Bot logged work on HIVE-24259:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 17:20
Start Date: 26/Nov/20 17:20
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1610:
URL: https://github.com/apache/hive/pull/1610#discussion_r531158652



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
##
@@ -2397,7 +2397,7 @@ public SQLAllTableConstraints 
listCachedAllTableConstraints(String catName, Stri
 
   public List listCachedForeignKeys(String catName, String 
foreignDbName, String foreignTblName,
String parentDbName, String 
parentTblName) {
-List keys = new ArrayList<>();
+List keys = null;

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517126)
Time Spent: 1.5h  (was: 1h 20m)

> [CachedStore] Optimise get constraints call by removing redundant table check 
> --
>
> Key: HIVE-24259
> URL: https://issues.apache.org/jira/browse/HIVE-24259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Description -
> Problem - 
> 1. Redundant check if table is present or not
> 2. Currently in order to get all constraint form the cachedstore. 6 different 
> call is made with in the cached store. Which led to 6 different call to raw 
> store
>  
> DOD
> 1. Check only once if table exit in cached store.
> 2. Instead of calling individual constraint in cached store. Add a method 
> which return all constraint at once and if data is not consistent then fall 
> back to rawstore.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24259) [CachedStore] Optimise get constraints call by removing redundant table check

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24259?focusedWorklogId=517125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517125
 ]

ASF GitHub Bot logged work on HIVE-24259:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 17:20
Start Date: 26/Nov/20 17:20
Worklog Time Spent: 10m 
  Work Description: ashish-kumar-sharma commented on a change in pull 
request #1610:
URL: https://github.com/apache/hive/pull/1610#discussion_r531158438



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
##
@@ -2836,14 +2836,32 @@ long getPartsFound() {
   @Override
   public SQLAllTableConstraints getAllTableConstraints(String catName, String 
dbName, String tblName)
   throws MetaException, NoSuchObjectException {
-SQLAllTableConstraints sqlAllTableConstraints = new 
SQLAllTableConstraints();
-sqlAllTableConstraints.setPrimaryKeys(getPrimaryKeys(catName, dbName, 
tblName));
-sqlAllTableConstraints.setForeignKeys(getForeignKeys(catName, null, null, 
dbName, tblName));
-sqlAllTableConstraints.setUniqueConstraints(getUniqueConstraints(catName, 
dbName, tblName));
-
sqlAllTableConstraints.setDefaultConstraints(getDefaultConstraints(catName, 
dbName, tblName));
-sqlAllTableConstraints.setCheckConstraints(getCheckConstraints(catName, 
dbName, tblName));
-
sqlAllTableConstraints.setNotNullConstraints(getNotNullConstraints(catName, 
dbName, tblName));
-return sqlAllTableConstraints;
+
+catName = StringUtils.normalizeIdentifier(catName);
+dbName = StringUtils.normalizeIdentifier(dbName);
+tblName = StringUtils.normalizeIdentifier(tblName);
+if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && 
rawStore.isActiveTransaction())) {
+  return rawStore.getAllTableConstraints(catName, dbName, tblName);
+}
+
+Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName);
+if (tbl == null) {
+  // The table containing the constraints is not yet loaded in cache
+  return rawStore.getAllTableConstraints(catName, dbName, tblName);
+}
+SQLAllTableConstraints constraints = 
sharedCache.listCachedAllTableConstraints(catName, dbName, tblName);
+
+// if any of the constraint value is missing then there might be the case 
of partial constraints are stored in cached.
+// So fall back to raw store for correct values
+if (constraints != null && 
CollectionUtils.isNotEmpty(constraints.getPrimaryKeys()) && CollectionUtils

Review comment:
   Added flag





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517125)
Time Spent: 1h 20m  (was: 1h 10m)

> [CachedStore] Optimise get constraints call by removing redundant table check 
> --
>
> Key: HIVE-24259
> URL: https://issues.apache.org/jira/browse/HIVE-24259
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Description -
> Problem - 
> 1. Redundant check if table is present or not
> 2. Currently in order to get all constraint form the cachedstore. 6 different 
> call is made with in the cached store. Which led to 6 different call to raw 
> store
>  
> DOD
> 1. Check only once if table exit in cached store.
> 2. Instead of calling individual constraint in cached store. Add a method 
> which return all constraint at once and if data is not consistent then fall 
> back to rawstore.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24389) Trailing zeros of constant decimal numbers are removed

2020-11-26 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-24389.
---
Resolution: Fixed

Pushed to master, Thanks [~jcamachorodriguez] for review.

> Trailing zeros of constant decimal numbers are removed
> --
>
> Key: HIVE-24389
> URL: https://issues.apache.org/jira/browse/HIVE-24389
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In some case Hive removes trailing zeros of constant decimal numbers
> {code}
> select cast(1.1 as decimal(22, 2)) 
> 1.1
> {code}
> In this case *WritableConstantHiveDecimalObjectInspector* is used and this 
> object inspector takes it's wrapped HiveDecimal scale instead of the scale 
> specified in the wrapped typeinfo: 
> {code}
> this = {WritableConstantHiveDecimalObjectInspector@14415} 
>  value = {HiveDecimalWritable@14426} "1.1"
>  typeInfo = {DecimalTypeInfo@14421} "decimal(22,2)"{code}
> However in case of an expression with an aggregate function 
> *WritableHiveDecimalObjectInspector* is used
> {code}
> select cast(sum(1.1) as decimal(22, 2))
> 1.10
> {code}
> {code}
> o = {HiveDecimalWritable@16633} "1.1"
> oi = {WritableHiveDecimalObjectInspector@16634} 
>  typeInfo = {DecimalTypeInfo@16640} "decimal(22,2)"
> {code}
> Casting the expressions to string
> {code:java}
> select cast(cast(1.1 as decimal(22, 2)) as string), cast(cast(sum(1.1) as 
> decimal(22, 2)) as string)
> 1.1   1.10
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24389) Trailing zeros of constant decimal numbers are removed

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24389?focusedWorklogId=517095=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517095
 ]

ASF GitHub Bot logged work on HIVE-24389:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 14:59
Start Date: 26/Nov/20 14:59
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #1676:
URL: https://github.com/apache/hive/pull/1676


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517095)
Time Spent: 2h  (was: 1h 50m)

> Trailing zeros of constant decimal numbers are removed
> --
>
> Key: HIVE-24389
> URL: https://issues.apache.org/jira/browse/HIVE-24389
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In some case Hive removes trailing zeros of constant decimal numbers
> {code}
> select cast(1.1 as decimal(22, 2)) 
> 1.1
> {code}
> In this case *WritableConstantHiveDecimalObjectInspector* is used and this 
> object inspector takes it's wrapped HiveDecimal scale instead of the scale 
> specified in the wrapped typeinfo: 
> {code}
> this = {WritableConstantHiveDecimalObjectInspector@14415} 
>  value = {HiveDecimalWritable@14426} "1.1"
>  typeInfo = {DecimalTypeInfo@14421} "decimal(22,2)"{code}
> However in case of an expression with an aggregate function 
> *WritableHiveDecimalObjectInspector* is used
> {code}
> select cast(sum(1.1) as decimal(22, 2))
> 1.10
> {code}
> {code}
> o = {HiveDecimalWritable@16633} "1.1"
> oi = {WritableHiveDecimalObjectInspector@16634} 
>  typeInfo = {DecimalTypeInfo@16640} "decimal(22,2)"
> {code}
> Casting the expressions to string
> {code:java}
> select cast(cast(1.1 as decimal(22, 2)) as string), cast(cast(sum(1.1) as 
> decimal(22, 2)) as string)
> 1.1   1.10
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22782) Consolidate metastore call to fetch constraints

2020-11-26 Thread Sankar Hariappan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202575#comment-17202575
 ] 

Sankar Hariappan edited comment on HIVE-22782 at 11/26/20, 2:32 PM:


PR merged to master. 
Thanks [~ashish-kumar-sharma] for the contribution!


was (Author: sankarh):
PR merged to master. 
Thanks [~ashish-kumar-sharma] for the constribution!

> Consolidate metastore call to fetch constraints
> ---
>
> Key: HIVE-22782
> URL: https://issues.apache.org/jira/browse/HIVE-22782
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Currently separate calls are made to metastore to fetch constraints like Pk, 
> fk, not null etc. Since planner always retrieve these constraints we should 
> retrieve all of them in one call.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24435:
--
Labels: pull-request-available  (was: )

> Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
> -
>
> Key: HIVE-24435
> URL: https://issues.apache.org/jira/browse/HIVE-24435
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
> create table t (d string);
> insert into t values('2020-11-16 22:18:40 UTC');
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> set hive.fetch.task.conversion=none;
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> {code}
> results:
> {code}
> -- std udf:
> >2020-11-16 22:18:40 UTC<   1605593920  2020-11-16 22:18:40 
> >2020-11-16
> -- vectorized udf
> >2020-11-16 22:18:40 UTC<   NULLNULLNULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24435?focusedWorklogId=517035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517035
 ]

ASF GitHub Bot logged work on HIVE-24435:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 12:02
Start Date: 26/Nov/20 12:02
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1713:
URL: https://github.com/apache/hive/pull/1713


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517035)
Remaining Estimate: 0h
Time Spent: 10m

> Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
> -
>
> Key: HIVE-24435
> URL: https://issues.apache.org/jira/browse/HIVE-24435
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
> create table t (d string);
> insert into t values('2020-11-16 22:18:40 UTC');
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> set hive.fetch.task.conversion=none;
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> {code}
> results:
> {code}
> -- std udf:
> >2020-11-16 22:18:40 UTC<   1605593920  2020-11-16 22:18:40 
> >2020-11-16
> -- vectorized udf
> >2020-11-16 22:18:40 UTC<   NULLNULLNULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24409) Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24409?focusedWorklogId=517027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517027
 ]

ASF GitHub Bot logged work on HIVE-24409:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 11:49
Start Date: 26/Nov/20 11:49
Worklog Time Spent: 10m 
  Work Description: maheshk114 merged pull request #1708:
URL: https://github.com/apache/hive/pull/1708


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517027)
Time Spent: 0.5h  (was: 20m)

> Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc
> --
>
> Key: HIVE-24409
> URL: https://issues.apache.org/jira/browse/HIVE-24409
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2020-11-23 at 10.52.49 AM.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> !Screenshot 2020-11-23 at 10.52.49 AM.png|width=858,height=493!  
> Lines of interest:
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L535]
>  (non-vectorized path due to stats)
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java#L581]
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart

2020-11-26 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-24435:

Comment: was deleted

(was: looks like there are some things here
* unix_timestamp is deprecated - recommends to use current_timestamp - however 
current_timestamp doesnt take any argument; so its puzzling to suggest to use 
that
* GenericUDFUnixTimeStamp  has some implementation; but it also extends 
GenericUDFToUnixTimeStamp which also has a few vectorized implementations 
attached
* unix_timestamp behaves 100% the same as to_unix_timestamp in case an argument 
is specfied)

> Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
> -
>
> Key: HIVE-24435
> URL: https://issues.apache.org/jira/browse/HIVE-24435
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> {code}
> create table t (d string);
> insert into t values('2020-11-16 22:18:40 UTC');
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> set hive.fetch.task.conversion=none;
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> {code}
> results:
> {code}
> -- std udf:
> >2020-11-16 22:18:40 UTC<   1605593920  2020-11-16 22:18:40 
> >2020-11-16
> -- vectorized udf
> >2020-11-16 22:18:40 UTC<   NULLNULLNULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart

2020-11-26 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17239211#comment-17239211
 ] 

Zoltan Haindrich commented on HIVE-24435:
-

looks like there are some things here
* unix_timestamp is deprecated - recommends to use current_timestamp - however 
current_timestamp doesnt take any argument; so its puzzling to suggest to use 
that
* GenericUDFUnixTimeStamp  has some implementation; but it also extends 
GenericUDFToUnixTimeStamp which also has a few vectorized implementations 
attached
* unix_timestamp behaves 100% the same as to_unix_timestamp in case an argument 
is specfied

> Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
> -
>
> Key: HIVE-24435
> URL: https://issues.apache.org/jira/browse/HIVE-24435
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> {code}
> create table t (d string);
> insert into t values('2020-11-16 22:18:40 UTC');
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> set hive.fetch.task.conversion=none;
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> {code}
> results:
> {code}
> -- std udf:
> >2020-11-16 22:18:40 UTC<   1605593920  2020-11-16 22:18:40 
> >2020-11-16
> -- vectorized udf
> >2020-11-16 22:18:40 UTC<   NULLNULLNULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart

2020-11-26 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-24435:
---


> Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
> -
>
> Key: HIVE-24435
> URL: https://issues.apache.org/jira/browse/HIVE-24435
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> {code}
> create table t (d string);
> insert into t values('2020-11-16 22:18:40 UTC');
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> set hive.fetch.task.conversion=none;
> select
>   '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
> to_date(from_unixtime(unix_timestamp(d)))
> from t
> ;
> {code}
> results:
> {code}
> -- std udf:
> >2020-11-16 22:18:40 UTC<   1605593920  2020-11-16 22:18:40 
> >2020-11-16
> -- vectorized udf
> >2020-11-16 22:18:40 UTC<   NULLNULLNULL
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517001
 ]

ASF GitHub Bot logged work on HIVE-24424:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 11:09
Start Date: 26/Nov/20 11:09
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on a change in pull request 
#1704:
URL: https://github.com/apache/hive/pull/1704#discussion_r530952029



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -970,28 +971,44 @@ private static void close(ResultSet rs) {
 }
   }
 
-  private long getNextNLId(Statement stmt, SQLGenerator sqlGenerator, String 
sequence)
+  /**
+   * Get the next notification log ID.
+   *
+   * @return The next ID to use for a notification log message
+   * @throws SQLException if a database access error occurs or this method is
+   *   called on a closed connection
+   * @throws MetaException if the sequence table is not properly initialized
+   */
+  private long getNextNLId(Connection con, SQLGenerator sqlGenerator, String 
sequence)
   throws SQLException, MetaException {
-String s = sqlGenerator.addForUpdateClause("select \"NEXT_VAL\" from " +
-"\"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = " + 
quoteString(sequence));
-LOG.debug("Going to execute query <" + s + ">");
-ResultSet rs = null;
-try {
-  rs = stmt.executeQuery(s);
-  if (!rs.next()) {
-throw new MetaException("Transaction database not properly configured, 
can't find next NL id.");
+final String seq_sql = "select \"NEXT_VAL\" from \"SEQUENCE_TABLE\" where 
\"SEQUENCE_NAME\" = ?";
+final String upd_sql = "update \"SEQUENCE_TABLE\" set \"NEXT_VAL\" = ? 
where \"SEQUENCE_NAME\" = ?";
+
+final String sou_sql = sqlGenerator.addForUpdateClause(seq_sql);

Review comment:
   Please use camelCase for variables within a function.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517001)
Time Spent: 50m  (was: 40m)

> Use PreparedStatements in DbNotificationListener getNextNLId
> 
>
> Key: HIVE-24424
> URL: https://issues.apache.org/jira/browse/HIVE-24424
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Simplify the code, remove debug logging concatenation, and make it more 
> readable,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24424) Use PreparedStatements in DbNotificationListener getNextNLId

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24424?focusedWorklogId=517000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-517000
 ]

ASF GitHub Bot logged work on HIVE-24424:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 11:08
Start Date: 26/Nov/20 11:08
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on a change in pull request 
#1704:
URL: https://github.com/apache/hive/pull/1704#discussion_r530951752



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -970,28 +971,44 @@ private static void close(ResultSet rs) {
 }
   }
 
-  private long getNextNLId(Statement stmt, SQLGenerator sqlGenerator, String 
sequence)
+  /**
+   * Get the next notification log ID.
+   *
+   * @return The next ID to use for a notification log message
+   * @throws SQLException if a database access error occurs or this method is
+   *   called on a closed connection
+   * @throws MetaException if the sequence table is not properly initialized
+   */
+  private long getNextNLId(Connection con, SQLGenerator sqlGenerator, String 
sequence)
   throws SQLException, MetaException {
-String s = sqlGenerator.addForUpdateClause("select \"NEXT_VAL\" from " +
-"\"SEQUENCE_TABLE\" where \"SEQUENCE_NAME\" = " + 
quoteString(sequence));
-LOG.debug("Going to execute query <" + s + ">");
-ResultSet rs = null;
-try {
-  rs = stmt.executeQuery(s);
-  if (!rs.next()) {
-throw new MetaException("Transaction database not properly configured, 
can't find next NL id.");
+final String seq_sql = "select \"NEXT_VAL\" from \"SEQUENCE_TABLE\" where 
\"SEQUENCE_NAME\" = ?";

Review comment:
   These two are constants, please extract them.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 517000)
Time Spent: 40m  (was: 0.5h)

> Use PreparedStatements in DbNotificationListener getNextNLId
> 
>
> Key: HIVE-24424
> URL: https://issues.apache.org/jira/browse/HIVE-24424
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Simplify the code, remove debug logging concatenation, and make it more 
> readable,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24423) Improve DbNotificationListener Thread

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24423?focusedWorklogId=516999=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516999
 ]

ASF GitHub Bot logged work on HIVE-24423:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 11:01
Start Date: 26/Nov/20 11:01
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on a change in pull request 
#1703:
URL: https://github.com/apache/hive/pull/1703#discussion_r530947449



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -1242,64 +1244,50 @@ private void process(NotificationEvent event, 
ListenerEvent listenerEvent) throw
   }
 
   private static class CleanerThread extends Thread {
-private RawStore rs;
+private final RawStore rs;
 private int ttl;
-private boolean shouldRun = true;
 private long sleepTime;
 
 CleanerThread(Configuration conf, RawStore rs) {
   super("DB-Notification-Cleaner");
-  this.rs = rs;
-  boolean isReplEnabled = MetastoreConf.getBoolVar(conf, 
ConfVars.REPLCMENABLED);
-  if(isReplEnabled){
-setTimeToLive(MetastoreConf.getTimeVar(conf, 
ConfVars.REPL_EVENT_DB_LISTENER_TTL,
-TimeUnit.SECONDS));
-  }
-  else {
-setTimeToLive(MetastoreConf.getTimeVar(conf, 
ConfVars.EVENT_DB_LISTENER_TTL,
-TimeUnit.SECONDS));
-  }
-  setCleanupInterval(MetastoreConf.getTimeVar(conf, 
ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL,
-  TimeUnit.MILLISECONDS));
   setDaemon(true);
+  this.rs = Objects.requireNonNull(rs);
+
+  boolean isReplEnabled = MetastoreConf.getBoolVar(conf, 
ConfVars.REPLCMENABLED);
+  ConfVars ttlConf = (isReplEnabled) ?  
ConfVars.REPL_EVENT_DB_LISTENER_TTL : ConfVars.EVENT_DB_LISTENER_TTL;
+  setTimeToLive(MetastoreConf.getTimeVar(conf, ttlConf, TimeUnit.SECONDS));
+  setCleanupInterval(
+  MetastoreConf.getTimeVar(conf, 
ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL, TimeUnit.MILLISECONDS));
 }
 
 @Override
 public void run() {
-  while (shouldRun) {
+  while (true) {
+LOG.debug("Cleaner thread running");
 try {
   rs.cleanNotificationEvents(ttl);
   rs.cleanWriteNotificationEvents(ttl);
 } catch (Exception ex) {
-  //catching exceptions here makes sure that the thread doesn't die in 
case of unexpected
-  //exceptions
-  LOG.warn("Exception received while cleaning notifications: ", ex);
+  LOG.warn("Exception received while cleaning notifications", ex);

Review comment:
   What if interruption occurs while the execution of CleanerThread is 
within this try block, wouldn't the InterruptedException be caught by this 
catch block, and the thread will go on?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 516999)
Time Spent: 20m  (was: 10m)

> Improve DbNotificationListener Thread
> -
>
> Key: HIVE-24423
> URL: https://issues.apache.org/jira/browse/HIVE-24423
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Clean up and simplify {{DbNotificationListener}} thread class.
> Most importantly, stop the thread and wait for it to finish before launching 
> a new thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24359) Hive Compaction hangs because of doAs when worker set to HS2

2020-11-26 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage resolved HIVE-24359.
--
Resolution: Duplicate

HIVE-24410 duplicates this, and is Fixed.

> Hive Compaction hangs because of doAs when worker set to HS2
> 
>
> Key: HIVE-24359
> URL: https://issues.apache.org/jira/browse/HIVE-24359
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Reporter: Chiran Ravani
>Priority: Critical
>
> When creating a managed table and inserting data using Impala, with 
> compaction worker set to HiveServer2 - in secured environment (Kerberized 
> Cluster). Worker thread hangs indefinitely expecting user to provide kerberos 
> credentials from STDIN
> The problem appears to be because of no login context being sent from HS2 to 
> HMS as part of QueryCompactor and HS2 JVM has property 
> javax.security.auth.useSubjectCredsOnly is set to false. Which is causing it 
> to prompt for logins via stdin, however setting to true also does not helo as 
> the context does not seem to be passed in any case.
> Below is observed in HS2 Jstack. If you see the the thread is waiting for 
> stdin "com.sun.security.auth.module.Krb5LoginModule.promptForName"
> {code}
> "c570-node2.abc.host.com-44_executor" #47 daemon prio=1 os_prio=0 
> tid=0x01506000 nid=0x1348 runnable [0x7f1beea95000]
>java.lang.Thread.State: RUNNABLE
> at java.io.FileInputStream.readBytes(Native Method)
> at java.io.FileInputStream.read(FileInputStream.java:255)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <0x9fa38c90> (a java.io.BufferedInputStream)
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
> - locked <0x8c7d5010> (a java.io.InputStreamReader)
> at java.io.InputStreamReader.read(InputStreamReader.java:184)
> at java.io.BufferedReader.fill(BufferedReader.java:161)
> at java.io.BufferedReader.readLine(BufferedReader.java:324)
> - locked <0x8c7d5010> (a java.io.InputStreamReader)
> at java.io.BufferedReader.readLine(BufferedReader.java:389)
> at 
> com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153)
> at 
> com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120)
> at 
> com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862)
> at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708)
> at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
> at 
> javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
> at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
> at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
> at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
> at sun.security.jgss.GSSUtil.login(GSSUtil.java:258)
> at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175)
> at 
> sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341)
> at 
> sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146)
> at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
> at 
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189)
> at 
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
> at 
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)

[jira] [Work logged] (HIVE-24409) Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24409?focusedWorklogId=516954=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516954
 ]

ASF GitHub Bot logged work on HIVE-24409:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 09:03
Start Date: 26/Nov/20 09:03
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on pull request #1708:
URL: https://github.com/apache/hive/pull/1708#issuecomment-734168724


   LGTM. +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 516954)
Time Spent: 20m  (was: 10m)

> Use LazyBinarySerDe2 in PlanUtils::getReduceValueTableDesc
> --
>
> Key: HIVE-24409
> URL: https://issues.apache.org/jira/browse/HIVE-24409
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screenshot 2020-11-23 at 10.52.49 AM.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> !Screenshot 2020-11-23 at 10.52.49 AM.png|width=858,height=493!  
> Lines of interest:
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java#L535]
>  (non-vectorized path due to stats)
>  
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java#L581]
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values

2020-11-26 Thread Naresh P R (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-24433:
--
Description: 
PartionKeyValue is getting converted into lowerCase in below 2 places.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728]

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851]

Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries from 
proper partition values.

When query completes, the entry moves from TXN_COMPONENTS to 
COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the partition 
& considers it as invalid partition
{code:java}
create table abc(name string) partitioned by(city string) stored as orc 
tblproperties('transactional'='true');
insert into abc partition(city='Bangalore') values('aaa');
{code}
Example entry in COMPLETED_TXN_COMPONENTS
{noformat}
+---+--++---+-+-+---+
| CTC_TXNID | CTC_DATABASE | CTC_TABLE          | CTC_PARTITION     | 
CTC_TIMESTAMP       | CTC_WRITEID | CTC_UPDATE_DELETE |
+---+--++---+-+-+---+
|         2 | default      | abc    | city=bangalore    | 2020-11-25 09:26:59 | 
          1 | N                 |
+---+--++---+-+-+---+
{noformat}
 

AutoCompaction fails to get triggered with below error
{code:java}
2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator 
(Initiator.java:run(98)) - Checking to see if we should compact 
default.abc.city=bangalore
2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator 
(Initiator.java:run(155)) - Can't find partition 
default.compaction_test.city=bangalore, assuming it has been dropped and moving 
on{code}
I verifed below 4 SQL's with my PR, those all produced correct PartitionKeyValue

i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore"
{code:java}
insert into table abc PARTITION(CitY='Bangalore') values('Dan');
insert overwrite table abc partition(CiTy='Bangalore') select Name from abc;
update table abc set Name='xy' where CiTy='Bangalore';
delete from abc where CiTy='Bangalore';{code}

  was:
PartionKeyValue is getting converted into lowerCase in below 2 places.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728]

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851]

Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries from 
proper partition values.

When query completes, the entry moves from TXN_COMPONENTS to 
COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the partition 
& considers it as invalid partition
{code:java}
create table abc(name string) partitioned by(city string) stored as orc 
tblproperties('transactional'='true');
insert into abc partition(city='Bangalore') values('aaa');
{code}
Example entry in COMPLETED_TXN_COMPONENTS
{noformat}
+---+--++---+-+-+---+
| CTC_TXNID | CTC_DATABASE | CTC_TABLE          | CTC_PARTITION     | 
CTC_TIMESTAMP       | CTC_WRITEID | CTC_UPDATE_DELETE |
+---+--++---+-+-+---+
|         2 | default      | abc    | city=bangalore    | 2020-11-25 09:26:59 | 
          1 | N                 |
+---+--++---+-+-+---+
{noformat}
 

AutoCompaction fails to get triggered with below error
{code:java}
2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator 
(Initiator.java:run(98)) - Checking to see if we should compact 
default.abc.city=bangalore
2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator 
(Initiator.java:run(155)) - Can't find partition 
default.compaction_test.city=bhubaneshwar, assuming it has been dropped and 
moving on{code}
I verifed below 4 SQL's with my PR, those all produced correct PartitionKeyValue

i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore"
{code:java}
insert into table abc PARTITION(CitY='Bangalore') values('Dan');
insert overwrite table abc partition(CiTy='Bangalore') select Name from abc;
update table abc set Name='xy' where CiTy='Bangalore';
delete from abc where CiTy='Bangalore';{code}


> AutoCompaction is not getting 

[jira] [Work logged] (HIVE-24433) AutoCompaction is not getting triggered for CamelCase Partition Values

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24433?focusedWorklogId=516952=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516952
 ]

ASF GitHub Bot logged work on HIVE-24433:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 08:44
Start Date: 26/Nov/20 08:44
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on pull request #1712:
URL: https://github.com/apache/hive/pull/1712#issuecomment-734158908


   Hi @nareshpr  thanks for picking this up. Could you add a test case to 
TestInitiator that covers this use case.
   cc @klcopp  @deniskuzZ 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 516952)
Time Spent: 20m  (was: 10m)

> AutoCompaction is not getting triggered for CamelCase Partition Values
> --
>
> Key: HIVE-24433
> URL: https://issues.apache.org/jira/browse/HIVE-24433
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> PartionKeyValue is getting converted into lowerCase in below 2 places.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2728]
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2851]
> Because of which TXN_COMPONENTS & HIVE_LOCKS tables are not having entries 
> from proper partition values.
> When query completes, the entry moves from TXN_COMPONENTS to 
> COMPLETED_TXN_COMPONENTS. Hive AutoCompaction will not recognize the 
> partition & considers it as invalid partition
> {code:java}
> create table abc(name string) partitioned by(city string) stored as orc 
> tblproperties('transactional'='true');
> insert into abc partition(city='Bangalore') values('aaa');
> {code}
> Example entry in COMPLETED_TXN_COMPONENTS
> {noformat}
> +---+--++---+-+-+---+
> | CTC_TXNID | CTC_DATABASE | CTC_TABLE          | CTC_PARTITION     | 
> CTC_TIMESTAMP       | CTC_WRITEID | CTC_UPDATE_DELETE |
> +---+--++---+-+-+---+
> |         2 | default      | abc    | city=bangalore    | 2020-11-25 09:26:59 
> |           1 | N                 |
> +---+--++---+-+-+---+
> {noformat}
>  
> AutoCompaction fails to get triggered with below error
> {code:java}
> 2020-11-25T09:35:10,364 INFO [Thread-9]: compactor.Initiator 
> (Initiator.java:run(98)) - Checking to see if we should compact 
> default.abc.city=bangalore
> 2020-11-25T09:35:10,380 INFO [Thread-9]: compactor.Initiator 
> (Initiator.java:run(155)) - Can't find partition 
> default.compaction_test.city=bhubaneshwar, assuming it has been dropped and 
> moving on{code}
> I verifed below 4 SQL's with my PR, those all produced correct 
> PartitionKeyValue
> i.e, COMPLETED_TXN_COMPONENTS.CTC_PARTITION="city=Bangalore"
> {code:java}
> insert into table abc PARTITION(CitY='Bangalore') values('Dan');
> insert overwrite table abc partition(CiTy='Bangalore') select Name from abc;
> update table abc set Name='xy' where CiTy='Bangalore';
> delete from abc where CiTy='Bangalore';{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24410) Query-based compaction hangs because of doAs

2020-11-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24410?focusedWorklogId=516951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-516951
 ]

ASF GitHub Bot logged work on HIVE-24410:
-

Author: ASF GitHub Bot
Created on: 26/Nov/20 08:43
Start Date: 26/Nov/20 08:43
Worklog Time Spent: 10m 
  Work Description: klcopp merged pull request #1693:
URL: https://github.com/apache/hive/pull/1693


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 516951)
Time Spent: 3h  (was: 2h 50m)

> Query-based compaction hangs because of doAs
> 
>
> Key: HIVE-24410
> URL: https://issues.apache.org/jira/browse/HIVE-24410
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> QB compaction runs within a doas +and+ hive.server2.enable.doAs is set to 
> true (as of HIVE-24089). On a secure cluster with Worker threads running in 
> HS2, this results in HMS client not receiving a login context during 
> compaction queries, so kerberos prompts for a login via stdin which causes 
> the worker thread to hang until it times out:
> {code:java}
> "node-x.com-44_executor" #47 daemon prio=1 os_prio=0 tid=0x01506000 
> nid=0x1348 runnable [0x7f1beea95000]
>java.lang.Thread.State: RUNNABLE
> at java.io.FileInputStream.readBytes(Native Method)
> at java.io.FileInputStream.read(FileInputStream.java:255)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <0x9fa38c90> (a java.io.BufferedInputStream)
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
> - locked <0x8c7d5010> (a java.io.InputStreamReader)
> at java.io.InputStreamReader.read(InputStreamReader.java:184)
> at java.io.BufferedReader.fill(BufferedReader.java:161)
> at java.io.BufferedReader.readLine(BufferedReader.java:324)
> - locked <0x8c7d5010> (a java.io.InputStreamReader)
> at java.io.BufferedReader.readLine(BufferedReader.java:389)
> at 
> com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153)
> at 
> com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120)
> at 
> com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862)
> at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708)
> at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
> at 
> javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
> at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
> at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
> at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
> at sun.security.jgss.GSSUtil.login(GSSUtil.java:258)
> at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175)
> at 
> sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341)
> at 
> sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146)
> at 
> 

[jira] [Resolved] (HIVE-24410) Query-based compaction hangs because of doAs

2020-11-26 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage resolved HIVE-24410.
--
Resolution: Fixed

Committed to master branch. Thanks [~pvargacl] and [~pvary] for reviewing!

> Query-based compaction hangs because of doAs
> 
>
> Key: HIVE-24410
> URL: https://issues.apache.org/jira/browse/HIVE-24410
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> QB compaction runs within a doas +and+ hive.server2.enable.doAs is set to 
> true (as of HIVE-24089). On a secure cluster with Worker threads running in 
> HS2, this results in HMS client not receiving a login context during 
> compaction queries, so kerberos prompts for a login via stdin which causes 
> the worker thread to hang until it times out:
> {code:java}
> "node-x.com-44_executor" #47 daemon prio=1 os_prio=0 tid=0x01506000 
> nid=0x1348 runnable [0x7f1beea95000]
>java.lang.Thread.State: RUNNABLE
> at java.io.FileInputStream.readBytes(Native Method)
> at java.io.FileInputStream.read(FileInputStream.java:255)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <0x9fa38c90> (a java.io.BufferedInputStream)
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
> - locked <0x8c7d5010> (a java.io.InputStreamReader)
> at java.io.InputStreamReader.read(InputStreamReader.java:184)
> at java.io.BufferedReader.fill(BufferedReader.java:161)
> at java.io.BufferedReader.readLine(BufferedReader.java:324)
> - locked <0x8c7d5010> (a java.io.InputStreamReader)
> at java.io.BufferedReader.readLine(BufferedReader.java:389)
> at 
> com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153)
> at 
> com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120)
> at 
> com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862)
> at 
> com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708)
> at 
> com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
> at 
> javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
> at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
> at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
> at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
> at sun.security.jgss.GSSUtil.login(GSSUtil.java:258)
> at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175)
> at 
> sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341)
> at 
> sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336)
> at 
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146)
> at 
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
> at 
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189)
> at 
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
> at 
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
> at 
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
> at 
>