[jira] [Updated] (HIVE-24109) Load partitions in parallel for managed tables in the bootstrap phase

2020-10-21 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24109:
---
Status: In Progress  (was: Patch Available)

> Load partitions in parallel for managed tables in the bootstrap phase
> -
>
> Key: HIVE-24109
> URL: https://issues.apache.org/jira/browse/HIVE-24109
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24109.01.patch, HIVE-24109.02.patch, 
> HIVE-24109.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24109) Load partitions in parallel for managed tables in the bootstrap phase

2020-10-21 Thread Aasha Medhi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24109:
---
Attachment: HIVE-24109.03.patch
Status: Patch Available  (was: In Progress)

> Load partitions in parallel for managed tables in the bootstrap phase
> -
>
> Key: HIVE-24109
> URL: https://issues.apache.org/jira/browse/HIVE-24109
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24109.01.patch, HIVE-24109.02.patch, 
> HIVE-24109.03.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=503523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503523
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 22/Oct/20 04:00
Start Date: 22/Oct/20 04:00
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on pull request #1577:
URL: https://github.com/apache/hive/pull/1577#issuecomment-714206578


   Thanks for revising the patch @mustafaiman . Recent patch LGTM. +1 pending 
tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503523)
Time Spent: 1h 20m  (was: 1h 10m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-13693) Multi-insert query drops Filter before file output when there is a.val <> b.val

2020-10-21 Thread zhaolong (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218703#comment-17218703
 ] 

zhaolong commented on HIVE-13693:
-

[~jcamachorodriguez] hello  I found this patch may cause a semi join to left 
outer join?  NOT EXISTS query result when cbo disabled is not correct. 

the left is 3.1.0 branch,  the right is 1.2.1 branch, and query like select ... 
from a,b,c,d .. where a.xx=b.xx and c.a='' and c.b='' NOT EXIST(select 1 from d 
where d.xx='')。 the semi join are convert to left join, and some filter 
condition are missed

!image-2020-10-22-10-46-04-195.png!

> Multi-insert query drops Filter before file output when there is a.val <> 
> b.val
> ---
>
> Key: HIVE-13693
> URL: https://issues.apache.org/jira/browse/HIVE-13693
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.3.0, 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: HIVE-13693.01.patch, HIVE-13693.01.patch, 
> HIVE-13693.02.patch, HIVE-13693.patch
>
>
> To reproduce:
> {noformat}
> CREATE TABLE T_A ( id STRING, val STRING ); 
> CREATE TABLE T_B ( id STRING, val STRING ); 
> CREATE TABLE join_result_1 ( ida STRING, vala STRING, idb STRING, valb STRING 
> ); 
> CREATE TABLE join_result_3 ( ida STRING, vala STRING, idb STRING, valb STRING 
> ); 
> INSERT INTO TABLE T_A 
> VALUES ('Id_1', 'val_101'), ('Id_2', 'val_102'), ('Id_3', 'val_103'); 
> INSERT INTO TABLE T_B 
> VALUES ('Id_1', 'val_103'), ('Id_2', 'val_104'); 
> explain
> FROM T_A a LEFT JOIN T_B b ON a.id = b.id
> INSERT OVERWRITE TABLE join_result_1
> SELECT a.*, b.*
> WHERE b.id = 'Id_1' AND b.val = 'val_103'
> INSERT OVERWRITE TABLE join_result_3
> SELECT a.*, b.*
> WHERE b.val = 'val_104' AND b.id = 'Id_2' AND a.val <> b.val;
> {noformat}
> The (wrong) plan is the following:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-2 is a root stage
>   Stage-3 depends on stages: Stage-2
>   Stage-0 depends on stages: Stage-3
>   Stage-4 depends on stages: Stage-0
>   Stage-1 depends on stages: Stage-3
>   Stage-5 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-2
> Tez
>   DagId: haha_20160504140944_174465c9-5d1a-42f9-9665-fae02eeb2767:2
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (SIMPLE_EDGE)
>   DagName: 
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: a
>   Statistics: Num rows: 3 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: id (type: string)
> sort order: +
> Map-reduce partition columns: id (type: string)
> Statistics: Num rows: 3 Data size: 36 Basic stats: 
> COMPLETE Column stats: NONE
> value expressions: val (type: string)
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: b
>   Statistics: Num rows: 2 Data size: 24 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: id (type: string)
> sort order: +
> Map-reduce partition columns: id (type: string)
> Statistics: Num rows: 2 Data size: 24 Basic stats: 
> COMPLETE Column stats: NONE
> value expressions: val (type: string)
> Reducer 2 
> Reduce Operator Tree:
>   Merge Join Operator
> condition map:
>  Left Outer Join0 to 1
> keys:
>   0 id (type: string)
>   1 id (type: string)
> outputColumnNames: _col0, _col1, _col6
> Statistics: Num rows: 3 Data size: 39 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: _col0 (type: string), _col1 (type: string), 
> 'Id_1' (type: string), 'val_103' (type: string)
>   outputColumnNames: _col0, _col1, _col2, _col3
>   Statistics: Num rows: 3 Data size: 39 Basic stats: COMPLETE 
> Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 3 Data size: 39 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> serde: 
> 

[jira] [Work logged] (HIVE-24042) Fix typo in MetastoreConf.java

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24042?focusedWorklogId=503503=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503503
 ]

ASF GitHub Bot logged work on HIVE-24042:
-

Author: ASF GitHub Bot
Created on: 22/Oct/20 02:43
Start Date: 22/Oct/20 02:43
Worklog Time Spent: 10m 
  Work Description: yx91490 commented on pull request #1406:
URL: https://github.com/apache/hive/pull/1406#issuecomment-714185296


   ping @gm7y8



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503503)
Time Spent: 0.5h  (was: 20m)

> Fix typo in MetastoreConf.java
> --
>
> Key: HIVE-24042
> URL: https://issues.apache.org/jira/browse/HIVE-24042
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: yx91490
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: HIVE-24042.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Fix typo in MetastoreConf.java: correct word "riven" in package name to 
> "hadoop.hive.metastore".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24292) hive webUI should support keystoretype by config

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24292?focusedWorklogId=503486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503486
 ]

ASF GitHub Bot logged work on HIVE-24292:
-

Author: ASF GitHub Bot
Created on: 22/Oct/20 02:06
Start Date: 22/Oct/20 02:06
Worklog Time Spent: 10m 
  Work Description: yongzhi commented on a change in pull request #1594:
URL: https://github.com/apache/hive/pull/1594#discussion_r509836264



##
File path: 
service/src/test/org/apache/hive/service/server/TestHS2HttpServerPamConfiguration.java
##
@@ -48,6 +48,7 @@
   private static HiveConf hiveConf = null;
   private static String keyStorePassword = "123456";
   private static String keyFileName = "myKeyStore";
+  private static String keyStoreType = "jks";

Review comment:
   Will fix





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503486)
Time Spent: 0.5h  (was: 20m)

> hive webUI should support keystoretype by config
> 
>
> Key: HIVE-24292
> URL: https://issues.apache.org/jira/browse/HIVE-24292
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We need a property to pass-in  keystore type in webui too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24294) TezSessionPool sessions can throw AssertionError

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24294:
--
Labels: pull-request-available  (was: )

> TezSessionPool sessions can throw AssertionError
> 
>
> Key: HIVE-24294
> URL: https://issues.apache.org/jira/browse/HIVE-24294
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Whenever default TezSessionPool sessions are reopened for some reason, we are 
> setting dagResources to null before close & setting it back in openWhenever 
> default TezSessionPool sessions are reopened for some reason, we are setting 
> dagResources to null before close & setting it back in open
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503
> If there is an exception in sessionState.close(), we are not restoring the 
> dagResource but moving the session back to TezSessionPool.eg., exception 
> trace when sessionState.close() failed
> {code:java}
> 2020-10-15T09:20:28,749 INFO  [HiveServer2-Background-Pool: Thread-25451]: 
> client.TezClient (:()) - Failed to shutdown Tez Session via proxy
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, 
> finalApplicationStatus=SUCCEEDED, 
> trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, 
> diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, 
> sessionTimeoutInterval=60 ms
> Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0   
>  at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) 
> at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) 
> at org.apache.tez.client.TezClient.stop(TezClient.java:743) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code}
> Because of this, all new queries using this corrupted sessions are failing 
> with below exception
> {code:java}
> Caused by: java.lang.AssertionError: Ensure called on an unitialized (or 
> closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: 
> java.lang.AssertionError: Ensure called on an unitialized (or closed) session 
> 41774265-b7da-4d58-84a8-1bedfd597aec at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24294) TezSessionPool sessions can throw AssertionError

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24294?focusedWorklogId=503473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503473
 ]

ASF GitHub Bot logged work on HIVE-24294:
-

Author: ASF GitHub Bot
Created on: 22/Oct/20 01:39
Start Date: 22/Oct/20 01:39
Worklog Time Spent: 10m 
  Work Description: nareshpr opened a new pull request #1596:
URL: https://github.com/apache/hive/pull/1596


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503473)
Remaining Estimate: 0h
Time Spent: 10m

> TezSessionPool sessions can throw AssertionError
> 
>
> Key: HIVE-24294
> URL: https://issues.apache.org/jira/browse/HIVE-24294
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Whenever default TezSessionPool sessions are reopened for some reason, we are 
> setting dagResources to null before close & setting it back in openWhenever 
> default TezSessionPool sessions are reopened for some reason, we are setting 
> dagResources to null before close & setting it back in open
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503
> If there is an exception in sessionState.close(), we are not restoring the 
> dagResource but moving the session back to TezSessionPool.eg., exception 
> trace when sessionState.close() failed
> {code:java}
> 2020-10-15T09:20:28,749 INFO  [HiveServer2-Background-Pool: Thread-25451]: 
> client.TezClient (:()) - Failed to shutdown Tez Session via proxy
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, 
> finalApplicationStatus=SUCCEEDED, 
> trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, 
> diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, 
> sessionTimeoutInterval=60 ms
> Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0   
>  at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) 
> at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) 
> at org.apache.tez.client.TezClient.stop(TezClient.java:743) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code}
> Because of this, all new queries using this corrupted sessions are failing 
> with below exception
> {code:java}
> Caused by: java.lang.AssertionError: Ensure called on an unitialized (or 
> closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: 
> java.lang.AssertionError: Ensure called on an unitialized (or closed) session 
> 41774265-b7da-4d58-84a8-1bedfd597aec at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24294) TezSessionPool sessions can throw AssertionError

2020-10-21 Thread Naresh P R (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R reassigned HIVE-24294:
-


> TezSessionPool sessions can throw AssertionError
> 
>
> Key: HIVE-24294
> URL: https://issues.apache.org/jira/browse/HIVE-24294
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>
> Whenever default TezSessionPool sessions are reopened for some reason, we are 
> setting dagResources to null before close & setting it back in openWhenever 
> default TezSessionPool sessions are reopened for some reason, we are setting 
> dagResources to null before close & setting it back in open
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503
> If there is an exception in sessionState.close(), we are not restoring the 
> dagResource but moving the session back to TezSessionPool.eg., exception 
> trace when sessionState.close() failed
> {code:java}
> 2020-10-15T09:20:28,749 INFO  [HiveServer2-Background-Pool: Thread-25451]: 
> client.TezClient (:()) - Failed to shutdown Tez Session via proxy
> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
> applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, 
> finalApplicationStatus=SUCCEEDED, 
> trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, 
> diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, 
> sessionTimeoutInterval=60 ms
> Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0   
>  at 
> org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) 
> at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) 
> at org.apache.tez.client.TezClient.stop(TezClient.java:743) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531)
>  
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code}
> Because of this, all new queries using this corrupted sessions are failing 
> with below exception
> {code:java}
> Caused by: java.lang.AssertionError: Ensure called on an unitialized (or 
> closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: 
> java.lang.AssertionError: Ensure called on an unitialized (or closed) session 
> 41774265-b7da-4d58-84a8-1bedfd597aec at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24043) Retain original path info in Warehouse.makeSpecFromName()'s logger

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24043?focusedWorklogId=503465=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503465
 ]

ASF GitHub Bot logged work on HIVE-24043:
-

Author: ASF GitHub Bot
Created on: 22/Oct/20 00:58
Start Date: 22/Oct/20 00:58
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1407:
URL: https://github.com/apache/hive/pull/1407


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503465)
Time Spent: 0.5h  (was: 20m)

> Retain original path info in Warehouse.makeSpecFromName()'s logger
> --
>
> Key: HIVE-24043
> URL: https://issues.apache.org/jira/browse/HIVE-24043
> Project: Hive
>  Issue Type: Improvement
>Reporter: yx91490
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24043.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The warn logger in Warehouse.makeSpecFromName() not retain original path 
> info, for example:
> {code:java}
> 20/08/07 14:32:28 WARN warehouse: Cannot create partition spec from 
> hdfs://nameservice/; missing keys [dt1]
> {code}
> the log content was expect to be the full hdfs path but 'hdfs://nameservice'
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24042) Fix typo in MetastoreConf.java

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24042?focusedWorklogId=503466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503466
 ]

ASF GitHub Bot logged work on HIVE-24042:
-

Author: ASF GitHub Bot
Created on: 22/Oct/20 00:58
Start Date: 22/Oct/20 00:58
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1406:
URL: https://github.com/apache/hive/pull/1406


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503466)
Time Spent: 20m  (was: 10m)

> Fix typo in MetastoreConf.java
> --
>
> Key: HIVE-24042
> URL: https://issues.apache.org/jira/browse/HIVE-24042
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: yx91490
>Priority: Trivial
>  Labels: pull-request-available
> Attachments: HIVE-24042.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Fix typo in MetastoreConf.java: correct word "riven" in package name to 
> "hadoop.hive.metastore".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24292) hive webUI should support keystoretype by config

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24292?focusedWorklogId=503446=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503446
 ]

ASF GitHub Bot logged work on HIVE-24292:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 23:56
Start Date: 21/Oct/20 23:56
Worklog Time Spent: 10m 
  Work Description: risdenk commented on a change in pull request #1594:
URL: https://github.com/apache/hive/pull/1594#discussion_r509800907



##
File path: 
service/src/test/org/apache/hive/service/server/TestHS2HttpServerPamConfiguration.java
##
@@ -48,6 +48,7 @@
   private static HiveConf hiveConf = null;
   private static String keyStorePassword = "123456";
   private static String keyFileName = "myKeyStore";
+  private static String keyStoreType = "jks";

Review comment:
   You might want this to be `KeyStore. getDefaultType()` depending on the 
JDK being used? I think in most cases this should be ok though.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503446)
Time Spent: 20m  (was: 10m)

> hive webUI should support keystoretype by config
> 
>
> Key: HIVE-24292
> URL: https://issues.apache.org/jira/browse/HIVE-24292
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We need a property to pass-in  keystore type in webui too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24109) Load partitions in parallel for managed tables in the bootstrap phase

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24109?focusedWorklogId=503445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503445
 ]

ASF GitHub Bot logged work on HIVE-24109:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 23:54
Start Date: 21/Oct/20 23:54
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on pull request #1529:
URL: https://github.com/apache/hive/pull/1529#issuecomment-714015286


   Can you please rebase the patch.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503445)
Time Spent: 20m  (was: 10m)

> Load partitions in parallel for managed tables in the bootstrap phase
> -
>
> Key: HIVE-24109
> URL: https://issues.apache.org/jira/browse/HIVE-24109
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24109.01.patch, HIVE-24109.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24287) Cookie Signer class should use SHA-512 instead SHA-256 for cookie signature

2020-10-21 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala resolved HIVE-24287.
--
Resolution: Fixed

> Cookie Signer class should use SHA-512 instead SHA-256 for cookie signature
> ---
>
> Key: HIVE-24287
> URL: https://issues.apache.org/jira/browse/HIVE-24287
> Project: Hive
>  Issue Type: Bug
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> private static final String SHA_STRING = "SHA-256"; should use SHA-512 instead



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23969) Table owner info not being passed during show tables in database.

2020-10-21 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala resolved HIVE-23969.
--
Resolution: Fixed

> Table owner info not being passed during show tables in database.
> -
>
> Key: HIVE-23969
> URL: https://issues.apache.org/jira/browse/HIVE-23969
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
> Attachments: Screen Shot 2020-07-31 at 10.55.51 AM.png, Screen Shot 
> 2020-07-31 at 10.56.25 AM.png, Screen Shot 2020-07-31 at 10.56.51 AM.png
>
>
> Table owner information is not being passed in HiveMetaStore. As a result, 
> even though a user is the owner of tables, without a ranger policy, the user 
> is unable to view the tables created by the self.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24293) Integer overflow in llap collision mask

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24293:
--
Labels: pull-request-available  (was: )

> Integer overflow in llap collision mask
> ---
>
> Key: HIVE-24293
> URL: https://issues.apache.org/jira/browse/HIVE-24293
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If multiple threads put the same buffer to the cache, only one succeeds. The 
> other one detects this, and replaces its own buffer. This is marked by a bit 
> mask encoded in a long, where the collided buffers are marked with a 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24293) Integer overflow in llap collision mask

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24293?focusedWorklogId=503411=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503411
 ]

ASF GitHub Bot logged work on HIVE-24293:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 21:45
Start Date: 21/Oct/20 21:45
Worklog Time Spent: 10m 
  Work Description: asinkovits opened a new pull request #1595:
URL: https://github.com/apache/hive/pull/1595


   
   
   
   
   ### What changes were proposed in this pull request?
   bugfix
   
   
   ### Why are the changes needed?
   If multiple threads put the same buffer to the cache, only one succeeds. The 
other one detects this, and replaces its own buffer. This is marked by a bit 
mask encoded in a long, where the collided buffers are marked with a 1. By 
shifting the integer 1, it can happen that due to an overflow some buffers will 
not be removed after a collision, and the reference count decreases below zero, 
which is not a valid state.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Unit test added



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503411)
Remaining Estimate: 0h
Time Spent: 10m

> Integer overflow in llap collision mask
> ---
>
> Key: HIVE-24293
> URL: https://issues.apache.org/jira/browse/HIVE-24293
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If multiple threads put the same buffer to the cache, only one succeeds. The 
> other one detects this, and replaces its own buffer. This is marked by a bit 
> mask encoded in a long, where the collided buffers are marked with a 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24293) Integer overflow in llap collision mask

2020-10-21 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-24293:
---
Description: If multiple threads put the same buffer to the cache, only one 
succeeds. The other one detects this, and replaces its own buffer. This is 
marked by a bit mask encoded in a long, where the collided buffers are marked 
with a 1.

> Integer overflow in llap collision mask
> ---
>
> Key: HIVE-24293
> URL: https://issues.apache.org/jira/browse/HIVE-24293
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> If multiple threads put the same buffer to the cache, only one succeeds. The 
> other one detects this, and replaces its own buffer. This is marked by a bit 
> mask encoded in a long, where the collided buffers are marked with a 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-24293) Integer overflow in llap collision mask

2020-10-21 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24293 started by Antal Sinkovits.
--
> Integer overflow in llap collision mask
> ---
>
> Key: HIVE-24293
> URL: https://issues.apache.org/jira/browse/HIVE-24293
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> If multiple threads put the same buffer to the cache, only one succeeds. The 
> other one detects this, and replaces its own buffer. This is marked by a bit 
> mask encoded in a long, where the collided buffers are marked with a 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24293) Integer overflow in llap collision mask

2020-10-21 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits reassigned HIVE-24293:
--


> Integer overflow in llap collision mask
> ---
>
> Key: HIVE-24293
> URL: https://issues.apache.org/jira/browse/HIVE-24293
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?focusedWorklogId=503379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503379
 ]

ASF GitHub Bot logged work on HIVE-24275:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 20:10
Start Date: 21/Oct/20 20:10
Worklog Time Spent: 10m 
  Work Description: kishendas commented on a change in pull request #1583:
URL: https://github.com/apache/hive/pull/1583#discussion_r509643679



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -244,7 +249,32 @@ private void removeFiles(String location, ValidWriteIdList 
writeIdList, Compacti
 obsoleteDirs.addAll(dir.getAbortedDirectories());
 List filesToDelete = new ArrayList<>(obsoleteDirs.size());
 StringBuilder extraDebugInfo = new StringBuilder("[");
+boolean delayedCleanupEnabled = 
conf.getBoolVar(HiveConf.ConfVars.HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
+
 for (Path stat : obsoleteDirs) {
+  if (delayedCleanupEnabled) {
+String filename = stat.toString();
+if (filename.startsWith(AcidUtils.BASE_PREFIX)) {
+  long writeId = AcidUtils.ParsedBase.parseBase(stat).getWriteId();
+  if (ci.type == CompactionType.MINOR) {
+LOG.info("Skipping base dir " + stat + " as this cleanup is for 
minor compaction"
++ ", compaction id " + ci.id);
+continue;
+  } else if (writeId > writeIdList.getHighWatermark()) {
+LOG.info("Skipping base dir " + stat + " deletion as WriteId of 
this base dir is"
++ " greater than highWaterMark for compaction id " + ci.id);
+continue;
+  }
+}
+else if (filename.startsWith(AcidUtils.DELTA_PREFIX) || 
filename.startsWith(AcidUtils.DELETE_DELTA_PREFIX)) {
+  AcidUtils.ParsedDelta delta = AcidUtils.parsedDelta(stat, fs);
+  if (delta.getMaxWriteId() > writeIdList.getHighWatermark()) {

Review comment:
   Please add relevant comments in the code, wherever its not very obvious. 

##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -244,7 +249,32 @@ private void removeFiles(String location, ValidWriteIdList 
writeIdList, Compacti
 obsoleteDirs.addAll(dir.getAbortedDirectories());
 List filesToDelete = new ArrayList<>(obsoleteDirs.size());
 StringBuilder extraDebugInfo = new StringBuilder("[");
+boolean delayedCleanupEnabled = 
conf.getBoolVar(HiveConf.ConfVars.HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
+
 for (Path stat : obsoleteDirs) {
+  if (delayedCleanupEnabled) {
+String filename = stat.toString();
+if (filename.startsWith(AcidUtils.BASE_PREFIX)) {
+  long writeId = AcidUtils.ParsedBase.parseBase(stat).getWriteId();
+  if (ci.type == CompactionType.MINOR) {
+LOG.info("Skipping base dir " + stat + " as this cleanup is for 
minor compaction"
++ ", compaction id " + ci.id);
+continue;
+  } else if (writeId > writeIdList.getHighWatermark()) {
+LOG.info("Skipping base dir " + stat + " deletion as WriteId of 
this base dir is"
++ " greater than highWaterMark for compaction id " + ci.id);
+continue;
+  }
+}
+else if (filename.startsWith(AcidUtils.DELTA_PREFIX) || 
filename.startsWith(AcidUtils.DELETE_DELTA_PREFIX)) {
+  AcidUtils.ParsedDelta delta = AcidUtils.parsedDelta(stat, fs);

Review comment:
   It would be helpful to extract this logic to a separate method. 

##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -3058,6 +3058,11 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
 HIVE_COMPACTOR_CLEANER_RUN_INTERVAL("hive.compactor.cleaner.run.interval", 
"5000ms",
 new TimeValidator(TimeUnit.MILLISECONDS), "Time between runs of the 
cleaner thread"),
+
HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED("hive.compactor.delayed.cleanup.enabled",
 false,
+"When enabled, cleanup of obsolete files/dirs after compaction can be 
delayed. This delay \n" +
+" can be configured by hive configuration 
hive.compactor.cleaner.retention.time.seconds"),
+
HIVE_COMPACTOR_CLEANER_RETENTION_TIME_SECONDS("hive.compactor.cleaner.retention.time.seconds",
 "300s",

Review comment:
   It might be better to change the name to 
"HIVE_COMPACTOR_CLEANER_RETENTION_TIME", since the value would indicate whether 
it's in seconds or milliseconds. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

[jira] [Commented] (HIVE-24292) hive webUI should support keystoretype by config

2020-10-21 Thread Yongzhi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218534#comment-17218534
 ] 

Yongzhi Chen commented on HIVE-24292:
-

[~krisden], [~ngangam] could you review this change? Thanks

> hive webUI should support keystoretype by config
> 
>
> Key: HIVE-24292
> URL: https://issues.apache.org/jira/browse/HIVE-24292
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need a property to pass-in  keystore type in webui too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24292) hive webUI should support keystoretype by config

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24292:
--
Labels: pull-request-available  (was: )

> hive webUI should support keystoretype by config
> 
>
> Key: HIVE-24292
> URL: https://issues.apache.org/jira/browse/HIVE-24292
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need a property to pass-in  keystore type in webui too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24292) hive webUI should support keystoretype by config

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24292?focusedWorklogId=503360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503360
 ]

ASF GitHub Bot logged work on HIVE-24292:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 19:14
Start Date: 21/Oct/20 19:14
Worklog Time Spent: 10m 
  Work Description: yongzhi opened a new pull request #1594:
URL: https://github.com/apache/hive/pull/1594


   Unit test
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503360)
Remaining Estimate: 0h
Time Spent: 10m

> hive webUI should support keystoretype by config
> 
>
> Key: HIVE-24292
> URL: https://issues.apache.org/jira/browse/HIVE-24292
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We need a property to pass-in  keystore type in webui too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23962) Make bin/hive pick user defined jdbc url

2020-10-21 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-23962.
--
Fix Version/s: 4.0.0
 Assignee: Naveen Gangam
   Resolution: Fixed

Fix has been committed to master. Closing the jira. Thanks [~vihangk1] and 
[~Xiaomeng Zhang] for the initial work and the patch.

> Make bin/hive pick user defined jdbc url 
> -
>
> Key: HIVE-23962
> URL: https://issues.apache.org/jira/browse/HIVE-23962
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Xiaomeng Zhang
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently hive command will trigger bin/hive which run "beeline" by default.
> We want to pass a env variable so that user can define which url beeline use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23962) Make bin/hive pick user defined jdbc url

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23962?focusedWorklogId=503309=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503309
 ]

ASF GitHub Bot logged work on HIVE-23962:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 17:13
Start Date: 21/Oct/20 17:13
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #1591:
URL: https://github.com/apache/hive/pull/1591#issuecomment-713725949


   Thanks Vihang for the review and commit.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503309)
Time Spent: 1h 20m  (was: 1h 10m)

> Make bin/hive pick user defined jdbc url 
> -
>
> Key: HIVE-23962
> URL: https://issues.apache.org/jira/browse/HIVE-23962
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Xiaomeng Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently hive command will trigger bin/hive which run "beeline" by default.
> We want to pass a env variable so that user can define which url beeline use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23962) Make bin/hive pick user defined jdbc url

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23962?focusedWorklogId=503303=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503303
 ]

ASF GitHub Bot logged work on HIVE-23962:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 16:58
Start Date: 21/Oct/20 16:58
Worklog Time Spent: 10m 
  Work Description: vihangk1 merged pull request #1591:
URL: https://github.com/apache/hive/pull/1591


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503303)
Time Spent: 1h 10m  (was: 1h)

> Make bin/hive pick user defined jdbc url 
> -
>
> Key: HIVE-23962
> URL: https://issues.apache.org/jira/browse/HIVE-23962
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Xiaomeng Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently hive command will trigger bin/hive which run "beeline" by default.
> We want to pass a env variable so that user can define which url beeline use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24292) hive webUI should support keystoretype by config

2020-10-21 Thread Yongzhi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-24292:
---


> hive webUI should support keystoretype by config
> 
>
> Key: HIVE-24292
> URL: https://issues.apache.org/jira/browse/HIVE-24292
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>
> We need a property to pass-in  keystore type in webui too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24173) notification cleanup interval value changes depending upon replication enabled or not.

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24173?focusedWorklogId=503273=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503273
 ]

ASF GitHub Bot logged work on HIVE-24173:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 16:00
Start Date: 21/Oct/20 16:00
Worklog Time Spent: 10m 
  Work Description: ArkoSharma opened a new pull request #1593:
URL: https://github.com/apache/hive/pull/1593


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503273)
Remaining Estimate: 0h
Time Spent: 10m

> notification cleanup interval value changes depending upon replication 
> enabled or not.
> --
>
> Key: HIVE-24173
> URL: https://issues.apache.org/jira/browse/HIVE-24173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently we use hive.metastore.event.db.listener.timetolive to determine how 
> long the events are stored in rdbms backing hms. We should have another 
> configuration for the same purpose in context of replication so that we have 
> longer time configured for that otherwise we can default to a 1 day.
> hive.repl.cm.enabled can be used to identify if replication is enabled or 
> not. if enabled use the new configuration property to determine ttl for 
> events in rdbms else use hive.metastore.event.db.listener.timetolive for ttl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24173) notification cleanup interval value changes depending upon replication enabled or not.

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24173:
--
Labels: pull-request-available  (was: )

> notification cleanup interval value changes depending upon replication 
> enabled or not.
> --
>
> Key: HIVE-24173
> URL: https://issues.apache.org/jira/browse/HIVE-24173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Arko Sharma
>Assignee: Arko Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently we use hive.metastore.event.db.listener.timetolive to determine how 
> long the events are stored in rdbms backing hms. We should have another 
> configuration for the same purpose in context of replication so that we have 
> longer time configured for that otherwise we can default to a 1 day.
> hive.repl.cm.enabled can be used to identify if replication is enabled or 
> not. if enabled use the new configuration property to determine ttl for 
> events in rdbms else use hive.metastore.event.db.listener.timetolive for ttl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24256) REPL LOAD fails because of unquoted column name

2020-10-21 Thread Viacheslav Avramenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viacheslav Avramenko updated HIVE-24256:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> REPL LOAD fails because of unquoted column name
> ---
>
> Key: HIVE-24256
> URL: https://issues.apache.org/jira/browse/HIVE-24256
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Viacheslav Avramenko
>Assignee: Viacheslav Avramenko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-24256.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There is unquoted column name NWI_TABLE in one of the SQL queries which 
> executed during REPL LOAD.
>  This causes the command to fail when Postgres is used for metastore.
> {code:sql}
> SELECT \"NWI_NEXT\" FROM \"NEXT_WRITE_ID\" WHERE \"NWI_DATABASE\" = ? AND 
> NWI_TABLE = ?
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24256) REPL LOAD fails because of unquoted column name

2020-10-21 Thread Viacheslav Avramenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218327#comment-17218327
 ] 

Viacheslav Avramenko commented on HIVE-24256:
-

Thanks, [~pvary] and [Miklos 
Gergely!|https://reviews.apache.org/users/miki.gergely/]

> REPL LOAD fails because of unquoted column name
> ---
>
> Key: HIVE-24256
> URL: https://issues.apache.org/jira/browse/HIVE-24256
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Viacheslav Avramenko
>Assignee: Viacheslav Avramenko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-24256.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There is unquoted column name NWI_TABLE in one of the SQL queries which 
> executed during REPL LOAD.
>  This causes the command to fail when Postgres is used for metastore.
> {code:sql}
> SELECT \"NWI_NEXT\" FROM \"NEXT_WRITE_ID\" WHERE \"NWI_DATABASE\" = ? AND 
> NWI_TABLE = ?
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?focusedWorklogId=503213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503213
 ]

ASF GitHub Bot logged work on HIVE-24275:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 14:22
Start Date: 21/Oct/20 14:22
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1583:
URL: https://github.com/apache/hive/pull/1583#discussion_r509330534



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -272,4 +302,22 @@ private void removeFiles(String location, ValidWriteIdList 
writeIdList, Compacti
   fs.delete(dead, true);
 }
   }
+
+  /**
+   * Check if user configured retention time for the cleanup of obsolete 
directories/files for the table
+   * has passed or not
+   *
+   * @param ci CompactionInfo
+   * @return True, if retention time has passed and it is ok to clean, else 
false
+   */
+  public boolean isReadyToCleanWithRetentionPolicy(CompactionInfo ci) {

Review comment:
   This whole thing could be added to the findReadyToClean's sql query 
where clause. It could use the TxnDbUtil.getEpochFn so there would be no need 
for new fields in CompactionInfo





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503213)
Time Spent: 1h  (was: 50m)

> Configurations to delay the deletion of obsolete files by the Cleaner
> -
>
> Key: HIVE-24275
> URL: https://issues.apache.org/jira/browse/HIVE-24275
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kishen Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Whenever compaction happens, the cleaner immediately deletes older obsolete 
> files. In certain cases it would be beneficial to retain these for certain 
> period. For example : if you are serving the file metadata from cache and 
> don't want to invalidate the cache during compaction because of performance 
> reasons. 
> For this purpose we should introduce a configuration 
> hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
> cleaning up obsolete files. There should be a separate configuration 
> CLEANER_RETENTION_TIME to specify the duration till which we should retain 
> these older obsolete files. 
> It might be beneficial to have one more configuration to decide whether to 
> retain files involved in an aborted transaction 
> hive.compactor.aborted.txn.delayed.cleanup.enabled . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?focusedWorklogId=503210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503210
 ]

ASF GitHub Bot logged work on HIVE-24275:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 14:18
Start Date: 21/Oct/20 14:18
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1583:
URL: https://github.com/apache/hive/pull/1583#discussion_r509327437



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
##
@@ -237,10 +238,22 @@ public void markCompacted(CompactionInfo info) throws 
MetaException {
   try {
 dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
 stmt = dbConn.createStatement();
-String s = "UPDATE \"COMPACTION_QUEUE\" SET \"CQ_STATE\" = '" + 
READY_FOR_CLEANING + "', "
+long now = getDbTime(dbConn);
+String s = "UPDATE \"COMPACTION_QUEUE\" SET \"CQ_META_INFO\" = " + now 
+ ", \"CQ_STATE\" = '" + READY_FOR_CLEANING + "', "

Review comment:
   You should use a new field for that, and also can leverage 
TxnDbUtil.getEpochFn





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503210)
Time Spent: 50m  (was: 40m)

> Configurations to delay the deletion of obsolete files by the Cleaner
> -
>
> Key: HIVE-24275
> URL: https://issues.apache.org/jira/browse/HIVE-24275
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kishen Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Whenever compaction happens, the cleaner immediately deletes older obsolete 
> files. In certain cases it would be beneficial to retain these for certain 
> period. For example : if you are serving the file metadata from cache and 
> don't want to invalidate the cache during compaction because of performance 
> reasons. 
> For this purpose we should introduce a configuration 
> hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
> cleaning up obsolete files. There should be a separate configuration 
> CLEANER_RETENTION_TIME to specify the duration till which we should retain 
> these older obsolete files. 
> It might be beneficial to have one more configuration to decide whether to 
> retain files involved in an aborted transaction 
> hive.compactor.aborted.txn.delayed.cleanup.enabled . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?focusedWorklogId=503209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503209
 ]

ASF GitHub Bot logged work on HIVE-24275:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 14:15
Start Date: 21/Oct/20 14:15
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1583:
URL: https://github.com/apache/hive/pull/1583#discussion_r509324748



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -229,6 +233,7 @@ private static String idWatermark(CompactionInfo ci) {
   private void removeFiles(String location, ValidWriteIdList writeIdList, 
CompactionInfo ci)
   throws IOException, NoSuchObjectException, MetaException {
 Path locPath = new Path(location);
+FileSystem fs = locPath.getFileSystem(conf);

Review comment:
   This fs can passed to getAcidState





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503209)
Time Spent: 40m  (was: 0.5h)

> Configurations to delay the deletion of obsolete files by the Cleaner
> -
>
> Key: HIVE-24275
> URL: https://issues.apache.org/jira/browse/HIVE-24275
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kishen Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Whenever compaction happens, the cleaner immediately deletes older obsolete 
> files. In certain cases it would be beneficial to retain these for certain 
> period. For example : if you are serving the file metadata from cache and 
> don't want to invalidate the cache during compaction because of performance 
> reasons. 
> For this purpose we should introduce a configuration 
> hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
> cleaning up obsolete files. There should be a separate configuration 
> CLEANER_RETENTION_TIME to specify the duration till which we should retain 
> these older obsolete files. 
> It might be beneficial to have one more configuration to decide whether to 
> retain files involved in an aborted transaction 
> hive.compactor.aborted.txn.delayed.cleanup.enabled . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?focusedWorklogId=503208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503208
 ]

ASF GitHub Bot logged work on HIVE-24275:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 14:12
Start Date: 21/Oct/20 14:12
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1583:
URL: https://github.com/apache/hive/pull/1583#discussion_r509322335



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -244,7 +249,32 @@ private void removeFiles(String location, ValidWriteIdList 
writeIdList, Compacti
 obsoleteDirs.addAll(dir.getAbortedDirectories());
 List filesToDelete = new ArrayList<>(obsoleteDirs.size());
 StringBuilder extraDebugInfo = new StringBuilder("[");
+boolean delayedCleanupEnabled = 
conf.getBoolVar(HiveConf.ConfVars.HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
+
 for (Path stat : obsoleteDirs) {
+  if (delayedCleanupEnabled) {
+String filename = stat.toString();
+if (filename.startsWith(AcidUtils.BASE_PREFIX)) {
+  long writeId = AcidUtils.ParsedBase.parseBase(stat).getWriteId();
+  if (ci.type == CompactionType.MINOR) {
+LOG.info("Skipping base dir " + stat + " as this cleanup is for 
minor compaction"
++ ", compaction id " + ci.id);
+continue;
+  } else if (writeId > writeIdList.getHighWatermark()) {
+LOG.info("Skipping base dir " + stat + " deletion as WriteId of 
this base dir is"
++ " greater than highWaterMark for compaction id " + ci.id);
+continue;
+  }
+}
+else if (filename.startsWith(AcidUtils.DELTA_PREFIX) || 
filename.startsWith(AcidUtils.DELETE_DELTA_PREFIX)) {
+  AcidUtils.ParsedDelta delta = AcidUtils.parsedDelta(stat, fs);
+  if (delta.getMaxWriteId() > writeIdList.getHighWatermark()) {

Review comment:
   I am not sure about this check. I guess this is here, to prepare for the 
case when there were two compaction, and we are doing the cleanup of the first 
one and don't want to clean up the stuff that was compacted by the second one. 
But the cleaner validWriteId list is topped by the minOpenTxnId, so if 
everything was committed the writeIdList.getHighWatermark() will be 
NEXT_WRITE_ID -1 , so it won't prevent the cleaning of the second compacted 
stuff. Maybe you can use the highestWriteId from the CompactionInfo? Not sure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503208)
Time Spent: 0.5h  (was: 20m)

> Configurations to delay the deletion of obsolete files by the Cleaner
> -
>
> Key: HIVE-24275
> URL: https://issues.apache.org/jira/browse/HIVE-24275
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kishen Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Whenever compaction happens, the cleaner immediately deletes older obsolete 
> files. In certain cases it would be beneficial to retain these for certain 
> period. For example : if you are serving the file metadata from cache and 
> don't want to invalidate the cache during compaction because of performance 
> reasons. 
> For this purpose we should introduce a configuration 
> hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
> cleaning up obsolete files. There should be a separate configuration 
> CLEANER_RETENTION_TIME to specify the duration till which we should retain 
> these older obsolete files. 
> It might be beneficial to have one more configuration to decide whether to 
> retain files involved in an aborted transaction 
> hive.compactor.aborted.txn.delayed.cleanup.enabled . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24275) Configurations to delay the deletion of obsolete files by the Cleaner

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24275?focusedWorklogId=503205=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503205
 ]

ASF GitHub Bot logged work on HIVE-24275:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 14:04
Start Date: 21/Oct/20 14:04
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on a change in pull request #1583:
URL: https://github.com/apache/hive/pull/1583#discussion_r509315841



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##
@@ -244,7 +249,32 @@ private void removeFiles(String location, ValidWriteIdList 
writeIdList, Compacti
 obsoleteDirs.addAll(dir.getAbortedDirectories());
 List filesToDelete = new ArrayList<>(obsoleteDirs.size());
 StringBuilder extraDebugInfo = new StringBuilder("[");
+boolean delayedCleanupEnabled = 
conf.getBoolVar(HiveConf.ConfVars.HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED);
+
 for (Path stat : obsoleteDirs) {
+  if (delayedCleanupEnabled) {
+String filename = stat.toString();
+if (filename.startsWith(AcidUtils.BASE_PREFIX)) {
+  long writeId = AcidUtils.ParsedBase.parseBase(stat).getWriteId();
+  if (ci.type == CompactionType.MINOR) {
+LOG.info("Skipping base dir " + stat + " as this cleanup is for 
minor compaction"
++ ", compaction id " + ci.id);
+continue;
+  } else if (writeId > writeIdList.getHighWatermark()) {
+LOG.info("Skipping base dir " + stat + " deletion as WriteId of 
this base dir is"
++ " greater than highWaterMark for compaction id " + ci.id);
+continue;
+  }
+}
+else if (filename.startsWith(AcidUtils.DELTA_PREFIX) || 
filename.startsWith(AcidUtils.DELETE_DELTA_PREFIX)) {
+  AcidUtils.ParsedDelta delta = AcidUtils.parsedDelta(stat, fs);

Review comment:
   There is a ParsedDeltaLight in AcidUtils, that is a cheaper way to parse 
out the maxWriteId. This parsedDelta method will issue a FS call to check if 
the delta is in raw format





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503205)
Time Spent: 20m  (was: 10m)

> Configurations to delay the deletion of obsolete files by the Cleaner
> -
>
> Key: HIVE-24275
> URL: https://issues.apache.org/jira/browse/HIVE-24275
> Project: Hive
>  Issue Type: New Feature
>Reporter: Kishen Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Whenever compaction happens, the cleaner immediately deletes older obsolete 
> files. In certain cases it would be beneficial to retain these for certain 
> period. For example : if you are serving the file metadata from cache and 
> don't want to invalidate the cache during compaction because of performance 
> reasons. 
> For this purpose we should introduce a configuration 
> hive.compactor.delayed.cleanup.enabled, which if enabled will delay the 
> cleaning up obsolete files. There should be a separate configuration 
> CLEANER_RETENTION_TIME to specify the duration till which we should retain 
> these older obsolete files. 
> It might be beneficial to have one more configuration to decide whether to 
> retain files involved in an aborted transaction 
> hive.compactor.aborted.txn.delayed.cleanup.enabled . 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=503203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503203
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 14:00
Start Date: 21/Oct/20 14:00
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r509312017



##
File path: ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java
##
@@ -95,7 +95,7 @@ public static SessionState setUpSessionState(HiveConf conf, 
String user, boolean
 if (sessionState == null) {
   // Note: we assume that workers run on the same threads repeatedly, so 
we can set up
   //   the session here and it will be reused without explicitly 
storing in the worker.
-  sessionState = new SessionState(conf, user);
+  sessionState = new SessionState(conf, user, true);

Review comment:
   background threads do not need async delete. Many compaction tests 
specifically have sync assumptions. I dont see any benefit in moving background 
operations to async cleanup model.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503203)
Time Spent: 1h 10m  (was: 1h)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24284) NPE when parsing druid logs using Hive

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24284?focusedWorklogId=503122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503122
 ]

ASF GitHub Bot logged work on HIVE-24284:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 10:12
Start Date: 21/Oct/20 10:12
Worklog Time Spent: 10m 
  Work Description: maheshk114 merged pull request #1586:
URL: https://github.com/apache/hive/pull/1586


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503122)
Time Spent: 20m  (was: 10m)

> NPE when parsing druid logs using Hive
> --
>
> Key: HIVE-24284
> URL: https://issues.apache.org/jira/browse/HIVE-24284
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As per current Sys-logger parser, its always expecting a valid proc id. But 
> as per RFC3164 and RFC5424, the proc id can be skipped.So hive should handled 
> it by using NILVALUE/empty string in case the proc id is null.
>  
> {code:java}
> Caused by: java.lang.NullPointerException: null
> at java.lang.String.(String.java:566)
> at 
> org.apache.hadoop.hive.ql.log.syslog.SyslogParser.createEvent(SyslogParser.java:361)
> at 
> org.apache.hadoop.hive.ql.log.syslog.SyslogParser.readEvent(SyslogParser.java:326)
> at 
> org.apache.hadoop.hive.ql.log.syslog.SyslogSerDe.deserialize(SyslogSerDe.java:95)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-21 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218186#comment-17218186
 ] 

Denys Kuzmenko commented on HIVE-21052:
---

[~vpnvishv], do you plan to cherry-pick this change into branch-3.1? We did it 
a bit differently in master (no p-type compaction), would be great if you could 
take a look.

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-21 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218183#comment-17218183
 ] 

Denys Kuzmenko edited comment on HIVE-21052 at 10/21/20, 8:23 AM:
--

Pushed to master.
Thanks [~jmarhuen], [~vpnvishv], [~asomani] for the contribution and 
[~ekoifman], [~pvarga], [~klcopp] for the review!


was (Author: dkuzmenko):
Pushed to master.
Thanks [~jmarhuen], [~vpnvishv], [~asomani] for the contribution and [~pvarga], 
[~klcopp] for the review!

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-21 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218183#comment-17218183
 ] 

Denys Kuzmenko edited comment on HIVE-21052 at 10/21/20, 8:20 AM:
--

Pushed to master.
Thanks [~jmarhuen], [~vpnvishv], [~asomani] for the contribution and [~pvarga], 
[~klcopp] for the review!


was (Author: dkuzmenko):
Pushed to master.
Thanks [~jmarhuen], [~vpnvishv] for the contribution and [~pvarga], [~klcopp] 
for the review!

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-21 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218183#comment-17218183
 ] 

Denys Kuzmenko commented on HIVE-21052:
---

Pushed to master.
Thanks [~jmarhuen], [~vpnvishv] for the contribution and [~pvarga], [~klcopp] 
for the review!

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=503053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503053
 ]

ASF GitHub Bot logged work on HIVE-21052:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 07:51
Start Date: 21/Oct/20 07:51
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged pull request #1548:
URL: https://github.com/apache/hive/pull/1548


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503053)
Time Spent: 12h 10m  (was: 12h)

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=503051=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503051
 ]

ASF GitHub Bot logged work on HIVE-21052:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 07:37
Start Date: 21/Oct/20 07:37
Worklog Time Spent: 10m 
  Work Description: klcopp commented on a change in pull request #1548:
URL: https://github.com/apache/hive/pull/1548#discussion_r509054430



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
##
@@ -414,76 +403,30 @@ public void markCleaned(CompactionInfo info) throws 
MetaException {
  * aborted TXN_COMPONENTS above tc_writeid (and consequently about 
aborted txns).
  * See {@link ql.txn.compactor.Cleaner.removeFiles()}
  */
-s = "SELECT DISTINCT \"TXN_ID\" FROM \"TXNS\", \"TXN_COMPONENTS\" 
WHERE \"TXN_ID\" = \"TC_TXNID\" "
-+ "AND \"TXN_STATE\" = " + TxnStatus.ABORTED + " AND 
\"TC_DATABASE\" = ? AND \"TC_TABLE\" = ?";
-if (info.highestWriteId != 0) s += " AND \"TC_WRITEID\" <= ?";
-if (info.partName != null) s += " AND \"TC_PARTITION\" = ?";
-
+s = "DELETE FROM \"TXN_COMPONENTS\" WHERE \"TC_TXNID\" IN (" +

Review comment:
   never mind, LGTM





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503051)
Time Spent: 12h  (was: 11h 50m)

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24256) REPL LOAD fails because of unquoted column name

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24256?focusedWorklogId=503050=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503050
 ]

ASF GitHub Bot logged work on HIVE-24256:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 07:31
Start Date: 21/Oct/20 07:31
Worklog Time Spent: 10m 
  Work Description: pvary merged pull request #1569:
URL: https://github.com/apache/hive/pull/1569


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503050)
Time Spent: 0.5h  (was: 20m)

> REPL LOAD fails because of unquoted column name
> ---
>
> Key: HIVE-24256
> URL: https://issues.apache.org/jira/browse/HIVE-24256
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Viacheslav Avramenko
>Assignee: Viacheslav Avramenko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-24256.01.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There is unquoted column name NWI_TABLE in one of the SQL queries which 
> executed during REPL LOAD.
>  This causes the command to fail when Postgres is used for metastore.
> {code:sql}
> SELECT \"NWI_NEXT\" FROM \"NEXT_WRITE_ID\" WHERE \"NWI_DATABASE\" = ? AND 
> NWI_TABLE = ?
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=503049=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503049
 ]

ASF GitHub Bot logged work on HIVE-21052:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 07:20
Start Date: 21/Oct/20 07:20
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1548:
URL: https://github.com/apache/hive/pull/1548#discussion_r508809944



##
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
##
@@ -589,4 +593,9 @@ private void checkInterrupt() throws InterruptedException {
   throw new InterruptedException("Compaction execution is interrupted");
 }
   }
-}
+
+  private static boolean isDynPartAbort(Table t, CompactionInfo ci) {

Review comment:
   those are actually 2 diff methods the only common part is the check for 
isDynPart. Also there is no CompactionUtils only CompactorUtil, that contains 
thread factory stuff. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503049)
Time Spent: 11h 50m  (was: 11h 40m)

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background

2020-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=503046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-503046
 ]

ASF GitHub Bot logged work on HIVE-24270:
-

Author: ASF GitHub Bot
Created on: 21/Oct/20 07:00
Start Date: 21/Oct/20 07:00
Worklog Time Spent: 10m 
  Work Description: nareshpr commented on a change in pull request #1577:
URL: https://github.com/apache/hive/pull/1577#discussion_r509033974



##
File path: ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java
##
@@ -95,7 +95,7 @@ public static SessionState setUpSessionState(HiveConf conf, 
String user, boolean
 if (sessionState == null) {
   // Note: we assume that workers run on the same threads repeatedly, so 
we can set up
   //   the session here and it will be reused without explicitly 
storing in the worker.
-  sessionState = new SessionState(conf, user);
+  sessionState = new SessionState(conf, user, true);

Review comment:
   Are we targeting specific queries like auto-gather background stats 
threads & compaction ? why are we not providing a config to toggle sync/async 
delete ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 503046)
Time Spent: 1h  (was: 50m)

> Move scratchdir cleanup to background
> -
>
> Key: HIVE-24270
> URL: https://issues.apache.org/jira/browse/HIVE-24270
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In cloud environment, scratchdir cleaning at the end of the query may take 
> long time. This causes client to hang up to 1 minute even after the results 
> were streamed back. During this time client just waits for cleanup to finish. 
> Cleanup can take place in the background in HiveServer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)