[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=784107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784107
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/Jun/22 09:55
Start Date: 23/Jun/22 09:55
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #3307:
URL: https://github.com/apache/hive/pull/3307




Issue Time Tracking
---

Worklog Id: (was: 784107)
Time Spent: 10.5h  (was: 10h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=783767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783767
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 22/Jun/22 08:35
Start Date: 22/Jun/22 08:35
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r903452007


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3225,6 +3237,14 @@ public static String getPathSuffix(long txnId) {
 return (SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId));
   }
 
+  public static boolean isExclusiveCTASEnabled(Configuration conf) {
+return  HiveConf.getBoolVar(conf, ConfVars.TXN_CTAS_X_LOCK);
+  }
+
+  public static boolean isExclusiveCTASEnabled(Table t, Configuration conf) {
+return HiveConf.getBoolVar(conf, ConfVars.TXN_CTAS_X_LOCK) && 
isTransactionalTable(t);

Review Comment:
   if you'll be submitting new fixes, could we please rename just this method 
to `isTableExclusiveCTASEnabled` to avoid confusion with 
`isExclusiveCTASEnabled` or just remove isExclusiveCTASEnabled - it's using in 
1 place so you could just directly access conf there.





Issue Time Tracking
---

Worklog Id: (was: 783767)
Time Spent: 10h 20m  (was: 10h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=783765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783765
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 22/Jun/22 08:29
Start Date: 22/Jun/22 08:29
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r903446307


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3086,6 +3086,13 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 output.getWriteType().name()));
 break;
 
+  case CTAS:
+if (AcidUtils.isExclusiveCTASEnabled(t, conf)) {
+  compBuilder.setExclWrite();
+  compBuilder.setOperationType(DataOperationType.NO_TXN);
+  break;
+}

Review Comment:
   i do not see `continue` keyword





Issue Time Tracking
---

Worklog Id: (was: 783765)
Time Spent: 10h 10m  (was: 10h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=783753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783753
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 22/Jun/22 08:03
Start Date: 22/Jun/22 08:03
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r903420540


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3086,6 +3086,13 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 output.getWriteType().name()));
 break;
 
+  case CTAS:
+if (AcidUtils.isExclusiveCTASEnabled(t, conf)) {
+  compBuilder.setExclWrite();
+  compBuilder.setOperationType(DataOperationType.NO_TXN);
+  break;
+}

Review Comment:
   Yes, it should continue if it is not a transactional table or the config is 
not enabled.
   This was the old behavior as well.





Issue Time Tracking
---

Worklog Id: (was: 783753)
Time Spent: 10h  (was: 9h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=783749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783749
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 22/Jun/22 07:52
Start Date: 22/Jun/22 07:52
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r903406430


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3086,6 +3086,13 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 output.getWriteType().name()));
 break;
 
+  case CTAS:
+if (AcidUtils.isExclusiveCTASEnabled(t, conf)) {
+  compBuilder.setExclWrite();
+  compBuilder.setOperationType(DataOperationType.NO_TXN);
+  break;
+}

Review Comment:
   should we continue otherwise?





Issue Time Tracking
---

Worklog Id: (was: 783749)
Time Spent: 9h 50m  (was: 9h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=783735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783735
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 22/Jun/22 07:26
Start Date: 22/Jun/22 07:26
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r903379054


##
serde/if/test/complex.thrift:
##
@@ -1,3 +1,4 @@
+

Review Comment:
   nit: space





Issue Time Tracking
---

Worklog Id: (was: 783735)
Time Spent: 9h 40m  (was: 9.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=783140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783140
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 22:24
Start Date: 20/Jun/22 22:24
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901999420


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3086,6 +3086,13 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 output.getWriteType().name()));
 break;
 
+  case CTAS:
+if (AcidUtils.isExclusiveCTASEnabled(conf) && 
AcidUtils.isTransactionalTable(t)) {

Review Comment:
   No, since we do not have table data at the other places where the 
isExclusiveCTASEnabled method is called.





Issue Time Tracking
---

Worklog Id: (was: 783140)
Time Spent: 9.5h  (was: 9h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=783128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783128
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 21:47
Start Date: 20/Jun/22 21:47
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901984683


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3163,6 +3168,7 @@ private boolean shouldUpdateTxnComponent(long txnid, 
LockRequest rqst, LockCompo
 case INSERT:
 case UPDATE:
 case DELETE:
+case CTAS:

Review Comment:
   It was NO_TXN before.





Issue Time Tracking
---

Worklog Id: (was: 783128)
Time Spent: 9h 20m  (was: 9h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782901=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782901
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 09:20
Start Date: 20/Jun/22 09:20
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901442197


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3086,6 +3086,13 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 output.getWriteType().name()));
 break;
 
+  case CTAS:
+if (AcidUtils.isExclusiveCTASEnabled(conf) && 
AcidUtils.isTransactionalTable(t)) {

Review Comment:
   Could we move `AcidUtils.isTransactionalTable(t)` check inside of 
isExclusiveCTASEnabled?
   
   AcidUtils.isExclusiveCTASEnabled(t, conf)
   





Issue Time Tracking
---

Worklog Id: (was: 782901)
Time Spent: 9h 10m  (was: 9h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782898=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782898
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 09:19
Start Date: 20/Jun/22 09:19
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901442197


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3086,6 +3086,13 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 output.getWriteType().name()));
 break;
 
+  case CTAS:
+if (AcidUtils.isExclusiveCTASEnabled(conf) && 
AcidUtils.isTransactionalTable(t)) {

Review Comment:
   Could we move `AcidUtils.isTransactionalTable(t)` check inside of 
isExclusiveCTASEnabled?





Issue Time Tracking
---

Worklog Id: (was: 782898)
Time Spent: 9h  (was: 8h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782879
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:51
Start Date: 20/Jun/22 08:51
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901411913


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3111,11 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isExclusiveCTAS(List lockComponents, 
HiveConf conf) {
+return lockComponents.stream().anyMatch(lc -> DataOperationType.CTAS == 
lc.getOperationType()

Review Comment:
   use Set'<'WriteEntity'>' outputs.getType
   
   isExclusiveCTAS(work.getOutputs())
   





Issue Time Tracking
---

Worklog Id: (was: 782879)
Time Spent: 8h 50m  (was: 8h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782877=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782877
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:50
Start Date: 20/Jun/22 08:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901411913


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3111,11 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isExclusiveCTAS(List lockComponents, 
HiveConf conf) {
+return lockComponents.stream().anyMatch(lc -> DataOperationType.CTAS == 
lc.getOperationType()

Review Comment:
   use Set outputs.getType
   
   isExclusiveCTAS(work.getOutputs())
   



##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/OperationType.java:
##
@@ -34,7 +34,8 @@ public enum OperationType {
   INSERT('i', DataOperationType.INSERT),
   UPDATE('u', DataOperationType.UPDATE),
   DELETE('d', DataOperationType.DELETE),
-  COMPACT('c', null);
+  COMPACT('c', null),
+  CTAS('t', DataOperationType.CTAS);

Review Comment:
   remove this





Issue Time Tracking
---

Worklog Id: (was: 782877)
Time Spent: 8h 40m  (was: 8.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782876
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:47
Start Date: 20/Jun/22 08:47
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901411252


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3086,6 +3086,13 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 output.getWriteType().name()));
 break;
 
+  case CTAS:
+if (AcidUtils.isExclusiveCTASEnabled(conf) && 
AcidUtils.isTransactionalTable(t)) {
+  compBuilder.setExclWrite();
+  compBuilder.setOperationType(DataOperationType.CTAS);

Review Comment:
   set to `NO_TXN`





Issue Time Tracking
---

Worklog Id: (was: 782876)
Time Spent: 8.5h  (was: 8h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782875
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:41
Start Date: 20/Jun/22 08:41
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901405027


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -310,6 +310,11 @@ abstract class TxnHandler implements TxnStore, 
TxnStore.MutexAPI {
   "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE 
\"TXN_STATE\" = " + TxnStatus.ABORTED +
   " GROUP BY \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" HAVING 
COUNT(\"TXN_ID\") > ?";
 
+  private static final String EX_CTAS_ERR_MSG =

Review Comment:
   rename to `EXCL_CTAS_ERR_MSG`





Issue Time Tracking
---

Worklog Id: (was: 782875)
Time Spent: 8h 20m  (was: 8h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782874
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:40
Start Date: 20/Jun/22 08:40
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901399288


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3163,6 +3168,7 @@ private boolean shouldUpdateTxnComponent(long txnid, 
LockRequest rqst, LockCompo
 case INSERT:
 case UPDATE:
 case DELETE:
+case CTAS:

Review Comment:
   was the operationType for CTAS = 'i' before so we had to include it here? I 
think that should be removed, it we didn't insert before





Issue Time Tracking
---

Worklog Id: (was: 782874)
Time Spent: 8h 10m  (was: 8h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782873
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:35
Start Date: 20/Jun/22 08:35
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901399288


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3163,6 +3168,7 @@ private boolean shouldUpdateTxnComponent(long txnid, 
LockRequest rqst, LockCompo
 case INSERT:
 case UPDATE:
 case DELETE:
+case CTAS:

Review Comment:
   was the operationType for CTAS = 'i' before so we had to include it here?





Issue Time Tracking
---

Worklog Id: (was: 782873)
Time Spent: 8h  (was: 7h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782870
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:33
Start Date: 20/Jun/22 08:33
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901397117


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5280,23 +5291,24 @@ is performed on that db (e.g. show tables, created 
table, etc).
 LOG.debug("Failure to acquire lock({} intLockId:{} {}), blocked by 
({})", JavaUtils.lockIdToString(extLockId),
 intLockId, JavaUtils.txnIdToString(txnId), blockedBy);
 
-if (zeroWaitReadEnabled && isValidTxn(txnId)) {
+if ((zeroWaitReadEnabled || isExclusiveCTAS) && isValidTxn(txnId)) {
   LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
-  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
-
-  if (lockType == LockType.SHARED_READ) {
-String cleanupQuery = "DELETE FROM \"HIVE_LOCKS\" WHERE 
\"HL_LOCK_EXT_ID\" = " + extLockId;
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
 
-LOG.debug("Going to execute query: <" + cleanupQuery + ">");
-stmt.executeUpdate(cleanupQuery);
+  if (lockType == LockType.SHARED_READ || isExclusiveCTAS) {
+if (!isExclusiveCTAS) {

Review Comment:
   delete from HIVE_LOCKS should be done for isExclusiveCTAS as well





Issue Time Tracking
---

Worklog Id: (was: 782870)
Time Spent: 7h 50m  (was: 7h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782864=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782864
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:24
Start Date: 20/Jun/22 08:24
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901388096


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3163,6 +3168,7 @@ private boolean shouldUpdateTxnComponent(long txnid, 
LockRequest rqst, LockCompo
 case INSERT:
 case UPDATE:
 case DELETE:
+case CTAS:

Review Comment:
   do we need new operationType here?





Issue Time Tracking
---

Worklog Id: (was: 782864)
Time Spent: 7.5h  (was: 7h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782865
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/Jun/22 08:24
Start Date: 20/Jun/22 08:24
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r901388685


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -4678,6 +4678,12 @@ public static enum ConfVars {
 HIVE_ACID_DIRECT_INSERT_ENABLED("hive.acid.direct.insert.enabled", true,
 "Enable writing the data files directly to the table's final 
destination instead of the staging directory."
 + "This optimization only applies on INSERT operations on ACID 
tables."),
+
HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED("hive.acid.check.for.concurrent.ctas.enabled",
 false,

Review Comment:
   unused config





Issue Time Tracking
---

Worklog Id: (was: 782865)
Time Spent: 7h 40m  (was: 7.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782251
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 17/Jun/22 06:37
Start Date: 17/Jun/22 06:37
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r899807718


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {
+
+String deleteBlockedByTxnComp = "DELETE  FROM \"TXN_COMPONENTS\" 
WHERE" + " \"TC_TXNID\"=" + txnId;

Review Comment:
   Realized that the cleaner will take care of this. I have removed the delete 
query in the recent commit.





Issue Time Tracking
---

Worklog Id: (was: 782251)
Time Spent: 7h 10m  (was: 7h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=782252=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782252
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 17/Jun/22 06:37
Start Date: 17/Jun/22 06:37
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r899808001


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {

Review Comment:
   done





Issue Time Tracking
---

Worklog Id: (was: 782252)
Time Spent: 7h 20m  (was: 7h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781817
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 15/Jun/22 19:59
Start Date: 15/Jun/22 19:59
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r898367198


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3112,15 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isCTASOperation(List lockComponents, 
HiveConf conf) {
+boolean isCtas = false;
+for (LockComponent lock : lockComponents) {
+  if (lock.getOperationType().name().equals(OperationType.CTAS.name()) &&
+  
conf.getBoolVar(ConfVars.HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED))

Review Comment:
   Done.





Issue Time Tracking
---

Worklog Id: (was: 781817)
Time Spent: 7h  (was: 6h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781808
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 15/Jun/22 19:40
Start Date: 15/Jun/22 19:40
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r898345184


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -4678,6 +4678,9 @@ public static enum ConfVars {
 HIVE_ACID_DIRECT_INSERT_ENABLED("hive.acid.direct.insert.enabled", true,
 "Enable writing the data files directly to the table's final 
destination instead of the staging directory."
 + "This optimization only applies on INSERT operations on ACID 
tables."),
+
HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED("hive.acid.check.for.concurrent.ctas.enabled",
 false,

Review Comment:
   done





Issue Time Tracking
---

Worklog Id: (was: 781808)
Time Spent: 6h 50m  (was: 6h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781807
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 15/Jun/22 19:39
Start Date: 15/Jun/22 19:39
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r898344978


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -420,6 +420,7 @@ LockState acquireLocks(QueryPlan plan, Context ctx, String 
username, boolean isB
 }
 List lockComponents = 
AcidUtils.makeLockComponents(plan.getOutputs(), plan.getInputs(),
 ctx.getOperation(), conf);
+
rqstBuilder.setCheckForConcurrentCtas(AcidUtils.isCTASOperation(lockComponents, 
conf));

Review Comment:
   sure





Issue Time Tracking
---

Worklog Id: (was: 781807)
Time Spent: 6h 40m  (was: 6.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781806
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 15/Jun/22 19:39
Start Date: 15/Jun/22 19:39
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r898344710


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {
+
+String deleteBlockedByTxnComp = "DELETE  FROM \"TXN_COMPONENTS\" 
WHERE" + " \"TC_TXNID\"=" + txnId;

Review Comment:
   No, txn abort still leaves behind entries in TXN_COMPONENTS . 
   This happens even when an insert query is aborted. Looks like a bug.
   
   Below TXN_ID = 2  was aborted. But TXN_COMPONENTS still has entry.
   ```
   mysql> select * from TXNS;
   
++---+---++--+---++---+-+--+
   | TXN_ID | TXN_STATE | TXN_STARTED   | TXN_LAST_HEARTBEAT | TXN_USER | 
TXN_HOST  | TXN_AGENT_INFO | TXN_META_INFO | TXN_HEARTBEAT_COUNT | TXN_TYPE |
   
++---+---++--+---++---+-+--+
   |  0 | c | 0 |  0 |  |   
| NULL   | NULL  |NULL | NULL |
   |  1 | c | 1655320638716 |  1655320638716 | hive | 
localhost | NULL   | NULL  |NULL |0 |
   |  2 | a | 1655320644989 |  1655320664896 | hive | 
localhost | NULL   | NULL  |NULL |0 |
   |  3 | c | 1655320688403 |  1655320688403 | hive | 
localhost | NULL   | NULL  |NULL |0 |
   |  4 | c | 1655320719660 |  1655320719660 | hive | 
localhost | NULL   | NULL  |NULL |0 |
   
   
   mysql> select * from TXN_COMPONENTS;
   
+--+-+--+--+---++
   | TC_TXNID | TC_DATABASE | TC_TABLE | TC_PARTITION | TC_OPERATION_TYPE | 
TC_WRITEID |
   
+--+-+--+--+---++
   |2 | default | t1   | NULL | i | 
 1 |
   
+--+-+--+--+---++
   1 row in set (0.00 sec)
   ```





Issue Time Tracking
---

Worklog Id: (was: 781806)
Time Spent: 6.5h  (was: 6h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781103=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781103
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:26
Start Date: 14/Jun/22 11:26
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896696959


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {

Review Comment:
   embed this check with the above zeroWaitReadEnabled
   
   if ((zeroWaitReadEnabled || isExclusiveCTAS) && isValidTxn(txnId)) {
   
if (lockType == LockType.SHARED_READ || isExclusiveCTAS) {
   
   
   make sure to set proper setErrorMessage in case of exclusiveCTAS





Issue Time Tracking
---

Worklog Id: (was: 781103)
Time Spent: 6h 20m  (was: 6h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781102=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781102
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:24
Start Date: 14/Jun/22 11:24
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896696959


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {

Review Comment:
   embed this check with above zeroWaitReadEnabled
   
   if ((zeroWaitReadEnabled || isExclusiveCTAS) && isValidTxn(txnId)) {
   
   if (lockType == LockType.SHARED_READ || isExclusiveCTAS) {
   
   



##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {

Review Comment:
   embed this check with the above zeroWaitReadEnabled
   
   if ((zeroWaitReadEnabled || isExclusiveCTAS) && isValidTxn(txnId)) {
   
if (lockType == LockType.SHARED_READ || isExclusiveCTAS) {
   
   





Issue Time Tracking
---

Worklog Id: (was: 781102)
Time Spent: 6h 10m  (was: 6h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781101=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781101
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:20
Start Date: 14/Jun/22 11:20
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896679958


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3225,6 +3242,9 @@ public static String getPathSuffix(long txnId) {
 return (SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId));
   }
 
+  public static boolean isNoRenameCtasEnabled(Configuration conf) {

Review Comment:
   could we please rename it to
   
   isExclusiveCTASEnabled
   
   should we check here as well if table is transactional





Issue Time Tracking
---

Worklog Id: (was: 781101)
Time Spent: 6h  (was: 5h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781100
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:20
Start Date: 14/Jun/22 11:20
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896679958


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3225,6 +3242,9 @@ public static String getPathSuffix(long txnId) {
 return (SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId));
   }
 
+  public static boolean isNoRenameCtasEnabled(Configuration conf) {

Review Comment:
   could we please rename it to
   
   isExclusiveCTASEnabled
   
   should we check here if table isTransactional





Issue Time Tracking
---

Worklog Id: (was: 781100)
Time Spent: 5h 50m  (was: 5h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781099=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781099
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:18
Start Date: 14/Jun/22 11:18
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896684186


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3112,15 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isCTASOperation(List lockComponents, 
HiveConf conf) {
+boolean isCtas = false;
+for (LockComponent lock : lockComponents) {
+  if (lock.getOperationType().name().equals(OperationType.CTAS.name()) &&
+  
conf.getBoolVar(ConfVars.HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED))

Review Comment:
   
   return lockComponents.stream().anyMatch(lc -> DataOperationType.CTAS == 
lc.getOperationType() && isExclusiveCTASEnabled(conf))
   





Issue Time Tracking
---

Worklog Id: (was: 781099)
Time Spent: 5h 40m  (was: 5.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781098
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:17
Start Date: 14/Jun/22 11:17
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896672973


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -420,6 +420,7 @@ LockState acquireLocks(QueryPlan plan, Context ctx, String 
username, boolean isB
 }
 List lockComponents = 
AcidUtils.makeLockComponents(plan.getOutputs(), plan.getInputs(),
 ctx.getOperation(), conf);
+
rqstBuilder.setCheckForConcurrentCtas(AcidUtils.isCTASOperation(lockComponents, 
conf));

Review Comment:
   could we rename it to `isExclusiveCTAS`
   
   rqstBuilder.setExclusiveCTAS(AcidUtils.isExclusiveCTAS(lockComponents, conf))





Issue Time Tracking
---

Worklog Id: (was: 781098)
Time Spent: 5.5h  (was: 5h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781096
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:16
Start Date: 14/Jun/22 11:16
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896684186


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3112,15 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isCTASOperation(List lockComponents, 
HiveConf conf) {
+boolean isCtas = false;
+for (LockComponent lock : lockComponents) {
+  if (lock.getOperationType().name().equals(OperationType.CTAS.name()) &&
+  
conf.getBoolVar(ConfVars.HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED))

Review Comment:
   
   boolean isExclusiveCtas = lockComponents.stream().anyMatch(lc -> 
DataOperationType.CTAS == lc.getOperationType() && isExclusiveCTASEnabled(conf))
   



##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3112,15 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isCTASOperation(List lockComponents, 
HiveConf conf) {
+boolean isCtas = false;
+for (LockComponent lock : lockComponents) {
+  if (lock.getOperationType().name().equals(OperationType.CTAS.name()) &&
+  
conf.getBoolVar(ConfVars.HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED))

Review Comment:
   
   boolean isExclusiveCTAS = lockComponents.stream().anyMatch(lc -> 
DataOperationType.CTAS == lc.getOperationType() && isExclusiveCTASEnabled(conf))
   





Issue Time Tracking
---

Worklog Id: (was: 781096)
Time Spent: 5h 20m  (was: 5h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781094=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781094
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:09
Start Date: 14/Jun/22 11:09
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896684186


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3112,15 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isCTASOperation(List lockComponents, 
HiveConf conf) {
+boolean isCtas = false;
+for (LockComponent lock : lockComponents) {
+  if (lock.getOperationType().name().equals(OperationType.CTAS.name()) &&
+  
conf.getBoolVar(ConfVars.HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED))

Review Comment:
   why not use isNoRenameCtasEnabled/isExclusiveCTASEnabled helper method





Issue Time Tracking
---

Worklog Id: (was: 781094)
Time Spent: 5h 10m  (was: 5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781092
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:08
Start Date: 14/Jun/22 11:08
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896683204


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3104,6 +3112,15 @@ Seems much cleaner if each stmt is identified as a 
particular HiveOperation (whi
 return lockComponents;
   }
 
+  public static boolean isCTASOperation(List lockComponents, 
HiveConf conf) {

Review Comment:
   could we rename it to
   
   isExclusiveCTAS
   





Issue Time Tracking
---

Worklog Id: (was: 781092)
Time Spent: 5h  (was: 4h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781089
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:03
Start Date: 14/Jun/22 11:03
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896679958


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -3225,6 +3242,9 @@ public static String getPathSuffix(long txnId) {
 return (SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId));
   }
 
+  public static boolean isNoRenameCtasEnabled(Configuration conf) {

Review Comment:
   could we please rename it to
   
   isExclusiveCTASEnabled
   





Issue Time Tracking
---

Worklog Id: (was: 781089)
Time Spent: 4h 50m  (was: 4h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781088
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 11:01
Start Date: 14/Jun/22 11:01
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896678287


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -4678,6 +4678,9 @@ public static enum ConfVars {
 HIVE_ACID_DIRECT_INSERT_ENABLED("hive.acid.direct.insert.enabled", true,
 "Enable writing the data files directly to the table's final 
destination instead of the staging directory."
 + "This optimization only applies on INSERT operations on ACID 
tables."),
+
HIVE_ACID_CHECK_FOR_CONCURRENT_CTAS_ENABLED("hive.acid.check.for.concurrent.ctas.enabled",
 false,

Review Comment:
   please rename to
   
   TXN_CTAS_X_LOCK("hive.txn.xlock.ctas"
   





Issue Time Tracking
---

Worklog Id: (was: 781088)
Time Spent: 4h 40m  (was: 4.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781085=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781085
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 10:55
Start Date: 14/Jun/22 10:55
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896672973


##
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##
@@ -420,6 +420,7 @@ LockState acquireLocks(QueryPlan plan, Context ctx, String 
username, boolean isB
 }
 List lockComponents = 
AcidUtils.makeLockComponents(plan.getOutputs(), plan.getInputs(),
 ctx.getOperation(), conf);
+
rqstBuilder.setCheckForConcurrentCtas(AcidUtils.isCTASOperation(lockComponents, 
conf));

Review Comment:
   could we rename it to `isExclusiveCTAS`





Issue Time Tracking
---

Worklog Id: (was: 781085)
Time Spent: 4.5h  (was: 4h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781081
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 10:50
Start Date: 14/Jun/22 10:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896668783


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   no need for that, this is already embedded in checkForConcurrentCtas





Issue Time Tracking
---

Worklog Id: (was: 781081)
Time Spent: 4h 20m  (was: 4h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=781080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781080
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 14/Jun/22 10:48
Start Date: 14/Jun/22 10:48
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r896667206


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5297,6 +5303,28 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (checkForConcurrentCtas && isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {
+
+String deleteBlockedByTxnComp = "DELETE  FROM \"TXN_COMPONENTS\" 
WHERE" + " \"TC_TXNID\"=" + txnId;

Review Comment:
   no need for txn_components cleanup, txn abort should do the trick





Issue Time Tracking
---

Worklog Id: (was: 781080)
Time Spent: 4h 10m  (was: 4h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=780336=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780336
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 15:06
Start Date: 10/Jun/22 15:06
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r894630680


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -4667,6 +4667,9 @@ public static enum ConfVars {
 HIVE_ACID_DIRECT_INSERT_ENABLED("hive.acid.direct.insert.enabled", true,
 "Enable writing the data files directly to the table's final 
destination instead of the staging directory."
 + "This optimization only applies on INSERT operations on ACID 
tables."),
+HIVE_ACID_NO_RENAME_CTAS_ENABLED("hive.acid.no.rename.ctas.enabled", false,

Review Comment:
   Changed the conf  name to  hive.acid.check.for.concurrent.ctas.enabled as it 
is better suited here.
   
   Direct CTAS patch does not introduce a new conf. It depends on 
hive.acid.direct.insert.enabled .





Issue Time Tracking
---

Worklog Id: (was: 780336)
Time Spent: 4h  (was: 3h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=780333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780333
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 15:02
Start Date: 10/Jun/22 15:02
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r894626694


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3087,7 +3090,24 @@ private void insertTxnComponents(long txnid, LockRequest 
rqst, Connection dbConn
   String tblName = normalizeCase(lc.getTablename());
   String partName = normalizePartitionCase(lc.getPartitionname());
   OperationType opType = 
OperationType.fromDataOperationType(lc.getOperationType());
-
+  if (opType.getSqlConst().equals(OperationType.CTAS.getSqlConst())) {

Review Comment:
   Reverted to checkLocks method in the new commit.





Issue Time Tracking
---

Worklog Id: (was: 780333)
Time Spent: 3h 40m  (was: 3.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=780334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780334
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 10/Jun/22 15:03
Start Date: 10/Jun/22 15:03
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r894627428


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -60,6 +60,7 @@
 import javax.sql.DataSource;
 
 import com.google.common.collect.ImmutableList;
+import jline.internal.Log;

Review Comment:
   Intellij auto imported this... have removed it now.





Issue Time Tracking
---

Worklog Id: (was: 780334)
Time Spent: 3h 50m  (was: 3h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=779800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-779800
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 09/Jun/22 07:57
Start Date: 09/Jun/22 07:57
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r893190254


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -3087,7 +3090,24 @@ private void insertTxnComponents(long txnid, LockRequest 
rqst, Connection dbConn
   String tblName = normalizeCase(lc.getTablename());
   String partName = normalizePartitionCase(lc.getPartitionname());
   OperationType opType = 
OperationType.fromDataOperationType(lc.getOperationType());
-
+  if (opType.getSqlConst().equals(OperationType.CTAS.getSqlConst())) {

Review Comment:
   why do you even need to create locks for CTAS with this approach? that won't 
work as select doesn't even use any locking and we shouldn't add one. Why can't 
we stick to checkLock method and only check for Excl_Write and Exclusive locks 
in case of CTAS operation?





Issue Time Tracking
---

Worklog Id: (was: 779800)
Time Spent: 3.5h  (was: 3h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=779799=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-779799
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 09/Jun/22 07:51
Start Date: 09/Jun/22 07:51
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r893184493


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -60,6 +60,7 @@
 import javax.sql.DataSource;
 
 import com.google.common.collect.ImmutableList;
+import jline.internal.Log;

Review Comment:
   we should probably use `org.slf4j.Logger` as everywhere else





Issue Time Tracking
---

Worklog Id: (was: 779799)
Time Spent: 3h 20m  (was: 3h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=779797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-779797
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 09/Jun/22 07:47
Start Date: 09/Jun/22 07:47
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r893180482


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -4667,6 +4667,9 @@ public static enum ConfVars {
 HIVE_ACID_DIRECT_INSERT_ENABLED("hive.acid.direct.insert.enabled", true,
 "Enable writing the data files directly to the table's final 
destination instead of the staging directory."
 + "This optimization only applies on INSERT operations on ACID 
tables."),
+HIVE_ACID_NO_RENAME_CTAS_ENABLED("hive.acid.no.rename.ctas.enabled", false,

Review Comment:
   you'll probably need to rebase as CTAS change is already merged





Issue Time Tracking
---

Worklog Id: (was: 779797)
Time Spent: 3h 10m  (was: 3h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=779352=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-779352
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 08/Jun/22 08:27
Start Date: 08/Jun/22 08:27
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r892070374


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I was able to optimize it by moving this check to a much earlier step, right 
before we call checkLocks. This prevents unnecessary insert into the 
TXN_COMPONENTS table and saves us at least 3 to 4 subsequent queries to the 
metastore.





Issue Time Tracking
---

Worklog Id: (was: 779352)
Time Spent: 3h  (was: 2h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777328
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:33
Start Date: 02/Jun/22 07:33
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run an 
expensive checkLock `BIG` query. Also that would mean that we can just give up 
and do not check against TXNS table what is the type of blocking TXN.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order, however, it doesn't require any extra 
cleanup in case of failure. So I am OK with the selected approach, but we 
should try to optimize if possible.





Issue Time Tracking
---

Worklog Id: (was: 777328)
Time Spent: 2h 50m  (was: 2h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777325
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:28
Start Date: 02/Jun/22 07:28
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run 
expensive checkLock `BIG` query.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order, however, it doesn't require any extra 
cleanup in case of failure. So I am OK with the selected approach, but we 
should try to optimize if possible.





Issue Time Tracking
---

Worklog Id: (was: 777325)
Time Spent: 2h 40m  (was: 2.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777322
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:26
Start Date: 02/Jun/22 07:26
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database (EXCLUSIVE + EXCL_WRITE), so no need to run 
expensive checkLock `BIG` query.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order.





Issue Time Tracking
---

Worklog Id: (was: 777322)
Time Spent: 2.5h  (was: 2h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777316
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:22
Start Date: 02/Jun/22 07:22
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887645373


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -13910,7 +13911,13 @@ private void addDbAndTabToOutputs(String[] 
qualifiedTabName, TableType type,
 for(Map.Entry serdeMap : 
storageFormat.getSerdeProps().entrySet()){
   t.setSerdeParam(serdeMap.getKey(), serdeMap.getValue());
 }
-outputs.add(new WriteEntity(t, WriteEntity.WriteType.DDL_NO_LOCK));
+if (tblProps != null &&
+tblProps.get(TABLE_IS_CTAS) == "true" &&

Review Comment:
   could we do `Boolean.parseBoolean(..)` instead of comparing with the string





Issue Time Tracking
---

Worklog Id: (was: 777316)
Time Spent: 2h 20m  (was: 2h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777312
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:15
Start Date: 02/Jun/22 07:15
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887639411


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't really like that we are adding extra overhead in checkLocks method, 
it's already a sensitive part performance-wise. I think we should try to 
optimize: if it's CTAS we know that it could only be blocked by another 
artificial CTAS or DROP database, so no need to run expensive checkLock `BIG` 
query.
   Also, I would expect IOW to behave similarly to CTAS, currently it doesn't 
fail and is executed in sequential order.





Issue Time Tracking
---

Worklog Id: (was: 777312)
Time Spent: 2h 10m  (was: 2h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=777136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777136
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 19:59
Start Date: 01/Jun/22 19:59
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r887249049


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CreateTableOperation.java:
##
@@ -99,7 +99,8 @@ public int execute() throws HiveException {
   createTableNonReplaceMode(tbl);
 }
 
-DDLUtils.addIfAbsentByName(new WriteEntity(tbl, 
WriteEntity.WriteType.DDL_NO_LOCK), context);
+  DDLUtils.addIfAbsentByName(new WriteEntity(tbl, 
WriteEntity.WriteType.DDL_NO_LOCK), context);

Review Comment:
   Removed populating outputs in CreateTableOperation from the previous commit 
and retained it only in SemanticAnalyze. So, when removing the previous code, 
extra space has crept in. 
   Will remove the extra line and space.





Issue Time Tracking
---

Worklog Id: (was: 777136)
Time Spent: 2h  (was: 1h 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-06-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=776814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776814
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 11:42
Start Date: 01/Jun/22 11:42
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r886705645


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CreateTableOperation.java:
##
@@ -99,7 +99,8 @@ public int execute() throws HiveException {
   createTableNonReplaceMode(tbl);
 }
 
-DDLUtils.addIfAbsentByName(new WriteEntity(tbl, 
WriteEntity.WriteType.DDL_NO_LOCK), context);
+  DDLUtils.addIfAbsentByName(new WriteEntity(tbl, 
WriteEntity.WriteType.DDL_NO_LOCK), context);

Review Comment:
   what changed here, extra space?





Issue Time Tracking
---

Worklog Id: (was: 776814)
Time Spent: 1h 50m  (was: 1h 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773781
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:20
Start Date: 23/May/22 22:20
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879908488


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -13900,7 +13901,11 @@ private void addDbAndTabToOutputs(String[] 
qualifiedTabName, TableType type,
 for(Map.Entry serdeMap : 
storageFormat.getSerdeProps().entrySet()){
   t.setSerdeParam(serdeMap.getKey(), serdeMap.getValue());
 }
-outputs.add(new WriteEntity(t, WriteEntity.WriteType.DDL_NO_LOCK));
+if (tblProps.get("created_with_ctas") == "true") {
+  outputs.add(new WriteEntity(t, WriteType.CTAS));

Review Comment:
   In CreateTableOperation, DDLUtils.addIfAbsentByName would populate the 
output only if it was missed in the SemanticAnalyzer stage. 
   
   So I think, removing this step in CreateTableOperation and retaining it only 
in SemanticAnalyze would be fine.





Issue Time Tracking
---

Worklog Id: (was: 773781)
Time Spent: 1h 40m  (was: 1.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773780
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:17
Start Date: 23/May/22 22:17
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879906851


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   
   We do not know at what stage the 1st query can abort. As it is 
non-deterministic, we needed to make an assumption.
   
   So when this is enabled via the conf, we will be optimistic about the 
outcome and assume the 1st query always succeeds. With this assumption, we can 
fail-early the 2nd concurrent ctas query and prevent any unnecessary move tasks 
and clean up that would have been associated with the 2nd query, if it was to 
continue until the commit stage. Also, the 2nd user will not have to wait for a 
long time to find out that the query failed.
   
   But when this feature is disabled, the query will run with a pessimistic 
assumption that the 1st query can abort. As a result, it does not fail the 2nd 
query until the commit stage. This will result in a lot of overhead and 
clean-up associated with the failed query.  This may also make the user wait 
for a long time only to find out that the query failed which I think is not 
ideal.
   
   This was my thought process. Would this be fine?
   





Issue Time Tracking
---

Worklog Id: (was: 773780)
Time Spent: 1.5h  (was: 1h 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773775
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:08
Start Date: 23/May/22 22:08
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879902730


##
ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java:
##
@@ -18,24 +18,13 @@
 package org.apache.hadoop.hive.ql.lockmgr;
 
 import org.apache.commons.lang3.StringUtils;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.*;
 import org.apache.hadoop.hive.common.JavaUtils;
 import org.apache.hadoop.hive.common.ValidTxnList;
 import org.apache.hadoop.hive.common.ValidWriteIdList;
 import org.apache.hadoop.hive.metastore.MetastoreTaskThread;
-import org.apache.hadoop.hive.metastore.api.AddDynamicPartitions;
-import org.apache.hadoop.hive.metastore.api.AllocateTableWriteIdsRequest;
-import org.apache.hadoop.hive.metastore.api.AllocateTableWriteIdsResponse;
-import org.apache.hadoop.hive.metastore.api.DataOperationType;
-import org.apache.hadoop.hive.metastore.api.LockState;
-import org.apache.hadoop.hive.metastore.api.LockType;
-import org.apache.hadoop.hive.metastore.api.ShowLocksRequest;
-import org.apache.hadoop.hive.metastore.api.ShowLocksResponse;
-import org.apache.hadoop.hive.metastore.api.ShowLocksResponseElement;
-import org.apache.hadoop.hive.metastore.api.TxnType;
-import org.apache.hadoop.hive.metastore.api.CommitTxnRequest;
+import org.apache.hadoop.hive.metastore.Warehouse;
+import org.apache.hadoop.hive.metastore.api.*;

Review Comment:
   This was due to Intellij's auto-import. Will remove the wild card.





Issue Time Tracking
---

Worklog Id: (was: 773775)
Time Spent: 1h 20m  (was: 1h 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773774
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:08
Start Date: 23/May/22 22:08
Worklog Time Spent: 10m 
  Work Description: simhadri-g commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879902647


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -13900,7 +13901,11 @@ private void addDbAndTabToOutputs(String[] 
qualifiedTabName, TableType type,
 for(Map.Entry serdeMap : 
storageFormat.getSerdeProps().entrySet()){
   t.setSerdeParam(serdeMap.getKey(), serdeMap.getValue());
 }
-outputs.add(new WriteEntity(t, WriteEntity.WriteType.DDL_NO_LOCK));
+if (tblProps.get("created_with_ctas") == "true") {

Review Comment:
Thanks for the review:) .
Yes, will update this.





Issue Time Tracking
---

Worklog Id: (was: 773774)
Time Spent: 1h 10m  (was: 1h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773310=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773310
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:19
Start Date: 23/May/22 07:19
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879099718


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -13900,7 +13901,11 @@ private void addDbAndTabToOutputs(String[] 
qualifiedTabName, TableType type,
 for(Map.Entry serdeMap : 
storageFormat.getSerdeProps().entrySet()){
   t.setSerdeParam(serdeMap.getKey(), serdeMap.getValue());
 }
-outputs.add(new WriteEntity(t, WriteEntity.WriteType.DDL_NO_LOCK));
+if (tblProps.get("created_with_ctas") == "true") {
+  outputs.add(new WriteEntity(t, WriteType.CTAS));

Review Comment:
   Why is this needed/when, we are populating outputs in CreateTableOperation 
as well?





Issue Time Tracking
---

Worklog Id: (was: 773310)
Time Spent: 1h  (was: 50m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773309=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773309
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:15
Start Date: 23/May/22 07:15
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879096827


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java:
##
@@ -5283,6 +5284,39 @@ is performed on that db (e.g. show tables, created 
table, etc).
 return response;
   }
 }
+
+if (isValidTxn(txnId)) {
+  LockType lockType = LockTypeUtil.getLockTypeFromEncoding(lockChar)
+  .orElseThrow(() -> new MetaException("Unknown lock type: " + 
lockChar));
+
+  if (lockType == LockType.EXCL_WRITE && blockedBy.state == 
LockState.ACQUIRED) {

Review Comment:
   I don't think we should give up if there is already CTAS on the same table.  
What happens if the 1st CTAS aborts?





Issue Time Tracking
---

Worklog Id: (was: 773309)
Time Spent: 50m  (was: 40m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773304
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:01
Start Date: 23/May/22 07:01
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879086482


##
ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java:
##
@@ -18,24 +18,13 @@
 package org.apache.hadoop.hive.ql.lockmgr;
 
 import org.apache.commons.lang3.StringUtils;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.*;

Review Comment:
   please remove wildcard imports



##
ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java:
##
@@ -18,24 +18,13 @@
 package org.apache.hadoop.hive.ql.lockmgr;
 
 import org.apache.commons.lang3.StringUtils;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.*;
 import org.apache.hadoop.hive.common.JavaUtils;
 import org.apache.hadoop.hive.common.ValidTxnList;
 import org.apache.hadoop.hive.common.ValidWriteIdList;
 import org.apache.hadoop.hive.metastore.MetastoreTaskThread;
-import org.apache.hadoop.hive.metastore.api.AddDynamicPartitions;
-import org.apache.hadoop.hive.metastore.api.AllocateTableWriteIdsRequest;
-import org.apache.hadoop.hive.metastore.api.AllocateTableWriteIdsResponse;
-import org.apache.hadoop.hive.metastore.api.DataOperationType;
-import org.apache.hadoop.hive.metastore.api.LockState;
-import org.apache.hadoop.hive.metastore.api.LockType;
-import org.apache.hadoop.hive.metastore.api.ShowLocksRequest;
-import org.apache.hadoop.hive.metastore.api.ShowLocksResponse;
-import org.apache.hadoop.hive.metastore.api.ShowLocksResponseElement;
-import org.apache.hadoop.hive.metastore.api.TxnType;
-import org.apache.hadoop.hive.metastore.api.CommitTxnRequest;
+import org.apache.hadoop.hive.metastore.Warehouse;
+import org.apache.hadoop.hive.metastore.api.*;

Review Comment:
   please remove wildcard imports





Issue Time Tracking
---

Worklog Id: (was: 773304)
Time Spent: 0.5h  (was: 20m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773303=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773303
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:01
Start Date: 23/May/22 07:01
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879086214


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -13900,7 +13901,11 @@ private void addDbAndTabToOutputs(String[] 
qualifiedTabName, TableType type,
 for(Map.Entry serdeMap : 
storageFormat.getSerdeProps().entrySet()){
   t.setSerdeParam(serdeMap.getKey(), serdeMap.getValue());
 }
-outputs.add(new WriteEntity(t, WriteEntity.WriteType.DDL_NO_LOCK));
+if (tblProps.get("created_with_ctas") == "true") {

Review Comment:
   is there any constant for "created_with_ctas"?





Issue Time Tracking
---

Worklog Id: (was: 773303)
Time Spent: 20m  (was: 10m)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=773306=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773306
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:02
Start Date: 23/May/22 07:02
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3307:
URL: https://github.com/apache/hive/pull/3307#discussion_r879087417


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/CreateTableOperation.java:
##
@@ -98,8 +98,11 @@ public int execute() throws HiveException {
   }
   createTableNonReplaceMode(tbl);
 }
-
-DDLUtils.addIfAbsentByName(new WriteEntity(tbl, 
WriteEntity.WriteType.DDL_NO_LOCK), context);
+if 
(context.getQueryState().getCommandType().equals("CREATETABLE_AS_SELECT")) {

Review Comment:
   should we generalize this to ACID create statements, not only CTAS? 





Issue Time Tracking
---

Worklog Id: (was: 773306)
Time Spent: 40m  (was: 0.5h)

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26244) Implementing locking for concurrent ctas

2022-05-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26244?focusedWorklogId=772887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772887
 ]

ASF GitHub Bot logged work on HIVE-26244:
-

Author: ASF GitHub Bot
Created on: 20/May/22 15:30
Start Date: 20/May/22 15:30
Worklog Time Spent: 10m 
  Work Description: simhadri-g opened a new pull request, #3307:
URL: https://github.com/apache/hive/pull/3307

   
   
   ### What changes were proposed in this pull request?
   
   1. Address the issue with concurrent CTAS operations when creating a table 
with the same name.
   2. Introduce a new dataOperationType for CTAS(t).
   3. Change the lock that is taken by the CTAS operation for transactional 
tables from DDL_NO_LOCK to EXCL_WRITE lock.
   4. Check for entries in TXN_COMPONETS table when a concurrent CTAS operation 
(to create the same table) is blocked and fail-early the concurrent CTAS 
operation as it is unnecessary.
   
   ### Why are the changes needed?
   
   
   1. Currently, the CTAS operation does not acquire any lock(ie DDL_NO_LOCK). 
   2. Let us say that there are 2 concurrent CTAS operations on the same target 
table. This will result in a race to determine who commits first. The query 
that commits first will succeed whereas the query that is yet to commit will 
fail with the table already exists exception. This will result in an 
unnecessary overhead of cleaning up any data written by the failed query and a 
significant amount of unnecessary move tasks.
   
   With this PR, CTAS operation for transactional tables will acquire 
EXCL_WRITE lock and fail-early any other concurrent ctas query that tries to 
create the same table.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   New conf is introduced  hive.acid.no.rename.ctas.enabled
   ### How was this patch tested?
   
   Unit tests and Q tests.




Issue Time Tracking
---

Worklog Id: (was: 772887)
Remaining Estimate: 0h
Time Spent: 10m

> Implementing locking for concurrent ctas
> 
>
> Key: HIVE-26244
> URL: https://issues.apache.org/jira/browse/HIVE-26244
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri G
>Assignee: Simhadri G
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)