[jira] [Work logged] (HIVE-25189) Cache the validWriteIdList in query cache before fetching tables from HMS

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25189?focusedWorklogId=608272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608272
 ]

ASF GitHub Bot logged work on HIVE-25189:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 05:37
Start Date: 08/Jun/21 05:37
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #2342:
URL: https://github.com/apache/hive/pull/2342#issuecomment-856458040


   @deniskuzZ: if I understand correctly, then here we have collected every 
table used by the query in the first phase of the compilation. If my 
understanding is correct, then we have a good place to request the locks for 
the tables and prevent / reduce race conditions during the compilation.
   @scarlin-cloudera, @deniskuzZ what do you think?
   
   Thanks, Peter 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608272)
Time Spent: 1h  (was: 50m)

> Cache the validWriteIdList in query cache before fetching tables from HMS
> -
>
> Key: HIVE-25189
> URL: https://issues.apache.org/jira/browse/HIVE-25189
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For a small performance boost at compile time, we should fetch the 
> validWriteIdList before fetching the tables.  HMS allows these to be batched 
> together in one call.  This will avoid the getTable API from being called 
> twice, because the first time we call it, we pass in a null for 
> validWriteIdList.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25204?focusedWorklogId=608255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608255
 ]

ASF GitHub Bot logged work on HIVE-25204:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 05:10
Start Date: 08/Jun/21 05:10
Worklog Time Spent: 10m 
  Work Description: maheshk114 opened a new pull request #2365:
URL: https://github.com/apache/hive/pull/2365


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608255)
Remaining Estimate: 0h
Time Spent: 10m

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25204:
--
Labels: perfomance pull-request-available  (was: perfomance)

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-07 Thread Haymant Mangla (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haymant Mangla updated HIVE-25154:
--
Attachment: HIVE-25154.patch

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25154.patch
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=608218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608218
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 02:07
Start Date: 08/Jun/21 02:07
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r647057749



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java
##
@@ -608,6 +607,63 @@ private void testNoStatsUpdateForReplTable(String 
tblNamePrefix, String txnPrope
 msClient.close();
   }
 
+  @Test(timeout=8)
+  public void testNoStatsUpdateForSimpleFailoverDb() throws Exception {
+testNoStatsUpdateForFailoverDb("simple", "");
+  }
+
+  @Test(timeout=8)
+  public void testNoStatsUpdateForTxnFailoverDb() throws Exception {
+testNoStatsUpdateForFailoverDb("txn",
+"TBLPROPERTIES 
(\"transactional\"=\"true\",\"transactional_properties\"=\"insert_only\")");
+  }
+
+  private void testNoStatsUpdateForFailoverDb(String tblNamePrefix, String 
txnProperty) throws Exception {
+// Set high worker count so we get a longer queue.
+
hiveConf.setInt(MetastoreConf.ConfVars.STATS_AUTO_UPDATE_WORKER_COUNT.getVarname(),
 4);
+String tblWOStats = tblNamePrefix + "_repl_failover_nostats";
+String ptnTblWOStats = tblNamePrefix + "_ptn_repl_failover_nostats";
+String dbName = ss.getCurrentDatabase();
+StatsUpdaterThread su = createUpdater();
+IMetaStoreClient msClient = new HiveMetaStoreClient(hiveConf);
+hiveConf.setBoolVar(HiveConf.ConfVars.HIVESTATSAUTOGATHER, false);
+hiveConf.setBoolVar(HiveConf.ConfVars.HIVESTATSCOLAUTOGATHER, false);
+
+executeQuery("create table " + tblWOStats + "(i int, s string) " + 
txnProperty);
+executeQuery("insert into " + tblWOStats + "(i, s) values (1, 'test')");
+verifyStatsUpToDate(tblWOStats, Lists.newArrayList("i"), msClient, false);
+
+executeQuery("create table " + ptnTblWOStats + "(s string) partitioned by 
(i int) " + txnProperty);
+executeQuery("insert into " + ptnTblWOStats + "(i, s) values (1, 'test')");
+executeQuery("insert into " + ptnTblWOStats + "(i, s) values (2, 
'test2')");
+executeQuery("insert into " + ptnTblWOStats + "(i, s) values (3, 
'test3')");
+verifyPartStatsUpToDate(3, 1, msClient, ptnTblWOStats, false);
+
+assertTrue(su.runOneIteration());
+Assert.assertEquals(2, su.getQueueLength());
+executeQuery("alter database " + dbName + " set dbproperties('" + 
ReplConst.REPL_FAILOVER_ENABLED + "'='true')");
+//StatsUpdaterThread would not run analyze commands for the tables which 
were inserted before
+//failover property was enabled for that database
+drainWorkQueue(su, 2);
+verifyStatsUpToDate(tblWOStats, Lists.newArrayList("i"), msClient, false);
+verifyPartStatsUpToDate(3, 1, msClient, ptnTblWOStats, false);
+Assert.assertEquals(0, su.getQueueLength());
+
+executeQuery("create table new_table(s string) partitioned by (i int) " + 
txnProperty);
+executeQuery("insert into new_table(i, s) values (4, 'test4')");
+
+assertFalse(su.runOneIteration());
+Assert.assertEquals(0, su.getQueueLength());
+verifyStatsUpToDate(tblWOStats, Lists.newArrayList("i"), msClient, false);
+verifyPartStatsUpToDate(3, 1, msClient, ptnTblWOStats, false);
+
+executeQuery("alter database " + dbName + " set dbproperties('" + 
ReplConst.REPL_FAILOVER_ENABLED + "'='')");
+executeQuery("drop table " + tblWOStats);

Review comment:
   Actually it would make sense to re-verify after removing this property 
REPL_FAILOVER_ENABLED that stat updation is happening 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608218)
Time Spent: 4h 40m  (was: 4.5h)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24073) Execution exception in sort-merge semijoin

2021-06-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera resolved HIVE-24073.

Resolution: Fixed

> Execution exception in sort-merge semijoin
> --
>
> Key: HIVE-24073
> URL: https://issues.apache.org/jira/browse/HIVE-24073
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Reporter: Jesus Camacho Rodriguez
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Working on HIVE-24041, we trigger an additional SJ conversion that leads to 
> this exception at execution time:
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite 
> nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1063)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:685)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:462)
>   ... 16 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite 
> nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1037)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1060)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to 
> overwrite nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:564)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:243)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887)
>   at 
> org.apache.hadoop.hive.ql.exec.TezDummyStoreOperator.process(TezDummyStoreOperator.java:49)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1003)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1020)
>   ... 23 more
> {code}
> To reproduce, just set {{hive.auto.convert.sortmerge.join}} to {{true}} in 
> the last query in {{auto_sortmerge_join_10.q}} after HIVE-24041 has been 
> merged.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24073) Execution exception in sort-merge semijoin

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24073?focusedWorklogId=608217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608217
 ]

ASF GitHub Bot logged work on HIVE-24073:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 02:06
Start Date: 08/Jun/21 02:06
Worklog Time Spent: 10m 
  Work Description: maheshk114 merged pull request #1476:
URL: https://github.com/apache/hive/pull/1476


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608217)
Time Spent: 1h 40m  (was: 1.5h)

> Execution exception in sort-merge semijoin
> --
>
> Key: HIVE-24073
> URL: https://issues.apache.org/jira/browse/HIVE-24073
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Reporter: Jesus Camacho Rodriguez
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Working on HIVE-24041, we trigger an additional SJ conversion that leads to 
> this exception at execution time:
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite 
> nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1063)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:685)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:462)
>   ... 16 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite 
> nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1037)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1060)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to 
> overwrite nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:564)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:243)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887)
>   at 
> org.apache.hadoop.hive.ql.exec.TezDummyStoreOperator.process(TezDummyStoreOperator.java:49)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1003)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1020)
>   ... 23 more
> {code}
> To reproduce, just set {{hive.auto.convert.sortmerge.join}} to {{true}} in 
> the last query in {{auto_sortmerge_join_10.q}} after HIVE-24041 has been 
> merged.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25154) Disable StatsUpdaterThread and PartitionManagementTask for db that is being failoved over.

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25154?focusedWorklogId=608215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608215
 ]

ASF GitHub Bot logged work on HIVE-25154:
-

Author: ASF GitHub Bot
Created on: 08/Jun/21 02:04
Start Date: 08/Jun/21 02:04
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2311:
URL: https://github.com/apache/hive/pull/2311#discussion_r647056770



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
##
@@ -169,7 +168,7 @@ public void run() {
   // this always runs in 'sync' mode where partitions can be added and 
dropped
   MsckInfo msckInfo = new MsckInfo(table.getCatName(), 
table.getDbName(), table.getTableName(),
 null, null, true, true, true, retentionSeconds);
-  executorService.submit(new MsckThread(msckInfo, msckConf, 
qualifiedTableName, countDownLatch));
+  executorService.submit(new MsckThread(msckInfo, msckConf, 
qualifiedTableName, countDownLatch, msc));

Review comment:
   Sorry, looks like I missed to notice this change earlier. Sharing HMS 
client across threads might not be good idea. If all threads makes the call at 
the same time, the results might get mixed up. It would be better to use 
separate client for each of them.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608215)
Time Spent: 4.5h  (was: 4h 20m)

> Disable StatsUpdaterThread and PartitionManagementTask for db that is being 
> failoved over.
> --
>
> Key: HIVE-25154
> URL: https://issues.apache.org/jira/browse/HIVE-25154
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25215) tables_with_x_aborted_transactions should count partition/unpartitioned tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25215:
--
Labels: pull-request-available  (was: )

> tables_with_x_aborted_transactions should count partition/unpartitioned tables
> --
>
> Key: HIVE-25215
> URL: https://issues.apache.org/jira/browse/HIVE-25215
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Initiator compare's each partition's number of aborts to 
> hive.compactor.abortedtxn.threshold, so tables_with_x_aborted_transactions 
> should reflect the number of partitions/unpartitioned tables with >x aborts, 
> instead of the number of tables with >x aborts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25215) tables_with_x_aborted_transactions should count partition/unpartitioned tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25215?focusedWorklogId=608159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608159
 ]

ASF GitHub Bot logged work on HIVE-25215:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 23:27
Start Date: 07/Jun/21 23:27
Worklog Time Spent: 10m 
  Work Description: asinkovits opened a new pull request #2363:
URL: https://github.com/apache/hive/pull/2363


   …/unpartitioned tables
   
   
   
   ### What changes were proposed in this pull request?
   
   consolidate aborted txn threshold between compactor initiator and compactor 
metrics.
   
   ### Why are the changes needed?
   
   Subtask is part of the compaction observability initiative.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Unit tests were added


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608159)
Remaining Estimate: 0h
Time Spent: 10m

> tables_with_x_aborted_transactions should count partition/unpartitioned tables
> --
>
> Key: HIVE-25215
> URL: https://issues.apache.org/jira/browse/HIVE-25215
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Initiator compare's each partition's number of aborts to 
> hive.compactor.abortedtxn.threshold, so tables_with_x_aborted_transactions 
> should reflect the number of partitions/unpartitioned tables with >x aborts, 
> instead of the number of tables with >x aborts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25215) tables_with_x_aborted_transactions should count partition/unpartitioned tables

2021-06-07 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25215 started by Antal Sinkovits.
--
> tables_with_x_aborted_transactions should count partition/unpartitioned tables
> --
>
> Key: HIVE-25215
> URL: https://issues.apache.org/jira/browse/HIVE-25215
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> Initiator compare's each partition's number of aborts to 
> hive.compactor.abortedtxn.threshold, so tables_with_x_aborted_transactions 
> should reflect the number of partitions/unpartitioned tables with >x aborts, 
> instead of the number of tables with >x aborts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25215) tables_with_x_aborted_transactions should count partition/unpartitioned tables

2021-06-07 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits reassigned HIVE-25215:
--


> tables_with_x_aborted_transactions should count partition/unpartitioned tables
> --
>
> Key: HIVE-25215
> URL: https://issues.apache.org/jira/browse/HIVE-25215
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> Initiator compare's each partition's number of aborts to 
> hive.compactor.abortedtxn.threshold, so tables_with_x_aborted_transactions 
> should reflect the number of partitions/unpartitioned tables with >x aborts, 
> instead of the number of tables with >x aborts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25081) Put metrics collection behind a feature flag

2021-06-07 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25081:
---
Parent: HIVE-24824
Issue Type: Sub-task  (was: Bug)

> Put metrics collection behind a feature flag
> 
>
> Key: HIVE-25081
> URL: https://issues.apache.org/jira/browse/HIVE-25081
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Most metrics we're creating are collected in AcidMetricsService, which is 
> behind a feature flag. However there are some metrics that are collected 
> outside of the service. These should be behind a feature flag in addition to 
> hive.metastore.metrics.enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21489) EXPLAIN command throws ClassCastException in Hive

2021-06-07 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358879#comment-17358879
 ] 

Ramesh Kumar Thangarajan commented on HIVE-21489:
-

[~daijy] This issue can still be reproduced if it is Storage based 
Authorization and it will be better if we can fix this. The above patch seems 
to be reasonable for me. 

cc [~hashutosh] [~thejas]

> EXPLAIN command throws ClassCastException in Hive
> -
>
> Key: HIVE-21489
> URL: https://issues.apache.org/jira/browse/HIVE-21489
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.4
>Reporter: Ping Lu
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21489.1.patch, HIVE-21489.2.patch
>
>
> I'm trying to run commands like explain select * from src in hive-2.3.4,but 
> it falls with the ClassCastException: 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer cannot be cast to 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer
> Steps to reproduce:
> 1)hive.execution.engine is the default value mr
> 2)hive.security.authorization.enabled is set to true, and 
> hive.security.authorization.manager is set to 
> org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
> 3)start hivecli to run command:explain select * from src
> I debug the code and find the issue HIVE-18778 causing the above 
> ClassCastException.If I set hive.in.test to true,the explain command can be 
> successfully executed。
> Now,I have one question,due to hive.in.test cann't be modified at runtime.how 
> to run explain command with using default authorization in hive-2.3.4,



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25213) Implement List getTables() for existing connectors.

2021-06-07 Thread Dantong Dong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dantong Dong reassigned HIVE-25213:
---

Assignee: Dantong Dong  (was: Naveen Gangam)

> Implement List getTables() for existing connectors.
> --
>
> Key: HIVE-25213
> URL: https://issues.apache.org/jira/browse/HIVE-25213
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>
> In the initial implementation, connector providers do not implement the 
> getTables(string pattern) spi. We had deferred it for later. Only 
> getTableNames() and getTable() were implemented. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25214) Add hive authorization support for Data connectors.

2021-06-07 Thread Dantong Dong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dantong Dong reassigned HIVE-25214:
---

Assignee: Dantong Dong  (was: Naveen Gangam)

> Add hive authorization support for Data connectors.
> ---
>
> Key: HIVE-25214
> URL: https://issues.apache.org/jira/browse/HIVE-25214
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Dantong Dong
>Priority: Major
>
> We need to add authorization support for data connectors in hive. The default 
> behavior should be
> 1) Connectors can be create/dropped by users in admin role.
> 2) Connectors have READ and WRITE permissions.
> *   READ permissions are required to fetch a connector object or fetch all 
> connector names. So to create a REMOTE database using a connector, users will 
> need READ permission on the connector. DDL queries like "show connectors" and 
> "describe " will check for read access on the connector as well.
> *   WRITE permissions are required to alter/drop a connector. DDL queries 
> like "alter connector" and "drop connector" will need WRITE access on the 
> connector.
> Adding this support, Ranger can integrate with this.
>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25214) Add hive authorization support for Data connectors.

2021-06-07 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-25214:



> Add hive authorization support for Data connectors.
> ---
>
> Key: HIVE-25214
> URL: https://issues.apache.org/jira/browse/HIVE-25214
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>
> We need to add authorization support for data connectors in hive. The default 
> behavior should be
> 1) Connectors can be create/dropped by users in admin role.
> 2) Connectors have READ and WRITE permissions.
> *   READ permissions are required to fetch a connector object or fetch all 
> connector names. So to create a REMOTE database using a connector, users will 
> need READ permission on the connector. DDL queries like "show connectors" and 
> "describe " will check for read access on the connector as well.
> *   WRITE permissions are required to alter/drop a connector. DDL queries 
> like "alter connector" and "drop connector" will need WRITE access on the 
> connector.
> Adding this support, Ranger can integrate with this.
>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25213) Implement List getTables() for existing connectors.

2021-06-07 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-25213:



> Implement List getTables() for existing connectors.
> --
>
> Key: HIVE-25213
> URL: https://issues.apache.org/jira/browse/HIVE-25213
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>
> In the initial implementation, connector providers do not implement the 
> getTables(string pattern) spi. We had deferred it for later. Only 
> getTableNames() and getTable() were implemented. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25211) Create database throws NPE

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25211?focusedWorklogId=608097=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608097
 ]

ASF GitHub Bot logged work on HIVE-25211:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 20:32
Start Date: 07/Jun/21 20:32
Worklog Time Spent: 10m 
  Work Description: yongzhi opened a new pull request #2362:
URL: https://github.com/apache/hive/pull/2362


   
   
   ### What changes were proposed in this pull request?
   Fix NPE when the managed path is NULL. Add Null check.
   
   ### Why are the changes needed?
   Without the fix, create DB may fail with NPE
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
Manual tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608097)
Remaining Estimate: 0h
Time Spent: 10m

> Create database throws NPE
> --
>
> Key: HIVE-25211
> URL: https://issues.apache.org/jira/browse/HIVE-25211
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> <11>1 2021-06-06T17:32:48.964Z 
> metastore-0.metastore-service.warehouse-1622998329-9klr.svc.cluster.local 
> metastore 1 5ad83e8e-bf89-4ad3-b1fb-51c73c7133b7 [mdc@18060 
> class="metastore.RetryingHMSHandler" level="ERROR" thread="pool-9-thread-16"] 
> MetaException(message:java.lang.NullPointerException)
>   
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:8115)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1629)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy31.create_database(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:16795)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:16779)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:643)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:638)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:638)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:120)
>   at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:128)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:491)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:480)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:476)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$9.run(HiveMetaStore.java:1556)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$9.run(HiveMetaStore.java:1554)
>   at 

[jira] [Updated] (HIVE-25211) Create database throws NPE

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25211:
--
Labels: pull-request-available  (was: )

> Create database throws NPE
> --
>
> Key: HIVE-25211
> URL: https://issues.apache.org/jira/browse/HIVE-25211
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> <11>1 2021-06-06T17:32:48.964Z 
> metastore-0.metastore-service.warehouse-1622998329-9klr.svc.cluster.local 
> metastore 1 5ad83e8e-bf89-4ad3-b1fb-51c73c7133b7 [mdc@18060 
> class="metastore.RetryingHMSHandler" level="ERROR" thread="pool-9-thread-16"] 
> MetaException(message:java.lang.NullPointerException)
>   
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:8115)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1629)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy31.create_database(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:16795)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:16779)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:643)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:638)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:638)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:120)
>   at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:128)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:491)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:480)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:476)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$9.run(HiveMetaStore.java:1556)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$9.run(HiveMetaStore.java:1554)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:1554)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1618)
>   ... 21 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=608045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608045
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 18:31
Start Date: 07/Jun/21 18:31
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #2351:
URL: https://github.com/apache/hive/pull/2351#issuecomment-856166212


   @szlta: quick question: Would it be possible to create a test where we 
concurrently try to modify the schema through Hive and change the schema 
through the Iceberg Java API?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608045)
Time Spent: 3.5h  (was: 3h 20m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25211) Create database throws NPE

2021-06-07 Thread Yongzhi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-25211:
---

Assignee: Yongzhi Chen

> Create database throws NPE
> --
>
> Key: HIVE-25211
> URL: https://issues.apache.org/jira/browse/HIVE-25211
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
>
> <11>1 2021-06-06T17:32:48.964Z 
> metastore-0.metastore-service.warehouse-1622998329-9klr.svc.cluster.local 
> metastore 1 5ad83e8e-bf89-4ad3-b1fb-51c73c7133b7 [mdc@18060 
> class="metastore.RetryingHMSHandler" level="ERROR" thread="pool-9-thread-16"] 
> MetaException(message:java.lang.NullPointerException)
>   
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:8115)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1629)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy31.create_database(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:16795)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:16779)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:643)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:638)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:638)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:120)
>   at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:128)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:491)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:480)
>   at 
> org.apache.hadoop.hive.metastore.Warehouse.deleteDir(Warehouse.java:476)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$9.run(HiveMetaStore.java:1556)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$9.run(HiveMetaStore.java:1554)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:1554)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1618)
>   ... 21 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25212) Precision of the result set varying depending on the predicate

2021-06-07 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das reassigned HIVE-25212:
--


>  Precision of the result set varying depending on the predicate
> ---
>
> Key: HIVE-25212
> URL: https://issues.apache.org/jira/browse/HIVE-25212
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>
> Hive: Precision of the result set varying depending on the predicate
> Problem Statement:
> {noformat}
> SELECT t1.c1 FROM t1 WHERE 
> ((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1)) UNION 
> ALL SELECT t1.c1 FROM t1 WHERE NOT 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) UNION 
> ALL SELECT ALL t1.c1 FROM t1 WHERE 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) IS 
> NULL;
> -- result: [ 0.159323060, 
> 0.852237280 ]
> SELECT t1.c1 FROM t1 WHERE ((0.265)!=(t1.c1)) UNION ALL SELECT t1.c1 FROM t1 
> WHERE NOT (((0.265)!=(t1.c1))) UNION ALL SELECT ALL t1.c1 FROM t1 WHERE 
> (((0.265)!=(t1.c1))) IS NULL;
> -- result: [ 0.15932306, 0.85223728] {noformat}
> Steps to reproduce:
> {noformat}
> DROP DATABASE IF EXISTS database0 CASCADE;
> CREATE DATABASE database0;
> use database0;
> CREATE TABLE t1(c0 FLOAT NOT NULL, c1 DECIMAL(9,8) NOT NULL);
> -- Number of Inserts for this run: 2;
> INSERT INTO t1(c0, c1) VALUES(0.037977062, 0.15932306);
> INSERT INTO t1(c0, c1) VALUES(0.65065473, 0.85223728);
> SELECT t1.c1 FROM t1 WHERE 
> ((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1)) UNION 
> ALL SELECT t1.c1 FROM t1 WHERE NOT 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) UNION 
> ALL SELECT ALL t1.c1 FROM t1 WHERE 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) IS 
> NULL;
> -- result: [ 0.159323060, 
> 0.852237280 ]
> SELECT t1.c1 FROM t1 WHERE ((0.265)!=(t1.c1)) UNION ALL SELECT t1.c1 FROM t1 
> WHERE NOT (((0.265)!=(t1.c1))) UNION ALL SELECT ALL t1.c1 FROM t1 WHERE 
> (((0.265)!=(t1.c1))) IS NULL;
> -- result: [ 0.15932306, 0.85223728] {noformat}
> Observations:
>  If the NOT NULL constraint is removed then the result sets match (ideal case 
> is, it should not depend on the constraint)
> Similarity with Impala:
>  Result is as expected
> {noformat}
> impala database0> SELECT t1.c1 FROM t1 WHERE 
> ((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1)) UNION 
> ALL SELECT t1.c1 FROM t1 WHERE NOT 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) UNION 
> ALL SELECT ALL t1.c1 FROM t1 WHERE 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) IS 
> NULL;
> ++
> | c1 |
> ++
> | 0.15932306 |
> | 0.85223728 |
> ++
> rows_count: 2 Time taken: 936ms
> impala database0> SELECT t1.c1 FROM t1 WHERE ((0.265)!=(t1.c1)) UNION ALL 
> SELECT t1.c1 FROM t1 WHERE NOT (((0.265)!=(t1.c1))) UNION ALL SELECT ALL 
> t1.c1 FROM t1 WHERE (((0.265)!=(t1.c1))) IS NULL;
> ++
> | c1 |
> ++
> | 0.85223728 |
> | 0.15932306 |
> ++
> rows_count: 2 Time taken: 887ms{noformat}
> Similarity with Postgres:
> {noformat}
> temp=# SELECT t1.c1 FROM t1 WHERE 
> ((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1)) UNION 
> ALL SELECT t1.c1 FROM t1 WHERE NOT 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) UNION 
> ALL SELECT ALL t1.c1 FROM t1 WHERE 
> (((0.26568513289718753700441311593749560415744781494140625)!=(t1.c1))) IS 
> NULL;
>  c1
> 
>  0.15932306
>  0.85223728
> (2 rows)
> temp=# SELECT t1.c1 FROM t1 WHERE ((0.265)!=(t1.c1)) UNION ALL SELECT t1.c1 
> FROM t1 WHERE NOT (((0.265)!=(t1.c1))) UNION ALL SELECT ALL t1.c1 FROM t1 
> WHERE (((0.265)!=(t1.c1))) IS NULL;
>  c1
> 
>  0.15932306
>  0.85223728
> (2 rows) {noformat}
> Is this expected in case of Hive?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25209) SELECT query with SUM function producing unexpected result

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25209?focusedWorklogId=608023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608023
 ]

ASF GitHub Bot logged work on HIVE-25209:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 17:58
Start Date: 07/Jun/21 17:58
Worklog Time Spent: 10m 
  Work Description: soumyakanti3578 opened a new pull request #2360:
URL: https://github.com/apache/hive/pull/2360


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 608023)
Remaining Estimate: 0h
Time Spent: 10m

> SELECT query with SUM function producing unexpected result
> --
>
> Key: HIVE-25209
> URL: https://issues.apache.org/jira/browse/HIVE-25209
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive: SELECT query with SUM function producing unexpected result
> Problem Statement:
> {noformat}
> SELECT SUM(1) FROM t1;
>  result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
>  result: null {noformat}
> Steps to reproduce:
> {noformat}
> DROP DATABASE IF EXISTS db5 CASCADE;
> CREATE DATABASE db5;
> use db5;
> CREATE TABLE IF NOT EXISTS t1(c0 boolean, c1 boolean);
> SELECT SUM(1) FROM t1;
> -- result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
> -- result: null {noformat}
> Observations:
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 = t1.c1; – will result in null
> Similarity with postgres, 
>  both the queries result in null
> Similarity with Impala,
>  both the queries result in null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25209) SELECT query with SUM function producing unexpected result

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25209:
--
Labels: pull-request-available  (was: )

> SELECT query with SUM function producing unexpected result
> --
>
> Key: HIVE-25209
> URL: https://issues.apache.org/jira/browse/HIVE-25209
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive: SELECT query with SUM function producing unexpected result
> Problem Statement:
> {noformat}
> SELECT SUM(1) FROM t1;
>  result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
>  result: null {noformat}
> Steps to reproduce:
> {noformat}
> DROP DATABASE IF EXISTS db5 CASCADE;
> CREATE DATABASE db5;
> use db5;
> CREATE TABLE IF NOT EXISTS t1(c0 boolean, c1 boolean);
> SELECT SUM(1) FROM t1;
> -- result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
> -- result: null {noformat}
> Observations:
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 = t1.c1; – will result in null
> Similarity with postgres, 
>  both the queries result in null
> Similarity with Impala,
>  both the queries result in null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25210) oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table

2021-06-07 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-25210.
-
Resolution: Not A Problem

> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table
> 
>
> Key: HIVE-25210
> URL: https://issues.apache.org/jira/browse/HIVE-25210
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25210) oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table

2021-06-07 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-25210:
---


> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table
> 
>
> Key: HIVE-25210
> URL: https://issues.apache.org/jira/browse/HIVE-25210
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> oracle.sql.CLOB cannot be cast to java.lang.String in PARTITION_PARAMS table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25209) SELECT query with SUM function producing unexpected result

2021-06-07 Thread Soumyakanti Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das reassigned HIVE-25209:
--


> SELECT query with SUM function producing unexpected result
> --
>
> Key: HIVE-25209
> URL: https://issues.apache.org/jira/browse/HIVE-25209
> Project: Hive
>  Issue Type: Bug
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>
> Hive: SELECT query with SUM function producing unexpected result
> Problem Statement:
> {noformat}
> SELECT SUM(1) FROM t1;
>  result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
>  result: null {noformat}
> Steps to reproduce:
> {noformat}
> DROP DATABASE IF EXISTS db5 CASCADE;
> CREATE DATABASE db5;
> use db5;
> CREATE TABLE IF NOT EXISTS t1(c0 boolean, c1 boolean);
> SELECT SUM(1) FROM t1;
> -- result: 0
> SELECT SUM(agg0) FROM (
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE NOT (t1.c0) UNION ALL 
> SELECT SUM(1) as agg0 FROM t1 WHERE (t1.c0) IS NULL
> ) as asdf;
> -- result: null {noformat}
> Observations:
> SELECT SUM(1) as agg0 FROM t1 WHERE t1.c0 = t1.c1; – will result in null
> Similarity with postgres, 
>  both the queries result in null
> Similarity with Impala,
>  both the queries result in null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607884
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:53
Start Date: 07/Jun/21 13:53
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646608593



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java
##
@@ -783,6 +785,43 @@ public void testDropHiveTableWithoutUnderlyingTable() 
throws IOException {
 shell.executeStatement("DROP TABLE " + identifier);
   }
 
+  @Test
+  public void testAlterTableAddColumns() throws Exception {
+Assume.assumeTrue("Iceberg - alter table/add column is only relevant for 
HiveCatalog",
+testTableType == TestTables.TestTableType.HIVE_CATALOG);
+
+TableIdentifier identifier = TableIdentifier.of("default", "customers");
+
+// Create HMS table with with a property to be translated
+shell.executeStatement(String.format("CREATE EXTERNAL TABLE 
default.customers " +
+"STORED BY ICEBERG " +
+"TBLPROPERTIES ('%s'='%s', '%s'='%s', '%s'='%s')",
+InputFormatConfig.TABLE_SCHEMA, 
SchemaParser.toJson(HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA),
+InputFormatConfig.PARTITION_SPEC, PartitionSpecParser.toJson(SPEC),
+InputFormatConfig.EXTERNAL_TABLE_PURGE, "false"));
+
+shell.executeStatement("ALTER TABLE default.customers ADD COLUMNS " +

Review comment:
   Yeah I'd refrain from doing a thorough check of that here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607884)
Time Spent: 3h 20m  (was: 3h 10m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607881=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607881
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:52
Start Date: 07/Jun/21 13:52
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646608043



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -265,9 +287,12 @@ public void 
commitAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable
   HiveTableUtil.importFiles(preAlterTableProperties.tableLocation, 
preAlterTableProperties.format,
   partitionSpecProxy, preAlterTableProperties.partitionKeys, 
catalogProperties, conf);
 } else {
-  Map contextProperties = context.getProperties();
-  if (contextProperties.containsKey(ALTER_TABLE_OPERATION_TYPE) &&
-  
allowedAlterTypes.contains(contextProperties.get(ALTER_TABLE_OPERATION_TYPE))) {
+  if (isMatchingAlterOp(AlterTableType.ADDCOLS, context) && updateSchema 
!= null) {

Review comment:
   Ok this is now stored as a state of this hook.
   I usually like to be more restrictive, so I'd rather leave the ADDCOL op 
type check (btw I'm not sure how valid this case is, but if there are no new 
columns, then we have a case for ADDCOL op type with null updateSchema ;) )




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607881)
Time Spent: 3h 10m  (was: 3h)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25195) Store Iceberg write commit and ctas information in QueryState

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25195?focusedWorklogId=607878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607878
 ]

ASF GitHub Bot logged work on HIVE-25195:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:49
Start Date: 07/Jun/21 13:49
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2347:
URL: https://github.com/apache/hive/pull/2347#discussion_r646605093



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java
##
@@ -381,27 +379,23 @@ private void collectCommitInformation(TezWork work) 
throws IOException, TezExcep
 if (child.isDirectory() && 
child.getPath().getName().contains(jobIdPrefix)) {
   // folder name pattern is queryID-jobID, we're removing the 
queryID part to get the jobID
   String jobIdStr = 
child.getPath().getName().substring(jobConf.get("hive.query.id").length() + 1);
-  // get all target tables this vertex wrote to
+
   List tables = new ArrayList<>();
+  Map icebergProperties = new HashMap<>();
   for (Map.Entry entry : jobConf) {
-if (entry.getKey().startsWith("iceberg.mr.serialized.table.")) 
{
-  
tables.add(entry.getKey().substring("iceberg.mr.serialized.table.".length()));
+if 
(entry.getKey().startsWith(ICEBERG_SERIALIZED_TABLE_PREFIX)) {
+  // get all target tables this vertex wrote to
+  
tables.add(entry.getKey().substring(ICEBERG_SERIALIZED_TABLE_PREFIX.length()));
+} else if (entry.getKey().startsWith("iceberg.mr.")) {
+  // find iceberg props in jobConf as they can be needed, but 
not available, during job commit
+  icebergProperties.put(entry.getKey(), entry.getValue());
 }
   }
-  // save information for each target table (jobID, task num, 
query state)
-  for (String table : tables) {
-sessionConf.set(HIVE_TEZ_COMMIT_JOB_ID_PREFIX + table, 
jobIdStr);
-sessionConf.setInt(HIVE_TEZ_COMMIT_TASK_COUNT_PREFIX + table,
-status.getProgress().getSucceededTaskCount());
-  }
+  // save information for each target table

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607878)
Time Spent: 3.5h  (was: 3h 20m)

> Store Iceberg write commit and ctas information in QueryState 
> --
>
> Key: HIVE-25195
> URL: https://issues.apache.org/jira/browse/HIVE-25195
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> We should replace the current method of passing Iceberg write commit-related 
> information (jobID, task num) and CTAS info via the session conf using 
> prefixed keys. We have a new way of doing that more cleanly, using the 
> QueryState object. This should make the code easier to maintain and guard 
> against accidental session conf pollution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607874
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:46
Start Date: 07/Jun/21 13:46
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646602703



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -2039,4 +2189,29 @@ private static IntegerColumnStatistics 
deserializeIntColumnStatistics(List Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25128) Remove Thrift Exceptions From RawStore alterCatalog

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25128?focusedWorklogId=607873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607873
 ]

ASF GitHub Bot logged work on HIVE-25128:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:46
Start Date: 07/Jun/21 13:46
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #2291:
URL: https://github.com/apache/hive/pull/2291#discussion_r646602792



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -656,10 +656,9 @@ public void createCatalog(Catalog cat) throws 
MetaException {
   }
 
   @Override
-  public void alterCatalog(String catName, Catalog cat)
-  throws MetaException, InvalidOperationException {
+  public void alterCatalog(String catName, Catalog cat) {
 if (!cat.getName().equals(catName)) {
-  throw new InvalidOperationException("You cannot change a catalog's 
name");
+  throw new HiveMetaRuntimeException("You cannot change a catalog's name: 
" + cat.getName() + " -> " + catName);

Review comment:
   Thanks for the input on the error text. I've altered it a bit to make it 
more clear.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607873)
Time Spent: 2h 20m  (was: 2h 10m)

> Remove Thrift Exceptions From RawStore alterCatalog
> ---
>
> Key: HIVE-25128
> URL: https://issues.apache.org/jira/browse/HIVE-25128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> {code:java|title=RawStore.java}
>   /**
>* Alter an existing catalog.  Only description and location can be 
> changed, and the change of
>* location is for internal use only.
>* @param catName name of the catalog to alter.
>* @param cat new version of the catalog.
>* @throws MetaException something went wrong, usually in the database.
>* @throws InvalidOperationException attempt to change something about the 
> catalog that is not
>* changeable, like the name.
>*/
>   void alterCatalog(String catName, Catalog cat) throws MetaException, 
> InvalidOperationException;
> {code}
> Please check out parent task [HIVE-25126] for the motivation here, but I 
> would like to remove all Thrift-based Exceptions from the {{RawStore}} 
> interface to include MetaException and InvalidOperationException. These 
> should be replaced with something that is specific to Hive and not tied to 
> the RPC layer.
> I propose instead introducing RuntimeExceptions called 
> HiveMetaRuntimeException and sub-class HiveMetaDataAccessException to replace 
> these.
> HiveMetaDataAccessException  = Unable to load data from underlying data store
> HiveMetaRuntimeException = Generic exception for something that was thrown by 
> the RawStore but not specifically handled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607870=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607870
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:44
Start Date: 07/Jun/21 13:44
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646601459



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -2039,4 +2189,29 @@ private static IntegerColumnStatistics 
deserializeIntColumnStatistics(List Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607869
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:42
Start Date: 07/Jun/21 13:42
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646598948



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -1748,7 +1946,7 @@ public int compareTo(CompressedOwid other) {
   assert shouldReadDeleteDeltasWithLlap(conf, true);
 }
 deleteReaderValue = new DeleteReaderValue(readerData.reader, 
deleteDeltaFile, readerOptions, bucket,
-validWriteIdList, isBucketedTable, conf, keyInterval, 
orcSplit, numRows, cacheTag, fileId);
+validWriteIdList, isBucketedTable, conf, keyInterval, 
orcSplit, numRows, cacheTag, fileId);

Review comment:
   unnecessary space




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607869)
Time Spent: 2.5h  (was: 2h 20m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607865
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:38
Start Date: 07/Jun/21 13:38
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646596265



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -959,6 +989,20 @@ public boolean next(NullWritable key, VectorizedRowBatch 
value) throws IOExcepti
   int ix = rbCtx.findVirtualColumnNum(VirtualColumn.ROWID);
   value.cols[ix] = recordIdColumnVector;
 }
+if (rowIsDeletedProjected) {
+  if (fetchDeletedRows) {

Review comment:
   tbh we could even do the second check as part of the Set method (as we 
do already for cardinality 0) and simplify the logic here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607865)
Time Spent: 2h 20m  (was: 2h 10m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25128) Remove Thrift Exceptions From RawStore alterCatalog

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25128?focusedWorklogId=607866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607866
 ]

ASF GitHub Bot logged work on HIVE-25128:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:38
Start Date: 07/Jun/21 13:38
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #2291:
URL: https://github.com/apache/hive/pull/2291#discussion_r646596253



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -656,10 +656,9 @@ public void createCatalog(Catalog cat) throws 
MetaException {
   }
 
   @Override
-  public void alterCatalog(String catName, Catalog cat)
-  throws MetaException, InvalidOperationException {
+  public void alterCatalog(String catName, Catalog cat) {
 if (!cat.getName().equals(catName)) {
-  throw new InvalidOperationException("You cannot change a catalog's 
name");
+  throw new HiveMetaRuntimeException("You cannot change a catalog's name: 
" + cat.getName() + " -> " + catName);

Review comment:
   @nrg4878 `InvalidOperationException` is a Thrift exception, so I'm 
trying to get it out of this class.
   
   I am open to also making this a POJO `IllegalArgumentException` as well, but 
this would apply to quite a few methods so it probably should be done in 
another PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607866)
Time Spent: 2h 10m  (was: 2h)

> Remove Thrift Exceptions From RawStore alterCatalog
> ---
>
> Key: HIVE-25128
> URL: https://issues.apache.org/jira/browse/HIVE-25128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> {code:java|title=RawStore.java}
>   /**
>* Alter an existing catalog.  Only description and location can be 
> changed, and the change of
>* location is for internal use only.
>* @param catName name of the catalog to alter.
>* @param cat new version of the catalog.
>* @throws MetaException something went wrong, usually in the database.
>* @throws InvalidOperationException attempt to change something about the 
> catalog that is not
>* changeable, like the name.
>*/
>   void alterCatalog(String catName, Catalog cat) throws MetaException, 
> InvalidOperationException;
> {code}
> Please check out parent task [HIVE-25126] for the motivation here, but I 
> would like to remove all Thrift-based Exceptions from the {{RawStore}} 
> interface to include MetaException and InvalidOperationException. These 
> should be replaced with something that is specific to Hive and not tied to 
> the RPC layer.
> I propose instead introducing RuntimeExceptions called 
> HiveMetaRuntimeException and sub-class HiveMetaDataAccessException to replace 
> these.
> HiveMetaDataAccessException  = Unable to load data from underlying data store
> HiveMetaRuntimeException = Generic exception for something that was thrown by 
> the RawStore but not specifically handled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25128) Remove Thrift Exceptions From RawStore alterCatalog

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25128?focusedWorklogId=607862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607862
 ]

ASF GitHub Bot logged work on HIVE-25128:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:36
Start Date: 07/Jun/21 13:36
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #2291:
URL: https://github.com/apache/hive/pull/2291#discussion_r646593817



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaRuntimeException.java
##
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore;
+
+import com.google.errorprone.annotations.FormatMethod;
+
+/**
+ * This is the root of all Hive Runtime Exceptions. If a client can reasonably
+ * be expected to recover from an exception, make it a checked exception. If a
+ * client cannot do anything to recover from the exception, make it an 
unchecked
+ * exception.
+ */
+public class HiveMetaRuntimeException extends RuntimeException {

Review comment:
   @nrg4878 Yes.  My hope would be to use this as the base class for all 
HMS Runtime Exceptions.  DataAccess is just one such type which also serves to 
demonstrate the pattern I'm going for.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607862)
Time Spent: 2h  (was: 1h 50m)

> Remove Thrift Exceptions From RawStore alterCatalog
> ---
>
> Key: HIVE-25128
> URL: https://issues.apache.org/jira/browse/HIVE-25128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> {code:java|title=RawStore.java}
>   /**
>* Alter an existing catalog.  Only description and location can be 
> changed, and the change of
>* location is for internal use only.
>* @param catName name of the catalog to alter.
>* @param cat new version of the catalog.
>* @throws MetaException something went wrong, usually in the database.
>* @throws InvalidOperationException attempt to change something about the 
> catalog that is not
>* changeable, like the name.
>*/
>   void alterCatalog(String catName, Catalog cat) throws MetaException, 
> InvalidOperationException;
> {code}
> Please check out parent task [HIVE-25126] for the motivation here, but I 
> would like to remove all Thrift-based Exceptions from the {{RawStore}} 
> interface to include MetaException and InvalidOperationException. These 
> should be replaced with something that is specific to Hive and not tied to 
> the RPC layer.
> I propose instead introducing RuntimeExceptions called 
> HiveMetaRuntimeException and sub-class HiveMetaDataAccessException to replace 
> these.
> HiveMetaDataAccessException  = Unable to load data from underlying data store
> HiveMetaRuntimeException = Generic exception for something that was thrown by 
> the RawStore but not specifically handled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25128) Remove Thrift Exceptions From RawStore alterCatalog

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25128?focusedWorklogId=607860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607860
 ]

ASF GitHub Bot logged work on HIVE-25128:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:35
Start Date: 07/Jun/21 13:35
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #2291:
URL: https://github.com/apache/hive/pull/2291#issuecomment-855934805


   @nrg4878 This is a really heavy commit to change all the method signatures, 
though I do have another PR ready to go: #2290.
   
   As you've seen from the initial interest in this patch, I didn't want to put 
the time into all of the work without some acknowledgement and buy-in on the 
current direction I'm trying to take this in.  I still prefer the piecemeal 
approach for that reason. This PR is just a bit more foundational because it 
includes the Exceptions I would like to move forward with.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607860)
Time Spent: 1h 50m  (was: 1h 40m)

> Remove Thrift Exceptions From RawStore alterCatalog
> ---
>
> Key: HIVE-25128
> URL: https://issues.apache.org/jira/browse/HIVE-25128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code:java|title=RawStore.java}
>   /**
>* Alter an existing catalog.  Only description and location can be 
> changed, and the change of
>* location is for internal use only.
>* @param catName name of the catalog to alter.
>* @param cat new version of the catalog.
>* @throws MetaException something went wrong, usually in the database.
>* @throws InvalidOperationException attempt to change something about the 
> catalog that is not
>* changeable, like the name.
>*/
>   void alterCatalog(String catName, Catalog cat) throws MetaException, 
> InvalidOperationException;
> {code}
> Please check out parent task [HIVE-25126] for the motivation here, but I 
> would like to remove all Thrift-based Exceptions from the {{RawStore}} 
> interface to include MetaException and InvalidOperationException. These 
> should be replaced with something that is specific to Hive and not tied to 
> the RPC layer.
> I propose instead introducing RuntimeExceptions called 
> HiveMetaRuntimeException and sub-class HiveMetaDataAccessException to replace 
> these.
> HiveMetaDataAccessException  = Unable to load data from underlying data store
> HiveMetaRuntimeException = Generic exception for something that was thrown by 
> the RawStore but not specifically handled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607859
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:34
Start Date: 07/Jun/21 13:34
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646592844



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -983,7 +1027,7 @@ private void copyFromBase(VectorizedRowBatch value) {
   System.arraycopy(payloadStruct.fields, 0, value.cols, 0, 
value.getDataColumnCount());
 }
 if (rowIdProjected) {
-  recordIdColumnVector.fields[0] = 
vectorizedRowBatchBase.cols[OrcRecordUpdater.ORIGINAL_WRITEID];
+  recordIdColumnVector.fields[0] = 
vectorizedRowBatchBase.cols[fetchDeletedRows ? OrcRecordUpdater.CURRENT_WRITEID 
: OrcRecordUpdater.ORIGINAL_WRITEID];

Review comment:
   would love a comment about the different WRITEID here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607859)
Time Spent: 2h 10m  (was: 2h)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607858
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:34
Start Date: 07/Jun/21 13:34
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646592084



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -959,6 +989,20 @@ public boolean next(NullWritable key, VectorizedRowBatch 
value) throws IOExcepti
   int ix = rbCtx.findVirtualColumnNum(VirtualColumn.ROWID);
   value.cols[ix] = recordIdColumnVector;
 }
+if (rowIsDeletedProjected) {
+  if (fetchDeletedRows) {

Review comment:
   if (!fetchDeletedRows || notDeletedBitSet.cardinality() == 
vectorizedRowBatchBase.size )




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607858)
Time Spent: 2h  (was: 1h 50m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607854
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:32
Start Date: 07/Jun/21 13:32
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646590439



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -959,6 +989,20 @@ public boolean next(NullWritable key, VectorizedRowBatch 
value) throws IOExcepti
   int ix = rbCtx.findVirtualColumnNum(VirtualColumn.ROWID);

Review comment:
   we could probably do the same optimization here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607854)
Time Spent: 1h 50m  (was: 1h 40m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25128) Remove Thrift Exceptions From RawStore alterCatalog

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25128?focusedWorklogId=607856=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607856
 ]

ASF GitHub Bot logged work on HIVE-25128:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:32
Start Date: 07/Jun/21 13:32
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #2291:
URL: https://github.com/apache/hive/pull/2291#discussion_r646590431



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaDataAccessException.java
##
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore;
+
+/**
+ * Hive Exception when the metastore's underlying storage mechanism cannot be
+ * accessed.
+ */
+public class HiveMetaDataAccessException extends HiveMetaRuntimeException {

Review comment:
   @nrg4878 Hey, thanks for the review.
   
   I was looking at some other frameworks for inspiration here.  I think the 
idea here is that even on a WRITE operation, if the database is down or there 
is a timeout to acquire the lock for writing, this is still an "access" 
exception.  I would love to do some sub-classing and make more error specific 
to certain conditions, but I think the current naming is a good starting point.
   
   
https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/dao/DataAccessException.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607856)
Time Spent: 1h 40m  (was: 1.5h)

> Remove Thrift Exceptions From RawStore alterCatalog
> ---
>
> Key: HIVE-25128
> URL: https://issues.apache.org/jira/browse/HIVE-25128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {code:java|title=RawStore.java}
>   /**
>* Alter an existing catalog.  Only description and location can be 
> changed, and the change of
>* location is for internal use only.
>* @param catName name of the catalog to alter.
>* @param cat new version of the catalog.
>* @throws MetaException something went wrong, usually in the database.
>* @throws InvalidOperationException attempt to change something about the 
> catalog that is not
>* changeable, like the name.
>*/
>   void alterCatalog(String catName, Catalog cat) throws MetaException, 
> InvalidOperationException;
> {code}
> Please check out parent task [HIVE-25126] for the motivation here, but I 
> would like to remove all Thrift-based Exceptions from the {{RawStore}} 
> interface to include MetaException and InvalidOperationException. These 
> should be replaced with something that is specific to Hive and not tied to 
> the RPC layer.
> I propose instead introducing RuntimeExceptions called 
> HiveMetaRuntimeException and sub-class HiveMetaDataAccessException to replace 
> these.
> HiveMetaDataAccessException  = Unable to load data from underlying data store
> HiveMetaRuntimeException = Generic exception for something that was thrown by 
> the RawStore but not specifically handled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25128) Remove Thrift Exceptions From RawStore alterCatalog

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25128?focusedWorklogId=607855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607855
 ]

ASF GitHub Bot logged work on HIVE-25128:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:32
Start Date: 07/Jun/21 13:32
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #2291:
URL: https://github.com/apache/hive/pull/2291#discussion_r646590431



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaDataAccessException.java
##
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore;
+
+/**
+ * Hive Exception when the metastore's underlying storage mechanism cannot be
+ * accessed.
+ */
+public class HiveMetaDataAccessException extends HiveMetaRuntimeException {

Review comment:
   @nrg4878 Hey, thanks for the review.
   
   I was looking at some other frameworks for inspiration here.  I think the 
idea here is that even on a WRITE operation, if the database is down or there 
is a timeout to acquire the lock for writing, this is still an "access" 
exception.  I would love to do some sub-classing and make more error specific 
to certain conditions, but I think this is a good starting point.
   
   
https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/dao/DataAccessException.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607855)
Time Spent: 1.5h  (was: 1h 20m)

> Remove Thrift Exceptions From RawStore alterCatalog
> ---
>
> Key: HIVE-25128
> URL: https://issues.apache.org/jira/browse/HIVE-25128
> Project: Hive
>  Issue Type: Sub-task
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {code:java|title=RawStore.java}
>   /**
>* Alter an existing catalog.  Only description and location can be 
> changed, and the change of
>* location is for internal use only.
>* @param catName name of the catalog to alter.
>* @param cat new version of the catalog.
>* @throws MetaException something went wrong, usually in the database.
>* @throws InvalidOperationException attempt to change something about the 
> catalog that is not
>* changeable, like the name.
>*/
>   void alterCatalog(String catName, Catalog cat) throws MetaException, 
> InvalidOperationException;
> {code}
> Please check out parent task [HIVE-25126] for the motivation here, but I 
> would like to remove all Thrift-based Exceptions from the {{RawStore}} 
> interface to include MetaException and InvalidOperationException. These 
> should be replaced with something that is specific to Hive and not tied to 
> the RPC layer.
> I propose instead introducing RuntimeExceptions called 
> HiveMetaRuntimeException and sub-class HiveMetaDataAccessException to replace 
> these.
> HiveMetaDataAccessException  = Unable to load data from underlying data store
> HiveMetaRuntimeException = Generic exception for something that was thrown by 
> the RawStore but not specifically handled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25208) Refactor Iceberg commit to the MoveTask/MoveWork

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25208:
--
Labels: pull-request-available  (was: )

> Refactor Iceberg commit to the MoveTask/MoveWork
> 
>
> Key: HIVE-25208
> URL: https://issues.apache.org/jira/browse/HIVE-25208
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Instead of committing Iceberg changes in `DefaultMetaHook.preCommitInsert` we 
> should commit in MoveWork so we are using the same flow as normal tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25208) Refactor Iceberg commit to the MoveTask/MoveWork

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25208?focusedWorklogId=607845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607845
 ]

ASF GitHub Bot logged work on HIVE-25208:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 13:04
Start Date: 07/Jun/21 13:04
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #2359:
URL: https://github.com/apache/hive/pull/2359


   ### What changes were proposed in this pull request?
   We should use the MoveTask to commit the changes (inserts/insert overwrites)
   
   ### Why are the changes needed?
   MoveTask is used for multiple things, like stat generation. When we removed 
the Move tasks, we caused several unseen issues. We should reintroduce the 
MoveTask
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607845)
Remaining Estimate: 0h
Time Spent: 10m

> Refactor Iceberg commit to the MoveTask/MoveWork
> 
>
> Key: HIVE-25208
> URL: https://issues.apache.org/jira/browse/HIVE-25208
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Instead of committing Iceberg changes in `DefaultMetaHook.preCommitInsert` we 
> should commit in MoveWork so we are using the same flow as normal tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25193) Vectorized Query Execution: ClassCastException when use nvl() function which default_value is decimal type

2021-06-07 Thread qiang.bi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qiang.bi updated HIVE-25193:

Status: Patch Available  (was: Open)

> Vectorized Query Execution: ClassCastException when use nvl() function which 
> default_value is decimal type
> --
>
> Key: HIVE-25193
> URL: https://issues.apache.org/jira/browse/HIVE-25193
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 4.0.0
>Reporter: qiang.bi
>Assignee: qiang.bi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25193.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Problem statement:
> {code:java}
> set hive.vectorized.execution.enabled = true;
> select nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) 
> corrected_price,
> from dw_mdm_sync_asset;
> {code}
>  The error log:
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVectorCaused by: 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setElement(BytesColumnVector.java:504)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorCoalesce.evaluate(VectorCoalesce.java:124)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.CastStringToDouble.evaluate(CastStringToDouble.java:83)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ... 28 more{code}
>  The problem HiveQL:
> {code:java}
> nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) corrected_price
> {code}
>  The problem expression:
> {code:java}
> CastStringToDouble(col 39:string)(children: VectorCoalesce(columns [37, 
> 38])(children: VectorUDFAdaptor(get_json_object(_col14, '$.correctedPrice')) 
> -> 37:string, ConstantVectorExpression(val 0.88) -> 38:decimal(2,2)) -> 
> 39:string) -> 40:double
> {code}
>  The problem code:
> {code:java}
> public class VectorCoalesce extends VectorExpression {  
>   ...   
>   @Override
>   public void evaluate(VectorizedRowBatch batch) throws HiveException {if 
> (childExpressions != null) {
>   super.evaluateChildren(batch);
> }int[] sel = batch.selected;
> int n = batch.size;
> ColumnVector outputColVector = batch.cols[outputColumnNum];
> boolean[] outputIsNull = outputColVector.isNull;
> if (n <= 0) {
>   // Nothing to do
>   return;
> }if (unassignedBatchIndices == null || n > 
> unassignedBatchIndices.length) {  // (Re)allocate larger to be a multiple 
> of 1024 (DEFAULT_SIZE).
>   final int roundUpSize =
>   ((n + VectorizedRowBatch.DEFAULT_SIZE - 1) / 
> VectorizedRowBatch.DEFAULT_SIZE)
>   * VectorizedRowBatch.DEFAULT_SIZE;
>   unassignedBatchIndices = new int[roundUpSize];
> }// We do not need to do a column reset since we are carefully 
> changing the output.
> outputColVector.isRepeating = false;// CONSIDER: Should be do this 
> for all vector expressions that can
> //   work on BytesColumnVector output columns???
> outputColVector.init();
> final int columnCount = inputColumns.length;/*
>  * Process the input columns to find a non-NULL value for each row.
>  *
>  * We track the unassigned batchIndex of the rows that have not received
>  * a non-NULL value yet.  Similar to a selected array.
>  */
> boolean isAllUnassigned = true;
> int unassignedColumnCount = 0;
> for (int k = 0; k < inputColumns.length; k++) {
>   ColumnVector cv = batch.cols[inputColumns[k]];
>   if (cv.isRepeating) {if (cv.noNulls || !cv.isNull[0]) {
>   /*
>* With a repeating value we can finish all remaining rows.
>*/
>   if (isAllUnassigned) {// No other columns provided 
> non-NULL values.  We can return repeated output.
> outputIsNull[0] = false;
> outputColVector.setElement(0, 0, cv);
> outputColVector.isRepeating = true;
> return;
>   } else {// Some rows have already been assigned values. 
> Assign the remaining.
> // We cannot use copySelected method here.
> for (int i = 0; i < unassignedColumnCount; i++) {
>   final int batchIndex = unassignedBatchIndices[i];
>   outputIsNull[batchIndex] 

[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607840
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:53
Start Date: 07/Jun/21 12:53
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646558980



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -948,7 +978,7 @@ public boolean next(NullWritable key, VectorizedRowBatch 
value) throws IOExcepti
   // This loop fills up the selected[] vector with all the index positions 
that are selected.
   for (int setBitIndex = selectedBitSet.nextSetBit(0), selectedItr = 0;
setBitIndex >= 0;
-   setBitIndex = selectedBitSet.nextSetBit(setBitIndex+1), 
++selectedItr) {
+   setBitIndex = selectedBitSet.nextSetBit(setBitIndex + 1), 
++selectedItr) {

Review comment:
   change not needed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607840)
Time Spent: 1h 40m  (was: 1.5h)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607839
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:52
Start Date: 07/Jun/21 12:52
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646558742



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -932,8 +960,10 @@ public boolean next(NullWritable key, VectorizedRowBatch 
value) throws IOExcepti
 }
 
 // Case 2- find rows which have been deleted.
+BitSet notDeletedBitSet = fetchDeletedRows ? (BitSet) 
selectedBitSet.clone() : selectedBitSet;

Review comment:
   lets add a comment above saying when/why we clone the BitSet




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607839)
Time Spent: 1.5h  (was: 1h 20m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25193) Vectorized Query Execution: ClassCastException when use nvl() function which default_value is decimal type

2021-06-07 Thread qiang.bi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qiang.bi updated HIVE-25193:

Attachment: HIVE-25193.1.patch

> Vectorized Query Execution: ClassCastException when use nvl() function which 
> default_value is decimal type
> --
>
> Key: HIVE-25193
> URL: https://issues.apache.org/jira/browse/HIVE-25193
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 4.0.0
>Reporter: qiang.bi
>Assignee: qiang.bi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-25193.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Problem statement:
> {code:java}
> set hive.vectorized.execution.enabled = true;
> select nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) 
> corrected_price,
> from dw_mdm_sync_asset;
> {code}
>  The error log:
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVectorCaused by: 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setElement(BytesColumnVector.java:504)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorCoalesce.evaluate(VectorCoalesce.java:124)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.CastStringToDouble.evaluate(CastStringToDouble.java:83)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ... 28 more{code}
>  The problem HiveQL:
> {code:java}
> nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) corrected_price
> {code}
>  The problem expression:
> {code:java}
> CastStringToDouble(col 39:string)(children: VectorCoalesce(columns [37, 
> 38])(children: VectorUDFAdaptor(get_json_object(_col14, '$.correctedPrice')) 
> -> 37:string, ConstantVectorExpression(val 0.88) -> 38:decimal(2,2)) -> 
> 39:string) -> 40:double
> {code}
>  The problem code:
> {code:java}
> public class VectorCoalesce extends VectorExpression {  
>   ...   
>   @Override
>   public void evaluate(VectorizedRowBatch batch) throws HiveException {if 
> (childExpressions != null) {
>   super.evaluateChildren(batch);
> }int[] sel = batch.selected;
> int n = batch.size;
> ColumnVector outputColVector = batch.cols[outputColumnNum];
> boolean[] outputIsNull = outputColVector.isNull;
> if (n <= 0) {
>   // Nothing to do
>   return;
> }if (unassignedBatchIndices == null || n > 
> unassignedBatchIndices.length) {  // (Re)allocate larger to be a multiple 
> of 1024 (DEFAULT_SIZE).
>   final int roundUpSize =
>   ((n + VectorizedRowBatch.DEFAULT_SIZE - 1) / 
> VectorizedRowBatch.DEFAULT_SIZE)
>   * VectorizedRowBatch.DEFAULT_SIZE;
>   unassignedBatchIndices = new int[roundUpSize];
> }// We do not need to do a column reset since we are carefully 
> changing the output.
> outputColVector.isRepeating = false;// CONSIDER: Should be do this 
> for all vector expressions that can
> //   work on BytesColumnVector output columns???
> outputColVector.init();
> final int columnCount = inputColumns.length;/*
>  * Process the input columns to find a non-NULL value for each row.
>  *
>  * We track the unassigned batchIndex of the rows that have not received
>  * a non-NULL value yet.  Similar to a selected array.
>  */
> boolean isAllUnassigned = true;
> int unassignedColumnCount = 0;
> for (int k = 0; k < inputColumns.length; k++) {
>   ColumnVector cv = batch.cols[inputColumns[k]];
>   if (cv.isRepeating) {if (cv.noNulls || !cv.isNull[0]) {
>   /*
>* With a repeating value we can finish all remaining rows.
>*/
>   if (isAllUnassigned) {// No other columns provided 
> non-NULL values.  We can return repeated output.
> outputIsNull[0] = false;
> outputColVector.setElement(0, 0, cv);
> outputColVector.isRepeating = true;
> return;
>   } else {// Some rows have already been assigned values. 
> Assign the remaining.
> // We cannot use copySelected method here.
> for (int i = 0; i < unassignedColumnCount; i++) {
>   final int batchIndex = unassignedBatchIndices[i];
>   outputIsNull[batchIndex] = 

[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607837
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:49
Start Date: 07/Jun/21 12:49
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646556329



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -303,6 +314,12 @@ private VectorizedOrcAcidRowBatchReader(JobConf conf, 
OrcSplit orcSplit, Reporte
   VectorizedRowBatch.DEFAULT_SIZE, null, null, null);
 }
 rowIdProjected = areRowIdsProjected(rbCtx);
+rowIsDeletedProjected = isVirtualColumnProjected(rbCtx, 
VirtualColumn.ROWISDELETED);
+if (rowIsDeletedProjected) {
+  rowIsDeletedVector = new RowIsDeletedColumnVector();

Review comment:
   Lets explicitly pass VectorizedRowBatch.DEFAULT_SIZE to make this more 
obvious




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607837)
Time Spent: 1h 20m  (was: 1h 10m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607833
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:47
Start Date: 07/Jun/21 12:47
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646554194



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -892,13 +913,20 @@ public boolean next(NullWritable key, VectorizedRowBatch 
value) throws IOExcepti
 } catch (Exception e) {
   throw new IOException("error iterating", e);
 }
-if(!includeAcidColumns) {
+if (!includeAcidColumns) {
   //if here, we don't need to filter anything wrt acid metadata columns
   //in fact, they are not even read from file/llap
   value.size = vectorizedRowBatchBase.size;
   value.selected = vectorizedRowBatchBase.selected;
   value.selectedInUse = vectorizedRowBatchBase.selectedInUse;
   copyFromBase(value);
+
+  if (rowIsDeletedProjected) {
+rowIsDeletedVector.clear();
+int ix = rbCtx.findVirtualColumnNum(VirtualColumn.ROWISDELETED);

Review comment:
   Why do we have to recompute this for every batch? Lets store this along 
with rowIsDeletedProjected flag




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607833)
Time Spent: 1h  (was: 50m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607834=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607834
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:47
Start Date: 07/Jun/21 12:47
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646554194



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -892,13 +913,20 @@ public boolean next(NullWritable key, VectorizedRowBatch 
value) throws IOExcepti
 } catch (Exception e) {
   throw new IOException("error iterating", e);
 }
-if(!includeAcidColumns) {
+if (!includeAcidColumns) {
   //if here, we don't need to filter anything wrt acid metadata columns
   //in fact, they are not even read from file/llap
   value.size = vectorizedRowBatchBase.size;
   value.selected = vectorizedRowBatchBase.selected;
   value.selectedInUse = vectorizedRowBatchBase.selectedInUse;
   copyFromBase(value);
+
+  if (rowIsDeletedProjected) {
+rowIsDeletedVector.clear();
+int ix = rbCtx.findVirtualColumnNum(VirtualColumn.ROWISDELETED);

Review comment:
   Why do we have to recompute **ix** for every batch? Lets store this 
along with rowIsDeletedProjected flag




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607834)
Time Spent: 1h 10m  (was: 1h)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607832
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:43
Start Date: 07/Jun/21 12:43
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646551472



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -303,6 +314,12 @@ private VectorizedOrcAcidRowBatchReader(JobConf conf, 
OrcSplit orcSplit, Reporte
   VectorizedRowBatch.DEFAULT_SIZE, null, null, null);
 }
 rowIdProjected = areRowIdsProjected(rbCtx);
+rowIsDeletedProjected = isVirtualColumnProjected(rbCtx, 
VirtualColumn.ROWISDELETED);

Review comment:
   lets move this to a Utility function as areRowIdsProjected() above




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607832)
Time Spent: 50m  (was: 40m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607830
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:41
Start Date: 07/Jun/21 12:41
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646550206



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -281,16 +285,23 @@ private VectorizedOrcAcidRowBatchReader(JobConf conf, 
OrcSplit orcSplit, Reporte
 deleteEventReaderOptions.range(0, Long.MAX_VALUE);
 deleteEventReaderOptions.searchArgument(null, null);
 keyInterval = findMinMaxKeys(orcSplit, conf, deleteEventReaderOptions);
+fetchDeletedRows = conf.getBoolean(Constants.ACID_FETCH_DELETED_ROWS, 
false);
 DeleteEventRegistry der;
 try {
   // See if we can load all the relevant delete events from all the
   // delete deltas in memory...
+  ColumnizedDeleteEventRegistry.OriginalWriteIdLoader writeIdLoader;
+  if (fetchDeletedRows) {
+writeIdLoader = new ColumnizedDeleteEventRegistry.BothWriteIdLoader();

Review comment:
   Maybe rename to something more explicit like 
OriginalAndCurrentWriteIdLoader?
   
   Also lets add some comment above explaining the logic




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607830)
Time Spent: 40m  (was: 0.5h)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25208) Refactor Iceberg commit to the MoveTask/MoveWork

2021-06-07 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-25208:
-


> Refactor Iceberg commit to the MoveTask/MoveWork
> 
>
> Key: HIVE-25208
> URL: https://issues.apache.org/jira/browse/HIVE-25208
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> Instead of committing Iceberg changes in `DefaultMetaHook.preCommitInsert` we 
> should commit in MoveWork so we are using the same flow as normal tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25193) Vectorized Query Execution: ClassCastException when use nvl() function which default_value is decimal type

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25193:
--
Labels: pull-request-available  (was: )

> Vectorized Query Execution: ClassCastException when use nvl() function which 
> default_value is decimal type
> --
>
> Key: HIVE-25193
> URL: https://issues.apache.org/jira/browse/HIVE-25193
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 4.0.0
>Reporter: qiang.bi
>Assignee: qiang.bi
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Problem statement:
> {code:java}
> set hive.vectorized.execution.enabled = true;
> select nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) 
> corrected_price,
> from dw_mdm_sync_asset;
> {code}
>  The error log:
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVectorCaused by: 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setElement(BytesColumnVector.java:504)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorCoalesce.evaluate(VectorCoalesce.java:124)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.CastStringToDouble.evaluate(CastStringToDouble.java:83)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ... 28 more{code}
>  The problem HiveQL:
> {code:java}
> nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) corrected_price
> {code}
>  The problem expression:
> {code:java}
> CastStringToDouble(col 39:string)(children: VectorCoalesce(columns [37, 
> 38])(children: VectorUDFAdaptor(get_json_object(_col14, '$.correctedPrice')) 
> -> 37:string, ConstantVectorExpression(val 0.88) -> 38:decimal(2,2)) -> 
> 39:string) -> 40:double
> {code}
>  The problem code:
> {code:java}
> public class VectorCoalesce extends VectorExpression {  
>   ...   
>   @Override
>   public void evaluate(VectorizedRowBatch batch) throws HiveException {if 
> (childExpressions != null) {
>   super.evaluateChildren(batch);
> }int[] sel = batch.selected;
> int n = batch.size;
> ColumnVector outputColVector = batch.cols[outputColumnNum];
> boolean[] outputIsNull = outputColVector.isNull;
> if (n <= 0) {
>   // Nothing to do
>   return;
> }if (unassignedBatchIndices == null || n > 
> unassignedBatchIndices.length) {  // (Re)allocate larger to be a multiple 
> of 1024 (DEFAULT_SIZE).
>   final int roundUpSize =
>   ((n + VectorizedRowBatch.DEFAULT_SIZE - 1) / 
> VectorizedRowBatch.DEFAULT_SIZE)
>   * VectorizedRowBatch.DEFAULT_SIZE;
>   unassignedBatchIndices = new int[roundUpSize];
> }// We do not need to do a column reset since we are carefully 
> changing the output.
> outputColVector.isRepeating = false;// CONSIDER: Should be do this 
> for all vector expressions that can
> //   work on BytesColumnVector output columns???
> outputColVector.init();
> final int columnCount = inputColumns.length;/*
>  * Process the input columns to find a non-NULL value for each row.
>  *
>  * We track the unassigned batchIndex of the rows that have not received
>  * a non-NULL value yet.  Similar to a selected array.
>  */
> boolean isAllUnassigned = true;
> int unassignedColumnCount = 0;
> for (int k = 0; k < inputColumns.length; k++) {
>   ColumnVector cv = batch.cols[inputColumns[k]];
>   if (cv.isRepeating) {if (cv.noNulls || !cv.isNull[0]) {
>   /*
>* With a repeating value we can finish all remaining rows.
>*/
>   if (isAllUnassigned) {// No other columns provided 
> non-NULL values.  We can return repeated output.
> outputIsNull[0] = false;
> outputColVector.setElement(0, 0, cv);
> outputColVector.isRepeating = true;
> return;
>   } else {// Some rows have already been assigned values. 
> Assign the remaining.
> // We cannot use copySelected method here.
> for (int i = 0; i < unassignedColumnCount; i++) {
>   final int batchIndex = unassignedBatchIndices[i];
>   outputIsNull[batchIndex] = false;  // Our 

[jira] [Work logged] (HIVE-25193) Vectorized Query Execution: ClassCastException when use nvl() function which default_value is decimal type

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25193?focusedWorklogId=607828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607828
 ]

ASF GitHub Bot logged work on HIVE-25193:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:38
Start Date: 07/Jun/21 12:38
Worklog Time Spent: 10m 
  Work Description: FoolishWall opened a new pull request #2358:
URL: https://github.com/apache/hive/pull/2358


   
   
   ### What changes were proposed in this pull request?
   
   Changed in 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorCoalesce.java,
 evaluate() function.
   
   ### Why are the changes needed?
   
   When set hive.vectorized.execution.enabled = true and use nvl() function 
which default_value is decimal type, the error log is as follows:
   
   `Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.BytesColumnVectorCaused by: 
java.lang.ClassCastException: 
org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector at 
org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setElement(BytesColumnVector.java:504)
 at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorCoalesce.evaluate(VectorCoalesce.java:124)
 at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
 at 
org.apache.hadoop.hive.ql.exec.vector.expressions.CastStringToDouble.evaluate(CastStringToDouble.java:83)
 at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
 ... 28 more`
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Precommit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607828)
Remaining Estimate: 0h
Time Spent: 10m

> Vectorized Query Execution: ClassCastException when use nvl() function which 
> default_value is decimal type
> --
>
> Key: HIVE-25193
> URL: https://issues.apache.org/jira/browse/HIVE-25193
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 4.0.0
>Reporter: qiang.bi
>Assignee: qiang.bi
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Problem statement:
> {code:java}
> set hive.vectorized.execution.enabled = true;
> select nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) 
> corrected_price,
> from dw_mdm_sync_asset;
> {code}
>  The error log:
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVectorCaused by: 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setElement(BytesColumnVector.java:504)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorCoalesce.evaluate(VectorCoalesce.java:124)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.CastStringToDouble.evaluate(CastStringToDouble.java:83)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
>  ... 28 more{code}
>  The problem HiveQL:
> {code:java}
> nvl(get_json_object(attr_json,'$.correctedPrice'),0.88) corrected_price
> {code}
>  The problem expression:
> {code:java}
> CastStringToDouble(col 39:string)(children: VectorCoalesce(columns [37, 
> 38])(children: VectorUDFAdaptor(get_json_object(_col14, '$.correctedPrice')) 
> -> 37:string, ConstantVectorExpression(val 0.88) -> 38:decimal(2,2)) -> 
> 39:string) -> 40:double
> {code}
>  The problem code:
> {code:java}
> public class VectorCoalesce extends VectorExpression {  
>   ...   
>   @Override
>   public void evaluate(VectorizedRowBatch batch) throws HiveException {if 
> (childExpressions != null) {
>   super.evaluateChildren(batch);
> }int[] sel = batch.selected;
> int n = batch.size;
> ColumnVector outputColVector = batch.cols[outputColumnNum];
> boolean[] outputIsNull = outputColVector.isNull;
> if (n <= 0) {
> 

[jira] [Resolved] (HIVE-25179) Support all partition transforms for Iceberg in create table

2021-06-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér resolved HIVE-25179.
--
Resolution: Fixed

> Support all partition transforms for Iceberg in create table
> 
>
> Key: HIVE-25179
> URL: https://issues.apache.org/jira/browse/HIVE-25179
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Enhance table create syntax with support to partition transforms:
> {code:sql}
> CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), 
> day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, 
> bucket_field bucket), identity_field ) STORED BY ICEBERG;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25179) Support all partition transforms for Iceberg in create table

2021-06-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358566#comment-17358566
 ] 

László Pintér commented on HIVE-25179:
--

Merged into master. Thanks, [~szita] [~mbod] [~pvary] for the review!

> Support all partition transforms for Iceberg in create table
> 
>
> Key: HIVE-25179
> URL: https://issues.apache.org/jira/browse/HIVE-25179
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Enhance table create syntax with support to partition transforms:
> {code:sql}
> CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), 
> day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, 
> bucket_field bucket), identity_field ) STORED BY ICEBERG;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25179) Support all partition transforms for Iceberg in create table

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25179?focusedWorklogId=607826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607826
 ]

ASF GitHub Bot logged work on HIVE-25179:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:34
Start Date: 07/Jun/21 12:34
Worklog Time Spent: 10m 
  Work Description: lcspinter merged pull request #2333:
URL: https://github.com/apache/hive/pull/2333


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607826)
Time Spent: 5h  (was: 4h 50m)

> Support all partition transforms for Iceberg in create table
> 
>
> Key: HIVE-25179
> URL: https://issues.apache.org/jira/browse/HIVE-25179
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Enhance table create syntax with support to partition transforms:
> {code:sql}
> CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), 
> day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, 
> bucket_field bucket), identity_field ) STORED BY ICEBERG;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607822
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:25
Start Date: 07/Jun/21 12:25
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646538850



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestVectorizedOrcAcidRowBatchReader.java
##
@@ -961,26 +966,41 @@ private void testDeleteEventOriginalFiltering2() throws 
Exception {
 
   @Test
   public void testVectorizedOrcAcidRowBatchReader() throws Exception {
+setupTestData();
+
+
testVectorizedOrcAcidRowBatchReader(ColumnizedDeleteEventRegistry.class.getName());
+
+// To test the SortMergedDeleteEventRegistry, we need to explicitly set the
+// HIVE_TRANSACTIONAL_NUM_EVENTS_IN_MEMORY constant to a smaller value.
+int oldValue = 
conf.getInt(HiveConf.ConfVars.HIVE_TRANSACTIONAL_NUM_EVENTS_IN_MEMORY.varname, 
100);
+
conf.setInt(HiveConf.ConfVars.HIVE_TRANSACTIONAL_NUM_EVENTS_IN_MEMORY.varname, 
1000);
+
testVectorizedOrcAcidRowBatchReader(SortMergedDeleteEventRegistry.class.getName());
+
+// Restore the old value.
+
conf.setInt(HiveConf.ConfVars.HIVE_TRANSACTIONAL_NUM_EVENTS_IN_MEMORY.varname, 
oldValue);
+  }
+
+  private void setupTestData() throws IOException {
 conf.set("bucket_count", "1");
-  conf.set(ValidTxnList.VALID_TXNS_KEY,
-  new ValidReadTxnList(new long[0], new BitSet(), 1000, 
Long.MAX_VALUE).writeToString());
+conf.set(ValidTxnList.VALID_TXNS_KEY,
+new ValidReadTxnList(new long[0], new BitSet(), 1000, 
Long.MAX_VALUE).writeToString());
 
 int bucket = 0;
 AcidOutputFormat.Options options = new AcidOutputFormat.Options(conf)
-.filesystem(fs)
-.bucket(bucket)
-.writingBase(false)
-.minimumWriteId(1)
-.maximumWriteId(NUM_OWID)
-.inspector(inspector)
-.reporter(Reporter.NULL)
-.recordIdColumn(1)
-.finalDestination(root);
+.filesystem(fs)

Review comment:
   nit. revert spaces




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607822)
Time Spent: 0.5h  (was: 20m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25179) Support all partition transforms for Iceberg in create table

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25179?focusedWorklogId=607821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607821
 ]

ASF GitHub Bot logged work on HIVE-25179:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:24
Start Date: 07/Jun/21 12:24
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2333:
URL: https://github.com/apache/hive/pull/2333#discussion_r646538065



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergTableUtil.java
##
@@ -46,27 +50,60 @@ private IcebergTableUtil() {
* @return an Iceberg table
*/
   static Table getTable(Configuration configuration, Properties properties) {
-Table table = null;
-QueryState queryState = null;
 String tableIdentifier = properties.getProperty(Catalogs.NAME);
-if (SessionState.get() != null) {
-  queryState = 
SessionState.get().getQueryState(configuration.get(HiveConf.ConfVars.HIVEQUERYID.varname));
-  if (queryState != null) {
-table = (Table) queryState.getResource(tableIdentifier);
-  } else {
-LOG.debug("QueryState is not available in SessionState. Loading {} 
from configured catalog.", tableIdentifier);
-  }
-} else {
-  LOG.debug("SessionState is not available. Loading {} from configured 
catalog.", tableIdentifier);
-}
+return SessionStateUtil.getResource(configuration, 
tableIdentifier).filter(o -> o instanceof Table)
+.map(o -> (Table) o).orElseGet(() -> {
+  LOG.debug("Iceberg table {} is not found in QueryState. Loading 
table from configured catalog",
+  tableIdentifier);
+  Table tab = Catalogs.loadTable(configuration, properties);
+  SessionStateUtil.addResource(configuration, tableIdentifier, tab);
+  return tab;
+});
+  }
 
-if (table == null) {
-  table = Catalogs.loadTable(configuration, properties);
-  if (queryState != null) {
-queryState.addResource(tableIdentifier, table);
-  }
+  /**
+   * Create {@link PartitionSpec} based on the partition information stored in
+   * {@link 
org.apache.hadoop.hive.ql.parse.PartitionTransform.PartitionTransformSpec}.
+   * @param configuration a Hadoop configuration
+   * @param schema iceberg table schema
+   * @return iceberg partition spec, always non-null
+   */
+  public static PartitionSpec spec(Configuration configuration, Schema schema) 
{
+List partitionTransformSpecList 
= SessionStateUtil
+.getResource(configuration, 
hive_metastoreConstants.PARTITION_TRANSFORM_SPEC)
+.map(o -> (List) 
o).orElseGet(() -> null);
+
+if (partitionTransformSpecList == null) {
+  LOG.debug("Iceberg partition transform spec is not found in 
QueryState.");

Review comment:
   I think this log line will be printed every time when we create or alter 
a table and there is no partition spec.
   This could be inconvenient




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607821)
Time Spent: 4h 50m  (was: 4h 40m)

> Support all partition transforms for Iceberg in create table
> 
>
> Key: HIVE-25179
> URL: https://issues.apache.org/jira/browse/HIVE-25179
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Enhance table create syntax with support to partition transforms:
> {code:sql}
> CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), 
> day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, 
> bucket_field bucket), identity_field ) STORED BY ICEBERG;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24715) Increase bucketId range

2021-06-07 Thread GuangMing Lu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu updated HIVE-24715:

Issue Type: Improvement  (was: Bug)

> Increase bucketId range
> ---
>
> Key: HIVE-24715
> URL: https://issues.apache.org/jira/browse/HIVE-24715
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Bucket Id range increase.pdf
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?focusedWorklogId=607819=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607819
 ]

ASF GitHub Bot logged work on HIVE-24991:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:22
Start Date: 07/Jun/21 12:22
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2264:
URL: https://github.com/apache/hive/pull/2264#discussion_r646536788



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -1940,39 +2091,38 @@ public boolean isEmpty() {
 }
 @Override
 public void findDeletedRecords(ColumnVector[] cols, int size, BitSet 
selectedBitSet) {
-  if (rowIds == null || compressedOwids == null) {
+  if (rowIds == null || writeIds == null || writeIds.isEmpty()) {
 return;
   }
   // Iterate through the batch and for each (owid, rowid) in the batch
   // check if it is deleted or not.
 
   long[] originalWriteIdVector =
-  cols[OrcRecordUpdater.ORIGINAL_WRITEID].isRepeating ? null
-  : ((LongColumnVector) 
cols[OrcRecordUpdater.ORIGINAL_WRITEID]).vector;
+  cols[OrcRecordUpdater.ORIGINAL_WRITEID].isRepeating ? null

Review comment:
   Lets avoid changing the tabs/spaces below




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607819)
Time Spent: 20m  (was: 10m)

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25205) Reduce overhead of partition column stat updation during batch loading of partitions.

2021-06-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25205:
---
Summary: Reduce overhead of partition column stat updation during batch 
loading of partitions.  (was: Reduce overhead of adding write notification log 
during batch loading of partition.)

> Reduce overhead of partition column stat updation during batch loading of 
> partitions.
> -
>
> Key: HIVE-25205
> URL: https://issues.apache.org/jira/browse/HIVE-25205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance
>
> During batch loading of partition the write notification logs are added for 
> each partition added. This is causing delay in execution as the call to HMS 
> is done for each partition. This can be optimised by adding a new API in HMS 
> to send a batch of partition and then this batch can be added together to the 
> backend database. Once we have a batch of notification log, at HMS side, code 
> can be optimised to add the logs using single call to backend RDBMS. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24663) Reduce overhead of partition column stat updation during batch loading of partitions.

2021-06-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-24663:
---
Summary: Reduce overhead of partition column stat updation during batch 
loading of partitions.  (was: Reduce overhead of partition column stats 
updation.)

> Reduce overhead of partition column stat updation during batch loading of 
> partitions.
> -
>
> Key: HIVE-24663
> URL: https://issues.apache.org/jira/browse/HIVE-24663
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> When large number of partitions (>20K) are processed, ColStatsProcessor runs 
> into DB issues. 
> {{ db.setPartitionColumnStatistics(request);}} gets stuck for hours together 
> and in some cases postgres stops processing. 
> It would be good to introduce small batches for stats gathering in 
> ColStatsProcessor instead of bulk update.
> Ref: 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L181
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L199



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24663) Reduce overhead of partition column stat updation during batch loading of partitions.

2021-06-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera resolved HIVE-24663.

Resolution: Fixed

> Reduce overhead of partition column stat updation during batch loading of 
> partitions.
> -
>
> Key: HIVE-24663
> URL: https://issues.apache.org/jira/browse/HIVE-24663
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> When large number of partitions (>20K) are processed, ColStatsProcessor runs 
> into DB issues. 
> {{ db.setPartitionColumnStatistics(request);}} gets stuck for hours together 
> and in some cases postgres stops processing. 
> It would be good to introduce small batches for stats gathering in 
> ColStatsProcessor instead of bulk update.
> Ref: 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L181
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L199



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HIVE-25205) Reduce overhead of adding write notification log during batch loading of partition..

2021-06-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reopened HIVE-25205:


> Reduce overhead of adding write notification log during batch loading of 
> partition..
> 
>
> Key: HIVE-25205
> URL: https://issues.apache.org/jira/browse/HIVE-25205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance
>
> During batch loading of partition the write notification logs are added for 
> each partition added. This is causing delay in execution as the call to HMS 
> is done for each partition. This can be optimised by adding a new API in HMS 
> to send a batch of partition and then this batch can be added together to the 
> backend database. Once we have a batch of notification log, at HMS side, code 
> can be optimised to add the logs using single call to backend RDBMS. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25205) Reduce overhead of adding write notification log during batch loading of partition..

2021-06-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25205:
---
Summary: Reduce overhead of adding write notification log during batch 
loading of partition..  (was: Reduce overhead of partition column stat updation 
during batch loading of partitions.)

> Reduce overhead of adding write notification log during batch loading of 
> partition..
> 
>
> Key: HIVE-25205
> URL: https://issues.apache.org/jira/browse/HIVE-25205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance
>
> During batch loading of partition the write notification logs are added for 
> each partition added. This is causing delay in execution as the call to HMS 
> is done for each partition. This can be optimised by adding a new API in HMS 
> to send a batch of partition and then this batch can be added together to the 
> backend database. Once we have a batch of notification log, at HMS side, code 
> can be optimised to add the logs using single call to backend RDBMS. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25205) Reduce overhead of partition column stat updation during batch loading of partitions.

2021-06-07 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera resolved HIVE-25205.

Resolution: Fixed

> Reduce overhead of partition column stat updation during batch loading of 
> partitions.
> -
>
> Key: HIVE-25205
> URL: https://issues.apache.org/jira/browse/HIVE-25205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance
>
> During batch loading of partition the write notification logs are added for 
> each partition added. This is causing delay in execution as the call to HMS 
> is done for each partition. This can be optimised by adding a new API in HMS 
> to send a batch of partition and then this batch can be added together to the 
> backend database. Once we have a batch of notification log, at HMS side, code 
> can be optimised to add the logs using single call to backend RDBMS. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24663) Reduce overhead of partition column stats updation.

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24663?focusedWorklogId=607809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607809
 ]

ASF GitHub Bot logged work on HIVE-24663:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 12:03
Start Date: 07/Jun/21 12:03
Worklog Time Spent: 10m 
  Work Description: maheshk114 merged pull request #2266:
URL: https://github.com/apache/hive/pull/2266


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607809)
Time Spent: 8h 50m  (was: 8h 40m)

> Reduce overhead of partition column stats updation.
> ---
>
> Key: HIVE-24663
> URL: https://issues.apache.org/jira/browse/HIVE-24663
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance, pull-request-available
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> When large number of partitions (>20K) are processed, ColStatsProcessor runs 
> into DB issues. 
> {{ db.setPartitionColumnStatistics(request);}} gets stuck for hours together 
> and in some cases postgres stops processing. 
> It would be good to introduce small batches for stats gathering in 
> ColStatsProcessor instead of bulk update.
> Ref: 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L181
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L199



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25179) Support all partition transforms for Iceberg in create table

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25179?focusedWorklogId=607806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607806
 ]

ASF GitHub Bot logged work on HIVE-25179:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 11:59
Start Date: 07/Jun/21 11:59
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2333:
URL: https://github.com/apache/hive/pull/2333#discussion_r646521390



##
File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexerParent.g
##
@@ -385,6 +385,7 @@ KW_DATACONNECTORS: 'CONNECTORS';
 KW_TYPE: 'TYPE';
 KW_URL: 'URL';
 KW_REMOTE: 'REMOTE';
+KW_SPEC: 'SPEC';

Review comment:
   Great, thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607806)
Time Spent: 4h 40m  (was: 4.5h)

> Support all partition transforms for Iceberg in create table
> 
>
> Key: HIVE-25179
> URL: https://issues.apache.org/jira/browse/HIVE-25179
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Enhance table create syntax with support to partition transforms:
> {code:sql}
> CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), 
> day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, 
> bucket_field bucket), identity_field ) STORED BY ICEBERG;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25179) Support all partition transforms for Iceberg in create table

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25179?focusedWorklogId=607790=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607790
 ]

ASF GitHub Bot logged work on HIVE-25179:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 11:19
Start Date: 07/Jun/21 11:19
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on a change in pull request #2333:
URL: https://github.com/apache/hive/pull/2333#discussion_r646497257



##
File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexerParent.g
##
@@ -385,6 +385,7 @@ KW_DATACONNECTORS: 'CONNECTORS';
 KW_TYPE: 'TYPE';
 KW_URL: 'URL';
 KW_REMOTE: 'REMOTE';
+KW_SPEC: 'SPEC';

Review comment:
   No. I added `KW_SPEC` as a non-reserved keyword. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607790)
Time Spent: 4.5h  (was: 4h 20m)

> Support all partition transforms for Iceberg in create table
> 
>
> Key: HIVE-25179
> URL: https://issues.apache.org/jira/browse/HIVE-25179
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Enhance table create syntax with support to partition transforms:
> {code:sql}
> CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), 
> day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, 
> bucket_field bucket), identity_field ) STORED BY ICEBERG;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25194) Add support for STORED AS ORC/PARQUET/AVRO for Iceberg

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25194?focusedWorklogId=607789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607789
 ]

ASF GitHub Bot logged work on HIVE-25194:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 11:17
Start Date: 07/Jun/21 11:17
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on a change in pull request #2348:
URL: https://github.com/apache/hive/pull/2348#discussion_r646495865



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##
@@ -13456,6 +13457,26 @@ ASTNode analyzeCreateTable(
   }
 }
 
+HiveStorageHandler handler = null;
+try {
+  handler = HiveUtils.getStorageHandler(conf, 
storageFormat.getStorageHandler());
+} catch (HiveException e) {
+  throw new SemanticException("Failed to load storage handler:  " + 
e.getMessage());
+}
+
+if (handler != null) {
+  String fileFormatPropertyKey = handler.getFileFormatPropertyKey();
+  if (fileFormatPropertyKey != null) {
+if (tblProps != null && tblProps.containsKey(fileFormatPropertyKey) && 
storageFormat.getSerdeProps() != null &&

Review comment:
   In `StorageFormat` class I store the fileformat in the serdeproperties, 
because I do not have reference to the table properties (it might be the case 
that the tableproperties were not parsed yet) 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607789)
Time Spent: 2.5h  (was: 2h 20m)

> Add support for STORED AS ORC/PARQUET/AVRO for Iceberg
> --
>
> Key: HIVE-25194
> URL: https://issues.apache.org/jira/browse/HIVE-25194
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently we have to specify the fileformat in TBLPROPERTIES during Iceberg 
> create table statements.
> The ideal syntax would be:
> CREATE TABLE tbl STORED BY ICEBERG STORED AS ORC ...
> One complication is that currently stored by and stored as are not permitted 
> within the same query, so that needs to be amended.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25179) Support all partition transforms for Iceberg in create table

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25179?focusedWorklogId=607783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607783
 ]

ASF GitHub Bot logged work on HIVE-25179:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 11:01
Start Date: 07/Jun/21 11:01
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2333:
URL: https://github.com/apache/hive/pull/2333#discussion_r646485008



##
File path: parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexerParent.g
##
@@ -385,6 +385,7 @@ KW_DATACONNECTORS: 'CONNECTORS';
 KW_TYPE: 'TYPE';
 KW_URL: 'URL';
 KW_REMOTE: 'REMOTE';
+KW_SPEC: 'SPEC';

Review comment:
   Would this mean that columns called `spec` could only be used from now 
on using backticks?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607783)
Time Spent: 4h 20m  (was: 4h 10m)

> Support all partition transforms for Iceberg in create table
> 
>
> Key: HIVE-25179
> URL: https://issues.apache.org/jira/browse/HIVE-25179
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Enhance table create syntax with support to partition transforms:
> {code:sql}
> CREATE TABLE ... PARTITIONED BY SPEC( year(year_field), month(month_field), 
> day(day_field), hour(hour_field), truncate(3, truncate_field), bucket(5, 
> bucket_field bucket), identity_field ) STORED BY ICEBERG;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607745=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607745
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:49
Start Date: 07/Jun/21 09:49
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646436391



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -265,9 +287,12 @@ public void 
commitAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTable
   HiveTableUtil.importFiles(preAlterTableProperties.tableLocation, 
preAlterTableProperties.format,
   partitionSpecProxy, preAlterTableProperties.partitionKeys, 
catalogProperties, conf);
 } else {
-  Map contextProperties = context.getProperties();
-  if (contextProperties.containsKey(ALTER_TABLE_OPERATION_TYPE) &&
-  
allowedAlterTypes.contains(contextProperties.get(ALTER_TABLE_OPERATION_TYPE))) {
+  if (isMatchingAlterOp(AlterTableType.ADDCOLS, context) && updateSchema 
!= null) {

Review comment:
   `updateSchema` can only be non-null if 
`isMatchingAlterOp(AlterTableType.ADDCOLS, context)` was true in preAlterTable, 
so maybe `if (updateSchema != null)` is enough here? That might make this part 
also work generically for other schema update operations, like drop and rename, 
since you just have to call `commit()` here regardless of the operation type.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607745)
Time Spent: 3h  (was: 2h 50m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607743
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:44
Start Date: 07/Jun/21 09:44
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646433022



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java
##
@@ -783,6 +785,43 @@ public void testDropHiveTableWithoutUnderlyingTable() 
throws IOException {
 shell.executeStatement("DROP TABLE " + identifier);
   }
 
+  @Test
+  public void testAlterTableAddColumns() throws Exception {
+Assume.assumeTrue("Iceberg - alter table/add column is only relevant for 
HiveCatalog",
+testTableType == TestTables.TestTableType.HIVE_CATALOG);
+
+TableIdentifier identifier = TableIdentifier.of("default", "customers");
+
+// Create HMS table with with a property to be translated
+shell.executeStatement(String.format("CREATE EXTERNAL TABLE 
default.customers " +
+"STORED BY ICEBERG " +
+"TBLPROPERTIES ('%s'='%s', '%s'='%s', '%s'='%s')",
+InputFormatConfig.TABLE_SCHEMA, 
SchemaParser.toJson(HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA),
+InputFormatConfig.PARTITION_SPEC, PartitionSpecParser.toJson(SPEC),
+InputFormatConfig.EXTERNAL_TABLE_PURGE, "false"));
+
+shell.executeStatement("ALTER TABLE default.customers ADD COLUMNS " +
+"(newintcol int, newstringcol string COMMENT 'Column with 
description')");
+
+org.apache.iceberg.Table icebergTable = testTables.loadTable(identifier);
+org.apache.hadoop.hive.metastore.api.Table hmsTable = 
shell.metastore().getTable("default", "customers");
+
+List icebergSchema = 
HiveSchemaUtil.convert(icebergTable.schema());
+List hmsSchema = hmsTable.getSd().getCols();
+
+List expectedSchema = Lists.newArrayList(
+new FieldSchema("customer_id", "bigint", null),
+new FieldSchema("first_name", "string", "This is first name"),
+new FieldSchema("last_name", "string", "This is last name"),
+new FieldSchema("newintcol", "int", null),
+new FieldSchema("newstringcol", "string", "Column with description"));
+
+Assert.assertEquals(expectedSchema, icebergSchema);
+Assert.assertEquals(icebergSchema, hmsSchema);
+
+shell.executeStatement("DROP TABLE " + identifier);

Review comment:
   You can remove this drop, since all tables are automatically dropped 
after each test case




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607743)
Time Spent: 2h 50m  (was: 2h 40m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607742
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:44
Start Date: 07/Jun/21 09:44
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646432618



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java
##
@@ -783,6 +785,43 @@ public void testDropHiveTableWithoutUnderlyingTable() 
throws IOException {
 shell.executeStatement("DROP TABLE " + identifier);
   }
 
+  @Test
+  public void testAlterTableAddColumns() throws Exception {
+Assume.assumeTrue("Iceberg - alter table/add column is only relevant for 
HiveCatalog",
+testTableType == TestTables.TestTableType.HIVE_CATALOG);
+
+TableIdentifier identifier = TableIdentifier.of("default", "customers");
+
+// Create HMS table with with a property to be translated
+shell.executeStatement(String.format("CREATE EXTERNAL TABLE 
default.customers " +
+"STORED BY ICEBERG " +
+"TBLPROPERTIES ('%s'='%s', '%s'='%s', '%s'='%s')",
+InputFormatConfig.TABLE_SCHEMA, 
SchemaParser.toJson(HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA),
+InputFormatConfig.PARTITION_SPEC, PartitionSpecParser.toJson(SPEC),
+InputFormatConfig.EXTERNAL_TABLE_PURGE, "false"));
+
+shell.executeStatement("ALTER TABLE default.customers ADD COLUMNS " +

Review comment:
   There's a bit of an overlapping work on that here: 
https://jira.cloudera.com/browse/CDPD-25369, which should cover add/drop/rename 
column scenarios.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607742)
Time Spent: 2h 40m  (was: 2.5h)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607740
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:38
Start Date: 07/Jun/21 09:38
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646427858



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -213,7 +219,9 @@ public void 
commitDropTable(org.apache.hadoop.hive.metastore.api.Table hmsTable,
   @Override
   public void preAlterTable(org.apache.hadoop.hive.metastore.api.Table 
hmsTable, EnvironmentContext context)
   throws MetaException {
-super.preAlterTable(hmsTable, context);
+if (!isSupportedAlterOperation(context)) {
+  super.preAlterTable(hmsTable, context);

Review comment:
   Since we've added ADDPROPS and DROPPROPS to our allowed list as well, 
maybe we can remove the super call? It seems like it wouldn't add any 
additional value at this point?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607740)
Time Spent: 2.5h  (was: 2h 20m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607739
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:35
Start Date: 07/Jun/21 09:35
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646424725



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -310,6 +335,26 @@ public void 
rollbackAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTab
 }
   }
 
+  private static boolean isSupportedAlterOperation(EnvironmentContext context) 
{
+for (Enum op : SUPPORTED_ALTER_OPS) {

Review comment:
   nit maybe: `return SUPPORTED_ALTER_OPS.stream().anyMatch(op -> 
isMatchingAlterOp(op, context));`
   
   or if the null checks are already here, it could even just be:
   
   `return SUPPORTED_ALTER_OPS.stream().anyMatch(op -> 
op.name().equals(contextProperties.get(ALTER_TABLE_OPERATION_TYPE));`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607739)
Time Spent: 2h 20m  (was: 2h 10m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25117:
--
Affects Version/s: 4.0.0

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?focusedWorklogId=607738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607738
 ]

ASF GitHub Bot logged work on HIVE-25117:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:34
Start Date: 07/Jun/21 09:34
Worklog Time Spent: 10m 
  Work Description: pgaref merged pull request #2286:
URL: https://github.com/apache/hive/pull/2286


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607738)
Time Spent: 1h  (was: 50m)

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25117.
---
Resolution: Fixed

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25117:
--
Fix Version/s: 4.0.0

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25117) Vector PTF ClassCastException with Decimal64

2021-06-07 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358477#comment-17358477
 ] 

Panagiotis Garefalakis commented on HIVE-25117:
---

Revolved via https://github.com/apache/hive/pull/2286 
Thanks [~rameshkumar] for the patch! 

> Vector PTF ClassCastException with Decimal64
> 
>
> Key: HIVE-25117
> URL: https://issues.apache.org/jira/browse/HIVE-25117
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: vector_ptf_classcast_exception.q
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Only reproduces when there is at least 1 buffered batch, so needed 2 rows 
> with 1 row/batch:
> {code:java}
> set hive.vectorized.testing.reducer.batch.size=1;
> {code}
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.copyNonSelectedColumnVector(VectorizedBatchUtil.java:664)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.forwardBufferedBatches(VectorPTFGroupBatches.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFGroupBatches.fillGroupResultsAndForward(VectorPTFGroupBatches.java:318)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.ptf.VectorPTFOperator.process(VectorPTFOperator.java:403)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:497)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25203) HiveQueryResultSet and client operation are not expected to be closed twice

2021-06-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved HIVE-25203.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> HiveQueryResultSet and client operation are not expected to be closed twice
> ---
>
> Key: HIVE-25203
> URL: https://issues.apache.org/jira/browse/HIVE-25203
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While testing retry scenarios of HIVE-24786, we found that 
> HiveQueryResultSet.close() is called twice, which is not expected. There are 
> 2 different issues here:
> 1. ResultSet should not handle Statement as in HiveQueryResultSet:
> {code}
> if (this.statement != null && (this.statement instanceof HiveStatement)) {
>   HiveStatement s = (HiveStatement) this.statement;
>   s.closeClientOperation();
> {code}
> The hiearchy of Connection(HiveConnection) -> Statement(HiveStatement) -> 
> ResultSet(HiveQueryResultSet) should be respected in a sense that the parent 
> can handle child but not the opposite way, only except a single case, where 
> the state of the result set has an effect on statement's state, which is 
> [Statement.closeOnCompletion|https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#closeOnCompletion()],
>  which was introduced by HIVE-22698.
> The above logic was introduced by 
> [HIVE-4974|https://github.com/apache/hive/blame/master/jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java#L276].
>  Its intention was to make children able to return their parents, but that 
> doesn't mean they should handle their parents' lifecycle.
> 2. Also, HiveStatement should close HiveQueryResultSet only if it's not 
> already closed, so it would make sense to check ResultSet.isClosed() before 
> closing. This is for the very same reason as another change above, to avoid 
> duplicated close logic. 
> Background: under normal circumstances, a close operation is idempotent, we 
> should not worry about any side effects of calling it twice, but while 
> testing HIVE-24786, we found strange issues where in case of a 
> SocketTimeoutException, such code path was hit in the jdbc client, that made 
> HiveStatement.closeClientOperation() to be called twice, and it led to a 
> WARNING on HS2 side. This is not expected as the operation close is protected 
> by stmtHandle != null check, but yet it ran twice. To avoid situations like 
> this, cleaning up duplicated close calls would help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607737=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607737
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:33
Start Date: 07/Jun/21 09:33
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646424725



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -310,6 +335,26 @@ public void 
rollbackAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTab
 }
   }
 
+  private static boolean isSupportedAlterOperation(EnvironmentContext context) 
{
+for (Enum op : SUPPORTED_ALTER_OPS) {

Review comment:
   nit maybe: `return SUPPORTED_ALTER_OPS.stream().anyMatch(op -> 
isMatchingAlterOp(op, context));`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607737)
Time Spent: 2h 10m  (was: 2h)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25203) HiveQueryResultSet and client operation are not expected to be closed twice

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25203?focusedWorklogId=607735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607735
 ]

ASF GitHub Bot logged work on HIVE-25203:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:33
Start Date: 07/Jun/21 09:33
Worklog Time Spent: 10m 
  Work Description: abstractdog merged pull request #2353:
URL: https://github.com/apache/hive/pull/2353


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607735)
Time Spent: 20m  (was: 10m)

> HiveQueryResultSet and client operation are not expected to be closed twice
> ---
>
> Key: HIVE-25203
> URL: https://issues.apache.org/jira/browse/HIVE-25203
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> While testing retry scenarios of HIVE-24786, we found that 
> HiveQueryResultSet.close() is called twice, which is not expected. There are 
> 2 different issues here:
> 1. ResultSet should not handle Statement as in HiveQueryResultSet:
> {code}
> if (this.statement != null && (this.statement instanceof HiveStatement)) {
>   HiveStatement s = (HiveStatement) this.statement;
>   s.closeClientOperation();
> {code}
> The hiearchy of Connection(HiveConnection) -> Statement(HiveStatement) -> 
> ResultSet(HiveQueryResultSet) should be respected in a sense that the parent 
> can handle child but not the opposite way, only except a single case, where 
> the state of the result set has an effect on statement's state, which is 
> [Statement.closeOnCompletion|https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#closeOnCompletion()],
>  which was introduced by HIVE-22698.
> The above logic was introduced by 
> [HIVE-4974|https://github.com/apache/hive/blame/master/jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java#L276].
>  Its intention was to make children able to return their parents, but that 
> doesn't mean they should handle their parents' lifecycle.
> 2. Also, HiveStatement should close HiveQueryResultSet only if it's not 
> already closed, so it would make sense to check ResultSet.isClosed() before 
> closing. This is for the very same reason as another change above, to avoid 
> duplicated close logic. 
> Background: under normal circumstances, a close operation is idempotent, we 
> should not worry about any side effects of calling it twice, but while 
> testing HIVE-24786, we found strange issues where in case of a 
> SocketTimeoutException, such code path was hit in the jdbc client, that made 
> HiveStatement.closeClientOperation() to be called twice, and it led to a 
> WARNING on HS2 side. This is not expected as the operation close is protected 
> by stmtHandle != null check, but yet it ran twice. To avoid situations like 
> this, cleaning up duplicated close calls would help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25203) HiveQueryResultSet and client operation are not expected to be closed twice

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25203?focusedWorklogId=607736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607736
 ]

ASF GitHub Bot logged work on HIVE-25203:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:33
Start Date: 07/Jun/21 09:33
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #2353:
URL: https://github.com/apache/hive/pull/2353#issuecomment-855772120


   merged, thanks for the review @prasanthj, @pgaref !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607736)
Time Spent: 0.5h  (was: 20m)

> HiveQueryResultSet and client operation are not expected to be closed twice
> ---
>
> Key: HIVE-25203
> URL: https://issues.apache.org/jira/browse/HIVE-25203
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While testing retry scenarios of HIVE-24786, we found that 
> HiveQueryResultSet.close() is called twice, which is not expected. There are 
> 2 different issues here:
> 1. ResultSet should not handle Statement as in HiveQueryResultSet:
> {code}
> if (this.statement != null && (this.statement instanceof HiveStatement)) {
>   HiveStatement s = (HiveStatement) this.statement;
>   s.closeClientOperation();
> {code}
> The hiearchy of Connection(HiveConnection) -> Statement(HiveStatement) -> 
> ResultSet(HiveQueryResultSet) should be respected in a sense that the parent 
> can handle child but not the opposite way, only except a single case, where 
> the state of the result set has an effect on statement's state, which is 
> [Statement.closeOnCompletion|https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#closeOnCompletion()],
>  which was introduced by HIVE-22698.
> The above logic was introduced by 
> [HIVE-4974|https://github.com/apache/hive/blame/master/jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java#L276].
>  Its intention was to make children able to return their parents, but that 
> doesn't mean they should handle their parents' lifecycle.
> 2. Also, HiveStatement should close HiveQueryResultSet only if it's not 
> already closed, so it would make sense to check ResultSet.isClosed() before 
> closing. This is for the very same reason as another change above, to avoid 
> duplicated close logic. 
> Background: under normal circumstances, a close operation is idempotent, we 
> should not worry about any side effects of calling it twice, but while 
> testing HIVE-24786, we found strange issues where in case of a 
> SocketTimeoutException, such code path was hit in the jdbc client, that made 
> HiveStatement.closeClientOperation() to be called twice, and it led to a 
> WARNING on HS2 side. This is not expected as the operation close is protected 
> by stmtHandle != null check, but yet it ran twice. To avoid situations like 
> this, cleaning up duplicated close calls would help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607734
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:32
Start Date: 07/Jun/21 09:32
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646423854



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
##
@@ -310,6 +335,26 @@ public void 
rollbackAlterTable(org.apache.hadoop.hive.metastore.api.Table hmsTab
 }
   }
 
+  private static boolean isSupportedAlterOperation(EnvironmentContext context) 
{
+for (Enum op : SUPPORTED_ALTER_OPS) {
+  if (isMatchingAlterOp(op, context)) {
+return true;
+  }
+}
+return false;
+  }
+
+  private static boolean isMatchingAlterOp(Enum alterOperation, 
EnvironmentContext context) {
+if (context == null) {

Review comment:
   these two null checks could be extracted into 
`isSupportedAlterOperation`, since if they're null they'll always be false so 
no need for looping




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607734)
Time Spent: 2h  (was: 1h 50m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607729=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607729
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:21
Start Date: 07/Jun/21 09:21
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646415970



##
File path: 
iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveSchemaUtil.java
##
@@ -134,6 +136,28 @@ public static Type convert(TypeInfo typeInfo) {
 return HiveSchemaConverter.convert(typeInfo, false);
   }
 
+  /**
+   * Produces the difference of two FieldSchema lists by only taking into 
account the field name and type.
+   * @param from List of fields to subtract from
+   * @param to List of fields to subtract
+   * @return the result list of difference
+   */
+  public static List schemaDifference(List from, 
List to) {
+List result = new LinkedList<>(from);
+Iterator it = result.iterator();
+while (it.hasNext()) {

Review comment:
   nit: maybe use Streams with filter and collect?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607729)
Time Spent: 1h 50m  (was: 1h 40m)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25200) Alter table add columns support for Iceberg tables

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25200?focusedWorklogId=607728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607728
 ]

ASF GitHub Bot logged work on HIVE-25200:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 09:18
Start Date: 07/Jun/21 09:18
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2351:
URL: https://github.com/apache/hive/pull/2351#discussion_r646413108



##
File path: 
iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveSchemaUtil.java
##
@@ -134,6 +136,28 @@ public static Type convert(TypeInfo typeInfo) {
 return HiveSchemaConverter.convert(typeInfo, false);
   }
 
+  /**
+   * Produces the difference of two FieldSchema lists by only taking into 
account the field name and type.
+   * @param from List of fields to subtract from
+   * @param to List of fields to subtract
+   * @return the result list of difference
+   */
+  public static List schemaDifference(List from, 
List to) {

Review comment:
   These parameter names suggest to me that `to` represent some destination 
collection, and `from` is the source collection. Can we rename them a bit? 
Maybe something like `baseFields` and `fieldsToRemove`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607728)
Time Spent: 1h 40m  (was: 1.5h)

> Alter table add columns support for Iceberg tables
> --
>
> Key: HIVE-25200
> URL: https://issues.apache.org/jira/browse/HIVE-25200
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Since Iceberg counts as being a non-native Hive table, addColumn operation 
> needs to be implemented by the help of Hive meta hooks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358449#comment-17358449
 ] 

Panagiotis Garefalakis edited comment on HIVE-25180 at 6/7/21, 8:50 AM:


Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba] and 
[~kgyrtkirk] for the review!


was (Author: pgaref):
Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba]

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25194) Add support for STORED AS ORC/PARQUET/AVRO for Iceberg

2021-06-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25194?focusedWorklogId=607720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607720
 ]

ASF GitHub Bot logged work on HIVE-25194:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 08:48
Start Date: 07/Jun/21 08:48
Worklog Time Spent: 10m 
  Work Description: marton-bod commented on a change in pull request #2348:
URL: https://github.com/apache/hive/pull/2348#discussion_r646390216



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##
@@ -13456,6 +13457,26 @@ ASTNode analyzeCreateTable(
   }
 }
 
+HiveStorageHandler handler = null;
+try {
+  handler = HiveUtils.getStorageHandler(conf, 
storageFormat.getStorageHandler());
+} catch (HiveException e) {
+  throw new SemanticException("Failed to load storage handler:  " + 
e.getMessage());
+}
+
+if (handler != null) {
+  String fileFormatPropertyKey = handler.getFileFormatPropertyKey();
+  if (fileFormatPropertyKey != null) {
+if (tblProps != null && tblProps.containsKey(fileFormatPropertyKey) && 
storageFormat.getSerdeProps() != null &&

Review comment:
   A quick clarifying question: this section handles the case when both 
`STORED AS ORC` and `TBLPROPERTIES` are defined in the same DDL query? If so, 
why do we need the `&& storageFormat.getSerdeProps() != null` part? If we used 
`||` instead, would that work too for validating both the tblproperties and the 
serdeproperties case?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607720)
Time Spent: 2h 20m  (was: 2h 10m)

> Add support for STORED AS ORC/PARQUET/AVRO for Iceberg
> --
>
> Key: HIVE-25194
> URL: https://issues.apache.org/jira/browse/HIVE-25194
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently we have to specify the fileformat in TBLPROPERTIES during Iceberg 
> create table statements.
> The ideal syntax would be:
> CREATE TABLE tbl STORED BY ICEBERG STORED AS ORC ...
> One complication is that currently stored by and stored as are not permitted 
> within the same query, so that needs to be amended.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis resolved HIVE-25180.
---
Resolution: Fixed

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358449#comment-17358449
 ] 

Panagiotis Garefalakis commented on HIVE-25180:
---

Resolved via https://github.com/apache/hive/pull/2345 thanks [~Csaba]

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25180:
--
Affects Version/s: 4.0.0

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis updated HIVE-25180:
--
Fix Version/s: 4.0.0

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25180) Update netty to 4.1.60.Final

2021-06-07 Thread Panagiotis Garefalakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Panagiotis Garefalakis reassigned HIVE-25180:
-

Assignee: Csaba Juhász

> Update netty to 4.1.60.Final
> 
>
> Key: HIVE-25180
> URL: https://issues.apache.org/jira/browse/HIVE-25180
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >