[jira] [Assigned] (HIVE-24784) Insert into table throws Invalid column reference when select is followed by any other operation

2021-06-16 Thread Sruthi M (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sruthi M reassigned HIVE-24784:
---

Assignee: Sruthi M

> Insert into table throws Invalid column reference when select is followed by 
> any other operation
> 
>
> Key: HIVE-24784
> URL: https://issues.apache.org/jira/browse/HIVE-24784
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions
>Reporter: Alagappan Maruthappan
>Assignee: Sruthi M
>Priority: Minor
>
>  
> To Reproduce:
> {code:sql}
> create table foo (a int, b string);
> create table bar (a int, b string);
> explain insert into foo (a, b) select a, b from bar order by b;
> 21/02/15 23:26:58 ERROR ql.Driver: FAILED: SemanticException [Error 10004]: 
> Line 1:61 Invalid table alias or column reference 'b': (possible column names 
> are: _col0, _col1)21/02/15 23:26:58 ERROR ql.Driver: FAILED: 
> SemanticException [Error 10004]: Line 1:61 Invalid table alias or column 
> reference 'b': (possible column names are: _col0, _col1)
> {code}
> Any operation that follows select (order by/cluster by/distribute by) throws 
> this exception.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25261) RetryingHMSHandler should wrap the MetaException with short description of the target

2021-06-16 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-25261:
--


> RetryingHMSHandler should wrap the MetaException with short description of 
> the target
> -
>
> Key: HIVE-25261
> URL: https://issues.apache.org/jira/browse/HIVE-25261
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> [RetryingMetaStoreClient|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L267-L276]
>  relies on the message of MetaException to make decision on retrying the 
> current operation when failed. However the RetryingHMSHandler only wraps the 
> message into MetaException, which may cause the client unable to retry with 
> other metastore instances.
> For example, if we got exception:
> {code:java}
> Caused by: javax.jdo.JDOFatalUserException: Persistence Manager has been 
> closed
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.assertIsOpen(JDOPersistenceManager.java:2235)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.evictAll(JDOPersistenceManager.java:481)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.rollbackTransaction(ObjectStore.java:635)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:1415)
>  at sun.reflect.GeneratedMethodAccessor153.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498){code}
> RetryingHMSHandler will throw MetaException with message 'Persistence Manager 
> has been closed', which not in the recoverable pattern defined in client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25250) Fix TestHS2ImpersonationWithRemoteMS.testImpersonation

2021-06-16 Thread Ashish Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-25250:


Assignee: Ashish Sharma

> Fix TestHS2ImpersonationWithRemoteMS.testImpersonation
> --
>
> Key: HIVE-25250
> URL: https://issues.apache.org/jira/browse/HIVE-25250
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Ashish Sharma
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-flaky-check/235/testReport/org.apache.hive.service/TestHS2ImpersonationWithRemoteMS/testImpersonation/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25250) Fix TestHS2ImpersonationWithRemoteMS.testImpersonation

2021-06-16 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364682#comment-17364682
 ] 

Ashish Sharma commented on HIVE-25250:
--

https://github.com/apache/hive/pull/2404

> Fix TestHS2ImpersonationWithRemoteMS.testImpersonation
> --
>
> Key: HIVE-25250
> URL: https://issues.apache.org/jira/browse/HIVE-25250
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-flaky-check/235/testReport/org.apache.hive.service/TestHS2ImpersonationWithRemoteMS/testImpersonation/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25250) Fix TestHS2ImpersonationWithRemoteMS.testImpersonation

2021-06-16 Thread Ashish Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364675#comment-17364675
 ] 

Ashish Sharma commented on HIVE-25250:
--

[~kgyrtkirk] Can I pick this ticket?

> Fix TestHS2ImpersonationWithRemoteMS.testImpersonation
> --
>
> Key: HIVE-25250
> URL: https://issues.apache.org/jira/browse/HIVE-25250
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-flaky-check/235/testReport/org.apache.hive.service/TestHS2ImpersonationWithRemoteMS/testImpersonation/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24944) When the default engine of the hiveserver is MR and the tez engine is set by the client, the client TEZ progress log cannot be printed normally

2021-06-16 Thread ZhangQiDong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangQiDong updated HIVE-24944:
---
Attachment: HIVE-24944.002.patch

> When the default engine of the hiveserver is MR and the tez engine is set by 
> the client, the client TEZ progress log cannot be printed normally
> ---
>
> Key: HIVE-24944
> URL: https://issues.apache.org/jira/browse/HIVE-24944
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 3.1.0, 4.0.0
>Reporter: ZhangQiDong
>Assignee: ZhangQiDong
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24944.001.patch, HIVE-24944.002.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HiveServer configuration parameter execution default MR. When set 
> hive.execution.engine = tez, the client cannot print the TEZ log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24944) When the default engine of the hiveserver is MR and the tez engine is set by the client, the client TEZ progress log cannot be printed normally

2021-06-16 Thread ZhangQiDong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangQiDong updated HIVE-24944:
---
Attachment: (was: HIVE-24944.002.patch)

> When the default engine of the hiveserver is MR and the tez engine is set by 
> the client, the client TEZ progress log cannot be printed normally
> ---
>
> Key: HIVE-24944
> URL: https://issues.apache.org/jira/browse/HIVE-24944
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 3.1.0, 4.0.0
>Reporter: ZhangQiDong
>Assignee: ZhangQiDong
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24944.001.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HiveServer configuration parameter execution default MR. When set 
> hive.execution.engine = tez, the client cannot print the TEZ log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24944) When the default engine of the hiveserver is MR and the tez engine is set by the client, the client TEZ progress log cannot be printed normally

2021-06-16 Thread ZhangQiDong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZhangQiDong updated HIVE-24944:
---
Attachment: HIVE-24944.002.patch

> When the default engine of the hiveserver is MR and the tez engine is set by 
> the client, the client TEZ progress log cannot be printed normally
> ---
>
> Key: HIVE-24944
> URL: https://issues.apache.org/jira/browse/HIVE-24944
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 3.1.0, 4.0.0
>Reporter: ZhangQiDong
>Assignee: ZhangQiDong
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24944.001.patch, HIVE-24944.002.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HiveServer configuration parameter execution default MR. When set 
> hive.execution.engine = tez, the client cannot print the TEZ log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25260) Provide tableId for AllocWriteIdEvent

2021-06-16 Thread Yu-Wen Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu-Wen Lai updated HIVE-25260:
--
Summary: Provide tableId for AllocWriteIdEvent  (was: Provide tableId for 
ALLOC_WRITE_ID_EVENT)

> Provide tableId for AllocWriteIdEvent
> -
>
> Key: HIVE-25260
> URL: https://issues.apache.org/jira/browse/HIVE-25260
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>
> For event-based incremental refreshing in external cache, we need table ID to 
> make sure we don't update write ID on a wrong table.
> After this patch, we will check if the table exists for allocating write ids. 
> If the table doesn't exist, a MetaException will be thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25260) Provide tableId for ALLOC_WRITE_ID_EVENT

2021-06-16 Thread Yu-Wen Lai (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu-Wen Lai reassigned HIVE-25260:
-


> Provide tableId for ALLOC_WRITE_ID_EVENT
> 
>
> Key: HIVE-25260
> URL: https://issues.apache.org/jira/browse/HIVE-25260
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>
> For event-based incremental refreshing in external cache, we need table ID to 
> make sure we don't update write ID on a wrong table.
> After this patch, we will check if the table exists for allocating write ids. 
> If the table doesn't exist, a MetaException will be thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Description: 
The query based MINOR compaction uses the following sorting order in its inner 
query: `bucket`, `originalTransaction`, `rowId`, as it can be seen in the 
[code|https://github.com/apache/hive/blob/d0bbe76ad626244802d062b0a93a9f1cd4fc5f20/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java#L474-L476].

But actually the rows should be ordered by originalTransactionId, 
bucketProperty and rowId, otherwise the delete deltas cannot be applied 
correctly. And this is the order what the MR MAJOR and MR MINOR compactions 
write. 
The sorting order used by the query-based MINOR compaction can lead to 
duplicated rows when running the compaction after multiple merge statements. 
This issue can be reproduced for example by running the following queries:
{noformat}
CREATE TABLE transactions(id int,value string) STORED AS ORC TBLPROPERTIES 
('transactional'='true');
INSERT INTO transactions VALUES
(1, 'value_01'),(2, 'value_02'),(3, 'value_03'),(4, 'value_04'),(5, 
'value_05'),(6, 'value_06'),(7, 'value_07'),(8, 'value_08');


CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
INSERT INTO merge_source_1 VALUES (1, 'newvalue_1'),(2, 'newvalue_2'),(4, 
'newvalue_4'),(6, 'newvalue_6'),(9, 'value_9'),(10, 'value_10'),(11, 
'value_11'),(12, 'value_12');

MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value 
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


CREATE TABLE merge_source_2(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_2 VALUES
  (2, 'newestvalue_2'),(4, 'newestvalue_4'),(6, 'newestvalue_6'),(10, 
'newestvalue_10'),(11, 'newestvalue_11'),(13, 'value_13'),(14, 'value_14');

MERGE INTO transactions AS T 
USING merge_source_2 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


ALTER TABLE transactions COMPACT 'MINOR';


CREATE TABLE merge_source_3(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_3 VALUES
  (1, 'latestvalue_1'),(4, 'latestvalue_4'),(5, 'latestvalue_5'),(9, 
'latestvalue_9'),(11, 'latestvalue_11'),(13, 'latestvalue_13'),(15, 'value_15');

MERGE INTO transactions AS T 
USING merge_source_3 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);

ALTER TABLE transactions COMPACT 'MINOR';
{noformat}

Running a select after the second compaction finished will return duplicated 
rows:
{noformat}
select * from transactions order by id;

+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| newvalue_1  |
| 1| latestvalue_1   |
| 2| newestvalue_2   |
| 2| newvalue_2  |
| 3| value_03|
| 4| latestvalue_4   |
| 4| newvalue_4  |
| 5| latestvalue_5   |
| 6| newvalue_6  |
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| latestvalue_9   |
| 10   | newestvalue_10  |
| 11   | latestvalue_11  |
| 12   | value_12|
| 13   | latestvalue_13  |
| 14   | value_14|
| 15   | value_15|
+--+-+
{noformat}

If the same queries are run with MR MINOR compaction, instead of the 
query-based MINOR compaction, the select will return the correct result:
{noformat}
+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| latestvalue_1   |
| 2| newestvalue_2   |
| 3| value_03|
| 4| latestvalue_4   |
| 5| latestvalue_5   |
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| latestvalue_9   |
| 10   | newestvalue_10  |
| 11   | latestvalue_11  |
| 12   | value_12|
| 13   | latestvalue_13  |
| 14   | value_14|
| 15   | value_15|
+--+-+
{noformat}

The content of the bucket files in the delta and delete delta directories after 
the query-based and MR 

[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Description: 
The query based MINOR compaction uses the following sorting order in its inner 
query: `bucket`, `originalTransaction`, `rowId`, as it can be seen in the 
[code|https://github.com/apache/hive/blob/d0bbe76ad626244802d062b0a93a9f1cd4fc5f20/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java#L474-L476].

But actually the rows should be ordered by originalTransactionId, 
bucketProperty and rowId, otherwise the delete deltas cannot be applied 
correctly. And this is the order what the MR MAJOR and MR MINOR compactions 
write. 
The sorting order used by the query-based MINOR compaction can lead to 
duplicated rows when running the compaction after multiple merge statements. 
This issue can be reproduced for example by running the following queries:
{noformat}
CREATE TABLE transactions(id int,value string) STORED AS ORC TBLPROPERTIES 
('transactional'='true');
INSERT INTO transactions VALUES
(1, 'value_01'),(2, 'value_02'),(3, 'value_03'),(4, 'value_04'),(5, 
'value_05'),(6, 'value_06'),(7, 'value_07'),(8, 'value_08');


CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
INSERT INTO merge_source_1 VALUES (1, 'newvalue_1'),(2, 'newvalue_2'),(4, 
'newvalue_4'),(6, 'newvalue_6'),(9, 'value_9'),(10, 'value_10'),(11, 
'value_11'),(12, 'value_12');

MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value 
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


CREATE TABLE merge_source_2(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_2 VALUES
  (2, 'newestvalue_2'),(4, 'newestvalue_4'),(6, 'newestvalue_6'),(10, 
'newestvalue_10'),(11, 'newestvalue_11'),(13, 'value_13'),(14, 'value_14');

MERGE INTO transactions AS T 
USING merge_source_2 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


ALTER TABLE transactions COMPACT 'MINOR';


CREATE TABLE merge_source_3(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_3 VALUES
  (1, 'latestvalue_1'),(4, 'latestvalue_4'),(5, 'latestvalue_5'),(9, 
'latestvalue_9'),(11, 'latestvalue_11'),(13, 'latestvalue_13'),(15, 'value_15');

MERGE INTO transactions AS T 
USING merge_source_3 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);

ALTER TABLE transactions COMPACT 'MINOR';
{noformat}

Running a select after the second compaction finished will return duplicated 
rows:
{noformat}
select * from transactions order by id;

+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| newvalue_1  |
| 1| latestvalue_1   |
| 2| newestvalue_2   |
| 2| newvalue_2  |
| 3| value_03|
| 4| latestvalue_4   |
| 4| newvalue_4  |
| 5| latestvalue_5   |
| 6| newvalue_6  |
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| latestvalue_9   |
| 10   | newestvalue_10  |
| 11   | latestvalue_11  |
| 12   | value_12|
| 13   | latestvalue_13  |
| 14   | value_14|
| 15   | value_15|
+--+-+
{noformat}

If the same queries are run with MR MINOR compaction, instead of the 
query-based MINOR compaction, the select will return the correct result:
{noformat}
+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| latestvalue_1   |
| 2| newestvalue_2   |
| 3| value_03|
| 4| latestvalue_4   |
| 5| latestvalue_5   |
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| latestvalue_9   |
| 10   | newestvalue_10  |
| 11   | latestvalue_11  |
| 12   | value_12|
| 13   | latestvalue_13  |
| 14   | value_14|
| 15   | value_15|
+--+-+
{noformat}

The content of the bucket files in the delta and delete delta directories after 
the query-based and MR 

[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Description: 
The query based MINOR compaction uses the following sorting order in its inner 
query: `bucket`, `originalTransaction`, `rowId`, as it can be seen in the 
[code|https://github.com/apache/hive/blob/d0bbe76ad626244802d062b0a93a9f1cd4fc5f20/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java#L474-L476].

But actually the rows should be ordered by originalTransactionId, 
bucketProperty and rowId, otherwise the delete deltas cannot be applied 
correctly. And this is the order what the MR MAJOR and MR MINOR compactions 
write. 
The sorting order used by the query-based MINOR compaction can lead to 
duplicated rows when running the compaction after multiple merge statements. 
This issue can be reproduced for example by running the following queries:
{noformat}
CREATE TABLE transactions(id int,value string) STORED AS ORC TBLPROPERTIES 
('transactional'='true');
INSERT INTO transactions VALUES
(1, 'value_01'),(2, 'value_02'),(3, 'value_03'),(4, 'value_04'),(5, 
'value_05'),(6, 'value_06'),(7, 'value_07'),(8, 'value_08');


CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
INSERT INTO merge_source_1 VALUES (1, 'newvalue_1'),(2, 'newvalue_2'),(4, 
'newvalue_4'),(6, 'newvalue_6'),(9, 'value_9'),(10, 'value_10'),(11, 
'value_11'),(12, 'value_12');

MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value 
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


CREATE TABLE merge_source_2(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_2 VALUES
  (2, 'newestvalue_2'),(4, 'newestvalue_4'),(6, 'newestvalue_6'),(10, 
'newestvalue_10'),(11, 'newestvalue_11'),(13, 'value_13'),(14, 'value_14');

MERGE INTO transactions AS T 
USING merge_source_2 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


ALTER TABLE transactions COMPACT 'MINOR';


CREATE TABLE merge_source_3(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_3 VALUES
  (1, 'latestvalue_1'),(4, 'latestvalue_4'),(5, 'latestvalue_5'),(9, 
'latestvalue_9'),(11, 'latestvalue_11'),(13, 'latestvalue_13'),(15, 'value_15');

MERGE INTO transactions AS T 
USING merge_source_3 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);

ALTER TABLE transactions COMPACT 'MINOR';
{noformat}

Running a select after the second compaction finished will return duplicated 
rows:
{noformat}
select * from transactions order by id;

+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| newvalue_1  |
| 1| latestvalue_1   |
| 2| newestvalue_2   |
| 2| newvalue_2  |
| 3| value_03|
| 4| latestvalue_4   |
| 4| newvalue_4  |
| 5| latestvalue_5   |
| 6| newvalue_6  |
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| latestvalue_9   |
| 10   | newestvalue_10  |
| 11   | latestvalue_11  |
| 12   | value_12|
| 13   | latestvalue_13  |
| 14   | value_14|
| 15   | value_15|
+--+-+
{noformat}

If the same queries are run with MR MINOR compaction, instead of the 
query-based MINOR compaction, the select will return the correct result:
{noformat}
+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| latestvalue_1   |
| 2| newestvalue_2   |
| 3| value_03|
| 4| latestvalue_4   |
| 5| latestvalue_5   |
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| latestvalue_9   |
| 10   | newestvalue_10  |
| 11   | latestvalue_11  |
| 12   | value_12|
| 13   | latestvalue_13  |
| 14   | value_14|
| 15   | value_15|
+--+-+
{noformat}

The content of the bucket files in the delta and delete delta directories after 
the query-based and MR 

[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Description: 
The query based MINOR compaction uses the following sorting order in its inner 
query: `bucket`, `originalTransaction`, `rowId`, as it can be seen in the 
[code|https://github.com/apache/hive/blob/d0bbe76ad626244802d062b0a93a9f1cd4fc5f20/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactionQueryBuilder.java#L474-L476].

But actually the rows should be ordered by originalTransactionId, 
bucketProperty and rowId, otherwise the delete deltas cannot be applied 
correctly. And this is the order what the MR MAJOR and MR MINOR compactions 
write. 
The sorting order used by the query-based MINOR compaction can lead to 
duplicated rows when running the compaction after multiple merge statements. 
This issue can be reproduced for example by running the following queries:
{noformat}
CREATE TABLE transactions(id int,value string) STORED AS ORC TBLPROPERTIES 
('transactional'='true');
INSERT INTO transactions VALUES
(1, 'value_01'),(2, 'value_02'),(3, 'value_03'),(4, 'value_04'),(5, 
'value_05'),(6, 'value_06'),(7, 'value_07'),(8, 'value_08');


CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
INSERT INTO merge_source_1 VALUES (1, 'newvalue_1'),(2, 'newvalue_2'),(4, 
'newvalue_4'),(6, 'newvalue_6'),(9, 'value_9'),(10, 'value_10'),(11, 
'value_11'),(12, 'value_12');

MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value 
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


CREATE TABLE merge_source_2(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_2 VALUES
  (2, 'newestvalue_2'),(4, 'newestvalue_4'),(6, 'newestvalue_6'),(10, 
'newestvalue_10'),(11, 'newestvalue_11'),(13, 'value_13'),(14, 'value_14');

MERGE INTO transactions AS T 
USING merge_source_2 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);


ALTER TABLE transactions COMPACT 'MINOR';


CREATE TABLE merge_source_3(ID int, value string) STORED AS ORC;
INSERT INTO merge_source_3 VALUES
  (1, 'latestvalue_1'),(4, 'latestvalue_4'),(5, 'latestvalue_5'),(9, 
'latestvalue_9'),(11, 'latestvalue_11'),(13, 'latestvalue_13'),(15, 'value_15');

MERGE INTO transactions AS T 
USING merge_source_3 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);

ALTER TABLE transactions COMPACT 'MINOR';
{noformat}

Running a select after the second compaction finished will return duplicated 
rows:
{noformat}
select * from transactions order by id;

+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| newvalue_1  |
| 1| latestvalue_1   |
| 2| newestvalue_2   |
| 2| newvalue_2  |
| 3| value_03|
| 4| latestvalue_4   |
| 4| newvalue_4  |
| 5| latestvalue_5   |
| 6| newvalue_6  |
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| latestvalue_9   |
| 10   | newestvalue_10  |
| 11   | latestvalue_11  |
| 12   | value_12|
| 13   | latestvalue_13  |
| 14   | value_14|
| 15   | value_15|
+--+-+
{noformat}

If the same queries are run with MR MINOR compaction, instead of the 
query-based MINOR compaction, the select will return the correct result:
{noformat}
+--+-+
| transactions.id  | transactions.value  |
+--+-+
| 1| newvalue_1  |
| 2| newestvalue_2   |
| 3| value_03|
| 4| newestvalue_4   |
| 5| value_05|
| 6| newestvalue_6   |
| 7| value_07|
| 8| value_08|
| 9| value_9 |
| 10   | newestvalue_10  |
| 11   | newestvalue_11  |
| 12   | value_12|
| 13   | value_13|
| 14   | value_14|
+--+---
{noformat}


  was:Details will be added soon.


> Incorrect row order after query-based MINOR compaction
> --
>

[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Fix Version/s: 4.0.0

> Incorrect row order after query-based MINOR compaction
> --
>
> Key: HIVE-25258
> URL: https://issues.apache.org/jira/browse/HIVE-25258
> Project: Hive
>  Issue Type: Bug
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Component/s: Transactions

> Incorrect row order after query-based MINOR compaction
> --
>
> Key: HIVE-25258
> URL: https://issues.apache.org/jira/browse/HIVE-25258
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Description: Detail will be added soon.

> Incorrect row order after query-based MINOR compaction
> --
>
> Key: HIVE-25258
> URL: https://issues.apache.org/jira/browse/HIVE-25258
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>
> Detail will be added soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25258:
-
Description: Details will be added soon.  (was: Detail will be added soon.)

> Incorrect row order after query-based MINOR compaction
> --
>
> Key: HIVE-25258
> URL: https://issues.apache.org/jira/browse/HIVE-25258
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>
> Details will be added soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25257) Incorrect row order validation for query-based MAJOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25257:
-
Component/s: Transactions

> Incorrect row order validation for query-based MAJOR compaction
> ---
>
> Key: HIVE-25257
> URL: https://issues.apache.org/jira/browse/HIVE-25257
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>
> In the insert query of the query-based MAJOR compaction, there is this 
> function call: "validate_acid_sort_order(ROW__ID.writeId, ROW__ID.bucketId, 
> ROW__ID.rowId)".
> This is to validate if the order of the rows is correct. This validation is 
> done by the GenericUDFValidateAcidSortOrder class and it assumes that the 
> rows are in increasing order by bucketProperty, originalTransactionId and 
> rowId. 
> But actually the rows should be ordered by originalTransactionId, 
> bucketProperty and rowId, otherwise the delete deltas cannot be applied 
> correctly. And this is the order what the MR MAJOR compaction writes and how 
> the split groups are created for the query-based MAJOR compaction. It doesn't 
> cause any issue until there is only one bucketProperty in the files, but as 
> soon as there are multiple bucketProperties in the same file, the validation 
> will fail. This can be reproduced by running multiple merge statements after 
> each other.
> For example:
> {noformat}
> CREATE TABLE transactions (id int,value string) STORED AS ORC TBLPROPERTIES 
> ('transactional'='true');
> INSERT INTO transactions VALUES
> (1, 'value_1'),
> (2, 'value_2'),
> (3, 'value_3'),
> (4, 'value_4'),
> (5, 'value_5');
> CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
> INSERT INTO merge_source_1 VALUES 
> (1, 'newvalue_1'),
> (2, 'newvalue_2'),
> (3, 'newvalue_3'),
> (6, 'value_6'),
> (7, 'value_7');
> MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
> WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
> value = S.value 
> WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);
> CREATE TABLE merge_source_2(
>  ID int,
>  value string)
> STORED AS ORC;
> INSERT INTO merge_source_2 VALUES
> (1, 'newestvalue_1'),
> (2, 'newestvalue_2'),
> (5, 'newestvalue_5'),
> (7, 'newestvalue_7'),
> (8, 'value_18);
> MERGE INTO transactions AS T 
> USING merge_source_2 AS S
> ON T.ID = S.ID
> WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
> value = S.value
> WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);
> ALTER TABLE transactions COMPACT 'MAJOR';
> {noformat}
> The MAJOR compaction will fail with the following error:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Wrong sort order 
> of Acid rows detected for the rows: 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@4d3ef25e
>  and 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@1c9df436
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder.evaluate(GenericUDFValidateAcidSortOrder.java:80)
> {noformat}
> So the validation doesn't check for the correct row order. The correct order 
> is originalTransactionId, bucketProperty, rowId.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25257) Incorrect row order validation for query-based MAJOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25257:
-
Fix Version/s: 4.0.0

> Incorrect row order validation for query-based MAJOR compaction
> ---
>
> Key: HIVE-25257
> URL: https://issues.apache.org/jira/browse/HIVE-25257
> Project: Hive
>  Issue Type: Bug
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
> Fix For: 4.0.0
>
>
> In the insert query of the query-based MAJOR compaction, there is this 
> function call: "validate_acid_sort_order(ROW__ID.writeId, ROW__ID.bucketId, 
> ROW__ID.rowId)".
> This is to validate if the order of the rows is correct. This validation is 
> done by the GenericUDFValidateAcidSortOrder class and it assumes that the 
> rows are in increasing order by bucketProperty, originalTransactionId and 
> rowId. 
> But actually the rows should be ordered by originalTransactionId, 
> bucketProperty and rowId, otherwise the delete deltas cannot be applied 
> correctly. And this is the order what the MR MAJOR compaction writes and how 
> the split groups are created for the query-based MAJOR compaction. It doesn't 
> cause any issue until there is only one bucketProperty in the files, but as 
> soon as there are multiple bucketProperties in the same file, the validation 
> will fail. This can be reproduced by running multiple merge statements after 
> each other.
> For example:
> {noformat}
> CREATE TABLE transactions (id int,value string) STORED AS ORC TBLPROPERTIES 
> ('transactional'='true');
> INSERT INTO transactions VALUES
> (1, 'value_1'),
> (2, 'value_2'),
> (3, 'value_3'),
> (4, 'value_4'),
> (5, 'value_5');
> CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
> INSERT INTO merge_source_1 VALUES 
> (1, 'newvalue_1'),
> (2, 'newvalue_2'),
> (3, 'newvalue_3'),
> (6, 'value_6'),
> (7, 'value_7');
> MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
> WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
> value = S.value 
> WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);
> CREATE TABLE merge_source_2(
>  ID int,
>  value string)
> STORED AS ORC;
> INSERT INTO merge_source_2 VALUES
> (1, 'newestvalue_1'),
> (2, 'newestvalue_2'),
> (5, 'newestvalue_5'),
> (7, 'newestvalue_7'),
> (8, 'value_18);
> MERGE INTO transactions AS T 
> USING merge_source_2 AS S
> ON T.ID = S.ID
> WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
> value = S.value
> WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);
> ALTER TABLE transactions COMPACT 'MAJOR';
> {noformat}
> The MAJOR compaction will fail with the following error:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Wrong sort order 
> of Acid rows detected for the rows: 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@4d3ef25e
>  and 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@1c9df436
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder.evaluate(GenericUDFValidateAcidSortOrder.java:80)
> {noformat}
> So the validation doesn't check for the correct row order. The correct order 
> is originalTransactionId, bucketProperty, rowId.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25258) Incorrect row order after query-based MINOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora reassigned HIVE-25258:



> Incorrect row order after query-based MINOR compaction
> --
>
> Key: HIVE-25258
> URL: https://issues.apache.org/jira/browse/HIVE-25258
> Project: Hive
>  Issue Type: Bug
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25257) Incorrect row order validation for query-based MAJOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25257 started by Marta Kuczora.

> Incorrect row order validation for query-based MAJOR compaction
> ---
>
> Key: HIVE-25257
> URL: https://issues.apache.org/jira/browse/HIVE-25257
> Project: Hive
>  Issue Type: Bug
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>
> In the insert query of the query-based MAJOR compaction, there is this 
> function call: "validate_acid_sort_order(ROW__ID.writeId, ROW__ID.bucketId, 
> ROW__ID.rowId)".
> This is to validate if the order of the rows is correct. This validation is 
> done by the GenericUDFValidateAcidSortOrder class and it assumes that the 
> rows are in increasing order by bucketProperty, originalTransactionId and 
> rowId. 
> But actually the rows should be ordered by originalTransactionId, 
> bucketProperty and rowId, otherwise the delete deltas cannot be applied 
> correctly. And this is the order what the MR MAJOR compaction writes and how 
> the split groups are created for the query-based MAJOR compaction. It doesn't 
> cause any issue until there is only one bucketProperty in the files, but as 
> soon as there are multiple bucketProperties in the same file, the validation 
> will fail. This can be reproduced by running multiple merge statements after 
> each other.
> For example:
> {noformat}
> CREATE TABLE transactions (id int,value string) STORED AS ORC TBLPROPERTIES 
> ('transactional'='true');
> INSERT INTO transactions VALUES
> (1, 'value_1'),
> (2, 'value_2'),
> (3, 'value_3'),
> (4, 'value_4'),
> (5, 'value_5');
> CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
> INSERT INTO merge_source_1 VALUES 
> (1, 'newvalue_1'),
> (2, 'newvalue_2'),
> (3, 'newvalue_3'),
> (6, 'value_6'),
> (7, 'value_7');
> MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
> WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
> value = S.value 
> WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);
> CREATE TABLE merge_source_2(
>  ID int,
>  value string)
> STORED AS ORC;
> INSERT INTO merge_source_2 VALUES
> (1, 'newestvalue_1'),
> (2, 'newestvalue_2'),
> (5, 'newestvalue_5'),
> (7, 'newestvalue_7'),
> (8, 'value_18);
> MERGE INTO transactions AS T 
> USING merge_source_2 AS S
> ON T.ID = S.ID
> WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
> value = S.value
> WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);
> ALTER TABLE transactions COMPACT 'MAJOR';
> {noformat}
> The MAJOR compaction will fail with the following error:
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Wrong sort order 
> of Acid rows detected for the rows: 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@4d3ef25e
>  and 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@1c9df436
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder.evaluate(GenericUDFValidateAcidSortOrder.java:80)
> {noformat}
> So the validation doesn't check for the correct row order. The correct order 
> is originalTransactionId, bucketProperty, rowId.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25257) Incorrect row order validation for query-based MAJOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora updated HIVE-25257:
-
Description: 
In the insert query of the query-based MAJOR compaction, there is this function 
call: "validate_acid_sort_order(ROW__ID.writeId, ROW__ID.bucketId, 
ROW__ID.rowId)".
This is to validate if the order of the rows is correct. This validation is 
done by the GenericUDFValidateAcidSortOrder class and it assumes that the rows 
are in increasing order by bucketProperty, originalTransactionId and rowId. 

But actually the rows should be ordered by originalTransactionId, 
bucketProperty and rowId, otherwise the delete deltas cannot be applied 
correctly. And this is the order what the MR MAJOR compaction writes and how 
the split groups are created for the query-based MAJOR compaction. It doesn't 
cause any issue until there is only one bucketProperty in the files, but as 
soon as there are multiple bucketProperties in the same file, the validation 
will fail. This can be reproduced by running multiple merge statements after 
each other.
For example:
{noformat}
CREATE TABLE transactions (id int,value string) STORED AS ORC TBLPROPERTIES 
('transactional'='true');

INSERT INTO transactions VALUES
(1, 'value_1'),
(2, 'value_2'),
(3, 'value_3'),
(4, 'value_4'),
(5, 'value_5');

CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
INSERT INTO merge_source_1 VALUES 
(1, 'newvalue_1'),
(2, 'newvalue_2'),
(3, 'newvalue_3'),
(6, 'value_6'),
(7, 'value_7');

MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value 
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);

CREATE TABLE merge_source_2(
 ID int,
 value string)
STORED AS ORC;

INSERT INTO merge_source_2 VALUES
(1, 'newestvalue_1'),
(2, 'newestvalue_2'),
(5, 'newestvalue_5'),
(7, 'newestvalue_7'),
(8, 'value_18);

MERGE INTO transactions AS T 
USING merge_source_2 AS S
ON T.ID = S.ID
WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
value = S.value
WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);

ALTER TABLE transactions COMPACT 'MAJOR';
{noformat}
The MAJOR compaction will fail with the following error:
{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Wrong sort order 
of Acid rows detected for the rows: 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@4d3ef25e
 and 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder$WriteIdRowId@1c9df436
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFValidateAcidSortOrder.evaluate(GenericUDFValidateAcidSortOrder.java:80)
{noformat}
So the validation doesn't check for the correct row order. The correct order is 
originalTransactionId, bucketProperty, rowId.

> Incorrect row order validation for query-based MAJOR compaction
> ---
>
> Key: HIVE-25257
> URL: https://issues.apache.org/jira/browse/HIVE-25257
> Project: Hive
>  Issue Type: Bug
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>
> In the insert query of the query-based MAJOR compaction, there is this 
> function call: "validate_acid_sort_order(ROW__ID.writeId, ROW__ID.bucketId, 
> ROW__ID.rowId)".
> This is to validate if the order of the rows is correct. This validation is 
> done by the GenericUDFValidateAcidSortOrder class and it assumes that the 
> rows are in increasing order by bucketProperty, originalTransactionId and 
> rowId. 
> But actually the rows should be ordered by originalTransactionId, 
> bucketProperty and rowId, otherwise the delete deltas cannot be applied 
> correctly. And this is the order what the MR MAJOR compaction writes and how 
> the split groups are created for the query-based MAJOR compaction. It doesn't 
> cause any issue until there is only one bucketProperty in the files, but as 
> soon as there are multiple bucketProperties in the same file, the validation 
> will fail. This can be reproduced by running multiple merge statements after 
> each other.
> For example:
> {noformat}
> CREATE TABLE transactions (id int,value string) STORED AS ORC TBLPROPERTIES 
> ('transactional'='true');
> INSERT INTO transactions VALUES
> (1, 'value_1'),
> (2, 'value_2'),
> (3, 'value_3'),
> (4, 'value_4'),
> (5, 'value_5');
> CREATE TABLE merge_source_1(ID int,value string) STORED AS ORC;
> INSERT INTO merge_source_1 VALUES 
> (1, 'newvalue_1'),
> (2, 'newvalue_2'),
> (3, 'newvalue_3'),
> (6, 'value_6'),
> (7, 'value_7');
> MERGE INTO transactions AS T USING merge_source_1 AS S ON T.ID = S.ID 
> WHEN MATCHED AND (T.value != S.value AND S.value IS NOT NULL) THEN UPDATE SET 
> value = S.value 
> WHEN NOT MATCHED THEN INSERT VALUES (S.ID, S.value);
> 

[jira] [Assigned] (HIVE-25257) Incorrect row order validation for query-based MAJOR compaction

2021-06-16 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora reassigned HIVE-25257:



> Incorrect row order validation for query-based MAJOR compaction
> ---
>
> Key: HIVE-25257
> URL: https://issues.apache.org/jira/browse/HIVE-25257
> Project: Hive
>  Issue Type: Bug
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25254) Upgrade to tez 0.10.1

2021-06-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17364232#comment-17364232
 ] 

László Bodor commented on HIVE-25254:
-

https://github.com/apache/hive/pull/2398

> Upgrade to tez 0.10.1
> -
>
> Key: HIVE-25254
> URL: https://issues.apache.org/jira/browse/HIVE-25254
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25256) Support ALTER TABLE CHANGE COLUMN for Iceberg

2021-06-16 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod reassigned HIVE-25256:
-


> Support ALTER TABLE CHANGE COLUMN for Iceberg
> -
>
> Key: HIVE-25256
> URL: https://issues.apache.org/jira/browse/HIVE-25256
> Project: Hive
>  Issue Type: New Feature
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>
> In order to provide support for renaming/changing the data type of a single 
> column, we should add alter table change column support for Iceberg tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25255) Support ALTER TABLE REPLACE COLUMNS for Iceberg

2021-06-16 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod reassigned HIVE-25255:
-


> Support ALTER TABLE REPLACE COLUMNS for Iceberg
> ---
>
> Key: HIVE-25255
> URL: https://issues.apache.org/jira/browse/HIVE-25255
> Project: Hive
>  Issue Type: New Feature
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25254) Upgrade to tez 0.10.1

2021-06-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25254 started by László Bodor.
---
> Upgrade to tez 0.10.1
> -
>
> Key: HIVE-25254
> URL: https://issues.apache.org/jira/browse/HIVE-25254
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25254) Upgrade to tez 0.10.1

2021-06-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-25254:
---

Assignee: László Bodor

> Upgrade to tez 0.10.1
> -
>
> Key: HIVE-25254
> URL: https://issues.apache.org/jira/browse/HIVE-25254
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25253) Incremental rewrite of partitioned insert only materialized views

2021-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-25253:
-


> Incremental rewrite of partitioned insert only materialized views
> -
>
> Key: HIVE-25253
> URL: https://issues.apache.org/jira/browse/HIVE-25253
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25253) Incremental rebuild of partitioned insert only materialized views

2021-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-25253:
--
Summary: Incremental rebuild of partitioned insert only materialized views  
(was: Incremental rewrite of partitioned insert only materialized views)

> Incremental rebuild of partitioned insert only materialized views
> -
>
> Key: HIVE-25253
> URL: https://issues.apache.org/jira/browse/HIVE-25253
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24991) Enable fetching deleted rows in vectorized mode

2021-06-16 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-24991:
-

Assignee: Krisztian Kasa

> Enable fetching deleted rows in vectorized mode
> ---
>
> Key: HIVE-24991
> URL: https://issues.apache.org/jira/browse/HIVE-24991
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> HIVE-24855 enables loading deleted rows from ORC tables when table property 
> *acid.fetch.deleted.rows* is true.
> The goal of this jira is to enable this feature in vectorized orc batch 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24804) Introduce check: RANGE with offset PRECEDING/FOLLOWING requires at least one ORDER BY column

2021-06-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24804:

Summary: Introduce check: RANGE with offset PRECEDING/FOLLOWING requires at 
least one ORDER BY column  (was: Introduce check: RANGE with offset 
PRECEDING/FOLLOWING requires exactly one ORDER BY column)

> Introduce check: RANGE with offset PRECEDING/FOLLOWING requires at least one 
> ORDER BY column
> 
>
> Key: HIVE-24804
> URL: https://issues.apache.org/jira/browse/HIVE-24804
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently, in Hive, we can run a windowing function with range specification 
> but without an ORDER BY clause:
> {code}
> create table vector_ptf_part_simple_text(p_mfgr string, p_name string, 
> p_retailprice double, rowindex string);
> select p_mfgr, p_name, rowindex,
> count(*) over(partition by p_mfgr range between 1 preceding and current row) 
> as cs1,
> count(*) over(partition by p_mfgr range between 3 preceding and current row) 
> as cs2
> from vector_ptf_part_simple_text;
> {code}
> This is confusing, because without an order by clause, the range is out of 
> context, we don't know by which column should we calculate the range.
> Tested on Postgres, it throws an exception:
> {code}
> create table vector_ptf_part_simple_text(p_mfgr varchar(10), p_name 
> varchar(10), p_retailprice integer, rowindex varchar(10));
> select p_mfgr, p_name, rowindex,
> count(*) over(partition by p_mfgr range between 1 preceding and current row) 
> as cs1,
> count(*) over(partition by p_mfgr range between 3 preceding and current row) 
> as cs2
> from vector_ptf_part_simple_text;
> *RANGE with offset PRECEDING/FOLLOWING requires exactly one ORDER BY column*
> {code}
> further references:
> https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts
> {code}
> RANGE: Computes the window frame based on a logical range of rows around the 
> current row, based on the current row’s ORDER BY key value. The provided 
> range value is added or subtracted to the current row's key value to define a 
> starting or ending range boundary for the window frame. In a range-based 
> window frame, there must be exactly one expression in the ORDER BY clause, 
> and the expression must have a numeric type.
> {code}
> https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/window-functions-frames.html
> {code}
> Without ORDER BY: The default frame includes all partition rows (because, 
> without ORDER BY, all partition rows are peers). The default is equivalent to 
> this frame specification:
> RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> {code}
> I believe this one could only make sense if you don't specify range, 
> otherwise the sql statement reflects a different thing from which is returned 
> by the engine



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25252) All new compaction metrics should be lower case

2021-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?focusedWorklogId=611762=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611762
 ]

ASF GitHub Bot logged work on HIVE-25252:
-

Author: ASF GitHub Bot
Created on: 16/Jun/21 07:16
Start Date: 16/Jun/21 07:16
Worklog Time Spent: 10m 
  Work Description: asinkovits opened a new pull request #2397:
URL: https://github.com/apache/hive/pull/2397


   
   
   
   ### What changes were proposed in this pull request?
   
   All newly introduced compaction related metrics should be lower case.
   
   ### Why are the changes needed?
   
   Some consumers of the metrics only accept lower case names.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Unit test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611762)
Remaining Estimate: 0h
Time Spent: 10m

> All new compaction metrics should be lower case
> ---
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> E.g:
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major
> compaction_cleaner_cycle_MINOR -> compaction_cleaner_cycle_minor
> compaction_cleaner_cycle_MAJOR -> compaction_cleaner_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25252) All new compaction metrics should be lower case

2021-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25252:
--
Labels: pull-request-available  (was: )

> All new compaction metrics should be lower case
> ---
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> E.g:
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major
> compaction_cleaner_cycle_MINOR -> compaction_cleaner_cycle_minor
> compaction_cleaner_cycle_MAJOR -> compaction_cleaner_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25086) Create Ranger Deny Policy for replication db in all cases if hive.repl.ranger.target.deny.policy is set to true.

2021-06-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25086?focusedWorklogId=611760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611760
 ]

ASF GitHub Bot logged work on HIVE-25086:
-

Author: ASF GitHub Bot
Created on: 16/Jun/21 07:11
Start Date: 16/Jun/21 07:11
Worklog Time Spent: 10m 
  Work Description: aasha commented on a change in pull request #2240:
URL: https://github.com/apache/hive/pull/2240#discussion_r652410662



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ranger/RangerRestClientImpl.java
##
@@ -79,32 +79,29 @@
   public RangerExportPolicyList exportRangerPolicies(String 
sourceRangerEndpoint,
  String dbName,
  String 
rangerHiveServiceName,
- HiveConf hiveConf)throws 
SemanticException {
+ HiveConf hiveConf)throws 
Exception {
 LOG.info("Ranger endpoint for cluster " + sourceRangerEndpoint);
 if (StringUtils.isEmpty(rangerHiveServiceName)) {
   throw new 
SemanticException(ErrorMsg.REPL_INVALID_CONFIG_FOR_SERVICE.format("Ranger 
Service Name " +
 "cannot be empty", ReplUtils.REPL_RANGER_SERVICE));
 }
+String finalUrl = getRangerExportUrl(sourceRangerEndpoint, 
rangerHiveServiceName, dbName);
+LOG.debug("Url to export policies from source Ranger: {}", finalUrl);
 Retryable retryable = Retryable.builder()
   .withHiveConf(hiveConf).withFailOnException(RuntimeException.class)
-  .withRetryOnException(URISyntaxException.class).build();
+  .withRetryOnException(Exception.class).build();

Review comment:
   RuntimeException is a subclass of Exception. If you retry on Exception, 
RuntimeException will also get retried




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 611760)
Time Spent: 4h 10m  (was: 4h)

> Create Ranger Deny Policy for replication db in all cases if 
> hive.repl.ranger.target.deny.policy is set to true.
> 
>
> Key: HIVE-25086
> URL: https://issues.apache.org/jira/browse/HIVE-25086
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25252) All new compaction metrics should be lower case

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25252:
---
Summary: All new compaction metrics should be lower case  (was: All new 
compaction metrics should be lower cased)

> All new compaction metrics should be lower case
> ---
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> E.g:
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major
> compaction_cleaner_cycle_MINOR -> compaction_cleaner_cycle_minor
> compaction_cleaner_cycle_MAJOR -> compaction_cleaner_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25252) All new compaction metrics should be lower cased

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25252:
---
Summary: All new compaction metrics should be lower cased  (was: All new 
metrics should be lower cased)

> All new compaction metrics should be lower cased
> 
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> E.g:
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major
> compaction_cleaner_cycle_MINOR -> compaction_cleaner_cycle_minor
> compaction_cleaner_cycle_MAJOR -> compaction_cleaner_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25252) All new metrics should be lower cased

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25252:
---
Description: 
E.g:
compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major
compaction_cleaner_cycle_MINOR -> compaction_cleaner_cycle_minor
compaction_cleaner_cycle_MAJOR -> compaction_cleaner_cycle_major

  was:
E.g:
compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major


> All new metrics should be lower cased
> -
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> E.g:
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major
> compaction_cleaner_cycle_MINOR -> compaction_cleaner_cycle_minor
> compaction_cleaner_cycle_MAJOR -> compaction_cleaner_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25252) All new metrics should be lower cased

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25252 started by Antal Sinkovits.
--
> All new metrics should be lower cased
> -
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25252) All new metrics should be lower cased

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25252:
---
Description: 
E.g:
compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major

  was:
compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major


> All new metrics should be lower cased
> -
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> E.g:
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25252) All new metrics should be lower cased

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25252:
---
Parent: HIVE-24824
Issue Type: Sub-task  (was: Bug)

> All new metrics should be lower cased
> -
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25252) All new mewLower case

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits reassigned HIVE-25252:
--


> All new mewLower case
> -
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Bug
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25252) All new metrics should be lower cased

2021-06-16 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits updated HIVE-25252:
---
Summary: All new metrics should be lower cased  (was: All new mewLower case)

> All new metrics should be lower cased
> -
>
> Key: HIVE-25252
> URL: https://issues.apache.org/jira/browse/HIVE-25252
> Project: Hive
>  Issue Type: Bug
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>
> compaction_worker_cycle_MINOR -> compaction_worker_cycle_minor
> compaction_worker_cycle_MAJOR -> compaction_worker_cycle_major



--
This message was sent by Atlassian Jira
(v8.3.4#803005)