Venugopal Reddy K created IMPALA-13141:
------------------------------------------
Summary: Partition transactional table is not updated on alter
partition when hms_event_incremental_refresh_transactional_table is disabled
Key: IMPALA-13141
URL: https://issues.apache.org/jira/browse/IMPALA-13141
Project: IMPALA
Issue Type: Bug
Reporter: Venugopal Reddy K
Partition transactional table is not updated on alter partition when
hms_event_incremental_refresh_transactional_table is disabled.
*Observations:*
1. In case of AlterPartitionEvent, this issue occurs when
hms_event_incremental_refresh_transactional_table is disabled.
2. In case of BatchPartitionEvent(when. more than 1
AlterPartitionEvent are batched together), this issue occurs without disabling
hms_event_incremental_refresh_transactional_table.
*Steps to reproduce:*
1. Create partitioned table and add some partitions from hive:
Note: This step can be from impala too.
{code:java}
0: jdbc:hive2://localhost:11050> create table s(i int, j int, p int);
0: jdbc:hive2://localhost:11050> insert into s values(1,10,100),(2,20,200);
{code}
{code:java}
0: jdbc:hive2://localhost:11050> create table test1(i int, j int) partitioned
by(p int) tblproperties ('transactional'='true',
'transactional_properties'='insert_only');
0: jdbc:hive2://localhost:11050> set hive.exec.dynamic.partition.mode=nonstrict;
0: jdbc:hive2://localhost:11050> insert into test partition(p) select * from s;
0: jdbc:hive2://localhost:11050> show partitions test;
+------------+
| partition |
+------------+
| p=100 |
| p=200 |
+------------+
0: jdbc:hive2://localhost:11050> desc formatted test partition(p=100);
+-----------------------------------+----------------------------------------------------+-----------------------+
| col_name | data_type
| comment |
+-----------------------------------+----------------------------------------------------+-----------------------+
| i | int
| |
| j | int
| |
| | NULL
| NULL |
| # Partition Information | NULL
| NULL |
| # col_name | data_type
| comment |
| p | int
| |
| | NULL
| NULL |
| # Detailed Partition Information | NULL
| NULL |
| Partition Value: | [100]
| NULL |
| Database: | default
| NULL |
| Table: | test
| NULL |
| CreateTime: | Fri Jun 07 14:21:17 IST 2024
| NULL |
| LastAccessTime: | UNKNOWN
| NULL |
| Location: |
hdfs://localhost:20500/test-warehouse/managed/test/p=100 | NULL
|
| Partition Parameters: | NULL
| NULL |
| | numFiles
| 1 |
| | totalSize
| 5 |
| | transient_lastDdlTime
| 1717750277 |
| | NULL
| NULL |
| # Storage Information | NULL
| NULL |
| SerDe Library: |
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL |
| InputFormat: | org.apache.hadoop.mapred.TextInputFormat
| NULL |
| OutputFormat: |
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL
|
| Compressed: | No
| NULL |
| Num Buckets: | -1
| NULL |
| Bucket Columns: | []
| NULL |
| Sort Columns: | []
| NULL |
| Storage Desc Params: | NULL
| NULL |
| | serialization.format
| 1 |
+-----------------------------------+----------------------------------------------------+-----------------------+
0: jdbc:hive2://localhost:11050>
{code}
2. Check for the partitions and its locations from impala shell:
{code:java}
[localhost:21050] default> show partitions test;
Query: show partitions test
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+----------------------------------------------------------+-----------+
| p | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format |
Incremental stats | Location |
EC Policy |
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+----------------------------------------------------------+-----------+
| 100 | -1 | 1 | 5B | NOT CACHED | NOT CACHED | TEXT |
false | hdfs://localhost:20500/test-warehouse/managed/test/p=100 |
NONE |
| 200 | -1 | 1 | 5B | NOT CACHED | NOT CACHED | TEXT |
false | hdfs://localhost:20500/test-warehouse/managed/test/p=200 |
NONE |
| Total | -1 | 2 | 10B | 0B | | |
| |
|
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+----------------------------------------------------------+-----------+
Fetched 3 row(s) in 4.43s
[localhost:21050] default>
[localhost:21050] default> select * from test;
Query: select * from test
Query submitted at: 2024-06-07 14:28:13 (Coordinator:
http://kvenureddy-OptiPlex-7000:25000)
Query state can be monitored at:
http://kvenureddy-OptiPlex-7000:25000/query_plan?query_id=024dff2843ae25bf:6d5792d100000000
+---+----+-----+
| i | j | p |
+---+----+-----+
| 2 | 20 | 200 |
| 1 | 10 | 100 |
+---+----+-----+
Fetched 2 row(s) in 0.34s
[localhost:21050] default> select * from test where p=100;
Query: select * from test where p=100
Query submitted at: 2024-06-07 14:28:19 (Coordinator:
http://kvenureddy-OptiPlex-7000:25000)
Query state can be monitored at:
http://kvenureddy-OptiPlex-7000:25000/query_plan?query_id=fa42921d722ecc8a:6e7e545400000000
+---+----+-----+
| i | j | p |
+---+----+-----+
| 1 | 10 | 100 |
+---+----+-----+
Fetched 1 row(s) in 0.11s
[localhost:21050] default>
{code}
3. Alter partition location from hive:
{code:java}
0: jdbc:hive2://localhost:11050> alter table test1 partition(p=100) set
location '/dummylocation100';
0: jdbc:hive2://localhost:11050> desc formatted test partition(p=100);
+-----------------------------------+----------------------------------------------------+-----------------------+
| col_name | data_type
| comment |
+-----------------------------------+----------------------------------------------------+-----------------------+
| i | int
| |
| j | int
| |
| | NULL
| NULL |
| # Partition Information | NULL
| NULL |
| # col_name | data_type
| comment |
| p | int
| |
| | NULL
| NULL |
| # Detailed Partition Information | NULL
| NULL |
| Partition Value: | [100]
| NULL |
| Database: | default
| NULL |
| Table: | test
| NULL |
| CreateTime: | Fri Jun 07 14:21:17 IST 2024
| NULL |
| LastAccessTime: | UNKNOWN
| NULL |
| Location: | hdfs://localhost:20500/dummylocation100
| NULL |
| Partition Parameters: | NULL
| NULL |
| | last_modified_by
| kvenureddy |
| | last_modified_time
| 1717750893 |
| | numFiles
| 1 |
| | totalSize
| 5 |
| | transient_lastDdlTime
| 1717750893 |
| | NULL
| NULL |
| # Storage Information | NULL
| NULL |
| SerDe Library: |
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL |
| InputFormat: | org.apache.hadoop.mapred.TextInputFormat
| NULL |
| OutputFormat: |
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL
|
| Compressed: | No
| NULL |
| Num Buckets: | -1
| NULL |
| Bucket Columns: | []
| NULL |
| Sort Columns: | []
| NULL |
| Storage Desc Params: | NULL
| NULL |
| | serialization.format
| 1 |
+-----------------------------------+----------------------------------------------------+-----------------------+
0: jdbc:hive2://localhost:11050> {code}
4. Check for the partition location from impala shell. It is not updated.
{code:java}
[localhost:21050] default> show partitions test;
Query: show partitions test
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+----------------------------------------------------------+-----------+
| p | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format |
Incremental stats | Location |
EC Policy |
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+----------------------------------------------------------+-----------+
| 100 | -1 | 1 | 5B | NOT CACHED | NOT CACHED | TEXT |
false | hdfs://localhost:20500/test-warehouse/managed/test/p=100 |
NONE |
| 200 | -1 | 1 | 5B | NOT CACHED | NOT CACHED | TEXT |
false | hdfs://localhost:20500/test-warehouse/managed/test/p=200 |
NONE |
| Total | -1 | 2 | 10B | 0B | | |
| |
|
+-------+-------+--------+------+--------------+-------------------+--------+-------------------+----------------------------------------------------------+-----------+
Fetched 3 row(s) in 0.01s
[localhost:21050] default>
[localhost:21050] default> select * from test where p=100;
Query: select * from test where p=100
Query submitted at: 2024-06-07 14:35:06 (Coordinator:
http://kvenureddy-OptiPlex-7000:25000)
Query state can be monitored at:
http://kvenureddy-OptiPlex-7000:25000/query_plan?query_id=7042fda005972a2b:a82fd60700000000
+---+----+-----+
| i | j | p |
+---+----+-----+
| 1 | 10 | 100 |
+---+----+-----+
Fetched 1 row(s) in 0.11s
[localhost:21050] default>
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)