Quanlong Huang created IMPALA-13691:
---------------------------------------
Summary: Processing INSERT event failed by partition values
mismatch
Key: IMPALA-13691
URL: https://issues.apache.org/jira/browse/IMPALA-13691
Project: IMPALA
Issue Type: Bug
Components: Catalog
Reporter: Quanlong Huang
Assignee: Quanlong Huang
Create a partitioned table:
{code:sql}
create external table test_part (i int) partitioned by (s string);{code}
Add the following partition folders inside the table location:
{code:bash}
TBL_DIR=hdfs://localhost:20500/test-warehouse/test_part
hdfs dfs -mkdir "$TBL_DIR/s=2024-09-09 00%25253A00%25253A00"
hdfs dfs -mkdir "$TBL_DIR/s=2024-09-09 00%253A00%253A00"
hdfs dfs -mkdir "$TBL_DIR/s=2024-09-09 00%3A00%3A00"
hdfs dfs -mkdir "$TBL_DIR/s=2024-09-10 00%25253A00%25253A00"
hdfs dfs -mkdir "$TBL_DIR/s=2024-09-10 00%253A00%253A00"
hdfs dfs -mkdir "$TBL_DIR/s=2024-09-10 00%3A00%3A00"
hdfs dfs -mkdir "$TBL_DIR/s=2025-01-21 00%253A00%253A00"
hdfs dfs -mkdir "$TBL_DIR/s=2025-01-21 00%3A00%3A00"
hdfs dfs -mkdir "$TBL_DIR/s=2025-01-22 00%3A00%3A00"{code}
In Impala, create the partitions by ALTER TABLE RECOVER PARTITIONS:
{code:sql}
impala> alter table test_part recover partitions;{code}
The partition values are inconsistent with the partition folders:
{noformat}
Query: show partitions test_part
+-----------------------------+-------+--------+--------+--------------+-------------------+---------+-------------------+-----------------------------------------------------------------------------------+-----------+
| s | #Rows | #Files | Size | Bytes Cached | Cache
Replication | Format | Incremental stats | Location
| EC Policy |
+-----------------------------+-------+--------+--------+--------------+-------------------+---------+-------------------+-----------------------------------------------------------------------------------+-----------+
| 2024-09-09 00%253A00%253A00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2024-09-09
00%25253A00%25253A00 | NONE |
| 2024-09-09 00%3A00%3A00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2024-09-09 00%253A00%253A00
| NONE |
| 2024-09-09 00:00:00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2024-09-09 00%3A00%3A00
| NONE |
| 2024-09-10 00%253A00%253A00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2024-09-10
00%25253A00%25253A00 | NONE |
| 2024-09-10 00%3A00%3A00 | -1 | 4 | 1.70KB | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2024-09-10 00%253A00%253A00
| NONE |
| 2024-09-10 00:00:00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2024-09-10 00%3A00%3A00
| NONE |
| 2025-01-21 00%3A00%3A00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2025-01-21 00%253A00%253A00
| NONE |
| 2025-01-21 00:00:00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2025-01-21 00%3A00%3A00
| NONE |
| 2025-01-22 00:00:00 | -1 | 0 | 0B | NOT CACHED | NOT
CACHED | PARQUET | false |
hdfs://localhost:20500/test-warehouse/test_part/s=2025-01-22 00%3A00%3A00
| NONE |
| Total | -1 | 4 | 1.70KB | 0B |
| | |
| |
+-----------------------------+-------+--------+--------+--------------+-------------------+---------+-------------------+-----------------------------------------------------------------------------------+-----------+{noformat}
INSERT one partition in Hive:
{code:sql}
hive> insert into test_part partition(s="2024-09-10 00%3A00%3A00") values
(0);{code}
The EventProcessor in catalogd failed to process the INSERT event:
{noformat}
E0124 12:37:52.303791 1926240 MetastoreEventsProcessor.java:1098] Unexpected
exception received while processing event
Java exception follows:
java.lang.IllegalArgumentException
at
com.google.common.base.Preconditions.checkArgument(Preconditions.java:129)
at
org.apache.impala.catalog.HdfsTable.reloadPartitions(HdfsTable.java:3054)
at
org.apache.impala.catalog.HdfsTable.reloadPartitionsFromNames(HdfsTable.java:2946)
at
org.apache.impala.service.CatalogOpExecutor.reloadPartitionsFromNamesIfExists(CatalogOpExecutor.java:5092)
at
org.apache.impala.service.CatalogOpExecutor.reloadPartitionsIfExist(CatalogOpExecutor.java:5021)
at
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadPartitions(MetastoreEvents.java:1112)
at
org.apache.impala.catalog.events.MetastoreEvents$InsertEvent.processPartitionInserts(MetastoreEvents.java:1671)
at
org.apache.impala.catalog.events.MetastoreEvents$InsertEvent.processTableEvent(MetastoreEvents.java:1653)
at
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.process(MetastoreEvents.java:1339)
at
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:701)
at
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1336)
at
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:1079)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
E0124 12:37:52.306558 1926240 MetastoreEventsProcessor.java:1436] Event id:
38879
Event Type: INSERT
Event time: 1737693396
Database name: default
Table name: test_part
Event message:
H4sIAAAAAAAAAO1WW2+bMBj9KxPV3ggxJoUQaQ9pS7VM3VqlTJu0TJELTvHkYGqbTl2V/z5fSBsovajawx6ah9h85/D5u/mIW0dgfo25M3FkwclKToZDyjJECybkJAbjwHENhWT4jJMyIxWiiqys+YVac7xCNZXqUaILirUbLOSyQvzOlt5U2p58T5P5l+nJMp0enCRb8PTi1yfBSoXfLhx/4UzUIiRXm8W9p4WzcRcObKPNyRYL2thVjUrKyksLjixIAu3Bj4IojAM/ijW0vwsBbQkfWCJr4TizmyZKKtTZkx8N4PruwwRIb+Ck1EFvfvZARb4SrQZAsA/AUBdi8BtxXLBa4GGnLp3cGb/0UIWyAnsFyhmrvIJcY++KeoR56q2rGkvvM6o4zs/s06ysannM+BrJVsFe7/G0lh2XTaHl6uV1hq+IQk1qjr0mio8KP8f8CLfqtEaV7Ztx7G5X4N6qlmw0cdxpcEMwDYt7m28xH7zgBM3zn5mo3QNhBzObButmZLGejHzYKn9vlo+PsegdY7USc8M2u4V5JPdAu90qgHk9nX9NDEFyVAqCS7mkSMijnKZkjQ3l/qoa4unBp8Pp2fRgdjJLZ8m5oSiX82R65Kr123yWKo9NiTvBtsXH5uM/mEk/6lxHNUANd9zSEGNqMibhaPs+DHdNMBzDzUYJXSXLPpnrtNvXDgAcDUA88ME7AN4H0+Zv4fSJxVMC2JGIXgWM4ZMK+B/q3VB8aFcI7psa2eVNDd/U8NVqGD0Z7EgP+7NCBcehVTQmET0nfyw4joCxlvX6mFAsEo5EzfEhy3FuCG3YmOBWx+JHBQnsSs3AN0IjVUBConV1d8mDOHRVZSuKMv0NtkJUYEVc6ZNUnv/2Ag6B+S3BMmPVzTLY29tTwvUXNRWTVGIKAAA=
W0124 12:37:52.306648 1926240 MetastoreEventsProcessor.java:1067] Event
processing is skipped since status is ERROR. Last synced event id is
38878{noformat}
Note that to reproduce the issue after IMPALA-12832, you need to launch
catalogd with "--invalidate_metadata_on_event_processing_failure=false".
CC [~hemanth619], [~VenuReddy]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]