[
https://issues.apache.org/jira/browse/IMPALA-12356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17797138#comment-17797138
]
Venugopal Reddy K commented on IMPALA-12356:
--------------------------------------------
[~stigahuang] Agree with with you on Fix #1. Will modify it accordingly.
Have observed 3 other issues during test. Will fix them with separate jira.
1. create partition table(either from impala or hive), need to make sure table
is loaded
2. Alter partition from impala
{code:java}
alter table my_part partition(p=0) set location '/tmp2';{code}
{{}} We set serviceId and version number are added to part params, update to
hms and add the version to partition's inFlightEvent. So It is treated as self
event in event processsing. {color:#FF0000}*Issue here is we are adding the
version to table's inFlightEvent as well(It is not correct as we do not receive
alter table event from HMS in this case)*{color}. So table inFlightEvent has a
dangling version number.
[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1303C1-L1305C78]
3. Now, alter partition from hive beeline
{noformat}
alter table my_part partition(p=0) set location '/tmp3';{noformat}
{color:#FF0000}*During this alter partition event processing, we add to
partition's inFlightEvent(It is an issue).*{color}
4. Alter partition from hive beeline again
{noformat}
alter table my_part partition(p=0) set location '/tmp4';{noformat}
Now this alter partition event is treated as self event because of step-3
inFlightEvent issue.
Add a new partition to table with
{noformat}
alter table my_part add partition (p=10);{noformat}
{color:#FF0000}{*}we are adding the version to table's inFlightEvent(B{*}{*}ut
we do not receive alter table event from HMS in this case).{*} *So table
inFlightEvent has a dangling version number.*{color}
{color:#172b4d}{color:#172b4d}[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1140C1-L1145C82]
{color}{color}
> Partition created by INSERT will make the next ALTER_PARTITION event on it
> always treated as self-event
> -------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-12356
> URL: https://issues.apache.org/jira/browse/IMPALA-12356
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Venugopal Reddy K
> Priority: Critical
> Labels: ramp-up
>
> In Impala, create a partitioned table and create one partition in it using
> {*}INSERT{*}:
> {noformat}
> create table my_part (i int) partitioned by (p int) stored as parquet;
> insert into my_part partition(p=0) values (0),(1),(2);
> show partitions my_part
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> | p | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format
> | Incremental stats | Location | EC
> Policy |
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> | 0 | -1 | 1 | 358B | NOT CACHED | NOT CACHED | PARQUET
> | false | hdfs://localhost:20500/test-warehouse/my_part/p=0 |
> NONE |
> | Total | -1 | 1 | 358B | 0B | |
> | | |
> |
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> {noformat}
> In Hive, describe the partition. We can see parameters of
> "impala.events.catalogServiceId" and "impala.events.catalogVersion" added by
> Impala. This is ok.
> {noformat}
> hive> desc formatted my_part partition(p=0);
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> | col_name | data_type
> | comment |
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> | i | int
> | |
> | | NULL
> | NULL |
> | # Partition Information | NULL
> | NULL |
> | # col_name | data_type
> | comment |
> | p | int
> | |
> | | NULL
> | NULL |
> | # Detailed Partition Information | NULL
> | NULL |
> | Partition Value: | [0]
> | NULL |
> | Database: | default
> | NULL |
> | Table: | my_part
> | NULL |
> | CreateTime: | Wed Aug 09 15:24:50 CST 2023
> | NULL |
> | LastAccessTime: | UNKNOWN
> | NULL |
> | Location: |
> hdfs://localhost:20500/test-warehouse/my_part/p=0 | NULL
> |
> | Partition Parameters: | NULL
> | NULL |
> | | impala.events.catalogServiceId
> | eab33ebb8a14cfd:8b2bdc12df3568df |
> | | impala.events.catalogVersion
> | 1882 |
> | | numFiles
> | 1 |
> | | totalSize
> | 358 |
> | | transient_lastDdlTime
> | 1691565890 |
> | | NULL
> | NULL |
> | # Storage Information | NULL
> | NULL |
> | SerDe Library: |
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | NULL
> |
> | InputFormat: |
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat | NULL
> |
> | OutputFormat: |
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | NULL
> |
> | Compressed: | No
> | NULL |
> | Num Buckets: | 0
> | NULL |
> | Bucket Columns: | []
> | NULL |
> | Sort Columns: | []
> | NULL |
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> {noformat}
> Now run an ALTER statement on the partition in Hive, e.g. changing the
> location:
> {code:sql}
> alter table my_part partition(p=0) set location '/tmp';{code}
> Impala will skip the ALTER_PARTITION event since it's considered as a
> self-event. In catalogd logs:
> {noformat}
> I0809 15:30:19.628449 29844 MetastoreEvents.java:628] EventId: 8351549
> EventType: ALTER_PARTITION Incremented events skipped counter to 12
> I0809 15:30:19.628616 29844 MetastoreEvents.java:628] EventId: 8351549
> EventType: ALTER_PARTITION Not processing the event as it is a
> self-event{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]