[ 
https://issues.apache.org/jira/browse/IMPALA-12356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17797138#comment-17797138
 ] 

Venugopal Reddy K commented on IMPALA-12356:
--------------------------------------------

[~stigahuang] Agree with with you on Fix #1. Will modify it accordingly.
 
Have observed 3 other issues during test. Will fix them with separate jira.
1. create partition table(either from impala or hive), need to make sure table 
is loaded
2.  Alter partition from impala
{code:java}
 alter table my_part partition(p=0) set location '/tmp2';{code}
{{}} We set serviceId and version number are added to part params, update to 
hms and add the version to partition's inFlightEvent. So It is treated as self 
event in event processsing.  {color:#FF0000}*Issue here is we are adding the 
version to table's inFlightEvent as well(It is not correct as we do not receive 
alter table event from HMS in this case)*{color}. So table inFlightEvent has a 
dangling version number.
[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1303C1-L1305C78]
 

3. Now, alter partition from hive beeline 
{noformat}
alter table my_part partition(p=0) set location '/tmp3';{noformat}
 {color:#FF0000}*During this alter partition event processing, we add to 
partition's inFlightEvent(It is an issue).*{color}
4. Alter partition from hive beeline again
{noformat}
alter table my_part partition(p=0) set location '/tmp4';{noformat}
Now this alter partition event is treated as self event because of step-3 
inFlightEvent issue.
 
 
Add a new partition to table with 
{noformat}
alter table my_part add partition (p=10);{noformat}
{color:#FF0000}{*}we are adding the version to table's inFlightEvent(B{*}{*}ut 
we do not receive alter table event from HMS in this case).{*} *So table 
inFlightEvent has a dangling version number.*{color}

{color:#172b4d}{color:#172b4d}[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1140C1-L1145C82]
 {color}{color}

 

> Partition created by INSERT will make the next ALTER_PARTITION event on it 
> always treated as self-event
> -------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-12356
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12356
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Venugopal Reddy K
>            Priority: Critical
>              Labels: ramp-up
>
> In Impala, create a partitioned table and create one partition in it using 
> {*}INSERT{*}:
> {noformat}
> create table my_part (i int) partitioned by (p int) stored as parquet;
> insert into my_part partition(p=0) values (0),(1),(2);
> show partitions my_part
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> | p     | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format  
> | Incremental stats | Location                                          | EC 
> Policy |
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> | 0     | -1    | 1      | 358B | NOT CACHED   | NOT CACHED        | PARQUET 
> | false             | hdfs://localhost:20500/test-warehouse/my_part/p=0 | 
> NONE      |
> | Total | -1    | 1      | 358B | 0B           |                   |         
> |                   |                                                   |     
>       |
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> {noformat}
> In Hive, describe the partition. We can see parameters of 
> "impala.events.catalogServiceId" and "impala.events.catalogVersion" added by 
> Impala. This is ok.
> {noformat}
> hive> desc formatted my_part partition(p=0);
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> |             col_name              |                     data_type           
>            |              comment              |
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> | i                                 | int                                     
>            |                                   |
> |                                   | NULL                                    
>            | NULL                              |
> | # Partition Information           | NULL                                    
>            | NULL                              |
> | # col_name                        | data_type                               
>            | comment                           |
> | p                                 | int                                     
>            |                                   |
> |                                   | NULL                                    
>            | NULL                              |
> | # Detailed Partition Information  | NULL                                    
>            | NULL                              |
> | Partition Value:                  | [0]                                     
>            | NULL                              |
> | Database:                         | default                                 
>            | NULL                              |
> | Table:                            | my_part                                 
>            | NULL                              |
> | CreateTime:                       | Wed Aug 09 15:24:50 CST 2023            
>            | NULL                              |
> | LastAccessTime:                   | UNKNOWN                                 
>            | NULL                              |
> | Location:                         | 
> hdfs://localhost:20500/test-warehouse/my_part/p=0  | NULL                     
>          |
> | Partition Parameters:             | NULL                                    
>            | NULL                              |
> |                                   | impala.events.catalogServiceId          
>            | eab33ebb8a14cfd:8b2bdc12df3568df  |
> |                                   | impala.events.catalogVersion            
>            | 1882                              |
> |                                   | numFiles                                
>            | 1                                 |
> |                                   | totalSize                               
>            | 358                               |
> |                                   | transient_lastDdlTime                   
>            | 1691565890                        |
> |                                   | NULL                                    
>            | NULL                              |
> | # Storage Information             | NULL                                    
>            | NULL                              |
> | SerDe Library:                    | 
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | NULL            
>                   |
> | InputFormat:                      | 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat | NULL          
>                     |
> | OutputFormat:                     | 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | NULL         
>                      |
> | Compressed:                       | No                                      
>            | NULL                              |
> | Num Buckets:                      | 0                                       
>            | NULL                              |
> | Bucket Columns:                   | []                                      
>            | NULL                              |
> | Sort Columns:                     | []                                      
>            | NULL                              |
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> {noformat}
> Now run an ALTER statement on the partition in Hive, e.g. changing the 
> location:
> {code:sql}
> alter table my_part partition(p=0) set location '/tmp';{code}
> Impala will skip the ALTER_PARTITION event since it's considered as a 
> self-event. In catalogd logs:
> {noformat}
> I0809 15:30:19.628449 29844 MetastoreEvents.java:628] EventId: 8351549 
> EventType: ALTER_PARTITION Incremented events skipped counter to 12
> I0809 15:30:19.628616 29844 MetastoreEvents.java:628] EventId: 8351549 
> EventType: ALTER_PARTITION Not processing the event as it is a 
> self-event{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to