[ 
https://issues.apache.org/jira/browse/IMPALA-12356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17797138#comment-17797138
 ] 

Venugopal Reddy K edited comment on IMPALA-12356 at 12/15/23 10:57 AM:
-----------------------------------------------------------------------

[~stigahuang] Agree with with you on Fix #1. Will modify it accordingly.
 
Have observed 3 other issues during test. Will fix them with separate jira.
1. create partition table(either from impala or hive), need to make sure table 
is loaded
2.  Alter partition from impala
{code:java}
 alter table my_part partition(p=0) set location '/tmp2';{code}
{{}} We set serviceId and version number are added to part params, update to 
hms and add the version to partition's inFlightEvent. So It is treated as self 
event in event processsing.  {color:#172b4d}*Issue here is we are adding the 
version to table's inFlightEvent as well(It is not correct as we do not receive 
alter table event from HMS in this case). So table inFlightEvent has a dangling 
version number.*{color}
[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1303C1-L1305C78]
 

3. Now, alter partition from hive beeline
{noformat}
alter table my_part partition(p=0) set location '/tmp3';{noformat}
{color:#172b4d} *During this alter partition event processing, we add to 
partition's inFlightEvent(It is an issue).*{color}
4. Alter partition from hive beeline again
{noformat}
alter table my_part partition(p=0) set location '/tmp4';{noformat}
Now this alter partition event is treated as self event because of step-3 
inFlightEvent issue.
 
 
Add a new partition to table with 
{noformat}
alter table my_part add partition (p=10);{noformat}
{color:#172b4d}{*}we are adding the version to table's inFlightEvent(B{*}{*}ut 
we do not receive alter table event from HMS in this case).{*} *So table 
inFlightEvent has a dangling version number.*{color}

{color:#172b4d}[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1140C1-L1145C82]
 {color}

 


was (Author: venureddy):
[~stigahuang] Agree with with you on Fix #1. Will modify it accordingly.
 
Have observed 3 other issues during test. Will fix them with separate jira.
1. create partition table(either from impala or hive), need to make sure table 
is loaded
2.  Alter partition from impala
{code:java}
 alter table my_part partition(p=0) set location '/tmp2';{code}
{{}} We set serviceId and version number are added to part params, update to 
hms and add the version to partition's inFlightEvent. So It is treated as self 
event in event processsing.  {color:#FF0000}*Issue here is we are adding the 
version to table's inFlightEvent as well(It is not correct as we do not receive 
alter table event from HMS in this case)*{color}. So table inFlightEvent has a 
dangling version number.
[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1303C1-L1305C78]
 

3. Now, alter partition from hive beeline 
{noformat}
alter table my_part partition(p=0) set location '/tmp3';{noformat}
 {color:#FF0000}*During this alter partition event processing, we add to 
partition's inFlightEvent(It is an issue).*{color}
4. Alter partition from hive beeline again
{noformat}
alter table my_part partition(p=0) set location '/tmp4';{noformat}
Now this alter partition event is treated as self event because of step-3 
inFlightEvent issue.
 
 
Add a new partition to table with 
{noformat}
alter table my_part add partition (p=10);{noformat}
{color:#FF0000}{*}we are adding the version to table's inFlightEvent(B{*}{*}ut 
we do not receive alter table event from HMS in this case).{*} *So table 
inFlightEvent has a dangling version number.*{color}

{color:#172b4d}{color:#172b4d}[https://github.com/apache/impala/blob/a6de494f24c47fbd679a037341ae0a34b9f696ff/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1140C1-L1145C82]
 {color}{color}

 

> Partition created by INSERT will make the next ALTER_PARTITION event on it 
> always treated as self-event
> -------------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-12356
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12356
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Venugopal Reddy K
>            Priority: Critical
>              Labels: ramp-up
>
> In Impala, create a partitioned table and create one partition in it using 
> {*}INSERT{*}:
> {noformat}
> create table my_part (i int) partitioned by (p int) stored as parquet;
> insert into my_part partition(p=0) values (0),(1),(2);
> show partitions my_part
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> | p     | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format  
> | Incremental stats | Location                                          | EC 
> Policy |
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> | 0     | -1    | 1      | 358B | NOT CACHED   | NOT CACHED        | PARQUET 
> | false             | hdfs://localhost:20500/test-warehouse/my_part/p=0 | 
> NONE      |
> | Total | -1    | 1      | 358B | 0B           |                   |         
> |                   |                                                   |     
>       |
> +-------+-------+--------+------+--------------+-------------------+---------+-------------------+---------------------------------------------------+-----------+
> {noformat}
> In Hive, describe the partition. We can see parameters of 
> "impala.events.catalogServiceId" and "impala.events.catalogVersion" added by 
> Impala. This is ok.
> {noformat}
> hive> desc formatted my_part partition(p=0);
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> |             col_name              |                     data_type           
>            |              comment              |
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> | i                                 | int                                     
>            |                                   |
> |                                   | NULL                                    
>            | NULL                              |
> | # Partition Information           | NULL                                    
>            | NULL                              |
> | # col_name                        | data_type                               
>            | comment                           |
> | p                                 | int                                     
>            |                                   |
> |                                   | NULL                                    
>            | NULL                              |
> | # Detailed Partition Information  | NULL                                    
>            | NULL                              |
> | Partition Value:                  | [0]                                     
>            | NULL                              |
> | Database:                         | default                                 
>            | NULL                              |
> | Table:                            | my_part                                 
>            | NULL                              |
> | CreateTime:                       | Wed Aug 09 15:24:50 CST 2023            
>            | NULL                              |
> | LastAccessTime:                   | UNKNOWN                                 
>            | NULL                              |
> | Location:                         | 
> hdfs://localhost:20500/test-warehouse/my_part/p=0  | NULL                     
>          |
> | Partition Parameters:             | NULL                                    
>            | NULL                              |
> |                                   | impala.events.catalogServiceId          
>            | eab33ebb8a14cfd:8b2bdc12df3568df  |
> |                                   | impala.events.catalogVersion            
>            | 1882                              |
> |                                   | numFiles                                
>            | 1                                 |
> |                                   | totalSize                               
>            | 358                               |
> |                                   | transient_lastDdlTime                   
>            | 1691565890                        |
> |                                   | NULL                                    
>            | NULL                              |
> | # Storage Information             | NULL                                    
>            | NULL                              |
> | SerDe Library:                    | 
> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe | NULL            
>                   |
> | InputFormat:                      | 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat | NULL          
>                     |
> | OutputFormat:                     | 
> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat | NULL         
>                      |
> | Compressed:                       | No                                      
>            | NULL                              |
> | Num Buckets:                      | 0                                       
>            | NULL                              |
> | Bucket Columns:                   | []                                      
>            | NULL                              |
> | Sort Columns:                     | []                                      
>            | NULL                              |
> +-----------------------------------+----------------------------------------------------+-----------------------------------+
> {noformat}
> Now run an ALTER statement on the partition in Hive, e.g. changing the 
> location:
> {code:sql}
> alter table my_part partition(p=0) set location '/tmp';{code}
> Impala will skip the ALTER_PARTITION event since it's considered as a 
> self-event. In catalogd logs:
> {noformat}
> I0809 15:30:19.628449 29844 MetastoreEvents.java:628] EventId: 8351549 
> EventType: ALTER_PARTITION Incremented events skipped counter to 12
> I0809 15:30:19.628616 29844 MetastoreEvents.java:628] EventId: 8351549 
> EventType: ALTER_PARTITION Not processing the event as it is a 
> self-event{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to