[ 
https://issues.apache.org/jira/browse/IMPALA-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486721#comment-17486721
 ] 

ASF subversion and git services commented on IMPALA-10923:
----------------------------------------------------------

Commit d59ec73990d89bfa4d4fa3d8fe598d53eb2918b7 in impala's branch 
refs/heads/master from Yu-Wen Lai
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d59ec73 ]

IMPALA-11093: Fine grained table refreshing doesn't refresh table file
metadata

If we insert data into an acid partitioned table from Hive, the
generated events will be like open_txn -> alter_partition
-> commit_txn.

Previously we assumed the partition object with the alter_partition
event has write id < current write id. However, that is not a valid
assumption, the partition object is actually the write id allocated
in this transaction. That means in commit_txn event, we will have
a partition with write id equals to the write id of cached partition.
So we need to modify the '<' condition to '<='.

Tests:
After IMPALA-10923, we now refresh file metadata while processing
commit events. Therefore, we can add back the test disabled in
IMPALA-9057.

Change-Id: Idabeb522525c45f000ca0992348660fa5a5d4d2d
Reviewed-on: http://gerrit.cloudera.org:8080/18175
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Sourabh Goyal <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>


> Fine grained table refreshing at partition level events for transactional 
> tables
> --------------------------------------------------------------------------------
>
>                 Key: IMPALA-10923
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10923
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Yu-Wen Lai
>            Assignee: Yu-Wen Lai
>            Priority: Major
>
> For ensuring the transactional tables are consistent, we currently take whole 
> table refreshing even a change is just for a partition only. That is too 
> expensive and possibly make event processing has a longer delay.
> To enable fine-grained table refreshing, there are three main changes in this 
> proposal.
>  # maintain validWriteIdList in Catalogd for transactional tables. We will 
> track write id changes by AllocWriteIdEvents, CommitTxnEvents, and 
> AbortTxnEvents.
>  # trigger partition level refreshing for addPartitionEvents, 
> dropPartitionEvents, and AlterPartitionEvents.
>  # Introduce a config *incremental_refresh_acid*, which can switch on/off the 
> fine-grained table refreshing



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to