[
https://issues.apache.org/jira/browse/IMPALA-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486721#comment-17486721
]
ASF subversion and git services commented on IMPALA-10923:
----------------------------------------------------------
Commit d59ec73990d89bfa4d4fa3d8fe598d53eb2918b7 in impala's branch
refs/heads/master from Yu-Wen Lai
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d59ec73 ]
IMPALA-11093: Fine grained table refreshing doesn't refresh table file
metadata
If we insert data into an acid partitioned table from Hive, the
generated events will be like open_txn -> alter_partition
-> commit_txn.
Previously we assumed the partition object with the alter_partition
event has write id < current write id. However, that is not a valid
assumption, the partition object is actually the write id allocated
in this transaction. That means in commit_txn event, we will have
a partition with write id equals to the write id of cached partition.
So we need to modify the '<' condition to '<='.
Tests:
After IMPALA-10923, we now refresh file metadata while processing
commit events. Therefore, we can add back the test disabled in
IMPALA-9057.
Change-Id: Idabeb522525c45f000ca0992348660fa5a5d4d2d
Reviewed-on: http://gerrit.cloudera.org:8080/18175
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Sourabh Goyal <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>
> Fine grained table refreshing at partition level events for transactional
> tables
> --------------------------------------------------------------------------------
>
> Key: IMPALA-10923
> URL: https://issues.apache.org/jira/browse/IMPALA-10923
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Yu-Wen Lai
> Assignee: Yu-Wen Lai
> Priority: Major
>
> For ensuring the transactional tables are consistent, we currently take whole
> table refreshing even a change is just for a partition only. That is too
> expensive and possibly make event processing has a longer delay.
> To enable fine-grained table refreshing, there are three main changes in this
> proposal.
> # maintain validWriteIdList in Catalogd for transactional tables. We will
> track write id changes by AllocWriteIdEvents, CommitTxnEvents, and
> AbortTxnEvents.
> # trigger partition level refreshing for addPartitionEvents,
> dropPartitionEvents, and AlterPartitionEvents.
> # Introduce a config *incremental_refresh_acid*, which can switch on/off the
> fine-grained table refreshing
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]