Sourabh Goyal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )

Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
......................................................................


Patch Set 1:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/17858/1/common/thrift/BackendGflags.thrift
File common/thrift/BackendGflags.thrift:

http://gerrit.cloudera.org:8080/#/c/17858/1/common/thrift/BackendGflags.thrift@219
PS1, Line 219:   97: required bool incremental_refresh_acid
nit: rename it to hms_event_incremental_refresh or 
hms_event_incremental_refresh_transactional_table

Also where are we setting the default value of this flag?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2582
PS1, Line 2582:       List<TPartitionKeyValue> tPartSpec, long writeId, boolean 
loadFileMetadata,
I feel it would be cleaner if we don't add writeId and eventId as parameters 
here and instead call this method only if these checks pass in the caller. 
Otherwise the caller of this method would need to know writeId and eventId 
before calling this method which does not seem right. We can add 
loadFileMetadata here though.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/TableWriteId.java
File fe/src/main/java/org/apache/impala/catalog/TableWriteId.java:

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/TableWriteId.java@29
PS1, Line 29:     this.tableName = TableName.parse(tableName);
We are probably expecting tableName to be fullyQualified but not enforcing that 
check here. Instead of TableName, we can store String DbName, String tableName 
separately. .


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@20
PS1, Line 20: import com.amazonaws.services.servicequotas.model.MetricInfo;
nit: remove it if not being used?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@184
PS1, Line 184:         switch (metastoreEventType) {
this seems incomplete. Are we missing anything here? We didn't create objs for 
ALLOC_WRITE_ID_EVENT, COMMIT_TXN and ABORT_TXN anywhere in this method.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1458
PS1, Line 1458:   public static class AcidAddPartitionEvent extends 
AddPartitionEvent {
1. Where is AcidAddPartitionEvent getting initialized?
2. Instead of creating a new event i.e AcidAddPartitionEvent, we can modify the 
process() logic in AddPartitionEvent itself based on the incremental refresh 
flag


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1717
PS1, Line 1717:           if (tbl_.getCreateEventId() > eventId_) {
IMO, there is no need to acquire read lock on table to read createEventId since 
it is set only once when creating table in db. Instead, we can make 
createEventId member variable in table as volatile to make sure we are always 
reading most recent value.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1735
PS1, Line 1735:       if (msTbl_ == null) {
can we move the check tbl_.getCreateEventId() > eventId_ here ? As msTbl_ is 
not being used anywhere else,  better to avoid it.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1738
PS1, Line 1738:       List<Long> writeIds = txnToWriteIdList_.stream()
txnToWriteIdList_ does not seem to be set anywhere in the event. Where are we 
reading its values from?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1786
PS1, Line 1786:         throw new CatalogException("Failed to get write event 
infos", e);
nit: Add txnId in the message?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1789
PS1, Line 1789:         commitTxnMessage_.addWriteEventInfo(writeEventInfoList);
Didn't understand the purpose of this.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1793
PS1, Line 1793:           throw new CatalogException("Commit event processing 
error", e);
nit: add more info like txnId etc.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1800
PS1, Line 1800:       Set<TableWriteId> tableWriteIds = 
catalog_.getWriteIds(txnId_);
if the tableWriteIds don't exist in catalog cache (say because of cache 
eviction etc), we should fetch it from HMS.


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1861
PS1, Line 1861:       Set<TableWriteId> tableWriteIds = 
catalog_.getWriteIds(txnId_);
Same as my previous comment in CommitTxnEvent, if txnId does not exist in 
catalog cache, we should fetch it from HMS.



--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 1
Gerrit-Owner: Yu-Wen Lai <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sourabh Goyal <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Reviewer: Yu-Wen Lai <[email protected]>
Gerrit-Comment-Date: Thu, 23 Sep 2021 00:48:53 +0000
Gerrit-HasComments: Yes

Reply via email to