[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

Yu-Wen Lai (Code Review) Wed, 20 Oct 2021 14:52:38 -0700

Yu-Wen Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17858 )


Change subject: IMPALA-10923: Fine grained table refreshing at partition level 
events for transactional tables
......................................................................


Patch Set 7:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@97
PS1, Line 97: protected final ConcurrentHashMap<Long, Set<TableWriteId>> 
txnToWriteIds_ =
            :       new ConcurrentHashMap<>();
> Thanks for the clarification. "the new HMS API getAllWriteEventInfo only re
getAllWriteEventInfo just return the data stored in the table 
TXN_WRITE_NOTIFICATION_LOG. AFAIK, HS2 calls add_write_notification_log that 
inserts records into TXN_WRITE_NOTIFICATION_LOG only for DML for transactional 
tables. I tried  few queries locally like "drop constraint", and they advance 
write id but don't add write notification log.

I tried to reduce the memory footprint here by saving write ids for 
transactional partitioned tables only. Besides, this map's size is just 
proportional to the simultaneous open transactions. Despite I don't have any 
real data points, we might not have a huge number of simultaneous "open" 
transactions?


http://gerrit.cloudera.org:8080/#/c/17858/1/fe/src/main/java/org/apache/impala/catalog/Catalog.java@97
PS1, Line 97: protected final ConcurrentHashMap<Long, Set<TableWriteId>> 
txnToWriteIds_ =
            :       new ConcurrentHashMap<>();
> @Yu-Wen: In addition to what Vihang asked, how would we handle the followin
@Sourabh
Good question. Since I don't see a way to retrieve back the missing write id, 
we might accept that this write id remains open. When next time a request with 
writeIdList that has this write id as committed, we will reload the whole table 
because the writeIdList of the request is considered more recent. In some 
sense, the table cache is considered stale when the write id is not marked 
committed.


http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2891
PS6, Line 2891: case
> do we need a default: clause which throws a exception?
Done


http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/catalog/Table.java@142
PS6, Line 142: volatile
> not sure why we need this?
@Vihang
I call getCreateEventId() in AllocWriteIdEvent without acquiring lock. Is there 
any chance createEventId will be set after the table is loaded? If not, we 
don't need this.


http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2066
PS6, Line 2066: catalog_.removeWriteIds(txnId_);
> This line must be in finally block otherwise we are leaking memory in case
Thank you for catching this.


http://gerrit.cloudera.org:8080/#/c/17858/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/17858/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2057
PS7, Line 2057:         commitTxnMessage_.addWriteEventInfo(writeEventInfoList);
> Why are we modifying commitTxnMesage? Can't we get all the required info fr
@Sourabh
I actually imitated the code from hive repl: 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CommitTxnHandler.java#L166.
The upside is that I don't have to parse table and partition objects. It is 
done by CommitTxnMessage. As I can see from the hive code, it seems like this 
function is used like this by design.


http://gerrit.cloudera.org:8080/#/c/17858/7/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2080
PS7, Line 2080:       
Preconditions.checkNotNull(commitTxnMessage_.getPartitions());
> Why are we checking for non null partitions? Wouldn't unpartitioned table h
As long as we have called addWriteEventInfo, this would be empty list even for 
unpartitioned table. So, this is just to check we have added write event info. 
I can change the check to other variables to avoid confusion.


http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4256
PS6, Line 4256: !(table instanceof HdfsTable)
> do we need a check for MVs here?
Thanks for the reminder. I added a check in CommitTxnEvent.


http://gerrit.cloudera.org:8080/#/c/17858/6/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4271
PS6, Line 4271: // HMS doesn't set partition object's write id correctly, so we 
set
              :         // it here as workaround.
              :         long writeId = writeIds.get(it.nextIndex());
              :         Partition part = it.next();
              :         part.setWriteId(writeId);
> This could lead to confusion, for instance, we may be sending the partition
I need to check if there is another way to fulfill my need here. I use this 
because I reload partitions from "event"'s partition objects without getting 
latest partitions from HMS so I need to check write id to avoid reloading an 
older partition object. Meanwhile, this can ensure the event processing for 
CommitTxnEvent is idempotent.


http://gerrit.cloudera.org:8080/#/c/17858/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17858/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4160
PS7, Line 4160:       List<Partition> partsFromEvent, boolean loadFromEvent, 
boolean loadFileMetadata,
> Looks like if loadFromEvent is true, then in HdfsTable we avoid making HMS
OK. I will separate another API for this usage.


http://gerrit.cloudera.org:8080/#/c/17858/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4233
PS7, Line 4233:   public int 
addCommittedWriteIdsAndReloadPartitionsIfExist(long eventId, String dbName,
> This api is not making much sense to me here. More so since it is public.
Could you elaborate the concern? It seems like we don't directly manipulate 
cache in MetastoreEvents but expose the API from CatalogOpExecutor or 
CatalogServiceCatalog as a convention.



--
To view, visit http://gerrit.cloudera.org:8080/17858
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Gerrit-Change-Number: 17858
Gerrit-PatchSet: 7
Gerrit-Owner: Yu-Wen Lai <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sourabh Goyal <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Reviewer: Yu-Wen Lai <[email protected]>
Gerrit-Comment-Date: Wed, 20 Oct 2021 21:52:28 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10923: Fine grained table refreshing at partition level events for transactional tables

Reply via email to