Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15818 )

Change subject: IMPALA-9512: Full ACID Milestone 2: Validate rows against the 
valid write id list
......................................................................


Patch Set 16:

(6 comments)

Sorry for the delayed review comments. Feel free to ignore or follow-up in a 
separate as you see appropriate. Most questions are basically are for my 
understanding as I am not very familiar with ACID.

http://gerrit.cloudera.org:8080/#/c/15818/13/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/15818/13/fe/src/main/java/org/apache/impala/catalog/Table.java@580
PS13, Line 580:     if (getMetaStoreTable() != null &&
Is it intentional that we always send the ValidWriteIdList irrespective of 
whether we requested the hms table or not? If yes, would be good to a comment 
with the reason.


http://gerrit.cloudera.org:8080/#/c/15818/13/fe/src/main/java/org/apache/impala/util/AcidUtils.java
File fe/src/main/java/org/apache/impala/util/AcidUtils.java:

http://gerrit.cloudera.org:8080/#/c/15818/13/fe/src/main/java/org/apache/impala/util/AcidUtils.java@387
PS13, Line 387:        if (parsedDelta.minWriteId <= baseWriteId) {
              :           Preconditions.checkState(parsedDelta.maxWriteId <= 
baseWriteId);
              :           it.remove();
              :           continue;
              :         }
we could use some comments here. Is the code removing all the delta directories 
which are superceded by the base directory? Just for my understand when would 
that happen?


http://gerrit.cloudera.org:8080/#/c/15818/13/fe/src/main/java/org/apache/impala/util/AcidUtils.java@398
PS13, Line 398: MetaException
Change to CatalogException?


http://gerrit.cloudera.org:8080/#/c/15818/15/fe/src/main/java/org/apache/impala/util/AcidUtils.java
File fe/src/main/java/org/apache/impala/util/AcidUtils.java:

http://gerrit.cloudera.org:8080/#/c/15818/15/fe/src/main/java/org/apache/impala/util/AcidUtils.java@191
PS15, Line 191:           if (!isTxnValid(pd.visibilityTxnId)) return false;
I am curious to understand this. The writeIdList that we load is pertaining to 
the current transaction state of the system. But during the load we fetch the 
ValidTxnList again. IIUC it seems like there are some transactions which do not 
update the write id and hence we need to look at the transaction ids 
specifically to determine if we need to skip certain files. I see that in 
IMPALA-9045 we added this check for base directories to ignore files when a 
major compaction is in open or aborted state. Does the same logic apply to 
delta files too? Are there more cases where we need to rely on validTxnIdList?


http://gerrit.cloudera.org:8080/#/c/15818/15/fe/src/main/java/org/apache/impala/util/AcidUtils.java@246
PS15, Line 246: visibilityTxnId
Just for sake my understanding, is this needed to support minor compactions 
similar to what IMPALA-9045 did for base directories?


http://gerrit.cloudera.org:8080/#/c/15818/15/fe/src/main/java/org/apache/impala/util/AcidUtils.java@380
PS15, Line 380: baseWriteId
Is it possible that baseWriteId is equal to SENTINEL_BASE_WRITE_ID here?



--
To view, visit http://gerrit.cloudera.org:8080/15818
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5ed74585a2d73ebbcee763b0545be4412926299d
Gerrit-Change-Number: 15818
Gerrit-PatchSet: 16
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 21 May 2020 19:39:31 +0000
Gerrit-HasComments: Yes

Reply via email to