Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15818 )

Change subject: IMPALA-9512: Full ACID Milestone 2: Validate each row against 
the valid write id list
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15818/1/be/src/exec/acid-metadata-utils.cc
File be/src/exec/acid-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/15818/1/be/src/exec/acid-metadata-utils.cc@113
PS1, Line 113:   for (int64_t i = min_write_id; i <= max_write_id; ++i) {
> Aren't we too pessimistic here?
Currently we don't open a transaction for table loading. And we fetch valid 
write ids before transaction ids. So in theory it can happen that we can see a 
compacted delta with open write ids.

But probably we should just open a transaction in HdfsTable.load() and use the 
txn id to fetch the valid write id list and the valid txn list.

The other thing is that AFAIK it's not guaranteed that a compacted directory 
has a visibilityTxnId. But in that case it'd be risky to read it anyway since 
we cannot be sure whether it's done or not.

So probably we can assume that there'll always be a visibilityTxnId for 
compacted directories. This would make row-validation unnecessary. Let's 
double-check it with Hive devs because Hive always validates the rows.



--
To view, visit http://gerrit.cloudera.org:8080/15818
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5ed74585a2d73ebbcee763b0545be4412926299d
Gerrit-Change-Number: 15818
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 28 Apr 2020 20:09:33 +0000
Gerrit-HasComments: Yes

Reply via email to