Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15818 )
Change subject: IMPALA-9512: Full ACID Milestone 2: Validate each row against the valid write id list ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15818/1/be/src/exec/acid-metadata-utils.cc File be/src/exec/acid-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/15818/1/be/src/exec/acid-metadata-utils.cc@113 PS1, Line 113: for (int64_t i = min_write_id; i <= max_write_id; ++i) { > Aren't we too pessimistic here? Currently we don't open a transaction for table loading. And we fetch valid write ids before transaction ids. So in theory it can happen that we can see a compacted delta with open write ids. But probably we should just open a transaction in HdfsTable.load() and use the txn id to fetch the valid write id list and the valid txn list. The other thing is that AFAIK it's not guaranteed that a compacted directory has a visibilityTxnId. But in that case it'd be risky to read it anyway since we cannot be sure whether it's done or not. So probably we can assume that there'll always be a visibilityTxnId for compacted directories. This would make row-validation unnecessary. Let's double-check it with Hive devs because Hive always validates the rows. -- To view, visit http://gerrit.cloudera.org:8080/15818 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5ed74585a2d73ebbcee763b0545be4412926299d Gerrit-Change-Number: 15818 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 28 Apr 2020 20:09:33 +0000 Gerrit-HasComments: Yes
