[
https://issues.apache.org/jira/browse/HIVE-29272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18034449#comment-18034449
]
Marta Kuczora commented on HIVE-29272:
--------------------------------------
Thanks a lot [~dkuzmenko] for the review and for merging the fix.
> Query-based MINOR compaction should not consider minOpenWriteId
> ---------------------------------------------------------------
>
> Key: HIVE-29272
> URL: https://issues.apache.org/jira/browse/HIVE-29272
> Project: Hive
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 4.0.0, 4.1.0
> Reporter: Marta Kuczora
> Assignee: Marta Kuczora
> Priority: Major
> Labels: Compaction, data-loss, hive-4.2.0-must,
> pull-request-available
> Fix For: 4.2.0
>
>
> In certain scenarios the query-based MINOR compaction produces empty delta
> file. In ACID tables it will be automatically cleaned up like no compaction
> happened, but on insert-only tables it causes data loss.
> This issue happens if there is an aborted and an open transaction on the
> compacted table.
> Let’s see an example:
> * Run an insert which creates delta_0000001_0000001 (writeId=1)
> * Start an insert and abort the transaction (writeId=2)
> * Run an insert which creates delta_0000003_0000003 (writeId=3)
> * Run an insert which creates delta_0000004_0000004 (writeId=4), but before
> it finishes, start the MINOR compaction
> * When the compaction is finished the table will contain the following files:
> delta_0000001_0000003
> delta_0000004_0000004
> delta_0000004_0000004/000000_0
> delta_0000001_0000001
> delta_0000001_0000001/000000_0
> delta_0000003_0000003
> delta_0000001_0000001/000000_0
> * It can be seen that the delta_0000001_0000003 directory (which was
> produced by the compactor) is empty.
> * When the Cleaner runs, it will remove delta_0000001_0000001 and
> delta_0000003_0000003, so the data in them will be lost.
> This happens because of this check in the MINOR compaction:
>
> {code:java}
> long minWriteID = validWriteIdList.getMinOpenWriteId() == null ? 1 :
> validWriteIdList.getMinOpenWriteId();
> long highWatermark = validWriteIdList.getHighWatermark();
> List<AcidUtils.ParsedDelta> deltas =
> dir.getCurrentDirectories().stream().filter(
> delta -> delta.isDeleteDelta() == isDeleteDelta &&
> delta.getMaxWriteId() <= highWatermark && delta.getMinWriteId() >= minWriteID)
> .collect(Collectors.toList());
> if (deltas.isEmpty()) {
> query.setLength(0); // no alter query needed; clear StringBuilder
> return;
> } {code}
> If the table has aborted and open transactions, the minOpenWriteId will be
> set. In the example it will be 4.
> When the ValidCompactorWriteIdList is created in the
> TxnUtils.createValidCompactWriteIdList the highWaterMark will be set to
> minOpenWriteId-1, so this will ensure that the compaction range is below the
> minOpenWriteId.
> But in the minor compaction's code the minOpenWriteId is considered as the
> lower limit, so it wants to compact deltas which are above this values. This
> is not correct, it seems like a misunderstanding this minOpenWriteId values.
> In the example the compaction should consider delta_1 and delta_3, but none
> of them fulfills the conditions "delta.getMinWriteId() >= minWriteID" as the
> minWriteID=minOpenWriteId=4 here.
> This check in the MINOR compaction code is not correct, I think it is safe to
> leave out checking against the minOpenWriteId as the highWatermark already
> adjusted to it.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)