[
https://issues.apache.org/jira/browse/HIVE-29601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on HIVE-29601 started by Marta Kuczora.
--------------------------------------------
> ACID: Cleaner finds base directories valid with writeId above the cleaner
> highWaterMark
> ---------------------------------------------------------------------------------------
>
> Key: HIVE-29601
> URL: https://issues.apache.org/jira/browse/HIVE-29601
> Project: Hive
> Issue Type: Task
> Affects Versions: 4.2.0
> Reporter: Marta Kuczora
> Assignee: Marta Kuczora
> Priority: Major
> Fix For: 4.3.0
>
>
> When the cleaner selects the base directories, all bases are validated by the
> AcidUtils.isValidBase method.
> {code:java}
> private static boolean isValidBase(ParsedBaseLight parsedBase,
> ValidWriteIdList writeIdList, FileSystem fs,
> HdfsDirSnapshot dirSnapshot) throws IOException {
> boolean isValidBase;
> if (dirSnapshot != null && dirSnapshot.isValidBase() != null) {
> isValidBase = dirSnapshot.isValidBase();
> } else {
> if (parsedBase.getWriteId() == Long.MIN_VALUE) {
> //such base is created by 1st compaction in case of non-acid to acid
> table conversion
> //By definition there are no open txns with id < 1.
> isValidBase = true;
> } else if (writeIdList.getMinOpenWriteId() != null &&
> parsedBase.getWriteId() <= writeIdList
> .getMinOpenWriteId()) {
> isValidBase = true;
> } else if (isCompactedBase(parsedBase, fs, dirSnapshot)) {
> isValidBase = writeIdList.isValidBase(parsedBase.getWriteId());
> } else {
> // if here, it's a result of IOW
> isValidBase = writeIdList.isWriteIdValid(parsedBase.getWriteId());
> }
> if (dirSnapshot != null) {
> dirSnapshot.setIsValidBase(isValidBase);
> }
> }
> return isValidBase;
> } {code}
> The following condition doesn't consider the cleaner's highWaterMark
> {code:java}
> else if (writeIdList.getMinOpenWriteId() != null && parsedBase.getWriteId()
> <= writeIdList.getMinOpenWriteId()) { isValidBase = true;
> } {code}
> So if the minOpenWriteId is set and greater than the highWaterMark, base
> directories with writeId above the highWaterMark are considered as valid.
> This can lead into use cases when the cleaner deletes bases above the
> cleaner's highWaterMark.
> This issue can lead to dataloss as well. If the base directory with the
> highest writeId is the result of an aborted insert-overwrite, and there is an
> open write transaction with higher writeId, the writeId <= minOpenWriteId
> will be true, so that base will be selected as the best valid delta. At this
> point in the code, the non-compacted base directories are checked if they are
> aborted or not. It will happen later in the writeIdList.isWriteIdValid call.
> If this base directory is selected as best base, the cleaner can clean up
> delta directories, but later this base will be cleaned up as well. So we lost
> data.
> Example:
> * insert 3 deltas and run major compaction (id=1)
> * insert 3 deltas and run major compaction (id=2)
> * an insert is started but it will stay open when the first cleaner is
> running
> * at this point we have the following directories:
> ** base_3_v0000004
> ** base_6_v0000008
> ** base_9_v0000012
> ** delta_0000001_0000001
> ** delta_0000002_0000002
> ** delta_0000003_0000003
> ** delta_0000004_0000004
> ** delta_0000005_0000005
> ** delta_0000006_0000006
> ** delta_0000007_0000007
> ** delta_0000008_0000008
> ** delta_0000009_0000009
> ** delta_0000010_0000010
> * cleaner runs the first time
> It will pick compaction (id=1) from the queue. The cleaner should delete only
> the directories which are made obsolete by that compaction. In this example
> it is only delta_0000001_0000001, delta_0000002_0000002 and
> delta_0000003_0000003.
> The minOpenWriteId=10 at this point and because of the check in the
> isValidBase method, base_9_v0000012 will be selected as the latest valid
> delta and base_3_v0000004 and base_6_v0000008 will be deleted.
> * the open insert is committed, so minOpenWriteId won't be set any more
> * cleaner runs for the second compaction
> it tries to find a base below its highWaterMark (it would be
> base_6_v0000008), but since it is deleted, the cleaner will fail with
> ACID_NOT_ENOUGH_HISTORY error.
> If there is an aborted insert overwrite between the last compaction and the
> open insert, its base directory would be selected as latest valid base and
> delta_0000001_0000001, delta_0000002_0000002 and delta_0000003_0000003 and
> the other base directories will be deleted. By this the data in
> delta_0000001_0000001, delta_0000002_0000002 and delta_0000003_0000003 will
> be lost.
> The shortcut to check only against the minOpenWriteId is added in this Jira:
> https://issues.apache.org/jira/browse/HIVE-22754
--
This message was sent by Atlassian Jira
(v8.20.10#820010)