[
https://issues.apache.org/jira/browse/HUDI-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517227#comment-17517227
]
Ethan Guo commented on HUDI-3637:
---------------------------------
After revisiting the relevant logic, the compaction and clustering logic is
correct using getLatestFileSlices(). The mismatch in this case should not
cause any correctness issue and should be handled at the validation layer.
At high level, getLatestFileSlices() is going to fetch the latest file slices
for committed base files and filter out any file slices with the uncommitted
base instant time. The uncommitted log files in the latest file slices may be
included, and they are skipped while doing log reading and merging, i.e., the
logic in "AbstractHoodieLogRecordReader":
{code:java}
if (logBlock.getBlockType() != CORRUPT_BLOCK && logBlock.getBlockType() !=
COMMAND_BLOCK) {
if (!completedInstantsTimeline.containsOrBeforeTimelineStarts(instantTime)
|| inflightInstantsTimeline.containsInstant(instantTime)) {
// hit an uncommitted block possibly from a failed write, move to the next
one and skip processing this one
continue;
}
if (instantRange.isPresent() && !instantRange.get().isInRange(instantTime)) {
// filter the log block by instant range
continue;
}
} {code}
At the concurrency control layer, when two concurrent commits trying to touch
the same file group, one of them is going to fail to guarantee correctness.
Take the following three cases as examples:
> Case 1
{code:java}
writer 1: DC1 (inflight) lf1 added ->
about to commit, conflict resolution DC1 fails
writer 2: schedule compaction (include bf1 lf1){code}
Writer 1 starts deltacommit (DC1) and it's inflight. log file 1 is written.
After that, writer 2 schedules compaction so it includes base file 1 and
corresponding log file 1.
When DC1 is about to commit later on, the conflict resolution detects that it
touches the same file group as the compaction does, so DC1 fails.
> Case 2
{code:java}
writer 1: DC1 (inflight) lf1 added ->
about to commit, conflict resolution DC1 fails . DC1 is rolled back
writer 2: schedule compaction (include bf1 lf1)
execution{code}
Writer 1 starts deltacommit (DC1) and it's inflight. log file 1 is written.
After that, writer 2 schedules compaction so it includes base file 1 and
corresponding log file 1. When DC1 is about to commit later on, the conflict
resolution detects that it touches the same file group as the compaction does,
so DC1 fails. DC1 is then rolled back with a rollback command block added to
the file group. Now DC1 does not exist in the timeline. Later on when
compaction is executed, log file 1 is still excluded based on the if condition
above.
> Case 3
{code:java}
writer 1: DC1 (inflight) lf1 added
-> about to commit, conflict resolution DC1 fails
writer 2: schedule compaction (include bf1 lf1) commit
(excluding lf1) succeeds{code}
Writer 1 starts deltacommit (DC1) and it's inflight. log file 1 is written.
After that, writer 2 schedules compaction so it includes base file 1 and
corresponding log file 1.
When executing compaction, log file 1 is excluded because the instant time
inside the log block has DC1 and it's still inflight. When DC1 is about to
commit later on, the conflict resolution detects that it touches the same file
group as the compaction does, so DC1 fails.
> Check file listing from FS vs metadata table when compaction in pending and
> inflight
> ------------------------------------------------------------------------------------
>
> Key: HUDI-3637
> URL: https://issues.apache.org/jira/browse/HUDI-3637
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Ethan Guo
> Assignee: Ethan Guo
> Priority: Blocker
> Fix For: 0.11.0
>
>
> HoodieMetadataTableValidator validation of the latest base files and file
> slices fails due to the following (from MT, log files are missing, compared
> to FS view). The validation failure may be due to the inflight compaction.
> Need to investigate whether this affects the file listing for write
> operations. The behavior is that after some instants, the validation can
> pass, so the MT correct is guaranteed, but the file listing view may have a
> bug.
> {code:java}
> file slices from metadata: [FileSlice
> {fileGroupId=HoodieFileGroupId{partitionPath='2022/1/28',
> fileId='769bf7ac-d6d0-452c-bf54-bbe7e8381766-0'},
> baseCommitTime=20220314001058266,
> baseFile='HoodieBaseFile{fullPath=file:/Users/ethan/Work/scripts/mt_rollout_testing/deploy_c_multi_writer/c2_mor_010nomt_011mt/test_table/2022/1/28/769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_2-47-485_20220314001058266.parquet,
> fileLen=106839698, BootstrapBaseFile=null}', logFiles='[]'}]
> file slices from file system and base files: [FileSlice
> {fileGroupId=HoodieFileGroupId{partitionPath='2022/1/28',
> fileId='769bf7ac-d6d0-452c-bf54-bbe7e8381766-0'},
> baseCommitTime=20220314001058266,
> baseFile='HoodieBaseFile{fullPath=file:/Users/ethan/Work/scripts/mt_rollout_testing/deploy_c_multi_writer/c2_mor_010nomt_011mt/test_table/2022/1/28/769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_2-47-485_20220314001058266.parquet,
> fileLen=106839698, BootstrapBaseFile=null}',
> logFiles='[HoodieLogFile{pathStr='file:/Users/ethan/Work/scripts/mt_rollout_testing/deploy_c_multi_writer/c2_mor_010nomt_011mt/test_table/2022/1/28/.769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_20220314001058266.log.1_2-111-954',
> fileLen=51607682}]'}]
> 22/03/14 00:33:03 ERROR HoodieMetadataTableValidator: Metadata table
> validation failed for 2022/1/28 due to HoodieValidationException {code}
> Compaction:
> {code:java}
> Partition Path │ FileId │ Base-Instant │
> Data File Path │
> Total Delta Files │ getMetrics
> ║
> ╠══
> 2022/1/28 │ 769bf7ac-d6d0-452c-bf54-bbe7e8381766-0 │ 20220314001058266
> │ 769bf7ac-d6d0-452c-bf54-bbe7e8381766-0_2-47-485_20220314001058266.parquet │
> 1 │ {TOTAL_LOG_FILES=1.0, TOTAL_IO_READ_MB=151.0,
> TOTAL_LOG_FILES_SIZE=5.1607682E7, TOTAL_IO_WRITE_MB=101.0, TOTAL_IO_MB=252.0}
> ║ {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)