[
https://issues.apache.org/jira/browse/HUDI-5863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-5863:
----------------------------
Description:
We observe that for MOR table, occasionally (<10% for large tables with
frequent updates and compactions), the deltacommit after the compaction commit
may add a new log file to the old file slice, not the latest file slice, in the
corresponding file group. This happens when both the metadata table and
timeline server are enabled. If either is disabled, the problem does not show
up.
Deeper analysis of the code surfaces that the file system view at the timeline
server may serve the stale view, causing the issue. This is because the sync
of HoodieMetadataFileSystemView is not atomic when the metadata table is
enabled:
{code:java}
AbstractTableFileSystemView:
@Override
public void sync() {
HoodieTimeline oldTimeline = getTimeline();
HoodieTimeline newTimeline =
metaClient.reloadActiveTimeline().filterCompletedOrMajorOrMinorCompactionInstants();
try {
writeLock.lock();
runSync(oldTimeline, newTimeline);
} finally {
writeLock.unlock();
}
}
HoodieMetadataFileSystemView:
@Override
public void sync() {
super.sync();
tableMetadata.reset();
}
{code}
> Fix the file system view serving stale view at the timeline server
> ------------------------------------------------------------------
>
> Key: HUDI-5863
> URL: https://issues.apache.org/jira/browse/HUDI-5863
> Project: Apache Hudi
> Issue Type: Bug
> Components: timeline-server, writer-core
> Reporter: Ethan Guo
> Assignee: Ethan Guo
> Priority: Blocker
> Fix For: 0.13.1
>
>
> We observe that for MOR table, occasionally (<10% for large tables with
> frequent updates and compactions), the deltacommit after the compaction
> commit may add a new log file to the old file slice, not the latest file
> slice, in the corresponding file group. This happens when both the metadata
> table and timeline server are enabled. If either is disabled, the problem
> does not show up.
> Deeper analysis of the code surfaces that the file system view at the
> timeline server may serve the stale view, causing the issue. This is because
> the sync of HoodieMetadataFileSystemView is not atomic when the metadata
> table is enabled:
>
> {code:java}
> AbstractTableFileSystemView:
> @Override
> public void sync() {
> HoodieTimeline oldTimeline = getTimeline();
> HoodieTimeline newTimeline =
> metaClient.reloadActiveTimeline().filterCompletedOrMajorOrMinorCompactionInstants();
> try {
> writeLock.lock();
> runSync(oldTimeline, newTimeline);
> } finally {
> writeLock.unlock();
> }
> }
> HoodieMetadataFileSystemView:
> @Override
> public void sync() {
> super.sync();
> tableMetadata.reset();
> }
> {code}
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)