[
https://issues.apache.org/jira/browse/HUDI-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-5433:
--------------------------------------
Description:
we trigger compaction in MDT, only when there are no pending inflights apart
from the one thats currently updating the MDT. So we use below code snippet for
it.
{code:java}
List<HoodieInstant> pendingInstants =
dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
.findInstantsBefore(instantTime).getInstants(); {code}
As you could see, we use "findInstantsBefore" which could not yield right
results at all times.
So, we need to find all inflight instants and see if there are any except the
current commit thats updating the MDT. If there are any, we should defer
compaction.
Impact:
writes to MDT might fail if there was any missed inflight and later it was
rolledback. Users have to disable MDT and make progress.
was:
we trigger compaction in MDT, only when there are no pending inflights apart
from the one thats currently updating the MDT. So we use below code snippet for
it.
{code:java}
List<HoodieInstant> pendingInstants =
dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
.findInstantsBefore(instantTime).getInstants(); {code}
As you could see, we use "findInstantsBefore" which could not yield right
results at all times.
So, we need to find all inflight instants and see if there are any except the
current commit thats updating the MDT. If there are any, we should defer
compaction.
> Fix the way we deduce the pending instants for MDT writes
> ---------------------------------------------------------
>
> Key: HUDI-5433
> URL: https://issues.apache.org/jira/browse/HUDI-5433
> Project: Apache Hudi
> Issue Type: Bug
> Components: metadata
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Blocker
> Fix For: 0.13.0
>
>
> we trigger compaction in MDT, only when there are no pending inflights apart
> from the one thats currently updating the MDT. So we use below code snippet
> for it.
>
> {code:java}
> List<HoodieInstant> pendingInstants =
> dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
> .findInstantsBefore(instantTime).getInstants(); {code}
> As you could see, we use "findInstantsBefore" which could not yield right
> results at all times.
>
> So, we need to find all inflight instants and see if there are any except the
> current commit thats updating the MDT. If there are any, we should defer
> compaction.
> Impact:
> writes to MDT might fail if there was any missed inflight and later it was
> rolledback. Users have to disable MDT and make progress.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)