[ 
https://issues.apache.org/jira/browse/HUDI-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-5433:
--------------------------------------
    Description: 
we trigger compaction in MDT, only when there are no pending inflights apart 
from the one thats currently updating the MDT. So we use below code snippet for 
it. 

 
{code:java}
 List<HoodieInstant> pendingInstants = 
dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
        .findInstantsBefore(instantTime).getInstants(); {code}
As you could see, we use "findInstantsBefore" which could not yield right 
results at all times.
 
So, we need to find all inflight instants and see if there are any except the 
current commit thats updating the MDT. If there are any, we should defer 
compaction.

Impact:

writes to MDT might fail if there was any missed inflight and later it was 
rolledback. Users have to disable MDT and make progress.

  was:
we trigger compaction in MDT, only when there are no pending inflights apart 
from the one thats currently updating the MDT. So we use below code snippet for 
it. 

 
{code:java}
 List<HoodieInstant> pendingInstants = 
dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
        .findInstantsBefore(instantTime).getInstants(); {code}
As you could see, we use "findInstantsBefore" which could not yield right 
results at all times.
 
So, we need to find all inflight instants and see if there are any except the 
current commit thats updating the MDT. If there are any, we should defer 
compaction.


> Fix the way we deduce the pending instants for MDT writes
> ---------------------------------------------------------
>
>                 Key: HUDI-5433
>                 URL: https://issues.apache.org/jira/browse/HUDI-5433
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Blocker
>             Fix For: 0.13.0
>
>
> we trigger compaction in MDT, only when there are no pending inflights apart 
> from the one thats currently updating the MDT. So we use below code snippet 
> for it. 
>  
> {code:java}
>  List<HoodieInstant> pendingInstants = 
> dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
>         .findInstantsBefore(instantTime).getInstants(); {code}
> As you could see, we use "findInstantsBefore" which could not yield right 
> results at all times.
>  
> So, we need to find all inflight instants and see if there are any except the 
> current commit thats updating the MDT. If there are any, we should defer 
> compaction.
> Impact:
> writes to MDT might fail if there was any missed inflight and later it was 
> rolledback. Users have to disable MDT and make progress.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to