[
https://issues.apache.org/jira/browse/HUDI-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Y Ethan Guo updated HUDI-8138:
------------------------------
Fix Version/s: 1.1.0
> Filtering of clustering replacecommits should be resilient to ongoing
> replacecommit rollbacks
> ---------------------------------------------------------------------------------------------
>
> Key: HUDI-8138
> URL: https://issues.apache.org/jira/browse/HUDI-8138
> Project: Apache Hudi
> Issue Type: Wish
> Reporter: Krishen Bhan
> Priority: Trivial
> Fix For: 1.1.0
>
>
> *Issue*
> When a writer creates an AbstractFileSystem via
> `org.apache.hudi.common.table.view.AbstractTableFileSystemView#init`, the API
> `org.apache.hudi.common.util.ClusteringUtils#getAllPendingClusteringPlans`
> is called which checks wether a repalcecommit plan is clustering. In a
> similar manner, when a writer identifies failed instants to rollback, it
> calls
> `org.apache.hudi.client.BaseHoodieTableServiceClient#getInflightTimelineExcludeCompactionAndClustering`
> which uses `org.apache.hudi.common.util.ClusteringUtils#isClusteringInstant`
> to check wether the replacecommt plan is clustering.
> This since prior to
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-7905?filter=allissues]
> , both insert_overwrite and clustering operations use the replacecommit
> timeline action type.
> If a writer is using these APIs while (non-clustering) instants are being
> rolled back, these writers will unnecessarily fail with an exception, since
> in between filtering the timeline for inflight replacecommits and reading the
> plan metadata from DFS, the replacecommit.requested can be deleted by a
> concurrent rollback (since it is legal to rollback a non-clustering
> replacecommit plan).
> *Scenario*
> For example, when an ingestion job executes the insert/upsert phase, before
> it begins to map each input record into file group buckets it first
> cross-checks the input records and the file groups they belong to with the
> files modified by pending clustering instants. The following sequence events
> can lead to the ingestion job failing
> # There is a failed non-clustering replacecommit (RC) on timeline
> # Job A starts an ingestion commit. During the execution of ingestion, the
> upsert execution step finds RC on timeline. Because the
> replacecommit.requested shows that RC isn’t a clustering and doesn’t have any
> overlapping file groups with Job A’s in progress commit.
> # Job B starts, and same as Job A it finds RC. It begins to check wether RC
> has any pending clustering groups that could conflict with Job B’s
> in-progress commit
> # Job A completes its commit, and does its post-commit phase. This includes
> a lazy clean, where it rolls back RC, completely removing it from timeline
> # Job B attempts to open RC’s replacecommit.requested file, but fails with
> a file-not-found error due to the file no longer existing
> *Resolution*
> This limitation can be resolved by identifying specific APIs where HUDI
> filters a set of inflight replacecommits for instants that are clustering.
> The two cases mentioned above are specific APIs in HUDI, but there can
> potentially be more.
> Each case can be handled by updating the implementation to not suppress a
> file-not-found error. In other words, if a repalcecommit.requested no longer
> exists then it will be assumed that it was a non-clustering replacecommit.
> This should be a safe assumption, since if the replacecommit.requested
> belonged to a clustering operation then it would not have been deleted.
> Although locking/synchronization might also potentially resolve this issue
> (by having HUDI filter replacecommits + read all repalcecommit.requested
> files under a table lock), it is likely not a feasible solution since HUDI
> readers will not be able to use HUDI Multiwriter OCC semantics
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)