Hi all,

I have created a PR https://github.com/apache/hudi/pull/12856 for RFC
90 for enabling clustering plans to be "cancellable"

Background:

Clustering is a table service that assists with optimizing table/files
layout in Hudi to speed up read queries. Clustering table service
plans can delay
ingestion writes from updating a dataset with recent data if potential
write conflicts are detected. Furthermore, a clustering plan that
isn't executed to completion for a large amount of time (due to
repeated failures, application
misconfiguration, or insufficient resources) will degrade the
read/write performance of a dataset due to delaying clean and
archival. This
RFC proposes to support "Cancellable" Clustering plans. Support for
such cancellable clustering plans will provide HUDI an avenue to fully
cancel a clustering plan to allow other table service and ingestion
writers to proceed and avoid possible starvation (based on user
needs).

Thanks, and any feedback will be appreciated.
-- 
From, Krishen Bhan

Reply via email to