Re: [PR] [RFC-79] Improving reliability of concurrent table service executions and rollbacks [hudi]

via GitHub Mon, 09 Dec 2024 17:07:35 -0800


nsivabalan commented on code in PR #11555:
URL: https://github.com/apache/hudi/pull/11555#discussion_r1876888974



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.

Review Comment:
   Currently this is an issue just for clustering. Pending compaction is not an 
issue anymore w/ NBCC w/ 1.x. We can leave the title as same, but the 
motivation can focus in clustering. We could call out that w/ NBCC going to 
become norm with 1.x, this should not be an issue for compaction.



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up

Review Comment:
   typo "cancellable" -> "cancelled" 



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.
+On the other side in ingestion write, the commit finalization flow for 
ingestion writers can be updated to ignore any inflight table service plans if 
they are cancellable.
+For the purpose of this design proposal, consider an ingestion job as having 
three steps:
+1. Schedule itself on the timeline with a new instant time in a .requested file
+2. Process/record tag incoming records, build a workload profile, and write 
the updating/replaced file groups to a "inflight" instant file on the timeline. 
Check for conflicts and abort if needed.
+3. Perform write conflict checks and commit the instant on the timeline
+
+The aforementioned changes to ingestion and table service flow will ensure 
that in the event of a conflicting ingestion and cancellable table service 
writer, the ingestion job will take precedence (and cause the cancellable table 
service instant to eventually fail) as long as a cancellable table service 
hasn't be completed before (2). Since if the cancellable table service has 
already been completed before (2), the ingestion job will see that a completed 
instant (a cancellable table service action) conflicts with its ongoing 
inflight write, and therefore it would not be legal to proceed. 
+
+### Adding a cancel action and abort state for cancellable plans
+This proposed design will also involve adding a new instant state and interal 
hoodie metadata directory, by making the following changes:
+* Add an ".aborted" state type for cancellable table service plan. This state 
is terminal and an instant can only be transitioned to .*commit or .aborted 
(not both)
+* Add a new .hoodie/.cancel folder, where each file corresponds to an instant 
time that a writer requested for cancellation. As will be detailed below, a 
writer can cancel an inflight plan by adding the instant to this directoy and 
execution of table service will not allow an instant to be comitted if it 
appears in this /.cancel directoy (or has been already transitioned to .aborted 
state)
+
+The new /.cancel folder will enable goal (B) by allowing writers to 
permentantly prevent an ongoing cancelltable table service write from 
comitting, without needing to block/wait for the table service writer (or rely 
on a conflicting ingestion write appearing and taking precedence). Once an 
instant is requested for cancellation (added to /.cancel) it cannot be 
"un-cancelled" - it must be eventually transitioned to aborted state. As an 
optimization, ingestion and non-cancellable table service flows will be updated 
such that during write-conflict detection, it will create an entry in /.cancel 
for any cancellable plans with a detected write conflict and will ignore any 
candidate inflight plans that have an entry in /.cancel. 
+
+The new ".aborted" state will allow writers to infer wether a cancelled table 
service plan needs to still have it's partial data writes cleaned up from the 
dataset, which is needed for (C). The additional design change below will 
complete the remaining requirement for (C) of eventual cleanup.
+  
+### Handling cancellation of plans
+An additional config "cancellation-policy" can be added to the table service 
plan to indicate when it is ellgible to be permanently cancelled by writers 
other than the one responsbible for executing the table service. This policy 
can be a threshold of hours or instants on timeline, where if that # of 
hours/instants have elapsed since the plan was scheduled, any call to clean 
will cleanup the instant. This policy should be configured by the writer 
scheduling a cancellable table service, based on the amount of time they expect 
the plan to remain on the timeline before being picked up for execution. For 
example, if the plan is expected to have its execution deferred to a few hours 
later, then the cancellation-policy should be lenient in allowing the plan to 
remain many hours on the timeline before being subject to clean's cancellation. 
Note that this cancellation policy is not a repalacement for determining wether 
a table service plan is currently being executed - as with ingestion wri
 tes, permanent cleanup of a cancellable table service plan will only start 
once it is confirmed that a ongoing writer is no longer progressing it. 

Review Comment:
   We also need to account for failed rollback attempts. 



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,140 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 

Review Comment:
   note on "In conjunction, any ingestion writer or non-cancellable table 
service writer should be able to infer that a conflicting inflight table 
service plan is cancellable, and therefore can be ignored when attempting to 
commit the instant." -> there should not be any conflict b/w mutliple table 
services. Bcoz, while scheduling clustering or compaction, we ignore all file 
groups that are part of pending table service plans. So, unless we relax that 
constraint, this should not be an issue. 



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.

Review Comment:
   does "ongoing concurrent writer" here refers to table service execution 
worker right? sometimes we are overloading the term concurrent writer. On the 
first look, I thought, its another concurrent ingestion writer. 



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.

Review Comment:
   lets ensure we align on one of the approaches. 
   a. either there is no ".cancel" request. And so the clustering job, at the 
end, will inspect all other ingestion commits that completed and aborts itself. 
   OR 
   b. we have ".cancel" and so the clustering job will never have to inspect 
other ingestion commits while trying to complete. If there was any overlap b/w 
ingestion writer and a pending cancellable clustering plan, the ingestion 
writer needs to ensure it adds a cancel request w/o fail. 
   
   I am inclined towards (b), so that clustering does not need to wait until 
the very end. 
   



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+

Review Comment:
   I know its implicit. but should we call out below as next bullet.
   - A cancelled table service(or even if request for cancellation is success) 
should not result in aborting any future ingestion writer. 



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.

Review Comment:
   I might have answered this above myself. but anyways. 
   are there chances that two cancellable clustering plans have overlap and 
both cancels/aborts itself during conflict resolution ? 



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.

Review Comment:
   typo "tale" (last but 2nd word). 
   also, the last statement is not very clear. can we rephrase that a bit



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.
+On the other side in ingestion write, the commit finalization flow for 
ingestion writers can be updated to ignore any inflight table service plans if 
they are cancellable.
+For the purpose of this design proposal, consider an ingestion job as having 
three steps:
+1. Schedule itself on the timeline with a new instant time in a .requested file
+2. Process/record tag incoming records, build a workload profile, and write 
the updating/replaced file groups to a "inflight" instant file on the timeline. 
Check for conflicts and abort if needed.
+3. Perform write conflict checks and commit the instant on the timeline
+
+The aforementioned changes to ingestion and table service flow will ensure 
that in the event of a conflicting ingestion and cancellable table service 
writer, the ingestion job will take precedence (and cause the cancellable table 
service instant to eventually fail) as long as a cancellable table service 
hasn't be completed before (2). Since if the cancellable table service has 
already been completed before (2), the ingestion job will see that a completed 
instant (a cancellable table service action) conflicts with its ongoing 
inflight write, and therefore it would not be legal to proceed. 
+
+### Adding a cancel action and abort state for cancellable plans
+This proposed design will also involve adding a new instant state and interal 
hoodie metadata directory, by making the following changes:
+* Add an ".aborted" state type for cancellable table service plan. This state 
is terminal and an instant can only be transitioned to .*commit or .aborted 
(not both)
+* Add a new .hoodie/.cancel folder, where each file corresponds to an instant 
time that a writer requested for cancellation. As will be detailed below, a 
writer can cancel an inflight plan by adding the instant to this directoy and 
execution of table service will not allow an instant to be comitted if it 
appears in this /.cancel directoy (or has been already transitioned to .aborted 
state)
+
+The new /.cancel folder will enable goal (B) by allowing writers to 
permentantly prevent an ongoing cancelltable table service write from 
comitting, without needing to block/wait for the table service writer (or rely 
on a conflicting ingestion write appearing and taking precedence). Once an 
instant is requested for cancellation (added to /.cancel) it cannot be 
"un-cancelled" - it must be eventually transitioned to aborted state. As an 
optimization, ingestion and non-cancellable table service flows will be updated 
such that during write-conflict detection, it will create an entry in /.cancel 
for any cancellable plans with a detected write conflict and will ignore any 
candidate inflight plans that have an entry in /.cancel. 
+
+The new ".aborted" state will allow writers to infer wether a cancelled table 
service plan needs to still have it's partial data writes cleaned up from the 
dataset, which is needed for (C). The additional design change below will 
complete the remaining requirement for (C) of eventual cleanup.
+  
+### Handling cancellation of plans
+An additional config "cancellation-policy" can be added to the table service 
plan to indicate when it is ellgible to be permanently cancelled by writers 
other than the one responsbible for executing the table service. This policy 
can be a threshold of hours or instants on timeline, where if that # of 
hours/instants have elapsed since the plan was scheduled, any call to clean 
will cleanup the instant. This policy should be configured by the writer 
scheduling a cancellable table service, based on the amount of time they expect 
the plan to remain on the timeline before being picked up for execution. For 
example, if the plan is expected to have its execution deferred to a few hours 
later, then the cancellation-policy should be lenient in allowing the plan to 
remain many hours on the timeline before being subject to clean's cancellation. 
Note that this cancellation policy is not a repalacement for determining wether 
a table service plan is currently being executed - as with ingestion wri
 tes, permanent cleanup of a cancellable table service plan will only start 
once it is confirmed that a ongoing writer is no longer progressing it. 

Review Comment:
   case a: scheduled. never got a chance to execute. no cancellation request.
   case b: scheduled. cancellation request added. never got a chance to 
execute. 
   case c: scheduled. execution attempted. cancellation request added. 
clustering job crashed. and never resumed.
   case d: scheduled. execution keeps on failing on multiple re-attempts. no 
cancellation request.
   
   
   



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.
+On the other side in ingestion write, the commit finalization flow for 
ingestion writers can be updated to ignore any inflight table service plans if 
they are cancellable.
+For the purpose of this design proposal, consider an ingestion job as having 
three steps:
+1. Schedule itself on the timeline with a new instant time in a .requested file
+2. Process/record tag incoming records, build a workload profile, and write 
the updating/replaced file groups to a "inflight" instant file on the timeline. 
Check for conflicts and abort if needed.
+3. Perform write conflict checks and commit the instant on the timeline
+
+The aforementioned changes to ingestion and table service flow will ensure 
that in the event of a conflicting ingestion and cancellable table service 
writer, the ingestion job will take precedence (and cause the cancellable table 
service instant to eventually fail) as long as a cancellable table service 
hasn't be completed before (2). Since if the cancellable table service has 
already been completed before (2), the ingestion job will see that a completed 
instant (a cancellable table service action) conflicts with its ongoing 
inflight write, and therefore it would not be legal to proceed. 
+
+### Adding a cancel action and abort state for cancellable plans

Review Comment:
   abort state will also be useful in other places. 
   for eg: while scheduling future compactions/clustering, in general we try to 
avoid file group which are part of pending clustering plans. But w/ abort 
state, we should only avoid file group which are part of pending clustering 
plans which are not yet up for cancellation or not yet aborted. 
   
   



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.
+On the other side in ingestion write, the commit finalization flow for 
ingestion writers can be updated to ignore any inflight table service plans if 
they are cancellable.
+For the purpose of this design proposal, consider an ingestion job as having 
three steps:
+1. Schedule itself on the timeline with a new instant time in a .requested file
+2. Process/record tag incoming records, build a workload profile, and write 
the updating/replaced file groups to a "inflight" instant file on the timeline. 
Check for conflicts and abort if needed.
+3. Perform write conflict checks and commit the instant on the timeline
+
+The aforementioned changes to ingestion and table service flow will ensure 
that in the event of a conflicting ingestion and cancellable table service 
writer, the ingestion job will take precedence (and cause the cancellable table 
service instant to eventually fail) as long as a cancellable table service 
hasn't be completed before (2). Since if the cancellable table service has 
already been completed before (2), the ingestion job will see that a completed 
instant (a cancellable table service action) conflicts with its ongoing 
inflight write, and therefore it would not be legal to proceed. 
+
+### Adding a cancel action and abort state for cancellable plans

Review Comment:
   Can we split this into two parah. 1st one focussing on .cancel and 2nd one 
discussing the abort state 



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.
+On the other side in ingestion write, the commit finalization flow for 
ingestion writers can be updated to ignore any inflight table service plans if 
they are cancellable.
+For the purpose of this design proposal, consider an ingestion job as having 
three steps:
+1. Schedule itself on the timeline with a new instant time in a .requested file
+2. Process/record tag incoming records, build a workload profile, and write 
the updating/replaced file groups to a "inflight" instant file on the timeline. 
Check for conflicts and abort if needed.
+3. Perform write conflict checks and commit the instant on the timeline
+
+The aforementioned changes to ingestion and table service flow will ensure 
that in the event of a conflicting ingestion and cancellable table service 
writer, the ingestion job will take precedence (and cause the cancellable table 
service instant to eventually fail) as long as a cancellable table service 
hasn't be completed before (2). Since if the cancellable table service has 
already been completed before (2), the ingestion job will see that a completed 
instant (a cancellable table service action) conflicts with its ongoing 
inflight write, and therefore it would not be legal to proceed. 
+
+### Adding a cancel action and abort state for cancellable plans

Review Comment:
   we might need to compensate some actions taken on the table though if a 
pending clustering is aborted. 
   for eg: delete partition operation. 
   
   lets check on bucket index flows as well to ensure we do are not missing 
anything w/ the new flow.



##########
rfc/rfc-79/rfc-79.md:
##########
@@ -0,0 +1,154 @@
+w<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+# Add support for cancellable table service plans
+
+## Proposers
+
+
+## Approvers
+
+## Status
+
+JIRA: HUDI-7946
+
+
+## Abstract
+Table service plans can delay ingestion writes from updating a dataset with 
recent data if potential write conflicts are detected. Furthermore, a table 
service plan that isn't executed to completion for a large amount of time (due 
to repeated failures, application misconfiguration, or insufficient resources) 
will degrade the read/write performance of a dataset due to delaying clean, 
archival, and metadata table compaction. This is because currently HUDI table 
service plans, upon being scheduled, must be executed to completion. And 
additonally will prevent any ingestion write targeting the same files from 
succeeding (due to posing as a write conflict) as well as can prevent new table 
service plans from targeting the same files. Enabling a user to configure a 
table service plan as "cancellable" can prevent frequent or repeatedly failing 
table service plans from delaying ingestion. Support for cancellable plans will 
provide HUDI an avenue to fully cancel a table service plan and allow 
 other table service and ingestion writers to proceed.
+
+
+## Background
+### Execution of table services 
+The table service operations compact and cluster are by default "immutable" 
plans, meaning that once a plan is scheduled it will stay as as a pending 
instant until a caller invokes the table service execute API on the table 
service instant and sucessfully completes it. Specifically, if an inflight 
execution fails after transitioning the instant to inflight, the next execution 
attempt will implictly create and execute a rollback plan (which will delete 
all new instant/data files), but will keep the table service plan. This process 
will repeat until the instant is completed. The below visualization captures 
these transitions at a high level 
+
+![table service lifecycle 
(1)](https://github.com/user-attachments/assets/4a656bde-4046-4d37-9398-db96144207aa)
+
+## Clean and rollback of failed writes
+The clean table service, in addition to performing a clean action, is 
responsible for rolling back any failed ingestion writes 
(non-clustering/non-compaction inflight instants that are not being 
concurrently executed by a writer). This means that table services plans are 
not currently subject to clean's rollback of failed writes. As detailed below, 
this proposal for supporting cancellable table service will benefit from 
enabling clean be capable of targeting table service plans.
+
+## Goals
+### (A) A cancellable table service plan should be capable of preventing 
itself from committing upon presence of write conflict
+The current requirement of HUDI needing to execute a table service plan to 
completion forces ingestion writers to abort a commit if a table service plan 
is conflicting. Becuase an ingestion writer typically determines the exact file 
groups it will be updating/replacing after building a workload profile and 
performing record tagging, the writer may have already spent a lot of time and 
resources before realizing that it needs to abort. In the face of frequent 
table service plans or an old inflight plan, this will cause delays in adding 
recent upstream records to the dataset as well as unecessairly take away 
resources (such as Spark executors in the case of the Spark engine) from other 
applications in the data lake. A cancellable table service plan should avoid 
this situation by preventing itself from being committed if a conflicting 
ingestion job has been comitted already, and cancel itself. In conjunction, any 
ingestion writer or non-cancellable table service writer should be able to
  infer that a conflicting inflight table service plan is cancellable, and 
therefore can be ignored when attempting to commit the instant. 
+
+### (B) A cancellable table service plan should be eligible for cancellation 
at any point before committing
+A writer should be able to explictly cancel a cancellable table service plan 
that an ongoing concurrent writer is executing, as long as it has not been 
committed yet. This requirement is needed due to presence of concurrent and 
async writers for table service execution, as another writer should not need to 
wait for a table service writer to execute further or fail before confirming 
that its cancellation request will be honored. As will be shown later, this not 
require the writer requesting the cancellation to have the ability to 
terminate/fail the writer of the target cancellable tale service plan.
+
+### (C) An incomplete cancellable plan should eventually have its partial 
writes cleaned up
+Although cancellation (be it via an explict request or due to a write 
conflict) can ensure that a table service write is never committed, there still 
needs to be a mechanism to have its data and instant files cleaned up 
permenantly. At minumum the table service writer itself should be able to do 
this cleanup, but this is not sufficient as orchestration/transient 
failrures/resource allocation can prevent table service writers from 
re-attempting their write. Clean can be used to guarantee that an incomplete 
cancellable plan is eventually cleaned up, since datasets that undergo 
clustering are anyway expected to undergo regular clean operations. Because an 
inflight plan remaining on the timeline can degrade performance of reads/writes 
(as mentioned earlier), a cancellable table service plan should be elligible to 
be targeted for cleanup if HUDI clean deems that it has remained inflight for 
too long (or some other critera).
+Note that a failed table service should still be able to be safely cleaned up 
immediately  - the goal here is just to make sure an inflight plan won't stay 
on the timeline for an unbounded amount of time but also won't be likely to be 
prematurely cleaned up by clean before it has a chance to be executed.
+
+## Design
+### Enabling a plan to be cancellable
+To satisfy goal (A), a new config flag "cancellable" can be added to a table 
service plan. A writer that intends to schedule a cancellable table service 
plan can enable the flag in the serialized plan metadata. Any writer executing 
the plan can infer that the plan is cancellable, and when trying to commit the 
instant should abort if it detects that any ingestion write or table service 
plan (without cancellable config flag) is targeting the same file groups. As a 
future optimization, the cancellable table writer can use early conflict 
detection (instead of waiting until committing the instant) to repeatadly poll 
for any conflicting write appearing on timeline, and abort earlier if needed.

Review Comment:
   with (a), ingestion writer cannot deterministically say that a concurrent 
cancellable clustering instant is about to complete or will abort for sure. But 
w/ (b), we can be sure w/ the cancel request w/o any non determinism. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [RFC-79] Improving reliability of concurrent table service executions and rollbacks [hudi]

Reply via email to