LinMingQiang commented on PR #7469:
URL: https://github.com/apache/hudi/pull/7469#issuecomment-1690915756

   > if I am not wrong, this is the core problem we are trying to solve. if 
there are failed commits and if two concurrent writers try to rollback 
concurrently we don't have a lock as such.
   > 
   > These complications arise just bcoz, hudi tries to do automatic clean up 
of failed writes. In other similar systems, you may have to trigger explicit 
commands to clean up partially failed commits. or coordinate when multiple 
writers are involved.
   > 
   > Wanted to call it out. Anyways, coming back to the original issue. Its 
recommended to disable table services (like cleaner, archival) in all writers 
except 1. So, we won't end up in such conflicts. These are anyways not latency 
sensitive. And w/ this approach all other writes will be even more faster since 
they don't trigger any of these table service and only take care of ingestion.
   > 
   > We do have a table level config to disable all table services 
https://hudi.apache.org/docs/configurations/#hoodietableservicesenabled
   > 
   > Having said all this, here is what I feel we could fix this issue.
   > 
   > We can leverage the heartbeats, such that rollback commits also start to 
emit heartbeats. So, a concurrent writer know if some other writer is 
concurrently executing the rollback, or whether its in failed state. That way, 
only one writer will go ahead and execute the rollback while others will step 
away.
   > 
   > I remember @suryaprasanna wanted to fix this if I am not wrong.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to