nsivabalan commented on issue #17879:
URL: https://github.com/apache/hudi/issues/17879#issuecomment-3766412418

   I will focus on the  problem statement before diving into the solution. Let 
me know if my understanding is right. 
   
   1. You are looking for support to rollback and nuke an existing clustering 
plan(already scheduled). 
   This should not be a big ask, assuming you can enable the config for just 1 
of the dedicated table service writer and whenever it detects a pending 
clustering plan in the timeline, it could rollback and nuke the plan. But 
chances that another concurrent ingestion writer could result in file not found 
issue which needs to be tackled. That's why we proposed 
https://github.com/apache/hudi/pull/12856 
   
   Are you asking for the proposed RFC to be implemented. Can you help clarify 
please. 
   
   2. Based on your requirements, ask is not as simple as support 1. You could 
have multiple table service writers, where table service writer 1 and table 
service writer 2 could contend to perform clustering for the same table based 
on how the table services is orchestrated. Say, we have a pending clustering 
plan in the timeline, unless we have a heart beat, how would the other table 
service writer knows that a given clustering instant is being worked upon or 
no? Does that mean that, you already have incorporated heart beats for table 
services? 
   2.b. If my hunch is right, the heart heats are enabled for both schedule and 
execution of clustering. but typically schedule and execution can be 
de-coupled. So, I am not sure how would we enable heart beats for scheduling in 
such cases. or after scheduling, if the writer shuts down, the heart beat could 
be seen as expired right. But we did not want to rollback and nuke the 
clustering plan in this case. 
   
   W/o going into the solution, can you help clarify the problem statement and 
requirements. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to