nsivabalan commented on issue #17879: URL: https://github.com/apache/hudi/issues/17879#issuecomment-3766412418
I will focus on the problem statement before diving into the solution. Let me know if my understanding is right. 1. You are looking for support to rollback and nuke an existing clustering plan(already scheduled). This should not be a big ask, assuming you can enable the config for just 1 of the dedicated table service writer and whenever it detects a pending clustering plan in the timeline, it could rollback and nuke the plan. But chances that another concurrent ingestion writer could result in file not found issue which needs to be tackled. That's why we proposed https://github.com/apache/hudi/pull/12856 Are you asking for the proposed RFC to be implemented. Can you help clarify please. 2. Based on your requirements, ask is not as simple as support 1. You could have multiple table service writers, where table service writer 1 and table service writer 2 could contend to perform clustering for the same table based on how the table services is orchestrated. Say, we have a pending clustering plan in the timeline, unless we have a heart beat, how would the other table service writer knows that a given clustering instant is being worked upon or no? Does that mean that, you already have incorporated heart beats for table services? 2.b. If my hunch is right, the heart heats are enabled for both schedule and execution of clustering. but typically schedule and execution can be de-coupled. So, I am not sure how would we enable heart beats for scheduling in such cases. or after scheduling, if the writer shuts down, the heart beat could be seen as expired right. But we did not want to rollback and nuke the clustering plan in this case. W/o going into the solution, can you help clarify the problem statement and requirements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
