nsivabalan commented on issue #17908:
URL: https://github.com/apache/hudi/issues/17908#issuecomment-3923436575

   thanks for the feature request. 
   
   Seems a valid one for high scale users like uber. 
   
   Can you clarify this statement though 
   ```
   In addition, regardless of the above async execution, we should ensure that 
it is safe for a concurrent writer to directly call performTableServices
   ...
   
   ```
   
   note sure I get this. 
   
   
   Back to the requirements: 
   Essentially, you are looking for ways to offload the execution to async job 
so that we can unblock the ingestion writer asap. And the compaction execution 
does not hold the table lock for extended period of time, but just when 
wrapping up the compaction commit. Hence, this improves the throughput of the 
system in general. 
   
   - We have some intricacies or inter dependencies between data table 
archival, metadata table compaction, metadata archival, and rollbacks. Have we 
accounted for scenarios w/ your proposed solution. 
   - What happens if the async execution did not happen at all? Do you have any 
guard rails to fail ingestion if we reach some max no of log files in mdt. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to