Re: [PR] [HUDI-7975] Provide an API to create empty commit [hudi]

via GitHub Wed, 10 Jul 2024 17:37:47 -0700


suryaprasanna commented on PR #11606:
URL: https://github.com/apache/hudi/pull/11606#issuecomment-2221771756


   > should we think through a way to improve the incremental query performance 
on the timeline instead of these tricky changes?
   The reason for creating an empty commit is to trigger table service 
operations like rollback, clean and archival. We have noticed that users are 
not calling any hudi APIs when there is no data to ingest, but our internal 
table services cannot handle all the cases. Another reason we also need this 
when data is getting ingested from multiple sources and each writer is tracking 
its own checkpoint in the commit. Then in that setup, if one writer is making 
frequent writes and other is not writing that frequently then there is a case 
where checkpoints stored by the second writer can be archived. So, we need a 
better way to store those checkpoint information as well.
   This change helps us in both triggering table services and also copy 
checkpoints.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-7975] Provide an API to create empty commit [hudi]

Reply via email to