lokesh-lingarajan-0310 opened a new pull request, #9336:
URL: https://github.com/apache/hudi/pull/9336

   ### Change Logs
   
   - Change s3 and gcs incremental job to batch within a commit based on the 
source limit
   - Refactor s3 incr job to lend more testing
   - Added test cases for both s3 and gcs incr jobs
   - Checkpoint format => "commitTime#Key", sorted order of these columns will 
help resume ingestion
   - Added a few timeline apis to support fetching commit data from current 
commit
   
   ### Impact
   
   No external facing impact, we are only changing the way we manage checkpoints
   
   ### Risk level (write none, low medium or high below)
   
   medium: as we are introducing batching within commits there is a risk of 
data loss in case we are not accurate in moving to the next batch
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable
   - [x] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to