lokeshj1703 opened a new pull request, #18076: URL: https://github.com/apache/hudi/pull/18076
### Describe the issue this Pull Request addresses Issue https://github.com/apache/hudi/issues/18075 Currently the cloud incremental source configures limit on number of bytes read from source. But with very small files, the number of files read can increase drastically and managing all the files metadata in driver can lead to OOM. The Issue aims to add a limit on the number of rows read by the source as well to reduce the memory overhead on driver. ### Summary and Changelog Adds a new configuration for limiting the number of rows read by hoodie incremental source. ### Impact Helps reduce the memory overhead on driver with cloud incremental source by limiting the number of files read. ### Risk Level Low ### Documentation Update NA ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Enough context is provided in the sections above - [ ] Adequate tests were added if applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
