vamshikrishnakyatham opened a new pull request, #14037: URL: https://github.com/apache/hudi/pull/14037
### Describe the issue this Pull Request addresses Filtering for Hudi incremental queries on tables with version >= 6 depends on commit completion time, but this information is not present in the query output for users. Currently, only requested times are outputted, which leads to inconsistent and incorrect filtering results. Users cannot provide proper filter times because the actual completion times used for filtering are not visible in the results. fixes: https://github.com/apache/hudi/issues/14036 ### Summary and Changelog Summary: 1. Users can now filter based on actual commit completion times instead of inconsistent requested times 2. New `_hoodie_commit_completion_time` virtual column provides visibility into the actual times used for filtering 3. This eliminates inconsistencies between internal filtering logic and user visible timestamps Changelog: 1. Added `_hoodie_commit_completion_time` virtual column for Hudi tables version >= 6 2. Extended V2 factory classes (`HoodieCopyOnWriteIncrementalHadoopFsRelationFactoryV2`, `HoodieMergeOnReadIncrementalHadoopFsRelationFactoryV2`) to include completion time field in schema 3. Supported for file group reader optimized queries (both MoR and CoW) and base CoW ### Impact 1. New virtual column `_hoodie_commit_completion_time` available in incremental query results for tables version >= 6 2. No breaking changes - existing `_hoodie_commit_time` column remains unchanged 3. Backward compatible - only affects V2 incremental relations ### Risk Level low ### Documentation Update Doc, API update with new `_hoodie_commit_completion_time` virtual column ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Enough context is provided in the sections above - [x] Adequate tests were added if applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
