PavithranRick opened a new pull request, #14303:
URL: https://github.com/apache/hudi/pull/14303
### Describe the issue this Pull Request addresses
This PR introduces the `ShowMetadataTableHistoryProcedure`, a new Spark SQL
procedure that displays timeline information for both the data table and
metadata table side-by-side, enabling analysis of metadata table
synchronization and evolution.
### Summary and Changelog
- Unified Timeline View: Shows data table and metadata table timelines in a
single result
- Side-by-Side Format: Metadata table columns first, then data table columns
- Time Formatting: Proper `MM-dd HH:mm:ss` formatting for
requested/inflight/completed times
- Archive Support: Includes archived timeline via `showArchived` parameter
- Filtering: SQL-based filtering and time range support (`startTime` /
`endTime`)
- Error Handling: Graceful handling when metadata table doesn't exist
### Impact
- New public API:
`show_metadata_table_history(table, path, limit, showArchived, filter,
startTime, endTime)`
- 11-column output schema showing metadata table and data table timeline
information
### Risk Level
Low
Verification performed:
- Comprehensive test coverage: 3 focused test cases
- Schema validation: all output fields properly typed and validated
- Error handling: graceful handling of invalid filters, missing tables, and
timeline access failures
- Timeline consistency: proper handling of both active and archived
timelines with correct state mapping
### Documentation Update
Update Hudi Spark SQL procedures documentation to include
`show_metadata_table_history` usage examples and parameter descriptions.
### Contributor's checklist
- [ ] Read through contributor's guide
- [ ] Enough context is provided in the sections above
- [ ] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]