PavithranRick opened a new pull request, #14303:
URL: https://github.com/apache/hudi/pull/14303

   ### Describe the issue this Pull Request addresses
   
   This PR introduces the `ShowMetadataTableHistoryProcedure`, a new Spark SQL 
procedure that displays timeline information for both the data table and 
metadata table side-by-side, enabling analysis of metadata table 
synchronization and evolution.
   
   ### Summary and Changelog
   
   - Unified Timeline View: Shows data table and metadata table timelines in a 
single result  
   - Side-by-Side Format: Metadata table columns first, then data table columns 
 
   - Time Formatting: Proper `MM-dd HH:mm:ss` formatting for 
requested/inflight/completed times  
   - Archive Support: Includes archived timeline via `showArchived` parameter  
   - Filtering: SQL-based filtering and time range support (`startTime` / 
`endTime`)  
   - Error Handling: Graceful handling when metadata table doesn't exist  
   
   ### Impact
   
   - New public API:  
     `show_metadata_table_history(table, path, limit, showArchived, filter, 
startTime, endTime)`  
   - 11-column output schema showing metadata table and data table timeline 
information
   
   ### Risk Level
   
   Low  
   Verification performed:  
   - Comprehensive test coverage: 3 focused test cases  
   - Schema validation: all output fields properly typed and validated  
   - Error handling: graceful handling of invalid filters, missing tables, and 
timeline access failures  
   - Timeline consistency: proper handling of both active and archived 
timelines with correct state mapping  
   
   ### Documentation Update
   
   Update Hudi Spark SQL procedures documentation to include 
`show_metadata_table_history` usage examples and parameter descriptions.
   
   ### Contributor's checklist
   
   - [ ] Read through contributor's guide  
   - [ ] Enough context is provided in the sections above  
   - [ ] Adequate tests were added if applicable  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to