manisin opened a new pull request, #3417: URL: https://github.com/apache/polaris/pull/3417
This PR adds a new blog post explaining how to use Apache Polaris's External Catalog feature to integrate legacy and heterogeneous data lakes without data migration. Key topics covered: - Challenges of existing data lake systems (format fragmentation, security burden) - Push-based, stateless External Catalog architecture - Best practices for implementing sync agents - Notification API integration for metadata updates Includes architecture diagram showing data flow from producers through sync agents to Apache Polaris and downstream to data consumers. Files Added - `site/content/blog/2026/01/12/external-catalog-legacy-datalakes.md` - Blog post content - `site/static/img/blog/2026/01/12/external-catalog-architecture.png` - Architecture diagram ## Checklist - [ ] š”ļø Don't disclose security issues! (contact [email protected]) - [ ] š Clearly explained why the changes are needed, or linked related issues: Fixes # - [ ] š§Ŗ Added/updated tests with good coverage, or manually tested (and explained how) - [ ] š” Added comments for complex logic - [ ] š§¾ Updated `CHANGELOG.md` (if needed) - [ ] š Updated documentation in `site/content/in-dev/unreleased` (if needed) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
