spacemonkd opened a new pull request, #10582: URL: https://github.com/apache/ozone/pull/10582
## What changes were proposed in this pull request? HDDS-15642. Design document for Delta Sharing support on Ozone Please describe your PR in detail: What changes were proposed in this pull request? This PR adds a design document for Delta Sharing Protocol support in Apache Ozone. It proposes a new standalone Delta Sharing Gateway service that enables secure, real-time sharing of Ozone-stored datasets via the open Delta Sharing protocol. #### Core features: - Implements Delta Sharing REST API for sharing Delta Lake and Parquet tables stored in Ozone - URL-based access mode: generates pre-signed S3 URLs for file downloads - Directory-based access mode: issues temporary S3 credentials for direct table access - Bearer token authentication and per-recipient authorization - Support for table versioning and change data feed (CDF) #### Future work: - Apache Iceberg table support - Additional formats (ORC, Avro, CSV, JSON) with server-side Parquet conversion option - Raw file sharing (ML models, logs, documents) - Native Ozone protocol access (ofs://) as alternative to S3 - Multi-cluster federation and cross-cluster sharing #### Key design decisions: - Standalone service (not embedded in S3 Gateway) for independent lifecycle and scaling - File-based YAML config in Phase 1, migrates to OM metadata in Phase 4 - Delta Kernel library for Delta log parsing - Service identity + gateway-level audit log for authentication ### Why is this needed? Organizations store large datasets in Ozone but lack a standardized way to share data with external teams (Pandas, Spark, Databricks, Tableau) without data duplication or granting direct storage credentials. Delta Sharing solves this by providing a simple REST API and pre-signed URL/credential-based access. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-15642 ## How was this patch tested? N/A -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
