ivandika3 commented on code in PR #8184: URL: https://github.com/apache/ozone/pull/8184#discussion_r2146647572
########## hadoop-hdds/docs/content/concept/OzoneS3Gateway.md: ########## @@ -0,0 +1,108 @@ +--- +title: "Ozone S3 Gateway" +date: "2025-03-28" +weight: 8 +menu: + main: + parent: Architecture +summary: Apache Ozone’s S3 Gateway (often called S3G) is a component that provides an Amazon S3-compatible REST interface for Ozone’s object store. +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +Apache Ozone’s S3 Gateway (often called **S3G**) is a component that provides an Amazon S3-compatible REST interface for Ozone’s object store. In essence, it allows any S3 client or tool to read and write data in Ozone as if it were talking to AWS S3. The S3 Gateway runs as a **separate, stateless service** on top of Ozone, translating HTTP S3 API calls into Ozone’s native storage operations. + +This design lets users leverage the rich ecosystem of S3-compatible applications with Ozone, without modifying those applications. + +## High-Level Overview of the S3 Gateway + +Ozone’s S3 Gateway acts as a **REST front-end** that speaks the S3 protocol and bridges it to the Ozone backend. It is **stateless**, meaning it does not persist data or metadata locally between requests – all state is stored in the Ozone cluster itself. + +Being stateless allows multiple S3 Gateway instances to run in parallel behind a load balancer for scaling and high availability. The gateway supports a broad set of S3 operations – create/delete buckets, PUT/GET objects, list objects, perform multipart uploads – using standard S3 SDKs or CLI tools. + +Under the hood, each operation is translated into the appropriate Ozone calls: +- A PUT object request becomes a `putKey` write operation. +- A GET object becomes a `getKey` read operation. + +### Mapping S3 to Ozone Concepts + +In Ozone, data is organized into **volumes**, **buckets**, and **keys**. By default, the S3 Gateway uses a special +volume named `/s3v` to store all S3 buckets (this is configurable via the `ozone.s3g.volume.name` property). + +Each S3 bucket created is represented as a bucket under the `/s3v` volume in Ozone. Objects (keys) are stored within those buckets. This mapping is transparent to the end user. + +## S3 Gateway in the Ozone Architecture + +The S3 Gateway does the following: +- It receives S3 REST API calls from clients. +- It translates them and forwards metadata operations to OM. +- It streams data directly to and from DataNodes. + +This stateless gateway can be scaled horizontally. + +## Internal Components and Design + +### HTTP Server and REST API Layer +Runs an embedded HTTP server exposing REST endpoints like: +- `PUT /<bucket>/<key>` +- `GET /<bucket>/<key>` + +Handles authentication headers, content length, and routing. + +### Request Handlers / Protocol Translation +Each S3 operation is mapped to Ozone APIs: +- Create Bucket → create in `/s3v` volume +- List Buckets → query all user-owned buckets under `/s3v` +- Put/Get Object → `putKey` / `getKey` + +It also assembles proper S3-compliant response formats. + +### Ozone Client Integration +Uses Ozone client libraries to: +- Call OM for metadata operations +- Stream data to/from DataNodes + +Bulk data does not flow through OM – it goes directly between gateway and DataNodes. + +### Statelessness and Caching +The gateway: +- Does not persist state +- Uses lightweight in-memory caches +- Depends on OM for coordination and user/token validation + +This simplifies scaling and failure recovery. + +### Authentication and Security +Supports AWS Signature-style authentication (if enabled): Review Comment: Note that we only support AWS Signature V4 not the old AWS Signature V2. ########## hadoop-hdds/docs/content/concept/OzoneS3Gateway.md: ########## @@ -0,0 +1,108 @@ +--- +title: "Ozone S3 Gateway" +date: "2025-03-28" +weight: 8 +menu: + main: + parent: Architecture +summary: Apache Ozone’s S3 Gateway (often called S3G) is a component that provides an Amazon S3-compatible REST interface for Ozone’s object store. +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +Apache Ozone’s S3 Gateway (often called **S3G**) is a component that provides an Amazon S3-compatible REST interface for Ozone’s object store. In essence, it allows any S3 client or tool to read and write data in Ozone as if it were talking to AWS S3. The S3 Gateway runs as a **separate, stateless service** on top of Ozone, translating HTTP S3 API calls into Ozone’s native storage operations. + +This design lets users leverage the rich ecosystem of S3-compatible applications with Ozone, without modifying those applications. + +## High-Level Overview of the S3 Gateway + +Ozone’s S3 Gateway acts as a **REST front-end** that speaks the S3 protocol and bridges it to the Ozone backend. It is **stateless**, meaning it does not persist data or metadata locally between requests – all state is stored in the Ozone cluster itself. + +Being stateless allows multiple S3 Gateway instances to run in parallel behind a load balancer for scaling and high availability. The gateway supports a broad set of S3 operations – create/delete buckets, PUT/GET objects, list objects, perform multipart uploads – using standard S3 SDKs or CLI tools. Review Comment: The idea of stateless also allows S3G to be deployed in Kubernetes quite easily. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org