ssulav commented on code in PR #300:
URL: https://github.com/apache/ozone-site/pull/300#discussion_r2747727592


##########
docs/04-user-guide/01-client-interfaces/04-s3a.md:
##########
@@ -4,4 +4,186 @@ sidebar_label: s3a
 
 # s3a and Ozone
 
-**TODO:** File a subtask under 
[HDDS-9858](https://issues.apache.org/jira/browse/HDDS-9858) and complete this 
page or section.
+Ozone exposes an **S3-compatible REST interface** via the S3 Gateway. Hadoop's 
**S3A** filesystem (`s3a://`) is a cloud connector that translates the AWS S3 
API into a Hadoop-compatible file system interface. Hadoop-style data analytics 
tools such as Hive, Impala, and Spark can access Ozone's S3 interface using the 
Hadoop S3A connector, so you can use Ozone buckets from existing Hadoop 
ecosystem tools without application changes.
+
+This page explains how to configure the Hadoop S3A client to use Ozone's S3 
Gateway (s3g) and provides sample commands to access Ozone s3g using s3a. For 
details about the Ozone S3 Gateway itself (supported REST APIs, URL schemes, 
security), see the [S3 Protocol](./03-s3/01-s3-api.md) page. For more 
information about S3A, see the [official Hadoop S3A 
documentation](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html).
+
+## Prerequisites
+
+- A running Ozone cluster with the **S3 Gateway** enabled. You can start a 
Docker-based cluster (including S3 Gateway) as described in the [S3 
Protocol](./03-s3/01-s3-api.md) documentation.
+- Ozone S3 endpoint (for example `http://localhost:9878` or a load balancer 
DNS name).
+- Hadoop distribution with the **`hadoop-aws`** module available. See the 
official Hadoop S3A documentation:
+  - [Hadoop-AWS: S3A client 
overview](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html)
+  - [Connecting via 
S3A](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/connecting.html)
+
+## Configuring S3A for Ozone
+
+### Enable the S3A client
+
+Ensure the `hadoop-aws` module is on the client classpath. In a typical Hadoop 
installation:
+
+- Set `HADOOP_OPTIONAL_TOOLS` in `hadoop-env.sh` to include `hadoop-aws`, 
**or**
+- Add a dependency on `org.apache.hadoop:hadoop-aws` with the same version as 
`hadoop-common`.
+
+See the [Hadoop S3A Getting 
Started](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Getting_Started)
 section for details.
+
+### core-site.xml: point S3A to Ozone
+
+Add the following properties to the Hadoop configuration (for example 
`core-site.xml`) so that `s3a://` URIs use the Ozone S3 Gateway instead of AWS 
S3:
+
+```xml
+<property>
+  <name>fs.s3a.endpoint</name>
+  <value>http://ozone-s3g-host:9878</value>
+  <description>
+    Ozone S3 Gateway endpoint. Replace with your s3g hostname or load balancer.
+  </description>
+</property>
+
+<property>
+  <name>fs.s3a.endpoint.region</name>
+  <value>us-east-1</value>
+  <description>
+    Logical region name required by the S3A client. Ozone does not enforce 
regions,
+    but this must be a valid-looking value.
+  </description>
+</property>
+
+<property>
+  <name>fs.s3a.path.style.access</name>
+  <value>true</value>
+  <description>
+    Ozone S3 Gateway defaults to path-style URLs (http://host:9878/bucket),
+    so S3A should use path-style access.
+  </description>
+</property>
+```
+
+These properties follow the official S3A connection settings in [Connecting to 
an S3 
store](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/connecting.html#Connection_Settings).
+
+### Recommended settings for Ozone
+
+Ozone S3 Gateway does not support ETags and object versioning in the same way 
as AWS S3. To avoid compatibility issues, set these options when using S3A with 
Ozone:
+
+```xml
+<property>
+  <name>fs.s3a.bucket.probe</name>
+  <value>0</value>
+  <description>
+    Disable the bucket existence probe at startup. This is the default in 
recent Hadoop
+    versions and is recommended for third-party S3-compatible stores such as 
Ozone.
+  </description>
+</property>
+
+<property>
+  <name>fs.s3a.change.detection.mode</name>
+  <value>none</value>
+  <description>Disable change detection; not applicable to Ozone 
S3.</description>
+</property>
+```
+
+### Credentials
+
+Ozone uses the same AWS-style access key and secret key model for the S3 
Gateway.
+
+- If **security is disabled**, any `AWS_ACCESS_KEY_ID` / 
`AWS_SECRET_ACCESS_KEY` pair can be used.
+- If **security is enabled**, obtain a key and secret via `ozone s3 getsecret` 
(Kerberos authentication is required). See the [S3 Protocol — 
Security](./03-s3/01-s3-api.md#security) and [Securing 
S3](./03-s3/02-securing-s3.md) sections for details.
+
+Configure S3A credentials in `core-site.xml`:
+
+```xml
+<property>
+  <name>fs.s3a.access.key</name>
+  <value>your-access-key</value>
+</property>
+
+<property>
+  <name>fs.s3a.secret.key</name>
+  <value>your-secret-key</value>
+</property>
+```
+
+Alternatively, use environment variables as documented in [Authenticating via 
AWS environment 
variables](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Authenticating_via_the_AWS_Environment_Variables):
+
+```bash
+export AWS_ACCESS_KEY_ID="your-access-key"
+export AWS_SECRET_ACCESS_KEY="your-secret-key"
+```
+
+:::note
+For generating and revoking Ozone S3 secrets, see the **Security** section of 
the [S3 Protocol](./03-s3/01-s3-api.md#security) page.
+:::
+
+:::caution
+If the Ozone S3 Gateway is exposed over **HTTPS**, the JVM must trust the 
gateway's TLS certificate. The Hadoop AWS client (`hadoop-aws`) uses the 
default Java truststore; if the gateway uses a custom or internal CA, add that 
CA to `JAVA_HOME/lib/security/jssecacerts` or configure the JVM truststore 
accordingly. Otherwise S3A connections to the HTTPS endpoint may fail with 
certificate errors.
+:::
+
+## Example: using `hadoop fs` with Ozone via S3A
+
+The examples below assume:
+
+- Ozone S3 Gateway is reachable at `http://localhost:9878`
+- `core-site.xml` is configured as above
+- An S3 bucket (for example `bucket1`) already exists (you can create it with 
`aws s3api --endpoint http://localhost:9878 create-bucket --bucket bucket1`)
+
+S3A URLs use the form `s3a://&lt;bucket&gt;/&lt;path&gt;`. The bucket 
corresponds to an Ozone bucket under the `/s3v` volume or a bucket link.

Review Comment:
   The path is broken with the quoting



##########
docs/04-user-guide/01-client-interfaces/04-s3a.md:
##########
@@ -4,4 +4,186 @@ sidebar_label: s3a
 
 # s3a and Ozone
 
-**TODO:** File a subtask under 
[HDDS-9858](https://issues.apache.org/jira/browse/HDDS-9858) and complete this 
page or section.
+Ozone exposes an **S3-compatible REST interface** via the S3 Gateway. Hadoop's 
**S3A** filesystem (`s3a://`) is a cloud connector that translates the AWS S3 
API into a Hadoop-compatible file system interface. Hadoop-style data analytics 
tools such as Hive, Impala, and Spark can access Ozone's S3 interface using the 
Hadoop S3A connector, so you can use Ozone buckets from existing Hadoop 
ecosystem tools without application changes.
+
+This page explains how to configure the Hadoop S3A client to use Ozone's S3 
Gateway (s3g) and provides sample commands to access Ozone s3g using s3a. For 
details about the Ozone S3 Gateway itself (supported REST APIs, URL schemes, 
security), see the [S3 Protocol](./03-s3/01-s3-api.md) page. For more 
information about S3A, see the [official Hadoop S3A 
documentation](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html).
+
+## Prerequisites
+
+- A running Ozone cluster with the **S3 Gateway** enabled. You can start a 
Docker-based cluster (including S3 Gateway) as described in the [S3 
Protocol](./03-s3/01-s3-api.md) documentation.
+- Ozone S3 endpoint (for example `http://localhost:9878` or a load balancer 
DNS name).
+- Hadoop distribution with the **`hadoop-aws`** module available. See the 
official Hadoop S3A documentation:
+  - [Hadoop-AWS: S3A client 
overview](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html)
+  - [Connecting via 
S3A](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/connecting.html)
+
+## Configuring S3A for Ozone
+
+### Enable the S3A client
+
+Ensure the `hadoop-aws` module is on the client classpath. In a typical Hadoop 
installation:
+
+- Set `HADOOP_OPTIONAL_TOOLS` in `hadoop-env.sh` to include `hadoop-aws`, 
**or**
+- Add a dependency on `org.apache.hadoop:hadoop-aws` with the same version as 
`hadoop-common`.
+
+See the [Hadoop S3A Getting 
Started](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Getting_Started)
 section for details.
+
+### core-site.xml: point S3A to Ozone
+
+Add the following properties to the Hadoop configuration (for example 
`core-site.xml`) so that `s3a://` URIs use the Ozone S3 Gateway instead of AWS 
S3:
+
+```xml
+<property>
+  <name>fs.s3a.endpoint</name>
+  <value>http://ozone-s3g-host:9878</value>
+  <description>
+    Ozone S3 Gateway endpoint. Replace with your s3g hostname or load balancer.
+  </description>
+</property>
+
+<property>
+  <name>fs.s3a.endpoint.region</name>
+  <value>us-east-1</value>
+  <description>
+    Logical region name required by the S3A client. Ozone does not enforce 
regions,
+    but this must be a valid-looking value.
+  </description>
+</property>
+
+<property>
+  <name>fs.s3a.path.style.access</name>
+  <value>true</value>
+  <description>
+    Ozone S3 Gateway defaults to path-style URLs (http://host:9878/bucket),
+    so S3A should use path-style access.
+  </description>
+</property>
+```
+
+These properties follow the official S3A connection settings in [Connecting to 
an S3 
store](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/connecting.html#Connection_Settings).
+
+### Recommended settings for Ozone
+
+Ozone S3 Gateway does not support ETags and object versioning in the same way 
as AWS S3. To avoid compatibility issues, set these options when using S3A with 
Ozone:

Review Comment:
   Etag support is added in S3 MPU. Do we still recommend below configs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to