SaketaChalamchala commented on code in PR #206:
URL: https://github.com/apache/ozone-site/pull/206#discussion_r2748063591


##########
docs/04-user-guide/03-integrations/09-flink.md:
##########
@@ -0,0 +1,196 @@
+---
+sidebar_label: Flink
+---
+
+# Apache Flink
+
+[Apache Flink](https://flink.apache.org/) is a powerful, open-source 
distributed processing framework designed for stateful computations over both 
bounded and unbounded data streams at any scale. It enables high-throughput, 
low-latency, and fault-tolerant processing while offering elastic scaling 
capabilities to handle millions of events per second across thousands of cores. 
+
+Apache Flink can use Apache Ozone for reading and writing data, and for 
storing essential operational components like application state checkpoints and 
savepoints.
+
+## Quickstart
+
+This tutorial shows how to get started with connecting Apache Flink to Apache 
Ozone using the S3 Gateway, with Docker Compose.
+
+First, obtain Ozone's sample Docker Compose configuration and save it as 
`docker-compose.yaml`:
+
+```bash
+curl -O 
https://raw.githubusercontent.com/apache/ozone-docker/refs/heads/latest/docker-compose.yaml
+```
+
+Refer to the [Docker quick start 
page](../../02-quick-start/01-installation/01-docker.md) for details.
+
+## Assumptions
+
+- Flink accesses Ozone through S3 Gateway instead of ofs.
+- Ozone S3G listens on port 9878
+- Ozone S3G enables path style access.
+- Ozone S3G does not enable security, therefore any S3 access key and secret 
key is accepted.
+- Flink Docker image tag `flink:scala_2.12-java17`
+
+## Step 1 — Create `docker-compose-flink.yml` for Flink
+
+```yaml
+services:
+  jobmanager:
+    image: flink:scala_2.12-java17
+    command: jobmanager
+    ports:
+      - "8081:8081"
+    environment:
+      AWS_ACCESS_KEY_ID: ozone
+      AWS_SECRET_ACCESS_KEY: ozone
+      FLINK_PROPERTIES: |
+        jobmanager.rpc.address: jobmanager
+        fs.s3a.endpoint: http://s3g:9878
+        fs.s3a.path.style.access: true
+        fs.s3a.connection.ssl.enabled: false
+        fs.s3a.access.key: ozone
+        fs.s3a.secret.key: ozone
+
+  taskmanager:
+    image: flink:scala_2.12-java17
+    command: taskmanager
+    depends_on:
+      - jobmanager
+    environment:
+      AWS_ACCESS_KEY_ID: ozone
+      AWS_SECRET_ACCESS_KEY: ozone
+      FLINK_PROPERTIES: |
+        jobmanager.rpc.address: jobmanager
+        taskmanager.numberOfTaskSlots: 4
+        fs.s3a.endpoint: http://s3g:9878
+        fs.s3a.path.style.access: true
+        fs.s3a.connection.ssl.enabled: false
+        fs.s3a.access.key: ozone
+        fs.s3a.secret.key: ozone
+```
+
+## Step 2 — Start Flink and Ozone together
+
+With both `docker-compose.yaml` (for Ozone) and `docker-compose-flink.yml` 
(for Flink) in the same directory,
+you can start both services together, sharing the same network, using:
+
+```bash
+export COMPOSE_FILE=docker-compose.yaml:docker-compose-flink.yml
+docker compose up -d
+```
+
+Verify containers are running:
+
+```bash
+docker ps
+```
+
+## Step 3 — Create an Ozone bucket
+
+You need to connect to Ozone (for example, `s3g`) to create a OBS bucket:
+
+```bash
+docker compose exec -it s3g ozone sh bucket create s3v/bucket1 -l obs
+```
+
+## Step 4 — Copy the Flink S3 filesystem plugin
+
+The official Flink Docker image does not enable S3 by default.
+You must copy the plugin JAR into both JobManager and TaskManager.
+
+Copy into JobManager
+
+```bash
+docker compose exec -it jobmanager bash -lc \
+  "mkdir -p /opt/flink/plugins/s3-fs-hadoop && \\
+   cp /opt/flink/opt/flink-s3-fs-hadoop-*.jar /opt/flink/plugins/s3-fs-hadoop/"
+```
+
+Copy into TaskManager
+
+```bash
+docker compose exec -it taskmanager bash -lc \
+  "mkdir -p /opt/flink/plugins/s3-fs-hadoop && \\
+   cp /opt/flink/opt/flink-s3-fs-hadoop-*.jar /opt/flink/plugins/s3-fs-hadoop/"
+```
+
+Verify:
+
+```bash
+docker compose exec -it jobmanager ls /opt/flink/plugins/s3-fs-hadoop
+docker compose exec -it taskmanager ls /opt/flink/plugins/s3-fs-hadoop
+```
+
+## Step 5 — Restart Flink containers (required)
+
+Plugins are loaded only at startup.
+
+```bash
+docker compose restart jobmanager taskmanager
+```
+
+## Step 6 — Start Flink SQL client

Review Comment:
   ```suggestion
   ### Step 5 — Start Flink SQL client
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to