This is an automated email from the ASF dual-hosted git repository.
jojochuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone-site.git
The following commit(s) were added to refs/heads/master by this push:
new 7411e085e HDDS-15101. Document read operation implementation guide.
(#407)
7411e085e is described below
commit 7411e085ef1ece8752d1babe7f6b2332c41eb017
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Wed May 13 08:23:26 2026 -0700
HDDS-15101. Document read operation implementation guide. (#407)
Generated-by: Cursor <[email protected]>
---
.../02-data-operations/02-read.md | 139 +++++++++++++++++----
.../02-data-operations/read_implementation.png | Bin 0 -> 7141601 bytes
2 files changed, 115 insertions(+), 24 deletions(-)
diff --git a/docs/07-system-internals/02-data-operations/02-read.md
b/docs/07-system-internals/02-data-operations/02-read.md
index 267b86226..710bb35e5 100644
--- a/docs/07-system-internals/02-data-operations/02-read.md
+++ b/docs/07-system-internals/02-data-operations/02-read.md
@@ -3,27 +3,118 @@ draft: true
sidebar_label: Read
---
-# Implementation of Read Operations
-
-**TODO:** File a subtask under
[HDDS-9862](https://issues.apache.org/jira/browse/HDDS-9862) and complete this
page or section.
-
-## Reading Metadata
-
-## Reading Data
-
-Trace every part of a read request from beginning to end. This includes:
-
-- Client getting encryption keys
-- Client calling OM to create key
-- OM validating client's Kerberos principal
-- OM checking permissions (Ranger or Native ACLs)
-- OM generating block tokens from the shared secret previously retrieved from
SCM
-- OM getting block locations from SCM or from its cache.
-- OM returning container, blocks, pipeline, block tokens
-- Client sending block tokens and Datanode validating based on the shared
secret from SCM
-- Client sending read chunk requests to Datanode to fetch the data.
- - For replication:
- - Include topology choices of which Datanodes to use
- - Include failover handling
- - For EC, link to the [EC feature page](../features/erasure-coding).
-- Client validating checksums
+# Apache Ozone Internals: Read Operation Implementation Guide
+
+This guide provides a comprehensive trace of a read request in Apache Ozone,
including metadata resolution, security (Block Tokens), Transparent Data
Encryption (TDE), and Authorization (Ranger/Native ACLs).
+
+---
+
+## 1. Phase 1: Request & Authorization (Client & OM)
+
+### 1.1 Initiating the Request
+
+The application calls `OzoneBucket.readKey(key)`. The client sends a
`getKeyInfo` RPC to the Ozone Manager (OM).
+
+### 1.2 OM: Authorization Check
+
+Before returning key metadata, the OM must authorize the user.
+
+1. **Entry Point:** `OmMetadataReader.checkAcls()` is called within the
`getKeyInfo` flow.
+2. **Authorizer Selection:** Based on configuration
(`ozone.acl.authorizer.class`), OM uses either:
+ - **Native Authorizer:** Uses Ozone's internal ACLs stored in RocksDB.
+ - **Apache Ranger Authorizer:** Delegates the decision to the Ranger Ozone
Plugin (`RangerOzoneAuthorizer`).
+3. **Authorization Logic:**
+ - OM builds an `OzoneObj` (Volume/Bucket/Key) and a `RequestContext` (User,
IP, Action: READ).
+ - **Ranger Flow:** The plugin checks its local cache of policies
(periodically synced from the Ranger Admin server). If a policy allows READ for
the user/group on that resource, access is granted.
+ - **Fallback:** If Ranger is disabled or the Native authorizer is used, OM
checks the object's ACL list for matching user/group permissions.
+
+### 1.3 OM: Key & Encryption Resolution
+
+Once authorized:
+
+1. **Key Lookup:** OM finds the `OmKeyInfo` in the `keyTable`.
+2. **Encryption Check:** If TDE is enabled, the `OmKeyInfo` contains the
`EDEK` (Encrypted Data Encryption Key) and the EZ Key Name.
+3. **Block Retrieval:** OM retrieves the `OmKeyLocationInfo` (Block IDs and
Pipelines). Container locations are resolved using OM's container location
cache; pipeline and placement metadata may be refreshed from SCM when needed.
For locality-aware reads, OM can sort Datanodes within each pipeline by network
distance using the network topology cached in OM (synchronized from SCM).
+4. **Block Token Generation:** OM generates a signed Block Token for each
block using secret keys managed by the SCM.
+
+OM returns `OmKeyInfo` (Metadata + `EDEK` + Block Tokens) to the client.
+
+---
+
+## 2. Phase 2: Decryption Setup (Client & KMS)
+
+### 2.1 Decrypting the `EDEK`
+
+If the key is encrypted:
+
+1. **KMS Request:** The client sends the `EDEK` to the KMS (Key Management
Server).
+2. **KMS Authorization:** The KMS also performs an authorization check (often
via Ranger KMS plugin) to ensure the user can use the EZ Key for decryption.
+3. **`DEK` Retrieval:** KMS returns the raw `DEK` (Data Encryption Key) to the
client.
+
+### 2.2 Initializing the Crypto Stream
+
+The client wraps the data stream in a `CryptoInputStream` initialized with the
raw `DEK` and the IV from the metadata.
+
+---
+
+## 3. Phase 3: Data Retrieval (Client & Datanode)
+
+### 3.1 Fetching Encrypted Chunks
+
+If `ozone.client.stream.readblock.enable` is `true` (default is `false`), the
client may issue an optional `ReadBlock` request instead. Otherwise, the client
issues `ReadChunk`.
+
+- **Security:** The request includes the Block Token.
+- **Datanode Validation:** The Datanode verifies the token's signature using
the Secret Keys it fetched from the SCM. This is the final "at-the-edge"
authorization check.
+- **Data Transfer:** The Datanode reads the encrypted data from disk and
streams it back.
+
+### 3.2 On-the-fly Decryption
+
+1. `KeyInputStream` receives encrypted bytes from the network.
+2. `CryptoInputStream` decrypts the bytes in the client's memory.
+3. The application receives the original plaintext data.
+
+---
+
+## Summary of Authorization & Security Layers
+
+| Layer | Component | Mechanism | Purpose
|
+| ---------------- | --------- | --------------------------- |
---------------------------------------------------- |
+| Identity | RPC Layer | Kerberos / Delegation Token | Identifies who
is making the request. |
+| Access Control | OM | Ranger or Native ACLs | Determines if
the user can see/access the Key. |
+| Data Security | KMS | EZ Master Keys | Protects the
Data Encryption Key (`DEK`). |
+| Edge Security | Datanode | Block Tokens (SCM-signed) | Ensures only
authorized clients can read raw blocks. |
+| At-Rest Security | Client/DN | AES-CTR (TDE) | Ensures data is
encrypted on physical disks. |
+
+---
+
+## Read Path Logic Flow
+
+```text
+ 1 Application -> [OzoneClient]
+ 2 |
+ 3 (1) getKeyInfo(Volume, Bucket, Key)
+ 4 |
+ 5 [Ozone Manager] -> (2) [Ranger/Native Authorizer] (Access READ?)
+ 6 | |-- Yes: Continue
+ 7 | |-- No: Throw PERMISSION_DENIED
+ 8 |
+ 9 (3) Fetch OmKeyInfo (Metadata + `EDEK` + Block IDs)
+10 (4) Sign Block Tokens using SCM Secret Keys
+11 |
+12 <-- Returns OmKeyInfo --
+13 |
+14 (5) [Client] -> KMS: decrypt(`EDEK`) -> returns `DEK`
+15 (6) [Client] -> Datanode: ReadChunk (BlockToken), or ReadBlock
(BlockToken) if ozone.client.stream.readblock.enable=true
+16 |
+17 [Datanode] -> (7) Verify Block Token Signature
+18 | |-- Valid: Stream bytes
+19 | |-- Invalid: Reject
+20 |
+21 (8) [Client] decrypts bytes using `DEK` -> returns Plaintext
+```
+
+---
+
+## System diagram
+
+
diff --git
a/docs/07-system-internals/02-data-operations/read_implementation.png
b/docs/07-system-internals/02-data-operations/read_implementation.png
new file mode 100644
index 000000000..8f928591c
Binary files /dev/null and
b/docs/07-system-internals/02-data-operations/read_implementation.png differ
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]