This is an automated email from the ASF dual-hosted git repository.

jojochuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ozone-site.git


The following commit(s) were added to refs/heads/master by this push:
     new 7411e085e HDDS-15101. Document read operation implementation guide. 
(#407)
7411e085e is described below

commit 7411e085ef1ece8752d1babe7f6b2332c41eb017
Author: Wei-Chiu Chuang <[email protected]>
AuthorDate: Wed May 13 08:23:26 2026 -0700

    HDDS-15101. Document read operation implementation guide. (#407)
    
    Generated-by: Cursor <[email protected]>
---
 .../02-data-operations/02-read.md                  | 139 +++++++++++++++++----
 .../02-data-operations/read_implementation.png     | Bin 0 -> 7141601 bytes
 2 files changed, 115 insertions(+), 24 deletions(-)

diff --git a/docs/07-system-internals/02-data-operations/02-read.md 
b/docs/07-system-internals/02-data-operations/02-read.md
index 267b86226..710bb35e5 100644
--- a/docs/07-system-internals/02-data-operations/02-read.md
+++ b/docs/07-system-internals/02-data-operations/02-read.md
@@ -3,27 +3,118 @@ draft: true
 sidebar_label: Read
 ---
 
-# Implementation of Read Operations
-
-**TODO:** File a subtask under 
[HDDS-9862](https://issues.apache.org/jira/browse/HDDS-9862) and complete this 
page or section.
-
-## Reading Metadata
-
-## Reading Data
-
-Trace every part of a read request from beginning to end. This includes:
-
-- Client getting encryption keys
-- Client calling OM to create key
-- OM validating client's Kerberos principal
-- OM checking permissions (Ranger or Native ACLs)
-- OM generating block tokens from the shared secret previously retrieved from 
SCM
-- OM getting block locations from SCM or from its cache.
-- OM returning container, blocks, pipeline, block tokens
-- Client sending block tokens and Datanode validating based on the shared 
secret from SCM
-- Client sending read chunk requests to Datanode to fetch the data.
-  - For replication:
-    - Include topology choices of which Datanodes to use
-    - Include failover handling
-  - For EC, link to the [EC feature page](../features/erasure-coding).
-- Client validating checksums
+# Apache Ozone Internals: Read Operation Implementation Guide
+
+This guide provides a comprehensive trace of a read request in Apache Ozone, 
including metadata resolution, security (Block Tokens), Transparent Data 
Encryption (TDE), and Authorization (Ranger/Native ACLs).
+
+---
+
+## 1. Phase 1: Request & Authorization (Client & OM)
+
+### 1.1 Initiating the Request
+
+The application calls `OzoneBucket.readKey(key)`. The client sends a 
`getKeyInfo` RPC to the Ozone Manager (OM).
+
+### 1.2 OM: Authorization Check
+
+Before returning key metadata, the OM must authorize the user.
+
+1. **Entry Point:** `OmMetadataReader.checkAcls()` is called within the 
`getKeyInfo` flow.
+2. **Authorizer Selection:** Based on configuration 
(`ozone.acl.authorizer.class`), OM uses either:
+   - **Native Authorizer:** Uses Ozone's internal ACLs stored in RocksDB.
+   - **Apache Ranger Authorizer:** Delegates the decision to the Ranger Ozone 
Plugin (`RangerOzoneAuthorizer`).
+3. **Authorization Logic:**
+   - OM builds an `OzoneObj` (Volume/Bucket/Key) and a `RequestContext` (User, 
IP, Action: READ).
+   - **Ranger Flow:** The plugin checks its local cache of policies 
(periodically synced from the Ranger Admin server). If a policy allows READ for 
the user/group on that resource, access is granted.
+   - **Fallback:** If Ranger is disabled or the Native authorizer is used, OM 
checks the object's ACL list for matching user/group permissions.
+
+### 1.3 OM: Key & Encryption Resolution
+
+Once authorized:
+
+1. **Key Lookup:** OM finds the `OmKeyInfo` in the `keyTable`.
+2. **Encryption Check:** If TDE is enabled, the `OmKeyInfo` contains the 
`EDEK` (Encrypted Data Encryption Key) and the EZ Key Name.
+3. **Block Retrieval:** OM retrieves the `OmKeyLocationInfo` (Block IDs and 
Pipelines). Container locations are resolved using OM's container location 
cache; pipeline and placement metadata may be refreshed from SCM when needed. 
For locality-aware reads, OM can sort Datanodes within each pipeline by network 
distance using the network topology cached in OM (synchronized from SCM).
+4. **Block Token Generation:** OM generates a signed Block Token for each 
block using secret keys managed by the SCM.
+
+OM returns `OmKeyInfo` (Metadata + `EDEK` + Block Tokens) to the client.
+
+---
+
+## 2. Phase 2: Decryption Setup (Client & KMS)
+
+### 2.1 Decrypting the `EDEK`
+
+If the key is encrypted:
+
+1. **KMS Request:** The client sends the `EDEK` to the KMS (Key Management 
Server).
+2. **KMS Authorization:** The KMS also performs an authorization check (often 
via Ranger KMS plugin) to ensure the user can use the EZ Key for decryption.
+3. **`DEK` Retrieval:** KMS returns the raw `DEK` (Data Encryption Key) to the 
client.
+
+### 2.2 Initializing the Crypto Stream
+
+The client wraps the data stream in a `CryptoInputStream` initialized with the 
raw `DEK` and the IV from the metadata.
+
+---
+
+## 3. Phase 3: Data Retrieval (Client & Datanode)
+
+### 3.1 Fetching Encrypted Chunks
+
+If `ozone.client.stream.readblock.enable` is `true` (default is `false`), the 
client may issue an optional `ReadBlock` request instead. Otherwise, the client 
issues `ReadChunk`.
+
+- **Security:** The request includes the Block Token.
+- **Datanode Validation:** The Datanode verifies the token's signature using 
the Secret Keys it fetched from the SCM. This is the final "at-the-edge" 
authorization check.
+- **Data Transfer:** The Datanode reads the encrypted data from disk and 
streams it back.
+
+### 3.2 On-the-fly Decryption
+
+1. `KeyInputStream` receives encrypted bytes from the network.
+2. `CryptoInputStream` decrypts the bytes in the client's memory.
+3. The application receives the original plaintext data.
+
+---
+
+## Summary of Authorization & Security Layers
+
+| Layer            | Component | Mechanism                   | Purpose         
                                     |
+| ---------------- | --------- | --------------------------- | 
---------------------------------------------------- |
+| Identity         | RPC Layer | Kerberos / Delegation Token | Identifies who 
is making the request.                |
+| Access Control   | OM        | Ranger or Native ACLs       | Determines if 
the user can see/access the Key.       |
+| Data Security    | KMS       | EZ Master Keys              | Protects the 
Data Encryption Key (`DEK`).            |
+| Edge Security    | Datanode  | Block Tokens (SCM-signed)   | Ensures only 
authorized clients can read raw blocks. |
+| At-Rest Security | Client/DN | AES-CTR (TDE)               | Ensures data is 
encrypted on physical disks.         |
+
+---
+
+## Read Path Logic Flow
+
+```text
+ 1 Application -> [OzoneClient]
+ 2                   |
+ 3         (1) getKeyInfo(Volume, Bucket, Key)
+ 4                   |
+ 5         [Ozone Manager] -> (2) [Ranger/Native Authorizer] (Access READ?)
+ 6                   |                 |-- Yes: Continue
+ 7                   |                 |-- No:  Throw PERMISSION_DENIED
+ 8                   |
+ 9         (3) Fetch OmKeyInfo (Metadata + `EDEK` + Block IDs)
+10         (4) Sign Block Tokens using SCM Secret Keys
+11                   |
+12         <-- Returns OmKeyInfo --
+13                   |
+14         (5) [Client] -> KMS: decrypt(`EDEK`) -> returns `DEK`
+15         (6) [Client] -> Datanode: ReadChunk (BlockToken), or ReadBlock 
(BlockToken) if ozone.client.stream.readblock.enable=true
+16                   |
+17         [Datanode] -> (7) Verify Block Token Signature
+18                   |                 |-- Valid: Stream bytes
+19                   |                 |-- Invalid: Reject
+20                   |
+21         (8) [Client] decrypts bytes using `DEK` -> returns Plaintext
+```
+
+---
+
+## System diagram
+
+![Apache Ozone Internals: Read Operation Implementation Guide — read path 
across Application, OM, KMS, Client, and Datanode](./read_implementation.png)
diff --git 
a/docs/07-system-internals/02-data-operations/read_implementation.png 
b/docs/07-system-internals/02-data-operations/read_implementation.png
new file mode 100644
index 000000000..8f928591c
Binary files /dev/null and 
b/docs/07-system-internals/02-data-operations/read_implementation.png differ


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to