Hi, Here is v3 of the Storage I/O Transform Hooks patch.
Changes from v2: - Fix -Wincompatible-pointer-types error in bufmgr.c by casting &bufdata to (void **) for mdread_post_hook call v2 changes were: - Add meson.build test configuration for test_tde extension -- Best regards, Sungkyun Park 2025년 12월 28일 (일) PM 7:44, Henson Choi <[email protected]>님이 작성: > Updated patches with meson build support: > > v2: > - Added meson.build for test_tde extension > - Added test_tde to contrib/meson.build > > Regards, > Henson Choi > > 2025년 12월 28일 (일) PM 6:47, Henson Choi <[email protected]>님이 작성: > >> Hello, >> >> Following up on the RFC, I am submitting the initial patch set for the >> proposed infrastructure. These patches introduce a minimal hook-based >> protocol to allow extensions to handle data transformation, such as TDE, >> while keeping the PostgreSQL core independent of specific cryptographic >> implementations. >> >> Implementation Details: >> >> Hook Points in Storage I/O Path >> The patch introduces five strategic hook points: >> >> mdread_post_hook: Called after blocks are read from disk. The extension >> can reverse-transform data in place. >> >> mdwrite_pre_hook & mdextend_pre_hook: Called before writing or extending >> blocks. These hooks return a pointer to transformed buffers. >> >> xlog_insert_pre_hook & xlog_decode_pre_hook: Handle transformation for >> WAL records during insertion and replay. >> >> Data Integrity and Checksum Protocol >> To ensure robust error detection, the hooks follow a specific >> verification protocol: >> >> On Write: The extension transforms the page, sets the Transform ID, then >> recalculates the checksum on the transformed data. >> >> On Read: The extension verifies the on-disk checksum of the transformed >> data first. After reverse-transformation, it clears the Transform ID and >> recalculates the checksum for the plaintext data. This ensures corruption >> is detected regardless of the transformation state. >> >> WAL Safety via XLR_BLOCK_ID_TRANSFORMED (251) >> For WAL records, I have introduced a specific block ID (251) to mark >> transformed data. If the decryption extension is not loaded, the WAL reader >> will encounter this unknown block ID and fail-fast, preventing the system >> from incorrectly interpreting encrypted data as valid WAL records. >> >> PageHeader Transform ID (5-bit) >> I have allocated bits 3-7 of pd_flags in the PageHeader for a Transform >> ID. This allows the engine and extensions to identify the transformation >> state of a page (e.g., key versioning or algorithm type) without attempting >> decryption. It ensures backward compatibility: pages with Transform ID 0 >> are treated as standard untransformed pages. >> >> Memory and Critical Section Safety >> As demonstrated in the contrib/test_tde reference implementation, cipher >> contexts are pre-allocated in _PG_init to avoid memory allocation during >> critical sections. For WAL transformation, >> MemoryContextAllowInCriticalSection() is used to allow buffer reallocation >> within critical sections; if OOM occurs during buffer growth, it results in >> a controlled PANIC. >> >> Performance Considerations >> When hooks are not set (default), the overhead is limited to a single >> NULL pointer comparison per I/O operation. This is architecturally >> consistent with existing PostgreSQL hooks and is designed to have a >> negligible impact on performance. >> >> Attached Patches: >> >> v20251228-0001-Add-Storage-I-O-Transform-Hooks-for-PostgreSQL.patch: Core >> infrastructure. >> v20251228-0002-Add-test_tde-extension-for-TDE-testing.patch: Reference >> implementation using AES-256-CTR. >> >> I look forward to your comments and feedback. >> >> Regards, >> >> Henson Choi >> >> 2025년 12월 28일 (일) PM 4:49, Henson Choi <[email protected]>님이 작성: >> >>> RFC: PostgreSQL Storage I/O Transformation Hooks Infrastructure for a >>> Technical Protocol Between RDBMS Core and Data Security Experts >>> >>> *Author:* Henson Choi [email protected] >>> >>> *Date:* 2025-12-28 >>> >>> *PostgreSQL Version:* master (Development) >>> ------------------------------ >>> 1. Summary & Motivation >>> >>> This RFC proposes the introduction of minimal hooks into the PostgreSQL >>> storage layer and the addition of a *Transformation ID* field to the >>> PageHeader. >>> A Diplomatic Protocol Between Expert Groups >>> >>> The core motivation of this proposal is *“Separation of Concerns and >>> Mutual Respect.”* >>> >>> Historically, discussions around Transparent Data Encryption (TDE) have >>> often felt like putting security experts on trial in a foreign >>> court—specifically, the “Court of RDBMS.” It is time to treat them not as >>> defendants to be judged by database-specific rules, but as an *equal >>> neighboring community* with their own specialized sovereignty. >>> >>> *The issue has never been a failure of technology, but rather a >>> misplacement of the focal point.* While previous discussions were mired >>> in the technicalities of “how to hardcode encryption into the core,” this >>> proposal shifts the debate toward an architectural solution: “what >>> interface the core should provide to external experts.” >>> >>> - *RDBMS Experts* provide a trusted pipeline responsible for data >>> I/O paths and consistency. >>> - *Security Experts* take responsibility for the specialized domain >>> of encryption algorithms and key management. >>> >>> This hook system functions as a *Technical Protocol*—a high-level >>> agreement that allows these two expert groups to exchange data securely >>> without encroaching on each other’s territory. >>> ------------------------------ >>> 2. Design Principles >>> >>> 1. *Delegation of Authority:* The core remains independent of >>> specific encryption standards, providing a “free territory” where >>> security >>> experts can respond to an ever-changing security landscape. >>> 2. *Diplomatic Convention:* The Transformation ID acts as a >>> communication protocol between the engine and the extension. The engine >>> uses this ID to identify the state of the data and hands over control to >>> the appropriate expert (the extension). >>> 3. *Minimal Interference:* Overhead is kept near zero when hooks are >>> not in use, ensuring the native performance of the PostgreSQL engine. >>> >>> ------------------------------ >>> 3. Proposal Specifications 3.1 The Interface (Hook Points) >>> >>> We allow intervention by security experts through five contact points >>> along the I/O path: >>> >>> - *Read/Write Hooks:* mdread_post, mdwrite_pre, mdextend_pre >>> (Transformation of the data area) >>> - *WAL Hooks:* xlog_insert_pre, xlog_decode_pre (Transformation of >>> transaction logs) >>> >>> 3.2 The Protocol Identifier (PageHeader Transformation ID) >>> >>> We allocate 5 bits of pd_flags to define the “Security State” of a >>> page. This serves as a *Status Message* sent by the security expert to >>> the engine, utilized for key versioning and as a migration marker. >>> ------------------------------ >>> 4. Reference Implementation: contrib/test_tde A Standard Code of >>> Conduct for Security Experts >>> >>> This reference implementation exists not as a commercial product, but to >>> define the *Standards of the Diplomatic Protocol* that >>> encryption/decryption experts must follow when entering the PostgreSQL >>> domain. >>> >>> 1. *Deterministic IV Derivation:* Demonstrates how to achieve >>> cryptographic safety by trusting unique values provided by the engine >>> (e.g., LSN). >>> 2. *Critical Section Safety:* Defines memory management regulations >>> that security logic must follow within “Critical Sections” to maintain >>> system stability. >>> 3. *Hook Chaining:* Demonstrates a cooperative structure that allows >>> peaceful coexistence with other expert tools (e.g., compression, >>> auditing). >>> >>> ------------------------------ >>> 5. Scope >>> >>> - *In-Scope:* Backend hook infrastructure, Transformation ID field, >>> and reference code demonstrating diplomatic protocol compliance. >>> - *Out-of-Scope:* Specific Key Management Systems (KMS), selection >>> of specific cryptographic algorithms, and integration with external >>> tools. >>> >>> This proposal represents a strategic diplomatic choice: rather than the >>> PostgreSQL core assuming all security responsibilities, it grants security >>> experts a *sovereign territory through extensions* where they can >>> perform at their best. >>> >>
v20251228-v3-0001-Add-Storage-I-O-Transform-Hooks-for-PostgreSQL.patch
Description: Binary data
v20251228-v3-0002-Add-test_tde-extension-for-TDE-testing.patch
Description: Binary data
