Re: RFC: PostgreSQL Storage I/O Transformation Hooks

Henson Choi Sun, 28 Dec 2025 07:51:49 -0800

Hi Konstantin,

I understand the decorator pattern, and yes, it can work for some cases.
But decorators can only intercept at the beginning and end of functions.


Looking at the actual hook locations in md.c:

- mdextend_pre_hook: after error checks, before file open → Decorator
possible
- mdwrite_pre_hook: after assertions, before I/O loop → Decorator possible
- mdread_post_hook: inside the segment loop → Decorator NOT possible

The mdreadv() function, introduced in PostgreSQL 17 as part of the
vectored I/O API, processes multiple blocks in a loop that respects
segment boundaries. The decryption hook must be called inside this loop,
after each segment's FileReadV() completes. A decorator wrapping mdreadv()
from the outside cannot access this internal loop timing.

With the SMGR decorator approach, the extension developer must:
- Track upstream md.c changes
- Replicate the internal loop logic to find the right decryption point

With hooks, the extension developer only needs to:
- Implement encrypt() and decrypt()

Regarding encryption+compression: that's a valid use case for SMGR,
but our primary concern is different. In South Korea, government
regulations require the use of nationally-approved cryptographic
algorithms (such as ARIA, SEED). This means organizations often cannot
adopt foreign TDE solutions, regardless of their technical merit.

We need a simple, stable hook interface that allows local security
experts to integrate these required algorithms - experts who understand
cryptography but not PostgreSQL storage internals.

If both approaches can coexist, why not provide hooks for the simple
case and SMGR for the complex case?

Best regards,
Henson Choi

2025년 12월 29일 (월) AM 12:27, Konstantin Knizhnik <[email protected]>님이 작성:

>
> On 28/12/2025 4:53 PM, Henson Choi wrote:
> > Subject: Re: RFC: PostgreSQL Storage I/O Transformation Hooks
> >
> > Hi Konstantin,
> >
> > I have great respect for the work being done on the extensible SMGR API.
> > It is a powerful tool for use cases that require replacing the entire
> > storage layer (like Neon's architecture).
> >
> > However, I believe we should distinguish between Storage Management
> > (where/how data is stored) and Data Transformation (what the data looks
> > like). I see a strong case for both approaches to coexist for the
> > following practical reasons:
> >
> > 1. Separation of Concerns and Safety
> >
> > Is it reasonable to ask cryptography experts to clone the entire SMGR
> > implementation and maintain code they don't fully understand just to
> > insert encryption logic? If an extension developer clones md.c to add
> > encryption, they become responsible for the fundamental integrity of
> > PostgreSQL's file I/O. Any bug in their cloned storage logic could lead
> > to data loss unrelated to encryption itself.
> >
> > 2. The Maintenance Debt of "Cloning"
> >
> > When md.c receives critical security patches or bug fixes in the core,
> > every TDE extension maintainer would need to manually backport those
> > changes to their specific SMGR implementation. This creates a fragmented
> > ecosystem where security extensions might actually introduce storage
> > vulnerabilities by running outdated cloned logic.
> >
> > 3. Minimalist Integration
> >
> > The hook approach allows crypto experts to focus strictly on transform()
> > and reverse_transform(). The complex storage orchestration remains with
> > the PostgreSQL core where it is most rigorously tested. This is a cleaner
> > separation of responsibilities: the core provides the trusted pipeline,
> > and the extension provides the specialized transformation.
> >
> > Conclusion:
> >
> > I believe these hooks provide a "low-barrier, high-safety" path for data
> > transformation that the SMGR API—by its very nature of being a full
> > replacement—cannot easily provide. Let's provide the SMGR for those who
> > want to reinvent the storage, and hooks for those who simply want to
> > secure the data.
> >
> > Best regards,
> > Henson Choi
>
>
> I do not think that custom SMGR API contradicts to the idea of Data
> Transformation.
> Do you know about decorator pattern?
> If you want to implement i.e. data encryption, you definitely do not
> need to write your storage manager from the scratch.
> Obviously you can (and should)  use standard storage manager (md.c) for
> actually performing IO.
> But your storage manager can perform some extra action prior of after
> IO, for example encrypt data before write and decrypt it after read.
> So any pre/post/instead hooks can be easily implemented using custom SMGR.
>
>
> Opposite unfortunately is not possible. You can not for example
> implement encryption+compression using hooks.
> But you can easily do it using custom SMGR: this is how compressed file
> system (CFS) was implemented in PgPro.
>
>

Re: RFC: PostgreSQL Storage I/O Transformation Hooks

Reply via email to