> The main difference is timing and current availability: > > - The hook approach is working today and can be used immediately . - Your SMGR extensibility work provides a more comprehensive long-term solution
I disagree with this. The SMGR patch is available since 2023/PG16 as a patch, and it is already used by at least 3 companies I know of (Neon, Nile, Percona), and probably also by others I don't know of. It is available immediately. Compared to that this proposal is something new, and more limited. The actual advantage of this proposal is that it includes WAL, but I still think the two should be separate discussions. > Regarding what to protect (WAL vs heap vs both), there's flexibility > depending on the organization and jurisdiction. The hook approach allows > extensions to choose - you can implement only the buffer hooks if that > satisfies your requirements, or add WAL hooks if needed. My concern is that these two separate discussion about 2 extensibility points, with different concerns by different people. One part shouldn't stall the other, as for some, even getting half of it into the core for PG19 would be useful. > You're absolutely right that extension developers need to understand > multiprocess architecture, memory management, critical sections, and so on. > This is precisely why test_tde exists as a reference implementation. The reference implementation ignores the tricky steps, like key rotation, caching, configuration, providing a user interface, etc, which all require knowledge of postgres internals. > ARIA and SEED are already implemented in OpenSSL. However, Korean law > requires certified implementations. Specifically, companies must use > nationally-certified builds and provide the hash codes of those specific > library binaries to regulators. You cannot simply use the OpenSSL version, > even if the algorithm is identical. That could be still solved by introducing an abstraction layer in the encryption code of a TDE extension :) Encryption is only a small part of an extension, the other parts (user interface, rotation, key storage integrations, etc) are a much bigger part. It is still questionable to reimplement everything because of an encryption library difference. But I see your point, that is a bit more difficult. > That's a reasonable approach for SMGR-based solutions where you control the > storage layer. However, with the hook approach, we don't have the ability to > inject custom WAL records for encryption events. > Currently, in a replication environment, the reference implementation > requires the same key to be configured in the settings on both primary and > replicas (shared key model). For future KMS integration, I'm considering > mechanisms to propagate keys to replicas through external channels rather > than WAL. I originally wrote a long answer about how I don't think this is related to where the hooks are, and then I realized that the problem is probably completely different - and this also shows why adding a few bits to the pages is not a good generic solution for all extensions. Our extension uses a 2 level key architecture, as used by most database servers (there's a master key, and it encodes separate internal keys, one for each database file). The proposed sample code in your patch uses a single key, with the IV encoding the database file. That means you want to encode which key is used for each page instead of for each file. So we approach how we map data/pages to keys completely differently. But I don't think the page header addition is a good solution, because it is specific to your implementation, not for encryption solutions in general. (Also, I just noticed that you forgot about timelineid in derive_iv, you probably want to include that somehow)
