On Tue, May 25, 2021 at 1:37 PM Andres Freund <and...@anarazel.de> wrote: > The obvious concerns are issues around binary upgrades for cases that > already use the special space? Are you planning to address that by not > having that path? Or by storing the nonce at the "start" of the special > space (i.e. [normal data][nonce][existing special])?
Well, there aren't any existing encrypted clusters, so what is the scenario exactly? Perhaps you are thinking that we'd have a pg_upgrade option that would take an unencrypted cluster and encrypt all the pages, without any other page format changes. If so, this design would preclude that choice, because there might be no free space available. > Is there an argument for generalizing the nonce approach for to replace > fake LSNs for unlogged relations? I hadn't thought about that. Maybe. But that would require including the nonce always, rather than only when TDE is selected, or including it always in some kinds of pages and only conditionally in others, which seems more complex. > Why is using pd_special better than finding space for a flag bit in the > header indicating whether it has a nonce? Using pd_special will burden > all code using special space, and maybe even some that does not (think > empty pages now occasionally having a non-zero pd_special), whereas > implementing it on the page level wouldn't quite have the same concerns. Well, I think there's a lot of code that knows where the line pointer array starts, and all those calculations will have to become more complex at runtime if we put the nonce anywhere near the start of the page. I think there are way fewer things that care about the end of the page. I dislike the idea that every call to PageGetItem() would need to know the nonce size - there are hundreds of those calls and making them more expensive seems a lot worse than the stuff this patch changes. It's always possible that I'm confused here, either about what you are proposing or how impactful it would actually be... > I do suspect having only the "no nonce" or "nonce is a compile time > constant" cases would be good performance wise. Stuff like > > > +#define MaxHeapTupleSizeLimit (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + \ > > + sizeof(ItemIdData))) > > +#define MaxHeapTupleSize(tdeNonceSize) (BLCKSZ - > > MAXALIGN(SizeOfPageHeaderData + \ > > + sizeof(ItemIdData)) - > > MAXALIGN(tdeNonceSize)) > > won't be free. One question here is whether we're comfortable saying that the nonce is entirely constant. I wasn't sure about that. It seems possible to me that different encryption algorithms might want nonces of different sizes, either now or in the future. I am not a cryptographer, but that seemed like a bit of a limiting assumption. So Bharath and I decided to make the POC cater to a fully variable-size nonce rather than zero-or-some-constant. However, if the consensus is that zero-or-some-constant is better, fair enough! The patch can certainly be adjusted to cater to work that way. -- Robert Haas EDB: http://www.enterprisedb.com