On Wed, Jul 31, 2019 at 09:25:01AM -0400, Sehrope Sarkuni wrote: > On Tue, Jul 30, 2019 at 4:48 PM Bruce Momjian <br...@momjian.us> wrote: > > I had more time to think about the complexity of adding relfilenode to > the IV. Since relfilenode is only unique within a database/tablespace, > we would need to have pg_upgrade preserve database/tablespace oids > (which I assume are the same as the directory and tablespace symlinks). > Then, to decode a page, you would need to look up those values. This is > in addition to the new complexity of CREATE DATABASE and moving files > between tablespaces. I am also concerned that crash recovery operations > and cluster forensics and repair would need to also deal with this. > > I am not even clear if pg_upgrade preserving relfilenode is possible --- > when we wrap the relfilenode counter, does it start at 1 or at the > first-user-relation-oid? If the former, it could conflict with oids > assigned to new system tables in later major releases. Tying the > preservation of relations to two restrictions seems risky. > > > Agreed. Unless you know for sure the input is going to immutable across copies > or upgrades, including anything in either the IV or key derivation gets risky > and could tie you down for the future. That's partly why I like the idea > separate salt (basically you directly pay for the complexity by tracking > that).
Yes, fragility is something to be concerned about. The system is already very complex, and we occasionally have to do forensic work or repairs. > Even if we do not include a separate per-relation salt or things like > relfilenode when generating a derived key, we can still include other types of > immutable attributes. For example the fork type could be included to > eventually > allow multiple forks for the same relation to be encrypted with the same IV = > LSN + Page Number as the derived key per-fork would be distinct. Yes, the fork number could be useful in this case. I was thinking we would just leave the extra bits as zeros and we can then set it to '1' or something else for a different fork. > > Using just the page LSN and page number allows a page to be be > decrypted/encrypted independently of its file name, tablespace, and > database, and I think that is a win for simplicity. Of course, if it is > insecure we will not do it. > > > As LSN + Page Number combo is unique for all relations (not just one relation) > I think we're good for pages. Yes, since a single LSN can only be used for a single relation, and I added an Assert to check that. Good. > I am thinking for the heap/index IV, it would be: > > uint64 lsn; > unint32 page number; > /* only uses 11 bits for a zero-based CTR counter for 32k pages */ > uint32 counter; > > > Looks good. > > > and for WAL it would be: > > uint64 segment_number; > uint32 counter; > /* guarantees this IV doesn't match any relation IV */ > uint32 2^32-1 /* all 1's */ > > > I need to read up more on the structure of the WAL records but here's some > high > level thoughts: > > WAL encryption should not use the same key as page encryption so there's no > need to design the IV to try to avoid matching the page IVs. Even a basic > derivation with a single fixed WDEK = HKDF(MDEK, "WAL") and TDEK = HKDF(MDEK, > "PAGE") would ensure separate keys. That's the the literal string "WAL" or > "PAGE" being added as a salt to generate the respective keys, all that matters > is they're different. I was thinking the WAL would use the same key since the nonce is unique between the two. What value is there in using a different key? > Ideally WAL encryption would generating new derived keys as part of the WAL > stream. The WAL stream is not fixed so you have the luxury of being able to > add > a "Use new random salt XZY going forward" records. Forcing generation of a new > salt/key upon promotion of a replica would ensure that at least the WAL is > unique going forward. Could also generate a new upon server startup, after Ah, yes, good point, and using a derived key would make that easier. The tricky part is what to use to create the new derived key, unless we generate a random number and store that somewhere in the data directory, but that might lead to fragility, so I am worried. We have pg_rewind, which allows to make the WAL go backwards. What is the value in doing this? > every N bytes, or a new one for each new WAL file. There's much more > flexibility compared to page encryption. > > As WAL is a single continuous stream, we can start the IV for each derived WAL > key from zero. There's no need to complicate it further as Key + IV will never > be reused. Uh, you want a new random key for each WAL file? I was going to use the WAL segment number as the nonce, which is always increasing, and easily determined. The file is 16MB. > If WAL is always written as full pages we need to ensure that the empty parts > of the page are actual zeros and not "encrypted zeroes". Otherwise an XOR of > the empty section of the first write of a page against a subsequent one would > give you the plain text. Right, I think we need the segment number as part of the nonce for WAL. > The non-fixed size of the WAL allows for the addition of a MAC though I'm not > sure yet the best way to incorporate it. It could be part of each encrypted > record or its own summary record (providing a MAC for a series of WAL > records). > After I've gone through this a bit more I'm looking to put together a write up > with this and some other thoughts in one place. I don't think we want to add a MAC at this point since the MAC for 8k pages seems unattainable. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +