On Wed, Jul 31, 2019 at 09:25:01AM -0400, Sehrope Sarkuni wrote:
> On Tue, Jul 30, 2019 at 4:48 PM Bruce Momjian <br...@momjian.us> wrote:
> 
>     I had more time to think about the complexity of adding relfilenode to
>     the IV.  Since relfilenode is only unique within a database/tablespace,
>     we would need to have pg_upgrade preserve database/tablespace oids
>     (which I assume are the same as the directory and tablespace symlinks).
>     Then, to decode a page, you would need to look up those values.  This is
>     in addition to the new complexity of CREATE DATABASE and moving files
>     between tablespaces.  I am also concerned that crash recovery operations
>     and cluster forensics and repair would need to also deal with this.
> 
>     I am not even clear if pg_upgrade preserving relfilenode is possible ---
>     when we wrap the relfilenode counter, does it start at 1 or at the
>     first-user-relation-oid?  If the former, it could conflict with oids
>     assigned to new system tables in later major releases.  Tying the
>     preservation of relations to two restrictions seems risky.
> 
> 
> Agreed. Unless you know for sure the input is going to immutable across copies
> or upgrades, including anything in either the IV or key derivation gets risky
> and could tie you down for the future. That's partly why I like the idea
> separate salt (basically you directly pay for the complexity by tracking 
> that).

Yes, fragility is something to be concerned about.  The system is
already very complex, and we occasionally have to do forensic work or
repairs.

> Even if we do not include a separate per-relation salt or things like
> relfilenode when generating a derived key, we can still include other types of
> immutable attributes. For example the fork type could be included to 
> eventually
> allow multiple forks for the same relation to be encrypted with the same IV =
> LSN + Page Number as the derived key per-fork would be distinct.

Yes, the fork number could be useful in this case.  I was thinking we
would just leave the extra bits as zeros and we can then set it to '1'
or something else for a different fork.

> 
>     Using just the page LSN and page number allows a page to be be
>     decrypted/encrypted independently of its file name, tablespace, and
>     database, and I think that is a win for simplicity.  Of course, if it is
>     insecure we will not do it.
> 
> 
> As LSN + Page Number combo is unique for all relations (not just one relation)
> I think we're good for pages.

Yes, since a single LSN can only be used for a single relation, and I
added an Assert to check that.  Good.

>     I am thinking for the heap/index IV, it would be:
> 
>             uint64 lsn;
>             unint32 page number;
>             /* only uses 11 bits for a zero-based CTR counter for 32k pages */
>             uint32 counter;
> 
> 
> Looks good. 
>  
> 
>     and for WAL it would be:
> 
>             uint64 segment_number;
>             uint32    counter;
>             /* guarantees this IV doesn't match any relation IV */
>             uint32   2^32-1 /* all 1's */   
> 
> 
> I need to read up more on the structure of the WAL records but here's some 
> high
> level thoughts:
> 
> WAL encryption should not use the same key as page encryption so there's no
> need to design the IV to try to avoid matching the page IVs. Even a basic
> derivation with a single fixed WDEK = HKDF(MDEK, "WAL") and TDEK = HKDF(MDEK,
> "PAGE") would ensure separate keys. That's the the literal string "WAL" or
> "PAGE" being added as a salt to generate the respective keys, all that matters
> is they're different.

I was thinking the WAL would use the same key since the nonce is unique
between the two.  What value is there in using a different key?

> Ideally WAL encryption would generating new derived keys as part of the WAL
> stream. The WAL stream is not fixed so you have the luxury of being able to 
> add
> a "Use new random salt XZY going forward" records. Forcing generation of a new
> salt/key upon promotion of a replica would ensure that at least the WAL is
> unique going forward. Could also generate a new upon server startup, after

Ah, yes, good point, and using a derived key would make that easier. 
The tricky part is what to use to create the new derived key, unless we
generate a random number and store that somewhere in the data directory,
but that might lead to fragility, so I am worried.  We have pg_rewind,
which allows to make the WAL go backwards.  What is the value in doing
this?

> every N bytes, or a new one for each new WAL file. There's much more
> flexibility compared to page encryption.
> 
> As WAL is a single continuous stream, we can start the IV for each derived WAL
> key from zero. There's no need to complicate it further as Key + IV will never
> be reused.

Uh, you want a new random key for each WAL file?  I was going to use the
WAL segment number as the nonce, which is always increasing, and easily
determined.  The file is 16MB.

> If WAL is always written as full pages we need to ensure that the empty parts
> of the page are actual zeros and not "encrypted zeroes". Otherwise an XOR of
> the empty section of the first write of a page against a subsequent one would
> give you the plain text.

Right, I think we need the segment number as part of the nonce for WAL.

> The non-fixed size of the WAL allows for the addition of a MAC though I'm not
> sure yet the best way to incorporate it. It could be part of each encrypted
> record or its own summary record (providing a MAC for a series of WAL 
> records).
> After I've gone through this a bit more I'm looking to put together a write up
> with this and some other thoughts in one place.

I don't think we want to add a MAC at this point since the MAC for 8k
pages seems unattainable.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


Reply via email to