On Mon, Jan 18, 2021 at 04:38:47PM -0500, Robert Haas wrote: > To me, it wouldn't make sense to commit a full README for a TDE > feature that we don't have yet with a key management patch, but the > way that they'll interact with each other has to be clear. The > doc/database-encryption.sgml file that Bruce included in the patch is > a decent start on explaining the design, though I think it needs more > work and more details, perhaps including some of the things Andres > mentioned.
Sure. > To be honest, after reading over that SGML documentation a bit, I'm > somewhat skeptical about whether it really makes sense to think about > committing the key management part separately. It seems to have no use > independent of the main feature, and it in fact embeds very specific For usefulness, it does enable passphrase prompting for the TLS private key. > details of how the main feature is expected to work. For example, the > documentation says that key #0 will be used for data files, and key #1 > for WAL. There seems to be no suggestion that the key management > portion of this can be used to manage encryption keys generally for > whatever purposes someone might have; it's all about the needs of a > particular TDE implementation. Typically, we would not commit We originally were going to have SQL-level keys, but many felt they weren't useful. > something like that separately, or only once the main patch was done, > with the two commits occurring in a relatively short time period. > Otherwise, as Bruce already noted, we can end up with something that > is documented and visible to users but doesn't actually work yet. Yep, that is the risk. > Some more specific comments on data-encryption.sgml: > > * The documentation explains that the purpose of having a WAL key > separate from the data file key is so that the data file keys can > "eventually" be rotated. It's not clear whether this means that we > might eventually have that feature or that we might eventually be able > to rotate, after failing over. If this kind of thing is possible, > we'll eventually need documentation on how to do it. I have clarified that saying "future release". > * The reasons for use a DEK and a KEK are not explained. I realize > it's not an uncommon practice and that other systems do it, but I > think a few sentences of explanation wouldn't be a bad idea. Even if > we are supposing that hackers who want to have input into this feature > have to be knowledgeable about cryptography, I don't think we can > reasonably suppose that for users. I added a little about that in the docs. > * "For example" is at one point followed by a period rather than a > colon or comma. Fixed. > * In the "Internals" subsection, the last sentence doesn't seem to be > grammatical. I wonder if it's missing the word "or"'. Fixed. > * The part about integrity-checking keys on startup isn't clear. It > makes it sound like we still have a copy of the KEK lying around > someplace against which we can compare, which I assume is not the case > since it would be really insecure. I rewored that entire section. See if it is better now. > * I think it's going to be pretty important that we can easily switch > to other cryptographic algorithms as they are discovered, so I don't > like the fact that this is tied specifically to AES. (In fact, > kmgr_utils.h makes it sound like we're specifically locked into > AES256, but that contradicts the documentation, so I guess there's > some clarification needed here about what exactly KMGR_CLUSTER_KEY_LEN > is doing.) As far as possible we should try to make this generic, like > supporting any cipher that SSL has which has property X. It seems > relatively inevitable that every currently popular cryptographic > algorithm will at some point in the future be judged weak and > worthless, just as has already happened with MD5 and some variants of > SHA, both of which used to be considered state of the art. It seems > equally inevitable that new and stronger algorithms will continued to > be devised, and we'll want to adopt those easily. That is a nifty idea. Right now I just pass the integer length around, and store it in pg_control, but if we define macros, we can easily abstract this and easily allow for new methods. If others like that, I will start on it now. > I'm not sure to what extent this a serious flaw in the patch and to > what extent it's just a matter of tweaking the wording of some things, > but I think this is actually an extremely critical design point where > we had better be certain we've got it right. Few things would be > sadder than to get a TDE feature and then have to rip it out again > because it couldn't be upgraded to work with newer crypto algorithms > with reasonable effort. Yep. > Notes on other parts of the documentation: > > * The documentation for initdb -K doesn't list the valid values of the > parameter, only the default. Probably we should be specifying an Fixed. > algorithm here and not just a bit count. Otherwise, like I say above, > what happens when AES gives way to something else? It'd be easy to say > -K BFT256 instead of -K AES256, but if AES is assumed and it's no > longer what we want them we have problems. This kind of thing probably > needs to be cleaned up in a bunch of places. Again, I can do that if people like it. > * I don't see the point of saying "a passphrase or PIN." We don't need > to document that your passphrase might happen to only contain digits. Well, PIN is what the Yubikey and PIV devices call it, so I thought I should give specific examples of inputs. > * pg_alterckey's description of "repair" is hard to understand. It > doesn't really explain why or how this would be necessary, and it begs > the question of why we'd ever leave things in a state that requires > repair. This is sketched out in code comments elsewhere, but I think > at least some of it needs to be explained in the documentation as > well. (Incidentally, I don't think the comments at the top of > recover_failure will survive a visit from pgindent, though I might be > wrong about that.) Fixed with rewording. Better? > * The changes to config.sgml say "Sample script" instead of "Sample scripts". Fixed. > * I don't think that the documentation of %R is very clear, or > adequate for someone to make effective use of it. If I wanted to use > %R, how would I ensure that a value is available? Fixed, use -R on server start. > * The changes to allfiles.sgml add pg_alterckey.sgml in the wrong > place and include an incorrect whitespace change. Uh, the whitespace change was to align the column. I will review and push that separately. > * It's odd that "pg_alterckey" describes itself as "technically" > changing the KEK. Isn't that just what it does, not a technicality? I > imagine we'll ultimately need a way to change a DEK as well, because > otherwise the use of a separate key for the WAL wouldn't accomplish > the intended goal. "technically" removed. I kind of wanted to say "in detail" or something like that, but removing the word is fine. Change-only patch attached so you can see the changes more easily. -- Bruce Momjian <br...@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 8f62e884b1..e6e4f346f1 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1460,20 +1460,21 @@ include_dir 'conf.d' <para> The command must print the passphrase to the standard output and exit with code 0. It can prompt from the terminal if - <option>--authprompt</option> is used. In the parameter value, - <literal>%R</literal> represents the file descriptor number opened - to the terminal that started the server. A file descriptor is only - available if enabled at server start. If <literal>%R</literal> - is used and no file descriptor is available, the server will not - start. Value <literal>%p</literal> is replaced by a pre-defined - prompt string. (Write <literal>%%</literal> for a literal - <literal>%</literal>.) Note that the prompt string will probably - contain whitespace, so be sure to quote its use adequately. - Newlines are stripped from the end of the output if present. + <option>--authprompt</option> is used. In the parameter + value, <literal>%R</literal> is replaced by a file descriptor + number opened to the terminal that started the server. A file + descriptor is only available if enabled at server start via + <option>-R</option>. If <literal>%R</literal> is specified and + no file descriptor is available, the server will not start. Value + <literal>%p</literal> is replaced by a pre-defined prompt string. + (Write <literal>%%</literal> for a literal <literal>%</literal>.) + Note that the prompt string will probably contain whitespace, + so be sure to quote its use adequately. Newlines are stripped + from the end of the output if present. </para> <para> - Sample script can be found in + Sample scripts can be found in <filename>$SHAREDIR/auth_commands</filename>, where <literal>$SHAREDIR</literal> means the <productname>PostgreSQL</productname> installation's shared-data @@ -7874,15 +7875,16 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; server start. </para> <para> - The command must print the cluster key to the standard output as - 64 hexadecimal characters, and exit with code 0. The command - can prompt for the passphrase or PIN from the terminal if - <option>--authprompt</option> is used. In the parameter value, - <literal>%R</literal> represents the file descriptor number opened - to the terminal that started the server. A file descriptor is only - available if enabled at server start. If <literal>%R</literal> - is used and no file descriptor is available, the server will not - start. Value <literal>%p</literal> is replaced by a pre-defined + The command must print the cluster key to the standard + output as 64 hexadecimal characters, and exit with code 0. + The command can prompt for the passphrase or PIN from the + terminal if <option>--authprompt</option> is used. In the + parameter value, <literal>%R</literal> is replaced by a file + descriptor number opened to the terminal that started the server. + A file descriptor is only available if enabled at server start + via <option>-R</option>. If <literal>%R</literal> is specified + and no file descriptor is available, the server will not start. + Value <literal>%p</literal> is replaced by a pre-defined prompt string. Value <literal>%d</literal> is replaced by the directory containing the keys; this is useful if the command must create files with the keys, e.g., to store a cluster-level diff --git a/doc/src/sgml/database-encryption.sgml b/doc/src/sgml/database-encryption.sgml index cb62f1d975..8eece94322 100644 --- a/doc/src/sgml/database-encryption.sgml +++ b/doc/src/sgml/database-encryption.sgml @@ -24,8 +24,8 @@ Key one is used to encrypt write-ahead log (WAL) files. Two different keys are used so that primary and standby servers can use different zero (heap/index/temp) keys, but the same one (WAL) key, so that these keys - can eventually be rotated by switching the primary to the standby - and then changing the WAL key. + can (in a future release) be rotated by switching the primary to the + standby and then changing the WAL key. </para> <para> @@ -36,7 +36,10 @@ each time the server is started. This key prevents anyone with access to the database directories from decrypting the data because they do not know the second-level key which encrypted the first-level keys - which encrypted the database cluster files. + which encrypted the database cluster files. This key can be easily + changed via <command>pg_alterckey</command> without requiring any + changes to the the data files or <command>WAL</command> files, which + are encrypted with the data keys. </para> <sect1 id="encryption-file-encryption"> @@ -54,7 +57,7 @@ <filename>postgresql.conf</filename> must match for the database cluster to start. Note that the cluster key command passed to <command>initdb</command> must return a key of - 64 hexadecimal characters. For example. + 64 hexadecimal characters. For example: <programlisting> initdb -D dbname --cluster-key-command='ckey_passphrase.sh' </programlisting> @@ -71,7 +74,8 @@ initdb -D dbname --cluster-key-command='ckey_passphrase.sh' the key encryption key (KEK) supplied by the cluster key command before being stored in the database directory. The key or passphrase that derives the key must be supplied from the terminal or stored in a - trusted key store, such as key vault software, hardware security module. + trusted key store, such as key vault software or a hardware security + module. </para> <para> @@ -81,17 +85,22 @@ initdb -D dbname --cluster-key-command='ckey_passphrase.sh' <varname>cluster_key_command</varname> command will be executed and the cluster key retrieved. The data encryption keys in the <filename>pg_cryptokeys</filename> directory will then be decrypted - using the supplied key and integrity-checked to ensure it - matches the initdb-supplied key. If this check fails, the - server will refuse to start. + using the supplied key and integrity-checked to ensure it matches the + initdb-supplied key. (If this check fails, the server will refuse + to start.) The cluster encryption key will then be removed from + system memory. The decrypted data encryption keys will remain in + shared memory until the server is stopped. </para> <para> - The data encryption keys are randomly generated and can be 128, 192, - or 256-bits in length. They are encrypted by the key encryption key - (KEK) using Advanced Encryption Standard (<acronym>AES256</acronym>) - encryption in Key Wrap Padded Mode, which also provides KEK - authentication. + The data encryption keys are randomly generated and can be + 128, 192, or 256-bits in length. They are encrypted by the + key encryption key (KEK) using Advanced Encryption Standard + (<acronym>AES256</acronym>) encryption in Key Wrap Padded + Mode, which also provides KEK authentication; see <ulink + url="https://tools.ietf.org/html/rfc5649">RFC 5649</ulink>. While + 128-bit encryption is sufficient for most sites, 256-bit encryption + is thought to be more immune to future quantum cryptographic attacks. </para> </sect1> </chapter> diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 4e9504e11b..92bad3150c 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -189,7 +189,6 @@ Complete list of usable sgml source files in this directory. <!ENTITY values SYSTEM "values.sgml"> <!-- applications and utilities --> -<!ENTITY pgalterckey SYSTEM "pg_alterckey.sgml"> <!ENTITY clusterdb SYSTEM "clusterdb.sgml"> <!ENTITY createdb SYSTEM "createdb.sgml"> <!ENTITY createuser SYSTEM "createuser.sgml"> @@ -197,6 +196,7 @@ Complete list of usable sgml source files in this directory. <!ENTITY dropuser SYSTEM "dropuser.sgml"> <!ENTITY ecpgRef SYSTEM "ecpg-ref.sgml"> <!ENTITY initdb SYSTEM "initdb.sgml"> +<!ENTITY pgalterckey SYSTEM "pg_alterckey.sgml"> <!ENTITY pgarchivecleanup SYSTEM "pgarchivecleanup.sgml"> <!ENTITY pgBasebackup SYSTEM "pg_basebackup.sgml"> <!ENTITY pgbench SYSTEM "pgbench.sgml"> diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml index 25342d1c19..4596489146 100644 --- a/doc/src/sgml/ref/initdb.sgml +++ b/doc/src/sgml/ref/initdb.sgml @@ -242,7 +242,7 @@ PostgreSQL documentation <listitem> <para> Specifies the number of bits for the file encryption keys. The - default is 128 bits. + value values are 128 (the default), 192, and 256. </para> </listitem> </varlistentry> diff --git a/doc/src/sgml/ref/pg_alterckey.sgml b/doc/src/sgml/ref/pg_alterckey.sgml index f80946dcc6..867c439b8e 100644 --- a/doc/src/sgml/ref/pg_alterckey.sgml +++ b/doc/src/sgml/ref/pg_alterckey.sgml @@ -78,14 +78,14 @@ PostgreSQL documentation </para> <para> - Technically, <command>pg_alterckey</command> changes the key - encryption key (<acronym>KEK</acronym>) which encrypts the data - encryption keys; it does not change the data encryption keys. It does - this by decrypting each data encryption key using the <replaceable + <command>pg_alterckey</command> changes the key encryption key + (<acronym>KEK</acronym>) which encrypts the data encryption keys; + it does not change the data encryption keys. It does this by + decrypting each data encryption key using the <replaceable class="parameter">old_cluster_key_command</replaceable>, re-encrypting it using the <replaceable - class="parameter">new_cluster_key_command</replaceable>, and - then writes the result back to the cluster directory. + class="parameter">new_cluster_key_command</replaceable>, and then + writes the result back to the cluster directory. </para> <para> @@ -95,13 +95,26 @@ PostgreSQL documentation arguments to specify retrieval of the old or new key. </para> + <para> + <command>pg_alterckey</command> manages data encryption keys, + which are critical to allowing Postgres to access its decrypted + data. For this reason, it is very careful to preserve these + keys in most possible failure conditions, e.g., operating system + failure during cluster encryption key rotation. + </para> + <para> When started, <command>pg_alterckey</command> repairs any files that - remain from previous <command>pg_alterckey</command> failures before - altering the cluster key. To perform only the repair task, - use the <option>--repair</option> option. The server will not start - if repair is needed, though a running server is unaffected by an - unrepaired cluster key configuration. + remain from previous failures before altering the cluster encryption + key. During this repair phase, <command>pg_alterckey</command> will + either roll back the cluster key or roll forward the changes that + were previously requested. The server will not start if repair is + needed, though a running server is unaffected by an unrepaired cluster + key configuration. Therefore, if <command>pg_alterckey</command> + fails for any reason, it is recommended you run the command with + <option>--repair</option> to simply roll back or forward any previous + changes. This will report if it rolled the cluster key back or forward, + and then run the command again to change the cluster key if needed. </para> <para> diff --git a/src/bin/pg_alterckey/pg_alterckey.c b/src/bin/pg_alterckey/pg_alterckey.c index 6a809037b8..50e168306a 100644 --- a/src/bin/pg_alterckey/pg_alterckey.c +++ b/src/bin/pg_alterckey/pg_alterckey.c @@ -325,7 +325,7 @@ create_lockfile(void) if (repair_mode) printf("old lock file removed\n"); - /* + /* ---------- * pid is no longer running, so remove the lock file. * This is not 100% safe from concurrent access, e.g.: * @@ -343,6 +343,7 @@ create_lockfile(void) * even more error-prone, especially since this might happen * on server start. Many PG tools seem to have problems with * concurrent access. + * ---------- */ unlink(pid_path); @@ -380,6 +381,7 @@ create_lockfile(void) } /* + * ---------- * recover_failure * * A previous pg_alterckey might have failed, so it might need recovery. @@ -406,6 +408,7 @@ create_lockfile(void) * remove old and new X X X * * We don't handle the abnormal cases, just report an error. + * ---------- */ static void recover_failure(void)