subject:"Internal key management system"

On Wed, Oct 28, 2020 at 05:16:32PM +0800, Craig Ringer wrote:
> On Wed, Oct 28, 2020 at 12:02 PM Craig Ringer 
> wrote:
> 
> On Wed, Oct 28, 2020 at 9:43 AM Bruce Momjian  wrote:
> >
> 
> I don't know much about how to hook into that stuff so if you have an
> idea, I am all ears.
> 
> 
> Yeah, I have a reasonable idea. The main thing will be to re-read the 
> patch
> and put it into more concrete terms, which I'll try to find time for soon.
> I need to find time to craft a proper demo that uses a virtual hsm, and 
> can
> also demonstrate how to use the host TPM or a Yubikey using the simple
> openssl engine interfaces or a URI.
> 
> 
> Do you have this in a public git tree anywhere? If not please consider using
> "git format-patch -v 1 -1" or similar to generate it, so I can "git am" the
> patch.

I have made a github branch, and will keep it updated:

https://github.com/bmomjian/postgres/tree/key

I am also attaching an updated patch.

> A few comments on the patch as I read through. Some will be addressed by the
> quick PoC I'm preparing for pluggable key derivation, some won't. In no
> particular order:
> 
> * The term KEK only appears in the docs; where it appears in the sources it's
> lower case. I suggest making "KEK" grep-able in the sources.

Fixed.

> * BootStrapKmgr() says to call once on "system install" . I suggest "initdb".

Done.

> * The jumble of #ifdef FRONTEND in src/common/kmgr_utils.c shouldn't remain in
> the final patch if possible. This may require some "static inline" wrappers or
> helpers.

I can do this if you give me more details.

> * PgKeyWrapCtx.key is unused and should probably be deleted

Removed.

> * HMAC support should go into PgCipherCtx so that we can have a single
> PgCipherCtx that supports cipher ops, cipher+HMAC ops, or just HMAC ops
> * PgKeyWrapCtx.cipherctx should then be supplemented with a hmacctx. It should
> be legal to set cipherctx and hmacctx to the same value, since in some cases
> the it won't be easy to initialize the backing implementation separately for
> key and HMAC.

Sorry, I don't know how to do the above items.

> FORK?
> 
> 
> One possible problem with this is that we should not assume we can perform KEK
> operations in postmaster children, since there's no guarantee we can use
> whatever sits behind a PgCipherCtx after fork(). But AFAICS the KEK doesn't
> live beyond the various kmgr_ operations as it is, so there's no reason it
> should ever have to be carried over a fork() anyway.

Yes, I think so.

> * the kmgr appears to need to be able to work in frontend code (?)
> * for key rotation we need to be able to change KEKs, and possibly KEK
> acquisition methods, at runtime

We might need to change the KEK using a command-line tool so we can more
easily prompt for the new KEK.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key.diff.gz
Description: application/gzip

Re: Internal key management system

On Wed, Oct 28, 2020 at 02:29:16PM -0400, Bruce Momjian wrote:
> On Wed, Oct 28, 2020 at 12:02:46PM +0800, Craig Ringer wrote:
> > Yes, that's possible. But in that case the passphrase will be asked for by
> > openssl only when required, and we'll need to supply an openssl askpass 
> > hook.
> 
> What we _will_ need is access to a /dev/tty file descriptor, and this
> patch does that, though it closes it as soon as the internal keys are
> unlocked so the terminal can be disconnected from the database
> processes.

FYI, the file descriptor facility will eventually allow for SSL
certificate unlocking passwords to be prompted from the terminal,
instead of requiring the use of ssl_passphrase_command, but let's get
the facility fully completed first.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

On Wed, Oct 28, 2020 at 12:02:46PM +0800, Craig Ringer wrote:
> On Wed, Oct 28, 2020 at 9:43 AM Bruce Momjian  wrote:
>  I have used OpenSSL with Yubikey via pksc11.  You
> can see the use of it on slide 57 and following:
> 
>         https://momjian.us/main/writings/crypto_hw_config.pdf#page=57
> 
> Interestingly, that still needed the user to type in a key to unlock the
> Yubikey, so we might need PKCS11 and a password for the same server
> start.
> 
> Yes, that's possible. But in that case the passphrase will be asked for by
> openssl only when required, and we'll need to supply an openssl askpass hook.

What we _will_ need is access to a /dev/tty file descriptor, and this
patch does that, though it closes it as soon as the internal keys are
unlocked so the terminal can be disconnected from the database
processes.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-10-28 Thread Stephen Frost

Greetings,

* Craig Ringer (craig.rin...@enterprisedb.com) wrote:
> On Mon, Oct 26, 2020 at 11:02 PM Stephen Frost  wrote:
> 
> TL;DR:
> 
> * Important to check that key rotation is possible on a replica, i.e.
> primary and standby can have different cluster passphrase and KEK
> encrypting the same WAL and heap keys;

I agree that key rotation would certainly be good to have.

> * with a HSM we can't read the key out, so a pluggable KEK operations
> context or a configurable URI for the KEK is necessary

There's a lot of options around HSMs, the Linux crypto API, potential
different encryption libraries, et al.  One thing that I'm not sure
we're being clear enough on here is when we're talking about a KEK (key
encryption key) vs. when we're talking about actually off-loading all of
the encryption to an HSM or to an OpenSSL engine (which might in turn
use the Linux crypto API...), etc.

Agreed that, with some HSMs, we aren't able to actually pull out the
key.  Depending on the HSM, it may or may not be able to perform
encryption and decryption with any kind of speed and therefore we should
have options which don't require that.  This would be the typical case
where we'd have a KEK which encrypts a key we have stored and then that
key is what's actually used for the encryption/decryption of the data.

> * I want the SQL key and SQL wrap/unwrap part in a separate patch, I
> don't think it's fully baked and oppose its inclusion in its current
> form

I'm generally a fan of having something at the SQL level, but I agree
that it doesn't need to be part of this initial capability and could be
done later as a separate patch.

> Most importantly - I don't think the SQL key adds anything really
> crucial that we cannot do at the SQL level with an extension.  An
> extension "pg_wrap" could provide pg_wrap() and pg_unwrap() already,
> using a single master key much like the SQL key proposed in this
> patch. To store the master key it could:

Lots of things can be done in extensions but, at least for my part, I'd
much rather see us build in an SQL key capability (with things like
grammar support and being able to tie to to a role cleanly) than to try
and figure out how to make this work as an extension.

> That way we haven't baked some sort of limited wrap/unwrap into Pg's
> long term user visible API. I'd be totally happy for such a SQL key
> wrap/unwrap to become part of pgcrypto, or a separate extension that
> uses pgcrypto, if you're worried about having it available to users. I
> just don't really want it in src/backend in its current form.

There's no shortage of interfaces that exist in other database systems
for this that we can look at to help guide us in coming up with a good
API here.  All that said, we can debate that on another thread and
independently of this discussion around TDE.

> OTHER TRANSPARENT ENCRYPTION USE CASES
> 
> 
> Does this patch get in the way of supporting other kinds of
> transparent encryption that are frequently requested and are in use on
> other systems already?
> 
> I don't think so. Whole-cluster encryption is quite separate and the
> proposed patch doesn't seem to do anything that'd make table-, row- or
> column-level encryption, per-user key management, etc any harder.
> 
> Specific use cases I looked at:
> 
> * Finer grained keying than whole-cluster for transparent
> encryption-at-rest. As soon as we have relations that require user
> session supplied information to allow the backend to read the relation
> we get into a real mess with autovacuum, logical decoding, etc. So if
> anyone wants to implement that sorts of thing they're probably going
> to want to do so separately to block-level whole-cluster encryption,
> in a way that preserves the normal page and page item structure and
> encrypts the row data only.

I tend to agree with this.

> * Client-driver-assisted transparently encrypted
> at-rest-and-in-transit data, where the database engine doesn't have
> the encrypt/decrypt keys at all. Again in this case they're going to
> have to do that at the row level or column level, not the block
> (relfilenode extents and WAL) level, otherwise we can't provide
> autovacuum etc.

+100 to having client-driver-assisted encryption, this solves real
attack vectors which traditional TDE simply doesn't, compared to
filesystem or block device level encryption (even though lots of
people seem to think it does, which is bizarre to me).

> > That
> > said- I don't think we necessarily want to throw out tho command-based
> > option, as users may wish to use a vaulting solution or similar instead
> > of an HSM.
> 
> I agree. I wasn't proposing to throw out the command based approach,
> just provide a way to inform postgres that it should do operations
> with the KEK using an external engine instead of deriving its own KEK
> from a passphrase and other inputs.

I would think we'd want to enable admins to be able to control if what
is being provided is a KEK (where the key is then

Re: Internal key management system

On Wed, Oct 28, 2020 at 09:24:35PM +0900, Masahiko Sawada wrote:
> On Tue, 27 Oct 2020 at 20:34, Bruce Momjian  wrote:
> > You need to use separate keys for heap/index and WAL so you can
> > replicate to another server that uses a different heap/index key, but
> > the same WAL.  You can then fail-over to the replica and change the WAL
> > key to complete full key rotation.  The replication protocol needs to
> > decrypt, and the receiving end has to encrypt with a different
> > heap/index key.  This is the key rotation method this is planned.  This
> > is another good reason the keys should be in a separate directory so
> > they can be easily copied or replaced.
> 
> I think it's better we decrypt WAL in the xlogreader layer, instead of
> doing in replication protocol. That way, we also can support frontend
> tools that need to read WAL such as pg_waldump and pg_rewind as well
> as logical decoding.

Sure.  I was just saying the heap/index files coming from the primary
should have decrypted heap/index blocks, but I was not sure what level
it should happen.  If the data coming out the primary is encrypted, you
would need the old (to decrypt) and new (to encrypt) keys on the
standby, which seems too complex.

To clarify, the data and heap/index pages in the WAL are only encrypted
with the WAL key, but when pg_basebackup is streaming the files from
PGDATA, it shouldn't be encrypted, or encrypted only with the WAL key,
at the time of transfer since the receiver should be re-encrypting it. 
If that will not work, we should know now.

> > > I want to take a closer look at how the current implementation will
> > > play with physical replication. I assume the WAL and heap keys have to
> > > be constant for the full cluster lifetime, short of a dump and reload.
> >
> > The WAL key can change if you are willing to stop/start the server.  You
> > only read the WAL during crash recovery.
> 
> We might need to consider having multiple key generations, rather than
> in-place rotation. If we simply update the WAL key in-place in the
> primary, archive WALs restored via restore_command cannot be decrypted
> in the replica. We might need to do generation management for WAL key
> and provide the functionality to purge old WAL keys.

Since we have the keys stored in the file system, I think we will use a
command-line tool that can access both old and new keys and re-encrypted
the archived WAL.  I think old/new keys inside the server is too
complex.

> > The idea of the SQL key was to give the boot key a use, but I am now
> > seeing that the SQL key is just holding us back, and is preventing the
> > boot testing that is a requirement.  Maybe we just need to forget the
> > SQL part and focus on the TDE usage now, and come back to the SQL part.
> > I am also not 100% clear on the usefulness of the SQL key.
> 
> I agree to focus on the TDE usage now.

I admit the SQL key idea was mine, and I now see it was a bad idea since
it just adds confusion and doesn't add value.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-10-28 Thread Masahiko Sawada

On Tue, 27 Oct 2020 at 20:34, Bruce Momjian  wrote:
>
> On Tue, Oct 27, 2020 at 03:07:22PM +0800, Craig Ringer wrote:
> > On Mon, Oct 26, 2020 at 11:02 PM Stephen Frost  wrote:
> >
> >
> > TL;DR:
> >
> > * Important to check that key rotation is possible on a replica, i.e.
> > primary and standby can have different cluster passphrase and KEK
> > encrypting the same WAL and heap keys;
> > * with a HSM we can't read the key out, so a pluggable KEK operations
> > context or a configurable URI for the KEK is necessary
> > * I want the SQL key and SQL wrap/unwrap part in a separate patch, I
> > don't think it's fully baked and oppose its inclusion in its current
> > form
> > * Otherwise this looks good so far
> >
> > Explanation and argument for why below.
> >
> > > I've not been following very closely, but I definitely agree with the
> > > general feedback here (more on that below), but to this point- I do
> > > believe that was the intent, or at least I sure hope that it was.  Being
> > > able to have user/role keys would certainly be good.  Having a way for a
> > > user to log in and unlock their key would also be really nice.
> >
> > Right. AFAICS this is supposed to provide the foundation layer for
> > whole-cluster encryption, and it looks ok for that, caveat about HSMs
> > aside. I see nothing wrong with using a single key for heap (and one
> > for WAL, or even the same key). Finer grained and autovacuum etc
> > becomes seriously painful.
>
> You need to use separate keys for heap/index and WAL so you can
> replicate to another server that uses a different heap/index key, but
> the same WAL.  You can then fail-over to the replica and change the WAL
> key to complete full key rotation.  The replication protocol needs to
> decrypt, and the receiving end has to encrypt with a different
> heap/index key.  This is the key rotation method this is planned.  This
> is another good reason the keys should be in a separate directory so
> they can be easily copied or replaced.

I think it's better we decrypt WAL in the xlogreader layer, instead of
doing in replication protocol. That way, we also can support frontend
tools that need to read WAL such as pg_waldump and pg_rewind as well
as logical decoding.

>
> > I want to take a closer look at how the current implementation will
> > play with physical replication. I assume the WAL and heap keys have to
> > be constant for the full cluster lifetime, short of a dump and reload.
>
> The WAL key can change if you are willing to stop/start the server.  You
> only read the WAL during crash recovery.

We might need to consider having multiple key generations, rather than
in-place rotation. If we simply update the WAL key in-place in the
primary, archive WALs restored via restore_command cannot be decrypted
in the replica. We might need to do generation management for WAL key
and provide the functionality to purge old WAL keys.

> >
> > SQL KEY
> > 
> >
> > I'm not against the SQL key and wrap/unwrap functionality - quite the
> > contrary, I think it's really important to have something like it. But
> > is it appropriate to have a single, fixed-for-cluster-lifetime key for
> > this, one with no SQL-level access control over who can or cannot use
> > it, etc? The material encrypted with the key is user-exposed so key
> > rotation is an issue, but is not addressed here. And the interface
> > doesn't really solve the numerous potential problems with key material
> > leaks through logs and error messages.
> >
> > I just think that if we bake in the proposed user visible wrap/unwrap
> > interface now we're going to regret it later. How will it work when we
> > want to add user- or role- level access control for database-stored
> > keys? When we want to introduce a C-level API for extensions to work
> > directly with encrypted data like they can work currently with TOASTed
> > data, to prevent decrypted data from ever becoming SQL function
> > arguments subject to possible leakage and to allow implementation of
> > always-encrypted data types, etc?
> >
> > Most importantly - I don't think the SQL key adds anything really
> > crucial that we cannot do at the SQL level with an extension.  An
> > extension "pg_wrap" could provide pg_wrap() and pg_unwrap() already,
> > using a single master key much like the SQL key proposed in this
> > patch. To store the master key it could:
>
> The idea of the SQL key was to give the boot key a use, but I am now
> seeing that the SQL key is just holding us back, and is preventing the
> boot testing that is a requirement.  Maybe we just need to forget the
> SQL part and focus on the TDE usage now, and come back to the SQL part.
> I am also not 100% clear on the usefulness of the SQL key.

I agree to focus on the TDE usage now.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-10-28 Thread Craig Ringer

On Wed, Oct 28, 2020 at 12:02 PM Craig Ringer 
wrote:

> On Wed, Oct 28, 2020 at 9:43 AM Bruce Momjian  wrote:
> >
>
>> I don't know much about how to hook into that stuff so if you have an
>> idea, I am all ears.
>
>
> Yeah, I have a reasonable idea. The main thing will be to re-read the
> patch and put it into more concrete terms, which I'll try to find time for
> soon. I need to find time to craft a proper demo that uses a virtual hsm,
> and can also demonstrate how to use the host TPM or a Yubikey using the
> simple openssl engine interfaces or a URI.
>

Do you have this in a public git tree anywhere? If not please consider
using "git format-patch -v 1 -1" or similar to generate it, so I can "git
am" the patch.

A few comments on the patch as I read through. Some will be addressed by
the quick PoC I'm preparing for pluggable key derivation, some won't. In no
particular order:

* The term KEK only appears in the docs; where it appears in the sources
it's lower case. I suggest making "KEK" grep-able in the sources.
* BootStrapKmgr() says to call once on "system install" . I suggest
"initdb".
* The jumble of #ifdef FRONTEND in src/common/kmgr_utils.c shouldn't remain
in the final patch if possible. This may require some "static inline"
wrappers or helpers.
* PgKeyWrapCtx.key is unused and should probably be deleted
* HMAC support should go into PgCipherCtx so that we can have a single
PgCipherCtx that supports cipher ops, cipher+HMAC ops, or just HMAC ops
* PgKeyWrapCtx.cipherctx should then be supplemented with a hmacctx. It
should be legal to set cipherctx and hmacctx to the same value, since in
some cases the it won't be easy to initialize the backing implementation
separately for key and HMAC.

The patch I've been hacking together will look like this, though I haven't
got far along with it yet:

* Give each PgKeyWrapCtx a 'key_name' field to identify it
* Extract default passphrase based KEK creation into separate function that
returns
  a new PgKeyWrapCtx for the KEK, currently called
  kmgr_create_kek_context_from_cluster_passphrase()
* BootStrapKmgr() and pg_rotate_cluster_passphrase() call
  kmgr_create_kek_context_from_cluster_passphrase()
  instead of doing their own ctx creation
* [TODO] Replace kmgr_verify_passphrase() with kmgr_verify_ctx(...)
  which takes a PgKeyWrapCtx instead of constructing its own from a
passphrase
* [TODO] In InitializeKmgr() use kmgr_verify_ctx() instead of explicit
passphrase fetch
* [TODO] Teach PgCipherCtx about HMAC operations
* [TODO] replace PgKeyWrapCtx.mackey with another PgCipherCtx
* [TODO] add PgKeyWrapCtx.teardown_cb callback to be called before free
* [TODO] add a kmgr_create_kek_context() that checks for a hook/plugin
  or other means of loading a non-default means of getting a KEK
  PgKeyWrapContext, and calls
kmgr_create_kek_context_from_cluster_passphrase()
  by default
* [TODO] replace calls to kmgr_create_kek_context_from_cluster_passphrase()
  with calls to kmgr_create_kek_context()

That should hide the details of HMAC operations and of KEK creation from
kmgr_* .

Then via a TBD configuration mechanism we'd be able to select a method of
creating the PgKeyWrapCtx for the KEK and its contained PgCipherCtx
implementations for cipher and HMAC operations, then use that without
caring about how it works internally.

The key manager no longer has to care if the KEK was created by reading a
password from a command and deriving the KEK and HMAC. Or whether it's
actually backed by an OpenSSL engine that delegates to PKCS#11. kmgr ops
can just request the KEK context and use it.

FORK?

One possible problem with this is that we should not assume we can perform
KEK operations in postmaster children, since there's no guarantee we can
use whatever sits behind a PgCipherCtx after fork(). But AFAICS the KEK
doesn't live beyond the various kmgr_ operations as it is, so there's no
reason it should ever have to be carried over a fork() anyway.

CONFIGURING SOURCE OF KEK
---

Re the configuration mechanism: the usual way Pg does things is do provide
a global foo_hook_type foo_hook. The foo() function checks for foo_hook and
calls it if it's non-null, otherwise it calls the default implementation in
standard_foo(). A hook may choose to override standard_foo() completely, or
take its own actions before or after. The hook is installed by an extension
loaded in shared_preload_libraries.

There are a couple of issues with using this method for kmgr:

* the kmgr appears to need to be able to work in frontend code (?)
* for key rotation we need to be able to change KEKs, and possibly KEK
acquisition methods, at runtime

so I'm inclined to handle this a bit like we do for logical decoding output
plugins instead. Use a normal Pg extension library with PG_MODULE_MAGIC,
but dlsym() a different entrypoint. Have that entrypoint populate a struct
of API function pointers. The kmgr can use that struct to request KEK
loading. If no kmgr plugin is configured, use the default

Re: Internal key management system

2020-10-27 Thread Craig Ringer

On Wed, Oct 28, 2020 at 9:43 AM Bruce Momjian  wrote:
>

> I don't know much about how to hook into that stuff so if you have an
> idea, I am all ears.

Yeah, I have a reasonable idea. The main thing will be to re-read the patch
and put it into more concrete terms, which I'll try to find time for soon.
I need to find time to craft a proper demo that uses a virtual hsm, and can
also demonstrate how to use the host TPM or a Yubikey using the simple
openssl engine interfaces or a URI.

 I have used OpenSSL with Yubikey via pksc11.  You
> can see the use of it on slide 57 and following:
>
> https://momjian.us/main/writings/crypto_hw_config.pdf#page=57
>
> Interestingly, that still needed the user to type in a key to unlock the
> Yubikey, so we might need PKCS11 and a password for the same server
> start.
>

Yes, that's possible. But in that case the passphrase will be asked for by
openssl only when required, and we'll need to supply an openssl askpass
hook.

Re: Internal key management system

On Tue, Oct 27, 2020 at 10:20:35AM -0400, Bruce Momjian wrote:
> I don't know much about how to hook into that stuff so if you have an
> idea, I am all ears.  I have used OpenSSL with Yubikey via pksc11.  You
> can see the use of it on slide 57 and following:
> 
>   https://momjian.us/main/writings/crypto_hw_config.pdf#page=57
> 
> Interestingly, that still needed the user to type in a key to unlock the
> Yubikey, so we might need PKCS11 and a password for the same server
> start.
> 
> I would like to get this moving forward so I will work on the idea of
> passing an open /dev/tty file descriptor from pg_ctl to the postmaster
> on start.

Here is an updated patch that uses an argument to pass an open /dev/tty
file descriptor to the postmaster.  It uses -R for initdb/pg_ctl, -R ###
for postmaster/postgres, and %R for cluster_passphrase_command.  Here is
a sample session:

--> $ initdb -R --cluster-passphrase-command '/tmp/pass_fd.sh "%p" %R'
The files belonging to this database system will be owned by user 
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Key management system is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ...
--> Enter database encryption pass phrase: 
B1D7B405EDCD97B7351DD3B7AE0637775FFBC6A2C2EEADAEC009A75A58A79F50
ok
performing post-bootstrap initialization ...
--> Enter database encryption pass phrase: 
B1D7B405EDCD97B7351DD3B7AE0637775FFBC6A2C2EEADAEC009A75A58A79F50
ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start

$ pg_ctl stop
pg_ctl: PID file "/u/pgsql/data/postmaster.pid" does not exist
Is server running?
--> $ pg_ctl -l /u/pg/server.log -R start
waiting for server to start...
--> Enter database encryption pass phrase: 
B1D7B405EDCD97B7351DD3B7AE0637775FFBC6A2C2EEADAEC009A75A58A79F50
 done
server started

Attached is my updated patch, based on Masahiko Sawada's patch, and my
pass_fd.sh script.  

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index f043433..65422d5
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** COPY postgres_log FROM '/full/path/to/lo
*** 7793,7798 
--- 7793,7836 
  
 
  
+
+ Encryption Key Management
+ 
+ 
+  
+   cluster_passphrase_command (string)
+   
+cluster_passphrase_command configuration parameter
+   
+   
+   
+
+ This option specifies an external command to be invoked when a
+ passphrase for key management system needs to be obtained.
+
+
+ The command must print the passphrase to the standard
+ output and have a zero exit code.  In the parameter value,
+ %p is replaced by a prompt string, and
+ %R represents the file descriptor for
+ reading/writing from the terminal; if specified at server start,
+ it is blank.  Since it can be blank, it is wise for it to be the
+ last command-line argument specified.  (Write %%
+ for a literal %.)  Note that the prompt string
+ will probably contain whitespace, so be sure to quote its use
+ adequately.  A single newline is stripped from the end of the
+ output if present.  The passphrase must be at least 64 bytes.
+
+
+ This parameter can only be set in the
+ postgresql.conf file or on the server
+ command line.
+
+   
+  
+ 
+
+ 
 
  Client Connection Defaults
  
*** dynamic_library_path = 'C:\tools\postgre
*** 9636,9641 
--- 9674,9695 
 

   
+ 
+   
+   key_management_enabled (boolean)
+   
+Key management configuration

Re: Internal key management system

On Tue, Oct 27, 2020 at 10:02:53PM +0800, Craig Ringer wrote:
> On Tue, 27 Oct 2020, 19:15 Bruce Momjian,  wrote:
> We don't need to do anything except provide a way to tell OpenSSL where to get
> the KEK from, for situations where having Pg generate it internally
> undesirable. 
> 
> I proposed a simple GUC that we could supply to OpenSSL as a key path because
> it's simple. It's definitely not best.
> 
> In my prior mail I outlined what I think is a better way. Abstract key key
> initialisation -  passphrase fetching KEK/HMAC loading and all of it - behind 
> a
> pluggable interface. Looking at the patch, it's mostly there already. We just
> need a way to hook the key loading and setup so it can be overridden to use
> whatever method is required. Then KEK operations to encrypt and decrypt the
> heap and WAL keys happen via that abstraction.
> 
> That way Pg does not have to care about the details of hardware key 
> management,
> PKCS#11 or OpenSSL engines, etc.
> 
> A little thought is needed to make key rotation work well. Especially when you
> want to switch from cluster passphrase to a plugin that supports use of a HVM
> escrowed key, or vice versa.
> 
> But most of what's needed looks like it's there already. It's just down to
> making sure the key loading and initialisation is overrideable.

I don't know much about how to hook into that stuff so if you have an
idea, I am all ears.  I have used OpenSSL with Yubikey via pksc11.  You
can see the use of it on slide 57 and following:

https://momjian.us/main/writings/crypto_hw_config.pdf#page=57

Interestingly, that still needed the user to type in a key to unlock the
Yubikey, so we might need PKCS11 and a password for the same server
start.

I would like to get this moving forward so I will work on the idea of
passing an open /dev/tty file descriptor from pg_ctl to the postmaster
on start.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-10-27 Thread Craig Ringer

On Tue, 27 Oct 2020, 19:15 Bruce Momjian,  wrote:

> We could implement a 'command:' prefix now, and maybe
> a 'pass:' one, and allow other methods like 'pkcs11' later.
>

We don't need to do anything except provide a way to tell OpenSSL where to
get the KEK from, for situations where having Pg generate it internally
undesirable.

I proposed a simple GUC that we could supply to OpenSSL as a key path
because it's simple. It's definitely not best.

In my prior mail I outlined what I think is a better way. Abstract key key
initialisation -  passphrase fetching KEK/HMAC loading and all of it -
behind a pluggable interface. Looking at the patch, it's mostly there
already. We just need a way to hook the key loading and setup so it can be
overridden to use whatever method is required. Then KEK operations to
encrypt and decrypt the heap and WAL keys happen via that abstraction.

That way Pg does not have to care about the details of hardware key
management, PKCS#11 or OpenSSL engines, etc.

A little thought is needed to make key rotation work well. Especially when
you want to switch from cluster passphrase to a plugin that supports use of
a HVM escrowed key, or vice versa.

But most of what's needed looks like it's there already. It's just down to
making sure the key loading and initialisation is overrideable.

Re: Internal key management system

On Tue, Oct 27, 2020 at 03:07:22PM +0800, Craig Ringer wrote:
> On Mon, Oct 26, 2020 at 11:02 PM Stephen Frost  wrote:
> 
> 
> TL;DR:
> 
> * Important to check that key rotation is possible on a replica, i.e.
> primary and standby can have different cluster passphrase and KEK
> encrypting the same WAL and heap keys;
> * with a HSM we can't read the key out, so a pluggable KEK operations
> context or a configurable URI for the KEK is necessary
> * I want the SQL key and SQL wrap/unwrap part in a separate patch, I
> don't think it's fully baked and oppose its inclusion in its current
> form
> * Otherwise this looks good so far
> 
> Explanation and argument for why below.
> 
> > I've not been following very closely, but I definitely agree with the
> > general feedback here (more on that below), but to this point- I do
> > believe that was the intent, or at least I sure hope that it was.  Being
> > able to have user/role keys would certainly be good.  Having a way for a
> > user to log in and unlock their key would also be really nice.
> 
> Right. AFAICS this is supposed to provide the foundation layer for
> whole-cluster encryption, and it looks ok for that, caveat about HSMs
> aside. I see nothing wrong with using a single key for heap (and one
> for WAL, or even the same key). Finer grained and autovacuum etc
> becomes seriously painful.

You need to use separate keys for heap/index and WAL so you can
replicate to another server that uses a different heap/index key, but
the same WAL.  You can then fail-over to the replica and change the WAL
key to complete full key rotation.  The replication protocol needs to
decrypt, and the receiving end has to encrypt with a different
heap/index key.  This is the key rotation method this is planned.  This
is another good reason the keys should be in a separate directory so
they can be easily copied or replaced.

> I want to take a closer look at how the current implementation will
> play with physical replication. I assume the WAL and heap keys have to
> be constant for the full cluster lifetime, short of a dump and reload.

The WAL key can change if you are willing to stop/start the server.  You
only read the WAL during crash recovery.

> The main issue I have so far is that I don't think the SQL key
> actually fits well with the current proposal. Its proposed interface
> and use cases are incomplete, it doesn't fully address key leak risks,
> there's no user access control, etc. Also the SQL key part could be
> implemented on top of the base cluster encryption part, I don't see
> why it actually has to integrate with the whole-cluster key management
> directly.

Agreed.  Maybe we should just focus on the TDE use now.  I do think the
current patch is not commitable since, because there are no defined
keys, there is no way to validate the boot-time password.  The no-key
case should be an unsupported configuration.  Maybe we need to just
create one key just to verify the boot password.

> 
> SQL KEY
> 
> 
> I'm not against the SQL key and wrap/unwrap functionality - quite the
> contrary, I think it's really important to have something like it. But
> is it appropriate to have a single, fixed-for-cluster-lifetime key for
> this, one with no SQL-level access control over who can or cannot use
> it, etc? The material encrypted with the key is user-exposed so key
> rotation is an issue, but is not addressed here. And the interface
> doesn't really solve the numerous potential problems with key material
> leaks through logs and error messages.
> 
> I just think that if we bake in the proposed user visible wrap/unwrap
> interface now we're going to regret it later. How will it work when we
> want to add user- or role- level access control for database-stored
> keys? When we want to introduce a C-level API for extensions to work
> directly with encrypted data like they can work currently with TOASTed
> data, to prevent decrypted data from ever becoming SQL function
> arguments subject to possible leakage and to allow implementation of
> always-encrypted data types, etc?
> 
> Most importantly - I don't think the SQL key adds anything really
> crucial that we cannot do at the SQL level with an extension.  An
> extension "pg_wrap" could provide pg_wrap() and pg_unwrap() already,
> using a single master key much like the SQL key proposed in this
> patch. To store the master key it could:

The idea of the SQL key was to give the boot key a use, but I am now
seeing that the SQL key is just holding us back, and is preventing the
boot testing that is a requirement.  Maybe we just need to forget the
SQL part and focus on the TDE usage now, and come back to the SQL part. 
I am also not 100% clear on the usefulness of the SQL key.

> OTHER TRANSPARENT ENCRYPTION USE CASES
> 
> 
> Does this patch get in the way of supporting other kinds of
> transparent encryption that are frequently requested and are in use on
> other systems already?
> 
> I don't think so. Whole-cluster encryption

Re: Internal key management system

On Mon, Oct 26, 2020 at 10:05:10PM +0800, Craig Ringer wrote:
> For example if I want to lock my database with a YubiHSM I would configure
> something like:
> 
>     cluster_encryption_key = 'pkcs11:token=YubiHSM;id=0:0001;type=private'

Well, openssl uses a prefix before the password string, e.g.:

*  pass:password
*  env:var
*  file:pathname
*  fd:number
*  stdin

See 'man openssl'.  I always thought that API was ugly, but I now see
the value in it.  We could implement a 'command:' prefix now, and maybe
a 'pass:' one, and allow other methods like 'pkcs11' later.

I can also imagine using the 'file' one to allow the key to be placed on
an encrypted file system that has to be mounted for Postgres to start. 
You could also have the key on a USB device that has to be inserted to
be used, and the 'file' is on the USB key --- seems clearer than having
to create a script to 'cat' the file.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-10-27 Thread Craig Ringer

On Mon, Oct 26, 2020 at 11:02 PM Stephen Frost  wrote:

TL;DR:

* Important to check that key rotation is possible on a replica, i.e.
primary and standby can have different cluster passphrase and KEK
encrypting the same WAL and heap keys;
* with a HSM we can't read the key out, so a pluggable KEK operations
context or a configurable URI for the KEK is necessary
* I want the SQL key and SQL wrap/unwrap part in a separate patch, I
don't think it's fully baked and oppose its inclusion in its current
form
* Otherwise this looks good so far

Explanation and argument for why below.

> I've not been following very closely, but I definitely agree with the
> general feedback here (more on that below), but to this point- I do
> believe that was the intent, or at least I sure hope that it was.  Being
> able to have user/role keys would certainly be good.  Having a way for a
> user to log in and unlock their key would also be really nice.

Right. AFAICS this is supposed to provide the foundation layer for
whole-cluster encryption, and it looks ok for that, caveat about HSMs
aside. I see nothing wrong with using a single key for heap (and one
for WAL, or even the same key). Finer grained and autovacuum etc
becomes seriously painful.

I want to take a closer look at how the current implementation will
play with physical replication. I assume the WAL and heap keys have to
be constant for the full cluster lifetime, short of a dump and reload.
But I want to make sure that the KEK+HMAC can differ from one node to
another, i.e. that we can perform KEK rotation on a replica to
re-encrypt the WAL and heap keys against a new KEK. This is important
for backups, and also for effective use of a HSM where we may not want
or be able to have the same key on a primary and its replicas. If it
isn't already supported it looks like it should be simple, but it's
IMO important not to miss.

The main issue I have so far is that I don't think the SQL key
actually fits well with the current proposal. Its proposed interface
and use cases are incomplete, it doesn't fully address key leak risks,
there's no user access control, etc. Also the SQL key part could be
implemented on top of the base cluster encryption part, I don't see
why it actually has to integrate with the whole-cluster key management
directly.

SQL KEY

I'm not against the SQL key and wrap/unwrap functionality - quite the
contrary, I think it's really important to have something like it. But
is it appropriate to have a single, fixed-for-cluster-lifetime key for
this, one with no SQL-level access control over who can or cannot use
it, etc? The material encrypted with the key is user-exposed so key
rotation is an issue, but is not addressed here. And the interface
doesn't really solve the numerous potential problems with key material
leaks through logs and error messages.

I just think that if we bake in the proposed user visible wrap/unwrap
interface now we're going to regret it later. How will it work when we
want to add user- or role- level access control for database-stored
keys? When we want to introduce a C-level API for extensions to work
directly with encrypted data like they can work currently with TOASTed
data, to prevent decrypted data from ever becoming SQL function
arguments subject to possible leakage and to allow implementation of
always-encrypted data types, etc?

Most importantly - I don't think the SQL key adds anything really
crucial that we cannot do at the SQL level with an extension.  An
extension "pg_wrap" could provide pg_wrap() and pg_unwrap() already,
using a single master key much like the SQL key proposed in this
patch. To store the master key it could:

1. Derive the key at shared_preload_libraries time in the same way
this key management system proposes to do so, using a
pg_wrap.pg_wrap_passphrase_command ; or
2. Read the key from a PEM file on disk, accepting a passphrase  to
decrypt it via a password command GUC or an SQL-level function call;
or
3. Read the key from a pg_wrap.pg_wrap_keys extension catalog, which
is superuser-only and which is protected by whole-db encryption if in
use;
4. Like (3) but use generic WAL and a custom relation to make the
in-db key store opaque without use of extensions like pageinspect.

That way we haven't baked some sort of limited wrap/unwrap into Pg's
long term user visible API. I'd be totally happy for such a SQL key
wrap/unwrap to become part of pgcrypto, or a separate extension that
uses pgcrypto, if you're worried about having it available to users. I
just don't really want it in src/backend in its current form.

To simplify (1) we could make the implementation of the KEK/HMAC
derivation accessible from extensions and allow them to provide their
own password callback, though that might make life harder if we want
to change things later, and it'd mean that you couldn't use the
extension on a db that was not configured for whole db encryption. So
I'd actually rather make key derivation and storage the

Re: Internal key management system

2020-10-26 Thread Stephen Frost

Greetings,

* Craig Ringer (craig.rin...@enterprisedb.com) wrote:
> On Mon, Oct 19, 2020 at 11:16 AM Masahiko Sawada 
>  wrote:
> > The patch introduces only key management infrastructure but with no
> > key. Currently, there is no interface to dynamically add a new
> > encryption key.
> 
> I'm a bit confused by the exact intent and use cases behind this patch.
> https://www.postgresql.org/message-id/17156d2e419.12a27f6df87825.436300492203108132%40highgo.ca
> that was somewhat helpful but not entirely clear.
> 
> The main intent of this proposal seems to be to power TDE-style encryption
> of data at rest, with a single master key for the entire cluster. Has any
> consideration been given to user- or role-level key management as part of
> this, or is that expected to be done separately and protected by the master
> key supplied by this patch?

I've not been following very closely, but I definitely agree with the
general feedback here (more on that below), but to this point- I do
believe that was the intent, or at least I sure hope that it was.  Being
able to have user/role keys would certainly be good.  Having a way for a
user to log in and unlock their key would also be really nice.

> If so, what if I have a HSM (or virtualised or paravirt or network proxied
> HSM) that I want to use to manage my database keys, such that the database
> master key is protected by the HSM? Say I want to put my database key in a
> smartcard, my machine's TPM, a usb HSM, a virtual HSM provided by my
> VM/cloud platform, etc?
> 
> As far as I can tell with the current design I'd have to encrypt my unlock
> passphrase and put it in the cluster_passphrase_command script or its
> arguments. The script would ask the HSM to decrypt the key passphrase and
> write that over stdio where Pg would read it and use it to decrypt the
> master key(s). That would work - but it should not be necessary and it
> weakens the protection offered by the HSM considerably.

Yeah, I do think this is how you'd need to do it and I agree that it'd
be better to offer an option that can go to the HSM directly.  That
said- I don't think we necessarily want to throw out tho command-based
option, as users may wish to use a vaulting solution or similar instead
of an HSM.  What I am curious about though- what are the thoughts around
using a vaulting solution's command-line tool vs. writing code to work
with an API?  Between these various options, what are the risks of
having a script vs. using an API and would one or the other weaken the
overall solution?  Or is what's really needed here is a way to tell us
if it's a passphrase we're getting or a proper key, regardless of the
method being used to fetch it?

> I suggest we allow the user to supply their own KEK via a
> cluster_encryption_key GUC. If set, Pg would create an SSLContext with the
> supplied key and use that SSLContext to decrypt the application keys - with
> no intermediate KEK-derivation based on cluster_passphrase_command
> performed. cluster_encryption_key could be set to an openssl engine URI, in
> which case OpenSSL would transparently use the supplied engine (usually a
> HSM) to decrypt the application keys. We'd install the
> cluster_passphrase_command as an openssl askpass callback so that if the
> HSM requires an unlock password it can be provided - like how it's done for
> libpq in Pg 13. Some thought is required for how to do key rotation here,
> though it matters a great deal less when a HSM is managing key escrow.

This really locks us into OpenSSL for this, which I don't particularly
like.  If we do go down this route, we should definitely make it clear
that this is for use when PG has been built with OpenSSL, ie:
openssl_cluster_encryption_key as the parameter name, or such.

> For example if I want to lock my database with a YubiHSM I would configure
> something like:
> 
> cluster_encryption_key = 'pkcs11:token=YubiHSM;id=0:0001;type=private'
> 
> The DB would be encrypted and decrypted using application keys unlocked by
> the HSM. Backups of the database, stolen disk images, etc, would be
> unreadable unless you have access to another HSM with the same key loaded.

Well, you would surely just need the key, since you could change the PG
config to fetch the key from whereever you have it, you wouldn't need an
actual HSM..

> If cluster_encryption_key is unset, Pg would perform its own KEK derivation
> based on cluster_passphrase_command as currently implemented.

To what I was suggesting above- what if we just had a GUC that's
"kek_method" with options 'passphrase' and 'direct', where passphrase
goes through KEK and 'direct' doesn't, which just changes how we treat
the results of called cluster_passphrase_command?

> I really don't think we should be adopting something that doesn't consider
> platform based hardware key escrow and protection.

I agree that we should consider platform based hardware key escrow and
protection.  I'm generally supportive of trying to do so in a way

Re: Internal key management system

2020-10-26 Thread Craig Ringer

On Mon, Oct 19, 2020 at 11:16 AM Masahiko Sawada <
masahiko.saw...@2ndquadrant.com> wrote:

The patch introduces only key management infrastructure but with no
> key. Currently, there is no interface to dynamically add a new
> encryption key.

I'm a bit confused by the exact intent and use cases behind this patch.
https://www.postgresql.org/message-id/17156d2e419.12a27f6df87825.436300492203108132%40highgo.ca
that was somewhat helpful but not entirely clear.

The main intent of this proposal seems to be to power TDE-style encryption
of data at rest, with a single master key for the entire cluster. Has any
consideration been given to user- or role-level key management as part of
this, or is that expected to be done separately and protected by the master
key supplied by this patch?

If so, what if I have a HSM (or virtualised or paravirt or network proxied
HSM) that I want to use to manage my database keys, such that the database
master key is protected by the HSM? Say I want to put my database key in a
smartcard, my machine's TPM, a usb HSM, a virtual HSM provided by my
VM/cloud platform, etc?

As far as I can tell with the current design I'd have to encrypt my unlock
passphrase and put it in the cluster_passphrase_command script or its
arguments. The script would ask the HSM to decrypt the key passphrase and
write that over stdio where Pg would read it and use it to decrypt the
master key(s). That would work - but it should not be necessary and it
weakens the protection offered by the HSM considerably.

I suggest we allow the user to supply their own KEK via a
cluster_encryption_key GUC. If set, Pg would create an SSLContext with the
supplied key and use that SSLContext to decrypt the application keys - with
no intermediate KEK-derivation based on cluster_passphrase_command
performed. cluster_encryption_key could be set to an openssl engine URI, in
which case OpenSSL would transparently use the supplied engine (usually a
HSM) to decrypt the application keys. We'd install the
cluster_passphrase_command as an openssl askpass callback so that if the
HSM requires an unlock password it can be provided - like how it's done for
libpq in Pg 13. Some thought is required for how to do key rotation here,
though it matters a great deal less when a HSM is managing key escrow.

For example if I want to lock my database with a YubiHSM I would configure
something like:

cluster_encryption_key = 'pkcs11:token=YubiHSM;id=0:0001;type=private'

The DB would be encrypted and decrypted using application keys unlocked by
the HSM. Backups of the database, stolen disk images, etc, would be
unreadable unless you have access to another HSM with the same key loaded.

If cluster_encryption_key is unset, Pg would perform its own KEK derivation
based on cluster_passphrase_command as currently implemented.

I really don't think we should be adopting something that doesn't consider
platform based hardware key escrow and protection.

Re: Internal key management system

2020-10-18 Thread Masahiko Sawada

On Sat, 17 Oct 2020 at 05:24, Bruce Momjian  wrote:
>
> On Fri, Jul 31, 2020 at 04:06:38PM +0900, Masahiko Sawada wrote:
> > > Given that the purpose of the key manager is to help TDE, discussing
> > > the SQL interface part (i.g., the second patch) deviates from the
> > > original purpose. I think we should discuss the design and
> > > implementation of the key manager first and then other additional
> > > interfaces. So I’ve attached a new version patch and removed the
> > > second patch part so that we can focus on only the key manager part.
> > >
> >
> > Since the previous patch sets conflicts with the current HEAD, I've
> > attached the rebased patch set.
>
> I have updated the attached patch and am hoping to move this feature
> forward.  The changes I made are:
>
> *  handle merge conflicts
> *  changed ssl initialization to match other places in our code
> *  changed StrNCpy() to strlcpy
> *  update the docs

Thank you for updating the patch!

>
> The first three were needed to get it to compile.  I then ran some tests
> using the attached shell script as my password script.  First, I found
> that initdb called the script twice.  The first call worked fine, but
> the second call would accept a password that didn't match the first
> call.   This is because there are no keys defined, so there is nothing
> for kmgr_verify_passphrase() to check for passkey verification, so it
> just succeeds.   In fact, I can't figure out how to create any keys with
> the patch,

The patch introduces only key management infrastructure but with no
key. Currently, there is no interface to dynamically add a new
encryption key. We need to add the new key ID to internalKeyLengths
array and increase KMGR_MAX_INTERNAL_KEY. The plan was to add a
subsequent patch that adds functionality using encryption keys managed
by the key manager. The patch was to add two SQL functions: pg_wrap()
and pg_unwrap(), along with the internal key wrap key.

IIUC, what to integrate with the key manager is still an open
question. The idea of pg_wrap() and pg_unwrap() seems good but it
still has the problem that the password could be logged to the server
log when the user wraps it. Of course, since the key manager is
originally designed for cluster-wide transparent encryption, TDE will
be one of the users of the key manager. But there was a discussion
it’s better to introduce the key manager first and to make it have
small functions or integrate with existing features such as pgcrypto
because TDE development might take time over the releases. So I'm
thinking to deal with the problem the pg_wrap idea has or to make
pgcrypto use the key manager so that it doesn't require the user to
pass the password as a function argument.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-10-16 Thread Bruce Momjian

On Fri, Oct 16, 2020 at 04:56:47PM -0400, Tom Lane wrote:
> Bruce Momjian  writes:
> > Second, in testing starting/stopping the server, pg_ctl doesn't allow
> > the cluster_passphrase_command to read from /dev/tty, which I think is a
> > requirement because the command could likely require a user-supplied
> > unlock key, even if that is not the actual passphrase, just like ssl
> > keys.  This is because pg_ctl calls setsid() just before calling execl()
> > to start the server, and setsid() disassociates itself from the
> > controlling terminal.  I think the fix is to remove setsid() from pg_ctl
> > and add a postmaster flag to call setsid() after it has potentially
> > called cluster_passphrase_command, and pg_ctl would use that flag.
> 
> We discussed that and rejected it in the thread leading up to
> bb24439ce [1].  The primary problem being that it's not very clear
> when the postmaster should daemonize itself, and later generally
> isn't better.  I doubt that this proposal is doing anything to
> clarify that situation.

Agreed.  No reason to destablize the postmaster for this.  What about
having pg_ctl open /dev/tty, and then pass in an open file descriptor
that is a copy of /dev/tty, that can be closed by the postmaster after
the cluster_passphrase_command?  I just tested this and it worked.

I am thinking we would pass the file descriptor number to the postmaster
via a command-line argument.  Ideally we would pass in the device name
of /dev/tty, but I don't know of a good way to do that.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-10-16 Thread Tom Lane

Bruce Momjian  writes:
> Second, in testing starting/stopping the server, pg_ctl doesn't allow
> the cluster_passphrase_command to read from /dev/tty, which I think is a
> requirement because the command could likely require a user-supplied
> unlock key, even if that is not the actual passphrase, just like ssl
> keys.  This is because pg_ctl calls setsid() just before calling execl()
> to start the server, and setsid() disassociates itself from the
> controlling terminal.  I think the fix is to remove setsid() from pg_ctl
> and add a postmaster flag to call setsid() after it has potentially
> called cluster_passphrase_command, and pg_ctl would use that flag.

We discussed that and rejected it in the thread leading up to
bb24439ce [1].  The primary problem being that it's not very clear
when the postmaster should daemonize itself, and later generally
isn't better.  I doubt that this proposal is doing anything to
clarify that situation.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/CAEET0ZH5Bf7dhZB3mYy8zZQttJrdZg_0Wwaj0o1PuuBny1JkEw%40mail.gmail.com

Re: Internal key management system

2020-10-16 Thread Bruce Momjian

On Fri, Jul 31, 2020 at 04:06:38PM +0900, Masahiko Sawada wrote:
> > Given that the purpose of the key manager is to help TDE, discussing
> > the SQL interface part (i.g., the second patch) deviates from the
> > original purpose. I think we should discuss the design and
> > implementation of the key manager first and then other additional
> > interfaces. So I’ve attached a new version patch and removed the
> > second patch part so that we can focus on only the key manager part.
> >
> 
> Since the previous patch sets conflicts with the current HEAD, I've
> attached the rebased patch set.

I have updated the attached patch and am hoping to move this feature
forward.  The changes I made are:

*  handle merge conflicts
*  changed ssl initialization to match other places in our code
*  changed StrNCpy() to strlcpy
*  update the docs

The first three were needed to get it to compile.  I then ran some tests
using the attached shell script as my password script.  First, I found
that initdb called the script twice.  The first call worked fine, but
the second call would accept a password that didn't match the first
call.   This is because there are no keys defined, so there is nothing
for kmgr_verify_passphrase() to check for passkey verification, so it
just succeeds.   In fact, I can't figure out how to create any keys with
the patch, and pg_encrypt() is documented, but not defined anywhere.

Second, in testing starting/stopping the server, pg_ctl doesn't allow
the cluster_passphrase_command to read from /dev/tty, which I think is a
requirement because the command could likely require a user-supplied
unlock key, even if that is not the actual passphrase, just like ssl
keys.  This is because pg_ctl calls setsid() just before calling execl()
to start the server, and setsid() disassociates itself from the
controlling terminal.  I think the fix is to remove setsid() from pg_ctl
and add a postmaster flag to call setsid() after it has potentially
called cluster_passphrase_command, and pg_ctl would use that flag.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 2768c85..44e0c1e
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** COPY postgres_log FROM '/full/path/to/lo
*** 7793,7798 
--- 7793,7833 
  
 
  
+
+ Encryption Key Management
+ 
+ 
+  
+   cluster_passphrase_command (string)
+   
+cluster_passphrase_command configuration parameter
+   
+   
+   
+
+ This option specifies an external command to be invoked when a
+ passphrase for key management system needs to be obtained.
+
+
+ The command must print the passphrase to the standard
+ output and have a zero exit code.  In the parameter value,
+ %p is replaced by a prompt string.  (Write
+ %% for a literal %.)
+ Note that the prompt string will probably contain whitespace,
+ so be sure to quote its use adequately.  A single newline is
+ stripped from the end of the output if present.  The passphrase
+ must be at least 64 bytes.
+
+
+ This parameter can only be set in the
+ postgresql.conf file or on the server
+ command line.
+
+   
+  
+ 
+
+ 
 
  Client Connection Defaults
  
*** dynamic_library_path = 'C:\tools\postgre
*** 9636,9641 
--- 9671,9692 
 

   
+ 
+   
+   key_management_enabled (boolean)
+   
+Key management configuration parameter parameter
+   
+   
+   
+
+ Reports whether encryption key management
+ is enabled for this cluster.  See  for more
+ information.
+
+   
+  
  
   
data_directory_mode (integer)
diff --git a/doc/src/sgml/database-encryption.sgml b/doc/src/sgml/database-encryption.sgml
new file mode 100644
index ...db84b0d
*** a/doc/src/sgml/database-encryption.sgml
--- b/doc/src/sgml/database-encryption.sgml
***
*** 0 
--- 1,292 
+ 
+ 
+ 
+  Database Encryption
+ 
+  
+   Server Side Encryption
+  
+ 
+  
+   The purpose of database encryption is to protect the confidential data
+   stored in a database from being revealed.
+  
+ 
+  
+   Encryption Key Management
+ 
+   
+PostgreSQL supports internal
+Encryption Key Management System, which is designed
+to manage the life cycles of cryptographic keys within the
+PostgreSQL.  This includes dealing with their
+generation, storage, usage and rotation.
+   
+ 
+   
+Encryption key management system is enabled when
+PostgreSQL is built with
+--with-openssl and
+ is specified during
+initdb.  The cluster passphrase provided by the
+

Re: Internal key management system

2020-09-08 Thread Michael Paquier

On Fri, Jul 31, 2020 at 04:06:38PM +0900, Masahiko Sawada wrote:
> Since the previous patch sets conflicts with the current HEAD, I've
> attached the rebased patch set.

Patch 0002 fails to apply, so a rebase is needed.
--
Michael


signature.asc
Description: PGP signature

Re: Internal key management system

2020-07-31 Thread Masahiko Sawada

On Tue, 23 Jun 2020 at 22:46, Masahiko Sawada
 wrote:
>
> On Fri, 19 Jun 2020 at 15:44, Fabien COELHO  wrote:
> >
> >
> > Hello Masahiko-san,
> >
> > > What I referred to "only one key" is KEK.
> >
> > Ok, sorry, I misunderstood.
> >
> > >>> Yeah, it depends on KMS, meaning we need different extensions for
> > >>> different KMS. A KMS might support an interface that creates key if not
> > >>> exist during GET but some KMS might support CREATE and GET separately.
> > >>
> > >> I disagree that it is necessary, but this is debatable. The KMS-side
> > >> interface code could take care of that, eg:
> > >>
> > >>if command is "get X"
> > >>  if (X does not exist in KMS)
> > >>create a new key stored in KMS, return it;
> > >>  else
> > >>return KMS-stored key;
> > >>...
> > >>
> > >> So you can still have a "GET" only interface which adapts to the "final"
> > >> KMS. Basically, the glue code which implements the interface for the KMS
> > >> can include some logic to adapt to the KMS point of view.
> > >
> > > Is the above code is for the extension side, right?
> >
> > Such a code could be in the command with which pg communicates (eg through
> > its stdin/stdout, or whatever) to get keys.
> >
> > pg talks to the command, the command can do anything, such as storing keys
> > or communicating with an external service to retrieve them, anything
> > really, that is the point.
> >
> > I'm advocating defining the pg/command protocol, something along "GET xxx"
> > as you wrote, and possibly provide a possible/reasonable command
> > implementation, which would be part of the code you put in your patch,
> > only it would be in the command instead of postgres.
> >
> > > For example, if users want to use a cloud KMS, say AWS KMS, to store
> > > DEKs and KEK they need an extension that is loaded to postgres and can
> > > communicate with AWS KMS. I imagine that such extension needs to be
> > > written in C,
> >
> > Why? I could write it in bash, probably. Ok, maybe not so good for suid,
> > but in principle it could be anything. I'd probably write it in C, though.
> >
> > > the communication between the extension uses AWS KMS API, and the
> > > communication between postgres core and the extension uses text
> > > protocol.
> >
> > I'm not sure of the word "extension" above. For me the postgres side could
> > be an extension as in "CREATE EXTENSION". The command itself could be
> > provided in the extension code, but would not be in the "CREATE
> > EXTENSION", it would be something run independently.
>
> Oh, I imagined extensions that can be installed by CREATE EXTENSION or
> specifying it to shared_preload_libraries.
>
> If the command runs in the background to talk with postgres there are
> some problems that we need to deal with. For instance, what if the
> process downs? Does the postmaster re-execute it? How does it work in
> single-user mode? etc. It seems to me it will bring another
> complexity.
>
> >
> > > When postgres core needs a DEK identified by KEY-A, it asks
> > > for the extension to get the DEK by passing something like “GET KEY-A”
> > > message, and then the extension asks the existence of that key to AWK
> > > KMS, creates if not exist and returns it to the postgres core. Is my
> > > understanding right?
> >
> > Yes. The command in the use-case you outline would just be an
> > intermediary, but for another use-case it would do the storing. The point
> > of aiming at extensibility if that from pg point of view the external
> > commands provide keys, but what these commands really do to do this can be
> > anything.
> >
> > > When we have TDE feature in the future, we would also need to change
> > > frontend tools such as pg_waldump and pg_rewind that read database
> > > files so that they can read encrypted files. It means that these
> > > frond-end tools also somehow need to communicate with the external
> > > place to get DEKs in order to decrypt encrypted database files. In
> > > your idea, what do you think about how we can support it?
> >
> > Hmmm. My idea was that the natural interface would be to get things
> > through postgres. For a debug tool such as pg_waldump, probably it needs
> > to be adapted if it needs to decrypt data to operate.
> >
> > Now I'm not sure I understood, because of the DEK are managed by postgres
> > in your patch, waldump and other external commands would have no access to
> > the decrypted data anyway, so the issue would be the same?
>
> With the current patch, we will be able to add
> --cluster-passphase-command command-line option to front-end tools
> that want to read encrypted database files. The front-end tools
> execute the specified command to get KEK and unwrap DEKs. The
> functions such as running passphrase command, verifying the passphrase
> is correct, and getting wrapped DEKs from the database cluster are
> implemented in src/common so both can use these functions.
>
> >
> > With file-level encryption, obviously all commands which

Re: Internal key management system

2020-06-23 Thread Masahiko Sawada

On Fri, 19 Jun 2020 at 15:44, Fabien COELHO  wrote:
>
>
> Hello Masahiko-san,
>
> > What I referred to "only one key" is KEK.
>
> Ok, sorry, I misunderstood.
>
> >>> Yeah, it depends on KMS, meaning we need different extensions for
> >>> different KMS. A KMS might support an interface that creates key if not
> >>> exist during GET but some KMS might support CREATE and GET separately.
> >>
> >> I disagree that it is necessary, but this is debatable. The KMS-side
> >> interface code could take care of that, eg:
> >>
> >>if command is "get X"
> >>  if (X does not exist in KMS)
> >>create a new key stored in KMS, return it;
> >>  else
> >>return KMS-stored key;
> >>...
> >>
> >> So you can still have a "GET" only interface which adapts to the "final"
> >> KMS. Basically, the glue code which implements the interface for the KMS
> >> can include some logic to adapt to the KMS point of view.
> >
> > Is the above code is for the extension side, right?
>
> Such a code could be in the command with which pg communicates (eg through
> its stdin/stdout, or whatever) to get keys.
>
> pg talks to the command, the command can do anything, such as storing keys
> or communicating with an external service to retrieve them, anything
> really, that is the point.
>
> I'm advocating defining the pg/command protocol, something along "GET xxx"
> as you wrote, and possibly provide a possible/reasonable command
> implementation, which would be part of the code you put in your patch,
> only it would be in the command instead of postgres.
>
> > For example, if users want to use a cloud KMS, say AWS KMS, to store
> > DEKs and KEK they need an extension that is loaded to postgres and can
> > communicate with AWS KMS. I imagine that such extension needs to be
> > written in C,
>
> Why? I could write it in bash, probably. Ok, maybe not so good for suid,
> but in principle it could be anything. I'd probably write it in C, though.
>
> > the communication between the extension uses AWS KMS API, and the
> > communication between postgres core and the extension uses text
> > protocol.
>
> I'm not sure of the word "extension" above. For me the postgres side could
> be an extension as in "CREATE EXTENSION". The command itself could be
> provided in the extension code, but would not be in the "CREATE
> EXTENSION", it would be something run independently.

Oh, I imagined extensions that can be installed by CREATE EXTENSION or
specifying it to shared_preload_libraries.

If the command runs in the background to talk with postgres there are
some problems that we need to deal with. For instance, what if the
process downs? Does the postmaster re-execute it? How does it work in
single-user mode? etc. It seems to me it will bring another
complexity.

>
> > When postgres core needs a DEK identified by KEY-A, it asks
> > for the extension to get the DEK by passing something like “GET KEY-A”
> > message, and then the extension asks the existence of that key to AWK
> > KMS, creates if not exist and returns it to the postgres core. Is my
> > understanding right?
>
> Yes. The command in the use-case you outline would just be an
> intermediary, but for another use-case it would do the storing. The point
> of aiming at extensibility if that from pg point of view the external
> commands provide keys, but what these commands really do to do this can be
> anything.
>
> > When we have TDE feature in the future, we would also need to change
> > frontend tools such as pg_waldump and pg_rewind that read database
> > files so that they can read encrypted files. It means that these
> > frond-end tools also somehow need to communicate with the external
> > place to get DEKs in order to decrypt encrypted database files. In
> > your idea, what do you think about how we can support it?
>
> Hmmm. My idea was that the natural interface would be to get things
> through postgres. For a debug tool such as pg_waldump, probably it needs
> to be adapted if it needs to decrypt data to operate.
>
> Now I'm not sure I understood, because of the DEK are managed by postgres
> in your patch, waldump and other external commands would have no access to
> the decrypted data anyway, so the issue would be the same?

With the current patch, we will be able to add
--cluster-passphase-command command-line option to front-end tools
that want to read encrypted database files. The front-end tools
execute the specified command to get KEK and unwrap DEKs. The
functions such as running passphrase command, verifying the passphrase
is correct, and getting wrapped DEKs from the database cluster are
implemented in src/common so both can use these functions.

>
> With file-level encryption, obviously all commands which have to read and
> understand the files have to be adapted if they are to still work, which
> is another argument to have some interface rather than integrated
> server-side stuff, because these external commands would need to be able
> to get keys and use

Re: Internal key management system

2020-06-19 Thread Fabien COELHO



Hello Masahiko-san,


What I referred to "only one key" is KEK.


Ok, sorry, I misunderstood.


Yeah, it depends on KMS, meaning we need different extensions for
different KMS. A KMS might support an interface that creates key if not
exist during GET but some KMS might support CREATE and GET separately.


I disagree that it is necessary, but this is debatable. The KMS-side
interface code could take care of that, eg:

   if command is "get X"
 if (X does not exist in KMS)
   create a new key stored in KMS, return it;
 else
   return KMS-stored key;
   ...

So you can still have a "GET" only interface which adapts to the "final"
KMS. Basically, the glue code which implements the interface for the KMS
can include some logic to adapt to the KMS point of view.


Is the above code is for the extension side, right?


Such a code could be in the command with which pg communicates (eg through 
its stdin/stdout, or whatever) to get keys.


pg talks to the command, the command can do anything, such as storing keys 
or communicating with an external service to retrieve them, anything 
really, that is the point.


I'm advocating defining the pg/command protocol, something along "GET xxx" 
as you wrote, and possibly provide a possible/reasonable command 
implementation, which would be part of the code you put in your patch, 
only it would be in the command instead of postgres.


For example, if users want to use a cloud KMS, say AWS KMS, to store 
DEKs and KEK they need an extension that is loaded to postgres and can 
communicate with AWS KMS. I imagine that such extension needs to be 
written in C,


Why? I could write it in bash, probably. Ok, maybe not so good for suid, 
but in principle it could be anything. I'd probably write it in C, though.


the communication between the extension uses AWS KMS API, and the 
communication between postgres core and the extension uses text 
protocol.


I'm not sure of the word "extension" above. For me the postgres side could 
be an extension as in "CREATE EXTENSION". The command itself could be 
provided in the extension code, but would not be in the "CREATE 
EXTENSION", it would be something run independently.


When postgres core needs a DEK identified by KEY-A, it asks 
for the extension to get the DEK by passing something like “GET KEY-A” 
message, and then the extension asks the existence of that key to AWK 
KMS, creates if not exist and returns it to the postgres core. Is my 
understanding right?


Yes. The command in the use-case you outline would just be an 
intermediary, but for another use-case it would do the storing. The point 
of aiming at extensibility if that from pg point of view the external 
commands provide keys, but what these commands really do to do this can be 
anything.



When we have TDE feature in the future, we would also need to change
frontend tools such as pg_waldump and pg_rewind that read database
files so that they can read encrypted files. It means that these
frond-end tools also somehow need to communicate with the external
place to get DEKs in order to decrypt encrypted database files. In
your idea, what do you think about how we can support it?


Hmmm. My idea was that the natural interface would be to get things 
through postgres. For a debug tool such as pg_waldump, probably it needs 
to be adapted if it needs to decrypt data to operate.


Now I'm not sure I understood, because of the DEK are managed by postgres 
in your patch, waldump and other external commands would have no access to 
the decrypted data anyway, so the issue would be the same?


With file-level encryption, obviously all commands which have to read and 
understand the files have to be adapted if they are to still work, which 
is another argument to have some interface rather than integrated 
server-side stuff, because these external commands would need to be able 
to get keys and use them as well.


Or I misunderstood something.


I'd like an extensible design to have anything in core. As I said in an
other mail, if you want to handle a somehow restricted use case, just
provide an external extension and do nothing in core, please. Put in core
something that people with a slightly different use case or auditor can
build on as well. The current patch makes a dozen hard-coded decisions
which it should not, IMHO.


It might have confused you that I included key manager and encryption
SQL functions to the patches but this key manager has been designed
dedicated to only TDE.


Hmmm. This is NOT AT ALL what the patch does. The documentation in your 
patch talks about "column level encryption", which is an application 
thing. Now you seem to say that it does not matter and can be removed 
because the use case is elsewhere.



It might be better to remove both SQL interface
and SQL key we discussed from the patch set as they are actually not
necessary for TDE purposes.


The documentation part of the patch, at no point, talks about TDE 
(transparent data

Re: Internal key management system

2020-06-18 Thread Jose Luis Tallon


On 18/6/20 19:41, Cary Huang wrote:

Hi all

Having read through the discussion, I have some comments and 
suggestions that I would like to share.


I think it is still quite early to even talk about external key 
management system even if it is running on the same host as PG. This 
is most likely achieved as an extension that can provide communication 
to external key server and it would be a separate project/discussion. 
I think the focus now is to finalize on the internal KMS design, and 
we can discuss about how to migrate internally managed keys to the 
external when the time is right.


As long as there exists a clean interface, and the "default" (internal) 
backend is a provider of said functionality, it'll be fine.


Given that having different KMS within a single instance (e.g. per 
database) is quite unlikely, I suggest just exposing hook-like 
function-pointer variables and be done with it. Requiring a preloaded 
library for this purpose doesn't seem too restrictive ---at least at 
this stage--- and can be very easily evolved in the future --- 
super-simple API which receives a struct made of function pointers, plus 
another function to reset it to "internal defaults" and that's it.




Key management system is generally built to manage the life cycle of 
cryptographic keys, so our KMS in my opinion needs to be built with 
key life cycle in mind such as:


* Key generation
* key protection
* key storage
* key rotation
* key rewrap
* key disable/enable
* key destroy


Add the support functions for your suggested "key information" 
functionality, and that's a very rough first draft of the API ...


KMS should not perform the above life cycle management by itself 
automatically or hardcoded, instead it should expose some interfaces 
to the end user or even a backend process like TDE to trigger the above.
The only key KMS should manage by itself is the KEK, which is derived 
from cluster passphrase value. This is fine in my opinion. This KEK 
should exist only within KMS to perform key protection (by wrapping) 
and key storage (save as file).


Asking for the "cluster password" is something better left optional / 
made easily overrideable ... or we risk thousands of clusters suddenly 
not working after a reboot :S



Just my .02€


Thanks,

    J.L.

Re: Internal key management system

2020-06-18 Thread Cary Huang

Hi all

Having read through the discussion, I have some comments and suggestions that I 
would like to share. 

I think it is still quite early to even talk about external key management 
system even if it is running on the same host as PG. This is most likely 
achieved as an extension that can provide communication to external key server 
and it would be a separate project/discussion. I think the focus now is to 
finalize on the internal KMS design, and we can discuss about how to migrate 
internally managed keys to the external when the time is right.

Key management system is generally built to manage the life cycle of 
cryptographic keys, so our KMS in my opinion needs to be built with key life 
cycle in mind such as:

* Key generation
* key protection
* key storage
* key rotation

* key rewrap
* key disable/enable
* key destroy

KMS should not perform the above life cycle management by itself automatically 
or hardcoded, instead it should expose some interfaces to the end user or even 
a backend process like TDE to trigger the above. 

The only key KMS should manage by itself is the KEK, which is derived from 
cluster passphrase value. This is fine in my opinion. This KEK should exist 
only within KMS to perform key protection (by wrapping) and key storage (save 
as file).

The other life cycle stages should be just left as interfaces, waiting for 
somebody to request KMS to perform. Somebody could be end user or back end 
process like TDE.

Using TDE as example, when TDE initializes, it calls KMS's key_generation 
interface to get however many keys that it needs, KMS should not return the 
keys in clear text of hex, it can return something like a key ID.

Using regular user as example, each user can also call KMS's key_generation 
interface to create a cryptographic key for their own purpose, KMS should also 
return just an key ID and this key should be bound to the user and we can limit 
that each user can have only one key managed, and regular user can only manage 
his/her own key with KMS, rotate, destroy, disable..etc; he/she cannot manage 
others' keys

super user (or key admin), however, can do all kinds of management to all keys, 
(generated by TDE or by other users). He or she can do key rotation, key 
rewrap, disable or destroy. Here we will need to think about how to prevent 
this user from misusing the key management functions. 

Super user should also be able to view the status of all the keys managed, 
information such as: 

* date of generation

* key ID

* owner

* status

* key length

* suggested date of rotation..etc etc

* expiry date??

to actually perform the encryption with keys managed by internal KMS, I suggest 
adding a set of wrapper functions within KMS using the Key ID as input 
argument. So for example, TDE wants to encrypt some data, it will call KMS's 
wrapper encryption function with Key ID supplied, KMS looked up the key with  
key ID ,verify caller's permission and translate these parameters and feed to 
pgcrypto for example. This may be a little slower due to the lookup, but we can 
have a variation of the function where KMS can look up the key with supplied 
key ID and convert it to encryption context and return it back to TDE. Then TDE 
can use this context to call another wrapper function for encryption without 
lookup all the time. If an end user wants to encrypt something, it will also 
call KMS's wrapper function and supply the key ID in the same way. 

I know that there is a discussion on moving the cryptographic functions as 
extension. In an already running PG, it is fine, but when TDE and XLOG 
bootstraps during initdb, it requires cryptographic function to encrypt the 
initial WAL file and during initdb i dont think it has access to any 
cryptographic function provided by the extension. we may need to include 
limited cryptographic function within KMS and TDE so it is enough to finish 
initdb with intial WAl encrypted.

This is just my thought how this KMS and TDE should look like. 

Best

Cary Huang

-

HighGo Software Inc. (Canada)

mailto:cary.hu...@highgo.ca

http://www.highgo.ca

 On Tue, 16 Jun 2020 23:52:03 -0700 Masahiko Sawada 
 wrote 

On Sun, 14 Jun 2020 at 19:39, Fabien COELHO  wrote: 
> 
> 
> Hello Masahiko-san, 
> 
> >>> * The external place needs to manage more encryption keys than the 
> >>> current patch does. 
> >> 
> >> Why? If the external place is just a separate process on the same host, 
> >> probably it would manage the very same amount as what your patch. 
> > 
> > In the current patch, the external place needs to manage only one key 
> > whereas postgres needs to manages multiple DEKs. But with your idea, 
> > the external place needs to manage both KEK and DEKs. 
> 
> Hmmm. I do not see a good use case for a "management system" which would 
> only have to manage a singleton. ISTM that one point of using one KEK is 
> to allows several DEKs under it.

Re: Internal key management system

2020-06-17 Thread Masahiko Sawada

On Sun, 14 Jun 2020 at 19:39, Fabien COELHO  wrote:
>
>
> Hello Masahiko-san,
>
> >>> * The external place needs to manage more encryption keys than the
> >>> current patch does.
> >>
> >> Why? If the external place is just a separate process on the same host,
> >> probably it would manage the very same amount as what your patch.
> >
> > In the current patch, the external place needs to manage only one key
> > whereas postgres needs to manages multiple DEKs. But with your idea,
> > the external place needs to manage both KEK and DEKs.
>
> Hmmm. I do not see a good use case for a "management system" which would
> only have to manage a singleton. ISTM that one point of using one KEK is
> to allows several DEKs under it. Maybe I have missed something in your
> patch, but only one key is a very restricted use case.
>
> >>> Some cloud key management services are charged by the number of active
> >>> keys and key operations. So the number of keys postgres requires affects
> >>> the charges. It'd be worse if we were to have keys per table.
> >>
> >> Possibly. Note that you do not have to use a cloud storage paid as a
> >> service. However, you could do it if there is an interface, because it
> >> would allow postgres to do so if the user wishes that. That is the point
> >> of having an interface that can be implemented differently for different
> >> use cases.
> >
> > The same is true for the current patch. The user can get KEK from
> > anywhere they want using cluster_passphrase_command.
>
> Yep. Somehow I'm proposing to have a command to get DEKs instead of just
> the KEK, otherwise it is not that far.
>
> > But as I mentioned above the number of keys that the user manages
> > outside postgres is different.
>
> Yep, and I do not think that "only one key" approach is good. I really
> missed something in the patch. From a use case point of view, I thought
> that the user could have has many keys has they see fit. Maybe one per
> cluser, or database, or table, or a row if for some reason the application
> would demand it. I do not think that the pg should decide that, among
> other things. That is why I'm constantly refering to a "key identifier",
> and in the pseudo code I added a "local id" (typically an int).

What I referred to "only one key" is KEK. In the current patch,
postgres needs to manage multiple DEKs and fetches one KEK from
somewhere. According to the recent TDE discussion, we would have one
DEK for all tables/indexes encryption and one DEK for WAL encryption
as the first step.

>
> >>> * If this approach supports only GET protocol, the user needs to
> >>> create encryption keys with appropriate ids in advance so that
> >>> postgres can get keys by id. If postgres TDE creates keys as needed,
> >>> CREATE protocol would also be required.
> >>
> >> I'm not sure. ISTM that if there is a KMS to manage keys, it could be its
> >> responsability to actually create a key, however the client (pg) would
> >> have to request it, basically say "given me a new key for this id".
> >>
> >> This could even work with a "get" command only, if the KMS is expected to
> >> create a new key when asked for a key which does not exists yet. ISTM that
> >> the client could (should?) only have to create identifiers for its keys.
> >
> > Yeah, it depends on KMS, meaning we need different extensions for
> > different KMS. A KMS might support an interface that creates key if not
> > exist during GET but some KMS might support CREATE and GET separately.
>
> I disagree that it is necessary, but this is debatable. The KMS-side
> interface code could take care of that, eg:
>
>if command is "get X"
>  if (X does not exist in KMS)
>create a new key stored in KMS, return it;
>  else
>return KMS-stored key;
>...
>
> So you can still have a "GET" only interface which adapts to the "final"
> KMS. Basically, the glue code which implements the interface for the KMS
> can include some logic to adapt to the KMS point of view.

Is the above code is for the extension side, right? For example, if
users want to use a cloud KMS, say AWS KMS, to store DEKs and KEK they
need an extension that is loaded to postgres and can communicate with
AWS KMS. I imagine that such extension needs to be written in C, the
communication between the extension uses AWS KMS API, and the
communication between postgres core and the extension uses text
protocol. When postgres core needs a DEK identified by KEY-A, it asks
for the extension to get the DEK by passing something like “GET KEY-A”
message, and then the extension asks the existence of that key to AWK
KMS, creates if not exist and returns it to the postgres core. Is my
understanding right?

When we have TDE feature in the future, we would also need to change
frontend tools such as pg_waldump and pg_rewind that read database
files so that they can read encrypted files. It means that these
frond-end tools also somehow need to communicate with the external
place to get DEKs in order

Re: Internal key management system

2020-06-14 Thread Fabien COELHO




Hello Masahiko-san,


* The external place needs to manage more encryption keys than the
current patch does.


Why? If the external place is just a separate process on the same host,
probably it would manage the very same amount as what your patch.


In the current patch, the external place needs to manage only one key
whereas postgres needs to manages multiple DEKs. But with your idea,
the external place needs to manage both KEK and DEKs.


Hmmm. I do not see a good use case for a "management system" which would 
only have to manage a singleton. ISTM that one point of using one KEK is 
to allows several DEKs under it. Maybe I have missed something in your 
patch, but only one key is a very restricted use case.



Some cloud key management services are charged by the number of active
keys and key operations. So the number of keys postgres requires affects
the charges. It'd be worse if we were to have keys per table.


Possibly. Note that you do not have to use a cloud storage paid as a
service. However, you could do it if there is an interface, because it
would allow postgres to do so if the user wishes that. That is the point
of having an interface that can be implemented differently for different
use cases.


The same is true for the current patch. The user can get KEK from
anywhere they want using cluster_passphrase_command.


Yep. Somehow I'm proposing to have a command to get DEKs instead of just 
the KEK, otherwise it is not that far.


But as I mentioned above the number of keys that the user manages 
outside postgres is different.


Yep, and I do not think that "only one key" approach is good. I really 
missed something in the patch. From a use case point of view, I thought 
that the user could have has many keys has they see fit. Maybe one per 
cluser, or database, or table, or a row if for some reason the application 
would demand it. I do not think that the pg should decide that, among 
other things. That is why I'm constantly refering to a "key identifier", 
and in the pseudo code I added a "local id" (typically an int).



* If this approach supports only GET protocol, the user needs to
create encryption keys with appropriate ids in advance so that
postgres can get keys by id. If postgres TDE creates keys as needed,
CREATE protocol would also be required.


I'm not sure. ISTM that if there is a KMS to manage keys, it could be its
responsability to actually create a key, however the client (pg) would
have to request it, basically say "given me a new key for this id".

This could even work with a "get" command only, if the KMS is expected to
create a new key when asked for a key which does not exists yet. ISTM that
the client could (should?) only have to create identifiers for its keys.


Yeah, it depends on KMS, meaning we need different extensions for 
different KMS. A KMS might support an interface that creates key if not 
exist during GET but some KMS might support CREATE and GET separately.


I disagree that it is necessary, but this is debatable. The KMS-side 
interface code could take care of that, eg:


  if command is "get X"
if (X does not exist in KMS)
  create a new key stored in KMS, return it;
else
  return KMS-stored key;
  ...

So you can still have a "GET" only interface which adapts to the "final"
KMS. Basically, the glue code which implements the interface for the KMS 
can include some logic to adapt to the KMS point of view.



* If we need only GET protocol, the current approach (i.g.



The point of a discussion is basically to present arguments.


My point is the same as Bruce. I'm concerned about the fact that even
if we introduce this approach the present data could still be stolen
when a postgres process is compromised.


Yes, sure.

It seems to me that your approach is extensible and can protect data 
from threats in addition to threats that the current patch can protect 
but it would bring some costs and complexity instead comparing to the 
current patch. I'd like to hear opinions from other hackers in the 
community.


I'd like an extensible design to have anything in core. As I said in an 
other mail, if you want to handle a somehow restricted use case, just 
provide an external extension and do nothing in core, please. Put in core 
something that people with a slightly different use case or auditor can 
build on as well. The current patch makes a dozen hard-coded decisions 
which it should not, IMHO.



I think the actual code would help to explain how your approach is not
complexed.


I provided quite some pseudo code, including some python. I'm not planning 
to redevelop the whole thing: my contribution is a review, currently about 
the overall design, then if somehow I agree on the design, I would look at 
the code more precisely.


--
Fabien.

Re: Internal key management system

2020-06-14 Thread Masahiko Sawada

On Sat, 13 Jun 2020 at 05:59, Fabien COELHO  wrote:
>
>
> Hello Masahiko-san,
>
> > Summarizing the discussed points so far, I think that the major
> > advantage points of your idea comparing to the current patch's
> > architecture are:
> >
> > * More secure. Because it never loads KEK in postgres processes we can
> > lower the likelihood of KEK leakage.
>
> Yes.
>
> > * More extensible. We will be able to implement more protocols to
> > outsource other operations to the external place.
>
> Yes.
>
> > On the other hand, here are some downsides and issues:
> >
> > * The external place needs to manage more encryption keys than the
> > current patch does.
>
> Why? If the external place is just a separate process on the same host,
> probably it would manage the very same amount as what your patch.

In the current patch, the external place needs to manage only one key
whereas postgres needs to manages multiple DEKs. But with your idea,
the external place needs to manage both KEK and DEKs.

>
> > Some cloud key management services are charged by the number of active
> > keys and key operations. So the number of keys postgres requires affects
> > the charges. It'd be worse if we were to have keys per table.
>
> Possibly. Note that you do not have to use a cloud storage paid as a
> service. However, you could do it if there is an interface, because it
> would allow postgres to do so if the user wishes that. That is the point
> of having an interface that can be implemented differently for different
> use cases.

The same is true for the current patch. The user can get KEK from
anywhere they want using cluster_passphrase_command. But as I
mentioned above the number of keys that the user manages outside
postgres is different.

>
> > * If this approach supports only GET protocol, the user needs to
> > create encryption keys with appropriate ids in advance so that
> > postgres can get keys by id. If postgres TDE creates keys as needed,
> > CREATE protocol would also be required.
>
> I'm not sure. ISTM that if there is a KMS to manage keys, it could be its
> responsability to actually create a key, however the client (pg) would
> have to request it, basically say "given me a new key for this id".
>
> This could even work with a "get" command only, if the KMS is expected to
> create a new key when asked for a key which does not exists yet. ISTM that
> the client could (should?) only have to create identifiers for its keys.

Yeah, it depends on KMS, meaning we need different extensions for
different KMS. A KMS might support an interface that creates key if
not exist during GET but some KMS might support CREATE and GET
separately.

>
> > * If we need only GET protocol, the current approach (i.g.
> > cluser_passphase_command) would be more simple. I imagine the
> > interface between postgres and the extension is C function.
>
> Yes. ISTM that can be pretty simple, something like:
>
> A guc to define the process to start the interface (having a process means
> that its uid can be changed), which would communicate on its stdin/stdout.
>
> A guc to define how to interact with the interface (eg whether DEK are
> retrieved, or whether the interface is to ask for encryption/decryption,
> or possibly some other modes).
>
> A few function:
>
>   - set_key(, );
> # may retrieve the DEK, or only note that local id of some key.
>
>   - encode(, ) -> 
> # may fail if no key is associated to local-id
> # or if the service is down somehow
>
>   - decode(, ) -> 
> # could also fail if there is some integrity check associated
>
> > This approach is more extensible
>
> Yep.
>
> > but it also means extensions need to support multiple protocols, leading
> > to increase complexity and development cost.
>
> I do not understand what you mean by "multiple protocols". For me there is
> one protocol, possibly a few commands in this protocol between client
> (postgres) and DMS. Anyway, sending "GET " to retreive a DEK, for
> instance, does not sound "complex". Here is some pseudo code:
>
> For get_key:
>
>if (mode of operation is to have DEKS locally)
>  try
>send to KMS "get "
>keys[local-id] = answer
>  catch & rethrow possible errors
>elif (mode is to keep DEKs remote)
>  key_id[local-id] = key-id;
>else ...
>
> For encode, the code is basically:
>
>if (has_key(local-id))
>  if (mode of operation is to have DEKs locally)
> return some_encode(key[local-id], data);
>  elif (mode is to keep DEKs remote)
> send to KMS "encode key_id[local-id] data"
> return answer; # or error
>  else ...
>else
>   throw error local-id has no associated key;
>
> decode is more or less the same as encode.
>
> There is another thing to consider is how the client "proves" its identity
> to the KMS interface, which might suggest some provisions when starting a
> process, but you already have things in your patch to deal with the KEK,
> which could be turned into some

Re: Internal key management system

2020-06-12 Thread Fabien COELHO




Hello Bruce.


The question is what should be put in the protocol, and I would tend to
think that some careful design time should be put in it.


I still don't see the value of this vs. its complexity.


Dunno. I'm looking for the value of having such a thing in core.

ISTM that there are no clear design goals of the system, no clear 
description of the target use case(s), no clear explanations of the 
underlying choices (in something like a README), no saying what it 
achieves and what it does not achieve. It is only code.


If the proposed thing is very specific to one use case, which may be more 
or less particular, then I'd say the stuff should really be an external 
extension, and you do not need to ask for a review. Call it pgcryptoXYZ 
and it is done.


However, if the stuff is amenable to many/more use cases, then it may 
still be an extension because it is specialized somehow and not everyone 
would like to have it if they do not use it, but having it in core would 
be much more justified. Also, it would have to be a little more "complex" 
to be extensible, sure. I do not think that it needs to be very complex in 
the end, but it needs to be carefully designed to be extensible.


Note I still do not see why it should be in core directly, i.e. not an 
extension. I'm yet to see a convincing argument about that.


--
Fabien.

Re: Internal key management system

2020-06-12 Thread Fabien COELHO



Hello Masahiko-san,


Summarizing the discussed points so far, I think that the major
advantage points of your idea comparing to the current patch's
architecture are:

* More secure. Because it never loads KEK in postgres processes we can
lower the likelihood of KEK leakage.


Yes.


* More extensible. We will be able to implement more protocols to
outsource other operations to the external place.


Yes.


On the other hand, here are some downsides and issues:

* The external place needs to manage more encryption keys than the
current patch does.


Why? If the external place is just a separate process on the same host, 
probably it would manage the very same amount as what your patch.


Some cloud key management services are charged by the number of active 
keys and key operations. So the number of keys postgres requires affects 
the charges. It'd be worse if we were to have keys per table.


Possibly. Note that you do not have to use a cloud storage paid as a 
service. However, you could do it if there is an interface, because it 
would allow postgres to do so if the user wishes that. That is the point 
of having an interface that can be implemented differently for different 
use cases.



* If this approach supports only GET protocol, the user needs to
create encryption keys with appropriate ids in advance so that
postgres can get keys by id. If postgres TDE creates keys as needed,
CREATE protocol would also be required.


I'm not sure. ISTM that if there is a KMS to manage keys, it could be its 
responsability to actually create a key, however the client (pg) would 
have to request it, basically say "given me a new key for this id".


This could even work with a "get" command only, if the KMS is expected to 
create a new key when asked for a key which does not exists yet. ISTM that 
the client could (should?) only have to create identifiers for its keys.



* If we need only GET protocol, the current approach (i.g.
cluser_passphase_command) would be more simple. I imagine the
interface between postgres and the extension is C function.


Yes. ISTM that can be pretty simple, something like:

A guc to define the process to start the interface (having a process means 
that its uid can be changed), which would communicate on its stdin/stdout.


A guc to define how to interact with the interface (eg whether DEK are 
retrieved, or whether the interface is to ask for encryption/decryption, 
or possibly some other modes).


A few function:

 - set_key(, );
   # may retrieve the DEK, or only note that local id of some key.

 - encode(, ) -> 
   # may fail if no key is associated to local-id
   # or if the service is down somehow

 - decode(, ) -> 
   # could also fail if there is some integrity check associated


This approach is more extensible


Yep.

but it also means extensions need to support multiple protocols, leading 
to increase complexity and development cost.


I do not understand what you mean by "multiple protocols". For me there is 
one protocol, possibly a few commands in this protocol between client 
(postgres) and DMS. Anyway, sending "GET " to retreive a DEK, for 
instance, does not sound "complex". Here is some pseudo code:


For get_key:

  if (mode of operation is to have DEKS locally)
try
  send to KMS "get "
  keys[local-id] = answer
catch & rethrow possible errors
  elif (mode is to keep DEKs remote)
key_id[local-id] = key-id;
  else ...

For encode, the code is basically:

  if (has_key(local-id))
if (mode of operation is to have DEKs locally)
   return some_encode(key[local-id], data);
elif (mode is to keep DEKs remote)
   send to KMS "encode key_id[local-id] data"
   return answer; # or error
else ...
  else
 throw error local-id has no associated key;

decode is more or less the same as encode.

There is another thing to consider is how the client "proves" its identity 
to the KMS interface, which might suggest some provisions when starting a 
process, but you already have things in your patch to deal with the KEK, 
which could be turned into some generic auth.



* This approach necessarily doesn’t eliminate the data leakage threat
completely caused by process compromisation.


Sure, if the process as decrypted data or DEK or whatever, then the 
process compromission leaks these data. My point is to try to limit the 
leakage potential of a process compromission.



Since DEK is placed in postgres process memory,


May be placed, depending on the mode of operation.

it’s still possible that if a postgres process is compromised the 
attacker can steal database data.


Obviously. This cannot be helped if pg is to hold unencrypted data.

The benefit of lowering the possibility of KEK leakage is to deal with 
the threat that the attacker sees database data encrypted by past or 
future DEK protected by the stolen KEK.


Yes.


* An open question is, as you previously mentioned, how to verify the
key obtained from the external place is the

Re: Internal key management system

2020-06-12 Thread Bruce Momjian

On Fri, Jun 12, 2020 at 09:17:37AM +0200, Fabien COELHO wrote:
> The question is what should be put in the protocol, and I would tend to
> think that some careful design time should be put in it.

I still don't see the value of this vs. its complexity.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-06-12 Thread Masahiko Sawada

On Fri, 12 Jun 2020 at 16:17, Fabien COELHO  wrote:
>
>
> Hello Masahiko-san,
>
> I'm not sure I understood your concern. I try to answer below.
>
> > If I understand your idea correctly we put both DEK and KEK
> > "elsewhere", and a postgres process gets only DEK from it.
>
> Yes, that is one of the option.
>
> > It seems to me this idea assumes that the place storing encryption keys
> > employees 2-tire key hierarchy or similar thing.
>
> ISTM that there is no such assumption. There is the assumption that there
> is an interface to retrieve DEK. What is done being the interface to
> retrieve this DEK should be irrelevant to pg. Having them secure by a
> KEK looks like an reasonable design, though. Maybe keys are actually
> stored. Maybe thay are computed based on something, eg key identifier and
> some secret. Maybe there is indeed a 2-tier something. Maybe whatever.
>
> > What if the user wants to or has to manage a single encryption key?
>
> Then it has one key identifier and it retrieve one key from the DMS.
> Having a "management system" for a singleton looks like overkill though,
> but it should work.
>
> > For example, storing an encryption key for PostgreSQL TDE into a file in
> > a safe server instead of KMS using DEK and KEK because of budgets or
> > requirements whatever.
>
> Good. If you have an interface to retrieve a key, then it can probably
> contact said server to get it when needed?
>
> > In this case, if the user does key rotation, that encryption key would
> > need to be rotated, resulting in the user would need to re-encrypt all
> > database data encrypted with old key.
>
> Sure, by definition actually changing the key requires a
> decryption/encryption cycle on all data.
>
> > It should work but what do you think about how postgres does key
> > rotation and re-encryption?
>
> If pg actually has the DEK, then it means that while the re-encryption is
> performed it has to manage two keys simultenaously, this is a question for
> what is done on pg server with the keys, not really about the DMS ?

Yes. Your explanation made my concern clear. Thanks!

>
> If the "elsewhere" service does the encryption, maybe the protocol could
> include it, eg something like:
>
> REC key1-id key2-id data-encrypted-with-key1
>   -> data-encrypted-with-key2
>
> But it could also achieve the same thing with two commands, eg:
>
> DEC key1-id data-encrypted-with-key1
>   -> clear-text-data
>
> ENC key2-id clear-text-data
>   -> data-encrypted-with-key2
>
> The question is what should be put in the protocol, and I would tend to
> think that some careful design time should be put in it.
>

Summarizing the discussed points so far, I think that the major
advantage points of your idea comparing to the current patch's
architecture are:

* More secure. Because it never loads KEK in postgres processes we can
lower the likelihood of KEK leakage.
* More extensible. We will be able to implement more protocols to
outsource other operations to the external place.

On the other hand, here are some downsides and issues:

* The external place needs to manage more encryption keys than the
current patch does. Some cloud key management services are charged by
the number of active keys and key operations. So the number of keys
postgres requires affects the charges. It'd be worse if we were to
have keys per table.

* If this approach supports only GET protocol, the user needs to
create encryption keys with appropriate ids in advance so that
postgres can get keys by id. If postgres TDE creates keys as needed,
CREATE protocol would also be required.

* If we need only GET protocol, the current approach (i.g.
cluser_passphase_command) would be more simple. I imagine the
interface between postgres and the extension is C function. This
approach is more extensible but it also means extensions need to
support multiple protocols, leading to increase complexity and
development cost.

* This approach necessarily doesn’t eliminate the data leakage threat
completely caused by process compromisation. Since DEK is placed in
postgres process memory, it’s still possible that if a postgres
process is compromised the attacker can steal database data. The
benefit of lowering the possibility of KEK leakage is to deal with the
threat that the attacker sees database data encrypted by past or
future DEK protected by the stolen KEK.

* An open question is, as you previously mentioned, how to verify the
key obtained from the external place is the right key.

Anything else we need to note?

Finally, please understand I’m not controverting your idea but just
trying to understand which approach is better from multiple
perspectives.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-06-12 Thread Fabien COELHO




Hello Masahiko-san,

I'm not sure I understood your concern. I try to answer below.


If I understand your idea correctly we put both DEK and KEK
"elsewhere", and a postgres process gets only DEK from it.


Yes, that is one of the option.

It seems to me this idea assumes that the place storing encryption keys 
employees 2-tire key hierarchy or similar thing.


ISTM that there is no such assumption. There is the assumption that there 
is an interface to retrieve DEK. What is done being the interface to 
retrieve this DEK should be irrelevant to pg. Having them secure by a 
KEK looks like an reasonable design, though. Maybe keys are actually 
stored. Maybe thay are computed based on something, eg key identifier and 
some secret. Maybe there is indeed a 2-tier something. Maybe whatever.



What if the user wants to or has to manage a single encryption key?


Then it has one key identifier and it retrieve one key from the DMS. 
Having a "management system" for a singleton looks like overkill though, 
but it should work.


For example, storing an encryption key for PostgreSQL TDE into a file in 
a safe server instead of KMS using DEK and KEK because of budgets or 
requirements whatever.


Good. If you have an interface to retrieve a key, then it can probably 
contact said server to get it when needed?



In this case, if the user does key rotation, that encryption key would
need to be rotated, resulting in the user would need to re-encrypt all
database data encrypted with old key.


Sure, by definition actually changing the key requires a 
decryption/encryption cycle on all data.


It should work but what do you think about how postgres does key 
rotation and re-encryption?


If pg actually has the DEK, then it means that while the re-encryption is 
performed it has to manage two keys simultenaously, this is a question for 
what is done on pg server with the keys, not really about the DMS ?


If the "elsewhere" service does the encryption, maybe the protocol could 
include it, eg something like:


REC key1-id key2-id data-encrypted-with-key1
 -> data-encrypted-with-key2

But it could also achieve the same thing with two commands, eg:

DEC key1-id data-encrypted-with-key1
 -> clear-text-data

ENC key2-id clear-text-data
 -> data-encrypted-with-key2

The question is what should be put in the protocol, and I would tend to 
think that some careful design time should be put in it.


--
Fabien.

Re: Internal key management system

2020-06-12 Thread Masahiko Sawada

On Thu, 11 Jun 2020 at 20:03, Fabien COELHO  wrote:
>
>
> Hello Masahiko-san,
>
> >> If the KEK is ever present in pg process, it presumes that the threat
> >> model being addressed allows its loss if the process is compromised, i.e.
> >> all (past, present, future) security properties are void once the process
> >> is compromised.
> >
> > Why we should not put KEK in pg process but it's okay for other
> > processes?
>
> My point is "elsewhere".
>
> Indeed, it could be on another process on the same host, in which case I'd
> rather have the process run under a different uid, which means another
> compromission would be required if pg is compromissed locally ; it could
> also be in a process on another host ; it could be on some special
> hardware. Your imagination is the limit.
>
> > I guess you're talking about a threat when a malicious user logged in OS
> > (or at least accessible) but I thought there is no difference between pg
> > process and other processes in terms of the process being compromised.
>
> Processes are isolated based on uid, unless root is compromised. Once a id
> is compromised (eg "postgres"), the hacker has basically access to all
> files and processes accessible to that id.
>
> > So the solution, in that case, would be to outsource
> > encryption/decryption to external servers as Bruce mentioned.
>
> Hosting stuff (keys, encryption) on another server is indeed an option if
> "elsewhere" is implemented.

If I understand your idea correctly we put both DEK and KEK
"elsewhere", and a postgres process gets only DEK from it. It seems to
me this idea assumes that the place storing encryption keys employees
2-tire key hierarchy or similar thing. What if the user wants to or
has to manage a single encryption key? For example, storing an
encryption key for PostgreSQL TDE into a file in a safe server instead
of KMS using DEK and KEK because of budgets or requirements whatever.
In this case, if the user does key rotation, that encryption key would
need to be rotated, resulting in the user would need to re-encrypt all
database data encrypted with old key. It should work but what do you
think about how postgres does key rotation and re-encryption?

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-06-11 Thread Fabien COELHO




Hello Masahiko-san,


If the KEK is ever present in pg process, it presumes that the threat
model being addressed allows its loss if the process is compromised, i.e.
all (past, present, future) security properties are void once the process
is compromised.


Why we should not put KEK in pg process but it's okay for other
processes?


My point is "elsewhere".

Indeed, it could be on another process on the same host, in which case I'd 
rather have the process run under a different uid, which means another 
compromission would be required if pg is compromissed locally ; it could 
also be in a process on another host ; it could be on some special 
hardware. Your imagination is the limit.


I guess you're talking about a threat when a malicious user logged in OS 
(or at least accessible) but I thought there is no difference between pg 
process and other processes in terms of the process being compromised.


Processes are isolated based on uid, unless root is compromised. Once a id 
is compromised (eg "postgres"), the hacker has basically access to all 
files and processes accessible to that id.


So the solution, in that case, would be to outsource 
encryption/decryption to external servers as Bruce mentioned.


Hosting stuff (keys, encryption) on another server is indeed an option if 
"elsewhere" is implemented.



From a design point of view:


 0. KEK, DEK & crypto are managed by pg

 1. DEK & crypto are managed by pg,
but KEK is outside pg.

 2. eveything is managed out of pg.

I think that both 1 & 2 are valid options, which do not require the same 
interface. If you have 1, you can do 0 by giving KEK to a pg process.


How DEK are identified and created with the KEK should also be something 
open, left to the implementation, the interface should not need to know.


--
Fabien.

Re: Internal key management system

2020-06-11 Thread Masahiko Sawada

On Thu, 11 Jun 2020 at 16:07, Fabien COELHO  wrote:
>
>
> Hello Bruce,
>
> Sorry for the length (yet again) of this answer, I'm trying to make my
> point as clear as possible.

Thank you for your explanation!

>
> >> Obviously it requires some more thinking and design, but my point is that
> >> postgres should not hold a KEK, ever, nor presume how DEK are to be managed
> >> by a DMS, and that is not very difficult to achieve by putting it outside 
> >> of
> >> pg and defining how interactions take place. Providing a reference/example
> >> implementation would be nice as well, and Masahiko-san code can be 
> >> rewrapped
> >> quite easily.
> >
> > Well, the decrypted keys are already stored in backend memory,
>
> The fact that if the pg process is compromised then the DEK and data
> encrypted are compromised is more or less unavoidable (maybe only the data
> could be compromised though, but not the DEK, depending on how the
> encryption/decryption operates).
>
> > so what risk does haveing the KEK in memory for a brief period avoid?
>
> My understanding is that the KEK does not protect one key, but all keys,
> thus all data, possibly even past or future, so it loss impacts more than
> the here and now local process.
>
> If the KEK is ever present in pg process, it presumes that the threat
> model being addressed allows its loss if the process is compromised, i.e.
> all (past, present, future) security properties are void once the process
> is compromised.

Why we should not put KEK in pg process but it's okay for other
processes? I guess you're talking about a threat when a malicious user
logged in OS (or at least accessible) but I thought there is no
difference between pg process and other processes in terms of the
process being compromised. So the solution, in that case, would be to
outsource encryption/decryption to external servers as Bruce
mentioned.

Regards,


--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-06-11 Thread Fabien COELHO




Hello Bruce,

Sorry for the length (yet again) of this answer, I'm trying to make my 
point as clear as possible.



Obviously it requires some more thinking and design, but my point is that
postgres should not hold a KEK, ever, nor presume how DEK are to be managed
by a DMS, and that is not very difficult to achieve by putting it outside of
pg and defining how interactions take place. Providing a reference/example
implementation would be nice as well, and Masahiko-san code can be rewrapped
quite easily.


Well, the decrypted keys are already stored in backend memory,


The fact that if the pg process is compromised then the DEK and data 
encrypted are compromised is more or less unavoidable (maybe only the data 
could be compromised though, but not the DEK, depending on how the 
encryption/decryption operates).



so what risk does haveing the KEK in memory for a brief period avoid?


My understanding is that the KEK does not protect one key, but all keys, 
thus all data, possibly even past or future, so it loss impacts more than 
the here and now local process.


If the KEK is ever present in pg process, it presumes that the threat 
model being addressed allows its loss if the process is compromised, i.e. 
all (past, present, future) security properties are void once the process 
is compromised.


This may be okay for some use cases, but I can easily foresee that it 
would not be for all. I can think of use cases where the user/auditor says 
that the KEK should be elsewhere, and I would tend to agree.


So my point is that the implementation should allow it, i.e. define a 
simple interface, and possibly a reference implementation with good 
properties which might fit some/typical security requirements, and the 
patch mostly fits that need, but for the possible isolation of the KEK.


ISTM that a reasonable point of comparision is the design of kerberos, 
with a central authority/server which authenticate parties and allow them 
to authenticate one another and communicate securely.


The design means that any compromised host/service would compromise all 
its interaction with other parties, but not communications between third 
parties. The compromission stays local, with the exception is the kerberos 
server itself, which somehow holds all the keys.


For me the KEK is basically the kerberos server, you should provide means 
to allow the user to isolate that where they think it should be, and not 
enforce that it is within postgres process.


Another point is that what I suggest does not seem very hard from an 
implementation point of view, and allows extensibility, which is also a 
win.


Lastly, I still think that all this, whatever the design, should be 
packaged as an extension, unless it is really impossible to do so. I would 
also try to separate the extension for KMS interaction with the extension 
for actually using the keys, so that the user could change the underlying 
cryptographic primitives as they see fit.


--
Fabien.

Re: Internal key management system

2020-06-10 Thread Bruce Momjian

On Fri, Jun  5, 2020 at 03:34:54PM +0200, Fabien COELHO wrote:
> Obviously it requires some more thinking and design, but my point is that
> postgres should not hold a KEK, ever, nor presume how DEK are to be managed
> by a DMS, and that is not very difficult to achieve by putting it outside of
> pg and defining how interactions take place. Providing a reference/example
> implementation would be nice as well, and Masahiko-san code can be rewrapped
> quite easily.

Well, the decrypted keys are already stored in backend memory, so what
risk does haveing the KEK in memory for a brief period avoid?

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-06-05 Thread Fabien COELHO



Hello Bruce,

Hmmm. This seels to suggest that interacting with something outside 
should be an option.


Our goal is not to implement every possible security idea someone has,
because we will never finish, and the final result would be too complex
to be unable.


Sure. I'm trying to propose something both simple and extensible, so that 
other people could plug their own KMS if they are not fully satisfied with 
the way the internal pg KMS works, which IMHO should be the case if 
someone is motivated and paranoid enough to setup a KMS in the first 
place.


You will need to explain exactly why having a separate process has value 
over coding/user complexity, and you will need to get agreement from a 
sufficient number of people to move that idea forward.


ISTM that the value is simple: The whole KMS idea turns around a "KEK", 
which is a secret key which allows to unlock/retrieve/recompute many data 
keys, aka DEKs. Loosing the KEK basically means loosing all data keys, 
past, present and possibly future, depending on how the KEK/DEK mechanism 
operates internally.


So the thing you should not want is to lose your KEK.

Keeping it inside pg process means that any pg process compromision would 
result in the KEK to be compromised as well, while the whole point of 
doing this KMS business was to provide security by isolating realms of 
data encryption.


If you provide an interface instead, which I'm advocating, then where the 
KEK is does not concern pg, which has just to ask for DEKs. A compromise 
of pg would compromise local DEKs, but not the KEK "master key". The KEK 
may be somewhere on the same host, in another process, or possibly on 
another host, on some attached specialized quantum hardware inacessible to 
human beings. Postgres should not decide where the user should put its 
KEK, because it would depend on the threat model being addressed that you 
do not know.


From an implementation point of view, what I suggest is reasonably simple 
and allows people to interface with the KMS of their choice, including one 
based on the patch, which would be a demos about what can be done, but 
other systems would be accessible just as well. The other software 
engineering aspect is that a KMS is a complex/sensitive thing, so 
reinventing our own and forcing it on users seems like a bad idea.


From what I understood from the code, the KEK is loaded into postgres 
process. That is what I'm disagreeing with, only needed DEK should be 
there.


One option would be to send the data needing to be encrypted to an
external command, and get the decrypted data back.  In that way, the KEK
is never on the Postgres server.  However, the API for doing such an
interface seems very complex and could lead to failures.


I was more thinking of an interface to retrieve DEKs, but to still keep 
encryption/decryption inside postgres, to limit traffic, but what you 
suggest could be a valid option, so maybe should be allowed.


I disagree with the implementation complexity, though.

Basically the protocol only function is sending "GET 
" and retrieving a response which is either the DEK 
or an error, which looks like a manageable complexity. Attached a 
simplistic server-side implementation of that for illustration.


If you want externalized DEK as well, it would be sending "ENC/DEC 
 " and the response is an error, or the translated 
data. Looks manageable as well. Allowing both approaches looks ok.


Obviously it requires some more thinking and design, but my point is that 
postgres should not hold a KEK, ever, nor presume how DEK are to be 
managed by a DMS, and that is not very difficult to achieve by putting it 
outside of pg and defining how interactions take place. Providing a 
reference/example implementation would be nice as well, and Masahiko-san 
code can be rewrapped quite easily.


--
Fabien.#! /usr/bin/env python3
#
# This (powerful) code is public domain.

import sys
from hashlib import sha256 as H

# obviously the KEK should be retrieved from somewhere else
KEK = b"super secret default KEK"

while True:
try: # parse GET 0xdeadbeef
req = sys.stdin.readline())
assert req[0:6] == "GET 0x" and req[-1] == "\n"
hdata = bytes.fromhex(req[6:-1])
print(f"DEK 0x{H(hdata + KEK).hexdigest()}")
except Exception as e:
print(f"ERR {e}")

Re: Internal key management system

2020-06-03 Thread Bruce Momjian

On Wed, Jun  3, 2020 at 09:16:03AM +0200, Fabien COELHO wrote:
> > > Also, I'm not at fully at ease with some of the underlying principles
> > > behind this proposal. Are we re-inventing/re-implementing kerberos or
> > > whatever? Are we re-implementing a brand new KMS inside pg? Why having
> > > our own?
> > 
> > As I explained above, this key manager is for managing internal keys
> > used by TDE. It's not an alternative to existing key management
> > solutions/services.
> 
> Hmmm. This seels to suggest that interacting with something outside should
> be an option.

Our goal is not to implement every possible security idea someone has,
because we will never finish, and the final result would be too complex
to be unable.  You will need to explain exactly why having a separate
process has value over coding/user complexity, and you will need to get
agreement from a sufficient number of people to move that idea forward.

> > The key rotation this key manager has is KEK rotation, which is very
> > important. Without KEK rotation, when KEK is leaked an attacker can
> > get database data by disk theft. Since KEK is responsible for
> > encrypting all internal keys it's necessary to re-encrypt the internal
> > keys when KEK is rotated. I think PG is the only role that can do that
> > job.
> 
> I'm not claiming that KEK rotation is a bad thing, I'm saying that it should
> not be postgres problem. My issue is where you put the thing, not about the
> thing itself.
> 
> > I think this key manager satisfies the fist point by
> > cluster_passphrase_command. For the second point, the key manager
> > stores local keys inside PG while protecting them by KEK managed
> > outside of PG.
> 
> I do not understand. From what I understood from the code, the KEK is loaded
> into postgres process. That is what I'm disagreeing with, only needed DEK
> should be there.

One option would be to send the data needing to be encrypted to an
external command, and get the decrypted data back.  In that way, the KEK
is never on the Postgres server.  However, the API for doing such an
interface seems very complex and could lead to failures.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

Re: Internal key management system

2020-06-03 Thread Masahiko Sawada

On Wed, 3 Jun 2020 at 16:16, Fabien COELHO  wrote:
>
>
> Hello Masahiko-san,
>
> > This key manager is aimed to manage cryptographic keys used for
> > transparent data encryption. As a result of the discussion, we
> > concluded it's safer to use multiple keys to encrypt database data
> > rather than using one key to encrypt the whole thing, for example, in
> > order to make sure different data is not encrypted with the same key
> > and IV. Therefore, in terms of TDE, the minimum requirement is that
> > PostgreSQL can use multiple keys.
> >
> > Using multiple keys in PG, there are roughly two possible designs:
> >
> > 1. Store all keys for TDE into the external KMS, and PG gets them from
> > it as needed.
>
> +1

In this approach, encryption keys obtained from the external KMS are
directly used to encrypt/decrypt data. What KEK and DEK are you
referring to in this approach?

>
> > 2. PG manages all keys for TDE inside and protect these keys on disk
> > by the key (i.g. KEK) stored in the external KMS.
>
> -1, this is the one where you would need arguing.
>
> > There are pros and cons to each design. If I take one cons of #1 as an
> > example, the operation between PG and the external KMS could be
> > complex. The operations could be creating, removing and rotate key and
> > so on.
>
> ISTM that only create (delete?) are really needed. Rotating is the problem
> of the KMS itself, thus does not need to be managed by pg under #1.

With your idea how is the key rotation going to be performed? After
invoking key rotation on the external KMS we need to re-encrypt all
data encrypted with the old keys? Or you assume that the external KMS
employes something like 2-tier key hierarchy?

>
> > We can implement these operations in an extension to interact
> > with different kinds of external KMS, and perhaps we can use KMIP.
>
> I would even put that (KMIP protocol stuff) outside pg core.
>
> Even under #2, if some KMS is implemented and managed by pg, I would put
> the stuff in a separate process which I would probably run with a
> different uid, so that the KEK is not accessible directly by pg, ever.
>
> Once KMS interactions are managed with an outside process, then what this
> process does becomes an interface, and whether this process actually
> manages the keys or discuss with some external KMS with some KMIP or
> whatever is irrelevant to pg. Providing an interface means that anyone
> could implement their KMS fitting their requirements if they comply with
> the interface/protocol.

Just to be clear we don't keep KEK on neither shared memory nor disk.
Postmaster and a backend who executes pg_rotate_cluster_passphrase()
get KEK and use it to (re-)encrypt internal keys. But after that they
immediately free it. The encryption keys we need to store inside
PostgreSQL are DEK.

>
> Note that I'd be fine with having the current implementation somehow
> wrapped up as an example KMS.
>
> > But the development cost could become high because we might need
> > different extensions for each key management solutions/services.
>
> Yes and no. What I suggest is, I think, pretty simple, and I think I can
> implement it in a few line of script, so the cost is not high, and having
> a separate process looks, to me, like a security win and an extensibility
> win (i.e. another implementation can be provided).

How can we get multiple keys from the external KMS? I think we will
need to save something like identifiers for each encryption key
Postgres needs in the core and ask the external KMS for the key by the
identifier via an extension. Is that right?

>
> > #2 is better at that point; the interaction between PG and KMS is only
> > GET.
>
> I think that it could be the same with #1. I think that having a separate
> process is a reasonable security requirement, and if you do that #1 and #2
> are more or less the same.
>
> > Other databases employes a similar approach are SQL Server and DB2.
>
> Too bad for them:-) I'd still disagree with having the master key inside
> the database process, even if Microsoft, IBM and Oracle think it is a good
> idea.
>
> > In terms of the necessity of introducing the key manager into PG core,
> > I think at least TDE needs to be implemented in PG core. And as this
> > key manager is for managing keys for TDE, I think the key manager also
> > needs to be introduced into the core so that TDE functionality doesn't
> > depend on external modules.
>
> Hmmm.
>
> My point is that only interactions should be in core.
>
> The implementation could be in core, but as a separate process.
>
> I agree that pg needs to be able to manage the DEK, so it needs to store
> data keys.
>
> I still do not understand why an extension, possibly distributed with pg,
> would not be ok. There may be good arguments for that, but I do not think
> you provided any yet.

Hmm I think I don't fully understand your idea yet. With the current
patch, KEK could be obtained by either postmaster or backend processs
who execute

Re: Internal key management system

2020-06-03 Thread Fabien COELHO



Hello Masahiko-san,


This key manager is aimed to manage cryptographic keys used for
transparent data encryption. As a result of the discussion, we
concluded it's safer to use multiple keys to encrypt database data
rather than using one key to encrypt the whole thing, for example, in
order to make sure different data is not encrypted with the same key
and IV. Therefore, in terms of TDE, the minimum requirement is that
PostgreSQL can use multiple keys.

Using multiple keys in PG, there are roughly two possible designs:

1. Store all keys for TDE into the external KMS, and PG gets them from
it as needed.


+1


2. PG manages all keys for TDE inside and protect these keys on disk
by the key (i.g. KEK) stored in the external KMS.


-1, this is the one where you would need arguing.


There are pros and cons to each design. If I take one cons of #1 as an
example, the operation between PG and the external KMS could be
complex. The operations could be creating, removing and rotate key and
so on.


ISTM that only create (delete?) are really needed. Rotating is the problem 
of the KMS itself, thus does not need to be managed by pg under #1.



We can implement these operations in an extension to interact
with different kinds of external KMS, and perhaps we can use KMIP.


I would even put that (KMIP protocol stuff) outside pg core.

Even under #2, if some KMS is implemented and managed by pg, I would put 
the stuff in a separate process which I would probably run with a 
different uid, so that the KEK is not accessible directly by pg, ever.


Once KMS interactions are managed with an outside process, then what this 
process does becomes an interface, and whether this process actually 
manages the keys or discuss with some external KMS with some KMIP or 
whatever is irrelevant to pg. Providing an interface means that anyone 
could implement their KMS fitting their requirements if they comply with 
the interface/protocol.


Note that I'd be fine with having the current implementation somehow 
wrapped up as an example KMS.


But the development cost could become high because we might need 
different extensions for each key management solutions/services.


Yes and no. What I suggest is, I think, pretty simple, and I think I can 
implement it in a few line of script, so the cost is not high, and having 
a separate process looks, to me, like a security win and an extensibility 
win (i.e. another implementation can be provided).



#2 is better at that point; the interaction between PG and KMS is only
GET.


I think that it could be the same with #1. I think that having a separate 
process is a reasonable security requirement, and if you do that #1 and #2 
are more or less the same.



Other databases employes a similar approach are SQL Server and DB2.


Too bad for them:-) I'd still disagree with having the master key inside 
the database process, even if Microsoft, IBM and Oracle think it is a good 
idea.



In terms of the necessity of introducing the key manager into PG core,
I think at least TDE needs to be implemented in PG core. And as this
key manager is for managing keys for TDE, I think the key manager also
needs to be introduced into the core so that TDE functionality doesn't
depend on external modules.


Hmmm.

My point is that only interactions should be in core.

The implementation could be in core, but as a separate process.

I agree that pg needs to be able to manage the DEK, so it needs to store 
data keys.


I still do not understand why an extension, possibly distributed with pg, 
would not be ok. There may be good arguments for that, but I do not think 
you provided any yet.



Also, I'm not at fully at ease with some of the underlying principles
behind this proposal. Are we re-inventing/re-implementing kerberos or
whatever? Are we re-implementing a brand new KMS inside pg? Why having
our own?


As I explained above, this key manager is for managing internal keys
used by TDE. It's not an alternative to existing key management
solutions/services.


Hmmm. This seels to suggest that interacting with something outside should 
be an option.



The requirements of this key manager are generating internal keys,
letting other PG components use them, protecting them by KEK when
persisting,


If you want that, I'd still argue that you should have a separate process.


and support KEK rotation. It doesn’t have a feature like
allowing users to store arbitrary keys into this key manager, like
other key management solutions/services have.


Hmmm.


I agree that the key used to encrypt data must not be placed in the
same host. But it's true only when the key is not protected, right?


The DEK is needed when encrypting and decrypting, obviously, so it would 
be there once obtained, it cannot be helped. My concern is about the KEK, 
which AFAICS in your code is somewhere in memory accessible by the 
postgres process, which is a no go for me.


The definition of "protected" is fuzzy, it would depend on what the user

Re: Internal key management system

2020-06-03 Thread Masahiko Sawada

On Wed, 3 Jun 2020 at 08:30, Cary Huang  wrote:
>
> Hi
>
>
>
> I took a step back today and started to think about the purpose of internal 
> KMS and what it is supposed to do, and how it compares to external KMS. Both 
> are intended to manage the life cycle of encryptions keys including their 
> generation, protection, storage and rotation. External KMS, on the other 
> hand, is a more centralized but expensive way to manage encryption life 
> cycles and many deployment actually starts with internal KMS and later 
> migrate to external one.
>
>
>
> Anyhow, the design and implementation of internal KMS should circle around 
> these stages of key lifecycle.
>
>
>
> 1. Key Generation -  Yes, internal KMS module generates keys using pseudo 
> random function, but only the keys for TDE and  SQL level keys. Users cannot 
> request new key generation
> 2. Key Protection - Yes, internal KMS wrap all keys with KEK and HMAC hash 
> derived from a cluster passphrase
> 3. Key Storage - Yes, the wrapped keys are stored in the cluster
> 4. Key Rotation - Yes, internal KMS has a SQL method to swap out cluster 
> passphrase, which rotates the KEK and HMAC key
>
>
>
> I am saying this, because I want to make sure we can all agree on the scope 
> of internal KMS. Without clear scope, this KMS development will seem to go on 
> forever.
>
>

Yes, the internal KMS is not an alternative to external KMS such as
AWS KMS, SafeNet Key Secure, and Vault, but a PostgreSQL internal
component that can work with these external solutions (via
cluster_passphrase_command). It's the same position as our other
management components such as bufmgr, smgr, and lmgr.

I agree with this scope. It manages only encryption keys used by PostgreSQL.

>
> In this patch, the internal KMS exposes pg_encrypt() and pg_decrypt() (was 
> pg_wrap() and pg_unwrap() before) to the user to turn a clear text password 
> into some sort of key material based on the SQL level key generated at 
> initdb. This is used so the user does not have to provide clear text password 
> to pgp_sym_encrypt() provided by pgcrypto extension. The intention is good, I 
> understand, but I don't think it is within the scope of KMS and it is 
> definitely not within the scope of TDE either.
>

I agree that neither pg_encrypt() nor pg_decrypt() is within the scope
of KMS and TDE. That's why I've split the patch, and that's why I
renamed to pg_encrypt() and pg_decrypt() to clarify the purpose of
these functions is not key management. Key wrapping and unwrapping is
one of the usages of these functions.

I think we can use the internal KMS for several purposes. It can
manage encryption keys not only for cluster-wide TDE but also, for
example, for column-level TDE and encryption SQL functions.
pg_encrypt() and pg_decrypt() are one example of the usage of the
internal KMS. Originally since we thought KMS and TDE are not
introduced at the same release, the idea is come up with so that users
can use KMS functionality with some interface. Therefore these SQL
functions are not within the scope of KMS and it should be fine with
introducing the internal KMS having 0 keys.

> Even if the password can be passed into pgp_sym_encrypt() securely by using 
> pg_decrypt() function, the pgp_sym_encrypt() still will have to take this 
> password and derive into an encryption key using algorithms that internal KMS 
> does not manage currently. This kind of defeats the purpose of internal KMS. 
> So simply using pg_encrypt() and pg_decrypt() is not really a solution to 
> pgcrypto's limitation.

Yeah, when using pgcrypto, user must manage their encryption keys. The
internal KMS doesn't help that because it manages only keys internally
used. What pg_encrypt() and pg_decrypt() can help is only to hide the
password from server logs.

> This should be in another topic/project that is aimed to improve pgcrypto by 
> integrating it with the internal KMS, similar to TDE where it also has to 
> integrate with the internal KMS later.
>

Agreed.

> So for internal KMS, the only cryptographic functions needed for now is 
> kmgr_wrap_key() and kmgr_unwrap_key() to encapsulate and restore the 
> encryptions keys to satisfy the "key protection" life cycle stage. I don't 
> think pg_encrypt() and pg_decrypt() should be part of internal KMS.
>

Agreed.

>
> Anyways, I have also reviewed the patch and have a few comments below:
>
>
>
> (1)
>
> The ciphering algorithm in my opinion should contain the algorithm name, key 
> length and block cipher mode, which is similar to openssl's definition.
>
>
>
> Instead of defining a cipher as  PG_CIPHER_AES_CBC, and have key length as 
> separate parameter, I would define them as
>
> #define PG_CIPHER_AES128_CBC 0
>
> #define PG_CIPHER_AES256_CBC 1
>
> #define PG_CIPHER_AES128_CTR 2
>
> #define PG_CIPHER_AES256_CTR 3
>

Agreed. I was concerned that we will end up having many IDs in the
future for example when porting pgcrypto functions into core but I'm
okay with that change.

>
>

Re: Internal key management system

2020-06-02 Thread Cary Huang

Hi

 

I took a step back
today and started to think about the purpose of internal KMS and what it is
supposed to do, and how it compares to external KMS. Both are intended to
manage the life cycle of encryptions keys including their generation,
protection, storage and rotation. External KMS, on the other hand, is a more
centralized but expensive way to manage encryption life cycles and many
deployment actually starts with internal KMS and later migrate to external one.

 

Anyhow, the design
and implementation of internal KMS should circle around these stages of key
lifecycle.

 

1. Key Generation -  Yes, internal KMS module generates keys
 using pseudo random function, but only the keys for TDE and  SQL level 
keys. Users cannot request new
 key generation

2. Key Protection - Yes,
 internal KMS wrap all keys with KEK and HMAC hash derived from a cluster
 passphrase

3. Key Storage - Yes, the
 wrapped keys are stored in the cluster

4. Key Rotation - Yes, internal
 KMS has a SQL method to swap out cluster passphrase, which rotates the KEK
 and HMAC key

 

I am
saying this, because I want to make sure we can all agree on the scope of
internal KMS. Without clear scope, this KMS development will seem to go on
forever.

 

In this
patch, the internal KMS exposes pg_encrypt() and pg_decrypt() (was pg_wrap()
and pg_unwrap() before) to the user to turn a clear text password into some
sort of key material based on the SQL level key generated at initdb. This is
used so the user does not have to provide clear text password to
pgp_sym_encrypt() provided by pgcrypto extension. The intention is good, I
understand, but I don't think it is within the scope of KMS and it is
definitely not within the scope of TDE either.

 

Even
if the password can be passed into pgp_sym_encrypt() securely by using 
pg_decrypt() function, the pgp_sym_encrypt() still will have to take
this password and derive into an encryption key using algorithms that internal
KMS does not manage currently. This kind of defeats the purpose of internal
KMS. So simply using pg_encrypt() and pg_decrypt() is not really a solution to
pgcrypto's limitation. This should be in another topic/project that is aimed to
improve pgcrypto by integrating it with the internal KMS, similar to TDE where
it also has to integrate with the internal KMS later.

 

So
for internal KMS, the only cryptographic functions needed for now is
kmgr_wrap_key() and kmgr_unwrap_key() to encapsulate and restore the
encryptions keys to satisfy the "key protection" life cycle stage. I
don't think pg_encrypt() and pg_decrypt() should be part of internal KMS.

 

Anyways, I have also
reviewed the patch and have a few comments below:

 

(1)

The ciphering
algorithm in my opinion should contain the algorithm name, key length and block
cipher mode, which is similar to openssl's definition.

 

Instead of defining
a cipher as  PG_CIPHER_AES_CBC, and have
key length as separate parameter, I would define them as

#define
PG_CIPHER_AES128_CBC 0

#define
PG_CIPHER_AES256_CBC 1

#define
PG_CIPHER_AES128_CTR 2

#define
PG_CIPHER_AES256_CTR 3

 

I know
PG_CIPHER_AES128_CTR and PG_CIPHER_AES256_CTR are not being used now as these
are for the TDE in future, but might as well list them here because this KMS is
made to work specifically for TDE as I understand.

---

/*

 * Supported symmetric encryption algorithm.
These identifiers are passed

 * to pg_cipher_ctx_create() function, and then
actual encryption

 * implementations need to initialize their
context of the given encryption

 * algorithm.

 */

#define
PG_CIPHER_AES_CBC0

#define
PG_MAX_CIPHER_ID1

---

 

(2)

If the cipher
algorithms are defined like (1), then there is no need to pass key length as
argument to ossl_cipher_ctx_create() function because it already knows the key
length based on the cipher definition. Less argument the better.

 

---

PgCipherCtx *

pg_cipher_ctx_create(int
cipher, uint8 *key, int klen)

{

PgCipherCtx
*ctx = NULL;

 

if
(cipher >= PG_MAX_CIPHER_ID)

return
NULL;

 

#ifdef USE_OPENSSL

ctx
= (PgCipherCtx *) palloc0(sizeof(PgCipherCtx));

 

ctx->encctx
= ossl_cipher_ctx_create(cipher, key, klen, true);

ctx->decctx
= ossl_cipher_ctx_create(cipher, key, klen, false);

#endif

 

return
ctx;

}

---



Cary Huang

-

HighGo Software Inc. (Canada)

mailto:cary.hu...@highgo.ca

http://www.highgo.ca



 On Sun, 31 May 2020 23:58:31 -0700 Masahiko Sawada 
 wrote 


On Sat, 30 May 2020 at 04:20, Robert Haas

Re: Internal key management system

2020-06-01 Thread Masahiko Sawada

On Sat, 30 May 2020 at 04:20, Robert Haas  wrote:
>
> On Fri, May 29, 2020 at 1:50 AM Masahiko Sawada
>  wrote:
> > However, this usage has a downside that user secret can be logged to
> > server logs when log_statement = 'all' or an error happens. To deal
> > with this issue I've created a PoC patch on top of the key manager
> > patch which adds a libpq function PQencrypt() to encrypt data and new
> > psql meta-command named \encrypt in order to  encrypt data while
> > eliminating the possibility of the user data being logged.
> > PQencrypt() just calls pg_encrypt() via PQfn(). Using this command the
> > above example can become as follows:
>
> If PQfn() calls aren't currently logged, that's probably more of an
> oversight due to the feature being almost dead than something upon
> which we want to rely.

Agreed.

The patch includes pg_encrypt() and pg_decrypt() SQL functions
inspired by Always Encryption but these functions are interfaces of
the key manager to make it work independently from TDE and are
actually not necessary in terms of TDE. Perhaps it's better to
consider whether it's worth having them after introducing TDE.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-06-01 Thread Masahiko Sawada

On Sun, 31 May 2020 at 17:13, Fabien COELHO  wrote:
>
>
> Hello Masahiko-san,
>
> >> I am sharing here a document patch based on top of kms_v10 that was
> >> shared awhile back. This document patch aims to cover more design
> >> details of the current KMS design and to help people understand KMS
> >> better. Please let me know if you have any more comments.
>
> A few questions and comments, mostly about the design. If I'm off topic,
> or these concerns have been clearly addressed in the thread, please accept
> my apology.

Thank you for your comments! Please correct me if I'm misunderstanding
your questions and comments.

>
> A lot of what I write is based on guessing from a look at the doc & code
> provided in the patch. The patch should provide some explanatory README
> about the overall design.

Agreed.

>
> It is a lot of code, which for me should not be there, inside the backend.
> Could this whole thing be an extension? I cannot see why not. If it could,
> then ISTM that it should. If not, what set of features is needed to allow
> that as an extension? How could pg be improved so that it could be an
> extension?

Let me explain some information about TDE behind this key manager patch.

This key manager is aimed to manage cryptographic keys used for
transparent data encryption. As a result of the discussion, we
concluded it's safer to use multiple keys to encrypt database data
rather than using one key to encrypt the whole thing, for example, in
order to make sure different data is not encrypted with the same key
and IV. Therefore, in terms of TDE, the minimum requirement is that
PostgreSQL can use multiple keys.

Using multiple keys in PG, there are roughly two possible designs:

1. Store all keys for TDE into the external KMS, and PG gets them from
it as needed.
2. PG manages all keys for TDE inside and protect these keys on disk
by the key (i.g. KEK) stored in the external KMS.

There are pros and cons to each design. If I take one cons of #1 as an
example, the operation between PG and the external KMS could be
complex. The operations could be creating, removing and rotate key and
so on. We can implement these operations in an extension to interact
with different kinds of external KMS, and perhaps we can use KMIP. But
the development cost could become high because we might need different
extensions for each key management solutions/services.

#2 is better at that point; the interaction between PG and KMS is only
GET. Other databases employes a similar approach are SQL Server and
DB2.

In terms of the necessity of introducing the key manager into PG core,
I think at least TDE needs to be implemented in PG core. And as this
key manager is for managing keys for TDE, I think the key manager also
needs to be introduced into the core so that TDE functionality doesn't
depend on external modules.

>
> Also, I'm not at fully at ease with some of the underlying principles
> behind this proposal. Are we re-inventing/re-implementing kerberos or
> whatever? Are we re-implementing a brand new KMS inside pg? Why having
> our own?

As I explained above, this key manager is for managing internal keys
used by TDE. It's not an alternative to existing key management
solutions/services.

The requirements of this key manager are generating internal keys,
letting other PG components use them, protecting them by KEK when
persisting, and support KEK rotation. It doesn’t have a feature like
allowing users to store arbitrary keys into this key manager, like
other key management solutions/services have.

>
> I think that key management should *not* belong to pg itself, but to some
> external facility/process with which pg would interact, so that no master
> key would ever be inside pg process, and possibly not on the same host, if
> it was me doing it.
>
> If some extension could provide it inside the process and stores thing
> inside some pg_cryptokeys directory, then fine if it fits the threat model
> being addressed, but the paranoïd user wanting that should have other
> options which could be summarized as "outside".
>
> Another benefit of "outside" is that if there is a security issue attached
> to the kms, then it would not be a pg security issue, and it would not
> affect normal pg users which do not use the feature.

I agree that the key used to encrypt data must not be placed in the
same host. But it's true only when the key is not protected, right? In
this key manager, since we protect all internal keys by KEK it's no
problem unless KEK is leaked. KEK can be obtained from outside key
management solutions/services through cluster_passphrase_command.

>
> Also, implementing a crash-safe key rotation algorithm does not look like
> inside pg backend, that is not its job.

The key rotation this key manager has is KEK rotation, which is very
important. Without KEK rotation, when KEK is leaked an attacker can
get database data by disk theft. Since KEK is responsible for
encrypting all internal keys it's necessary to

Re: Internal key management system

2020-05-31 Thread Fabien COELHO



Hello Masahiko-san,

I am sharing here a document patch based on top of kms_v10 that was 
shared awhile back. This document patch aims to cover more design 
details of the current KMS design and to help people understand KMS 
better. Please let me know if you have any more comments.


A few questions and comments, mostly about the design. If I'm off topic, 
or these concerns have been clearly addressed in the thread, please accept 
my apology.


A lot of what I write is based on guessing from a look at the doc & code 
provided in the patch. The patch should provide some explanatory README 
about the overall design.


It is a lot of code, which for me should not be there, inside the backend. 
Could this whole thing be an extension? I cannot see why not. If it could, 
then ISTM that it should. If not, what set of features is needed to allow 
that as an extension? How could pg be improved so that it could be an 
extension?


Also, I'm not at fully at ease with some of the underlying principles 
behind this proposal. Are we re-inventing/re-implementing kerberos or 
whatever? Are we re-implementing a brand new KMS inside pg? Why having 
our own?


I think that key management should *not* belong to pg itself, but to some 
external facility/process with which pg would interact, so that no master 
key would ever be inside pg process, and possibly not on the same host, if 
it was me doing it.


If some extension could provide it inside the process and stores thing 
inside some pg_cryptokeys directory, then fine if it fits the threat model 
being addressed, but the paranoïd user wanting that should have other 
options which could be summarized as "outside".


Another benefit of "outside" is that if there is a security issue attached 
to the kms, then it would not be a pg security issue, and it would not 
affect normal pg users which do not use the feature.


Also, implementing a crash-safe key rotation algorithm does not look like 
inside pg backend, that is not its job. Likewise, the AEAD AES-CBC 
HMAC-SHA512 does definitely not belong to postgres core backend 
implementation. Why should I use the OpenSSL library and not some other 
facility?


Basically, I'm -1 on having such a feature right inside pg, and +1 on 
allowing pg to have it outside and interact with it appropriately, 
preferably through an extension which could be in core.


So my take is that pg should allow an extension to:

 - provide a *generic* way to interact with an *external* kms
   eg by running a command (possibly setuid something) and interacting
   with its stdin/stderr what the command does should be of no concern
   to pg and use some trivial text protocol, and the existing code
   can be wrapped as an example working implementation.

 - store some local keys somewhere and provide functions to use these
   keys to encrypt/decrypt stuff, obviously, as generic as possible.

   ISTM that what crypto algorithms are actually used should not be
   hardcoded, but I'm not sure how to achieve that. Maybe simply by
   redefining the relevant function, maybe at the SQL level.

There is an open question on how the "command" validates that it is indeed 
the right pg which is interacting with it. This means some authentication, 
probably some passphrase to provide somehow, probably close to what is 
being implemented, so from an interface point of view, it could look quite 
the same, but the key point is that the whole thing would be out of 
postgres process, only encryption keys being used would be in postgres,

and probably only in the process which actually needs it.

Random comments about details I saw in passing:

* key_management_enabled

key_management (on|off) ?

* initdb -D dbname --cluster-passphrase-command="cat /path/to/passphrase-file"

Putting example in the documentation looks like a recommendation. It would 
put a caveat that doing the above is probably a bad idea.


--
Fabien.

Re: Internal key management system

2020-05-29 Thread Robert Haas

On Fri, May 29, 2020 at 1:50 AM Masahiko Sawada
 wrote:
> However, this usage has a downside that user secret can be logged to
> server logs when log_statement = 'all' or an error happens. To deal
> with this issue I've created a PoC patch on top of the key manager
> patch which adds a libpq function PQencrypt() to encrypt data and new
> psql meta-command named \encrypt in order to  encrypt data while
> eliminating the possibility of the user data being logged.
> PQencrypt() just calls pg_encrypt() via PQfn(). Using this command the
> above example can become as follows:

If PQfn() calls aren't currently logged, that's probably more of an
oversight due to the feature being almost dead than something upon
which we want to rely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Internal key management system

2020-05-01 Thread Cary Huang

Hi all



I am sharing here a document patch based on top of kms_v10 that was shared 
awhile back. This document patch aims to cover more design details of the 
current KMS design and to help people understand KMS better. Please let me know 
if you have any more comments.



thank you



Best regards



Cary Huang

-

HighGo Software Inc. (Canada)

mailto:cary.hu...@highgo.ca

http://www.highgo.ca




 On Tue, 07 Apr 2020 20:56:12 -0700 Ahsan Hadi 
 wrote 



Hi Bruce/Joe,



In the last meeting we discussed the need for improving the documentation for 
KMS so it is easier to understand the interface. Cary from highgo had a go at 
doing that, please see the previous email on this thread from Cary and let us 
know if it looks good...?



-- Ahsan 




On Wed, Apr 8, 2020 at 3:46 AM Cary Huang  wrote:








-- 

Highgo Software (Canada/China/Pakistan)
URL : http://www.highgo.ca/
ADDR: 10318 WHALLEY BLVD, Surrey, BC
EMAIL: mailto: 





Hello



Thanks a lot for the patch, I think in terms of functionality, the patch 
provides very straightforward functionalities regarding key management. In 
terms of documentation, I think the patch is still lacking some pieces of 
information that kind of prevent people from fully understanding how KMS works 
and how it can be used and why, (at least that is the impression I got from the 
zoom meeting recordings :p). I spent some time today revisiting the 
key-management documentation in the patch and rephrase and restructure it  
based on my current understanding of latest KMS design. I mentioned all 3 
application level keys that we have agreed and emphasize on explaining the SQL 
level encryption key because that is the key that can be used right now. Block 
and WAL levels keys we can add here more information once they are actually 
used in the TDE development. 



Please see below the KMS documentation that I have revised and I hope it will 
be more clear and easier for people to understand KMS. Feel free to make 
adjustments. Please note that we use the term "wrap" and "unwrap" a lot in our 
past discussions. Originally we used the terms within a context involving Key 
encryption keys (KEK). For example, "KMS wraps a master key with KEK". Later, 
we used the same term in a context involving encrypting user secret /password. 
For example, "KMS wraps a user secret with SQL key". In my opinion, both make 
sense but it may be confusing to people having the same term used differently. 
So in my revision below, the terms "wrap" and "unwrap" refer to encrypting or 
decrypting user secret / password as they are used in "pg_wrap() and 
pg_unwrap()". I use the terms "encapsulate" and "restore" when KEK is used to 
encrypt or decrypt a key.







Chapter 32: Encryption Key Management 

--



PostgreSQL supports internal Encryption Key Management System, which is 
designed to manage the life cycles of cryptographic keys within the PostgreSQL 
system. This includes dealing with their generation, storage, usage and 
rotation.



Encryption Key Management is enabled when PostgreSQL is build with 
--with-openssl and cluster passphrase command is specified during initdb. The 
cluster passphrase provided by --cluster-passphrase-command option during 
initdb and the one generated by cluster_passphrase_command in the 
postgresql.conf must match, otherwise, the database cluster will not start up.



32.1 Key Generations and Derivations

--



When cluster_passphrase_command option is specified to the initdb, the process 
will derive the cluster passphrase into a Key Encryption Key (KEK) and a HMAC 
Key using key derivation protocol before the actual generation of application 
level cryptographic level keys.



-Key Encryption Key (KEK)

KEK is primarily used to encapsulate or restore a given application level 
cryptographic key



-HMAC Key

HMAC key is used to compute the HASH of a given application level cryptographic 
key for integrity check purposes



These 2 keys are not stored physically within the PostgreSQL cluster as they 
are designed to be derived from the correctly configured cluster passphrase.



Encryption Key Management System currently manages 3 application level 
cryptographic keys that have different purposes and usages within the 
PostgreSQL system and these are generated using pg_strong_random() after KEK 
and HMAC key derivation during initdb process.



The 3 keys are:



-SQL Level Key

SQL Level Key is used to wrap and unwrap a user secret / passphrase via 
pg_wrap() and pg_unwrap() SQL functions. These 2 functions are designed to be 
used in conjunction with the cryptographic functions provided by pgcrypto 
extension to perform column level encryption/decryption without having to 
supply a clear text user secret or passphrase that is required by many pgcrypto 
functions as input. Please refer to

Re: Internal key management system

2020-04-07 Thread Ahsan Hadi

Hi Bruce/Joe,

In the last meeting we discussed the need for improving the documentation
for KMS so it is easier to understand the interface. Cary from highgo had a
go at doing that, please see the previous email on this thread from Cary
and let us know if it looks good...?

-- Ahsan

On Wed, Apr 8, 2020 at 3:46 AM Cary Huang  wrote:

> Hello
>
> Thanks a lot for the patch, I think in terms of functionality, the patch
> provides very straightforward functionalities regarding key management. In
> terms of documentation, I think the patch is still lacking some pieces of
> information that kind of prevent people from fully understanding how KMS
> works and how it can be used and why, (at least that is the impression I
> got from the zoom meeting recordings :p). I spent some time today
> revisiting the key-management documentation in the patch and rephrase and
> restructure it based on my current understanding of latest KMS design. I
> mentioned all 3 application level keys that we have agreed and emphasize on
> explaining the SQL level encryption key because that is the key that can be
> used right now. Block and WAL levels keys we can add here more information
> once they are actually used in the TDE development.
>
> Please see below the KMS documentation that I have revised and I hope it
> will be more clear and easier for people to understand KMS. Feel free to
> make adjustments. Please note that we use the term "wrap" and "unwrap" a
> lot in our past discussions. Originally we used the terms within a context
> involving Key encryption keys (KEK). For example, "KMS wraps a master key
> with KEK". Later, we used the same term in a context involving encrypting
> user secret /password. For example, "KMS wraps a user secret with SQL key".
> In my opinion, both make sense but it may be confusing to people having the
> same term used differently. So in my revision below, the terms "wrap" and
> "unwrap" refer to encrypting or decrypting user secret / password as they
> are used in "pg_wrap() and pg_unwrap()". I use the terms "encapsulate" and
> "restore" when KEK is used to encrypt or decrypt a key.
>
>
>
> Chapter 32: Encryption Key Management
> --
>
> PostgreSQL supports internal Encryption Key Management System, which is
> designed to manage the life cycles of cryptographic keys within the
> PostgreSQL system. This includes dealing with their generation, storage,
> usage and rotation.
>
> Encryption Key Management is enabled when PostgreSQL is build
> with --with-openssl and cluster passphrase command is specified
> during initdb. The cluster passphrase provided
> by --cluster-passphrase-command option during initdb and the one generated
> by cluster_passphrase_command in the postgresql.conf must match, otherwise,
> the database cluster will not start up.
>
> 32.1 Key Generations and Derivations
> --
>
> When cluster_passphrase_command option is specified to the initdb, the
> process will derive the cluster passphrase into a Key Encryption Key (KEK)
> and a HMAC Key using key derivation protocol before the actual generation
> of application level cryptographic level keys.
>
> -Key Encryption Key (KEK)
> KEK is primarily used to encapsulate or restore a given application level
> cryptographic key
>
> -HMAC Key
> HMAC key is used to compute the HASH of a given application level
> cryptographic key for integrity check purposes
>
> These 2 keys are not stored physically within the PostgreSQL cluster as
> they are designed to be derived from the correctly configured cluster
> passphrase.
>
> Encryption Key Management System currently manages 3 application level
> cryptographic keys that have different purposes and usages within the
> PostgreSQL system and these are generated using pg_strong_random() after
> KEK and HMAC key derivation during initdb process.
>
> The 3 keys are:
>
> -SQL Level Key
> SQL Level Key is used to wrap and unwrap a user secret / passphrase via
> pg_wrap() and pg_unwrap() SQL functions. These 2 functions are designed to
> be used in conjunction with the cryptographic functions provided by
> pgcrypto extension to perform column level encryption/decryption without
> having to supply a clear text user secret or passphrase that is required by
> many pgcrypto functions as input. Please refer to [Wrap and Unwrap User
> Secret section] for usage examples.
>
> -Block Level Key
> Block Level Key is primarily used to encrypt / decrypt buffers as part of
> the Transparent Data Encryption (TDE) feature
>
> -WAL Level Key
> WAL Level Key is primarily used to encrypt / decrypt WAL files as part of
> the Transparent Data Encryption (TDE) feature
>
> The 3 application level keys above will be encapsulated and hashed using
> KEK and HMAC key mentioned above before they are physically stored to
> pg_cryptokeys directory within the cluster.
>
> 32.1. Key Initialization
> -
>
> When a PostgreSQL cluster

Re: Internal key management system

2020-04-07 Thread Cary Huang

Hello



Thanks a lot for the patch, I think in terms of functionality, the patch 
provides very straightforward functionalities regarding key management. In 
terms of documentation, I think the patch is still lacking some pieces of 
information that kind of prevent people from fully understanding how KMS works 
and how it can be used and why, (at least that is the impression I got from the 
zoom meeting recordings :p). I spent some time today revisiting the 
key-management documentation in the patch and rephrase and restructure it  
based on my current understanding of latest KMS design. I mentioned all 3 
application level keys that we have agreed and emphasize on explaining the SQL 
level encryption key because that is the key that can be used right now. Block 
and WAL levels keys we can add here more information once they are actually 
used in the TDE development. 



Please see below the KMS documentation that I have revised and I hope it will 
be more clear and easier for people to understand KMS. Feel free to make 
adjustments. Please note that we use the term "wrap" and "unwrap" a lot in our 
past discussions. Originally we used the terms within a context involving Key 
encryption keys (KEK). For example, "KMS wraps a master key with KEK". Later, 
we used the same term in a context involving encrypting user secret /password. 
For example, "KMS wraps a user secret with SQL key". In my opinion, both make 
sense but it may be confusing to people having the same term used differently. 
So in my revision below, the terms "wrap" and "unwrap" refer to encrypting or 
decrypting user secret / password as they are used in "pg_wrap() and 
pg_unwrap()". I use the terms "encapsulate" and "restore" when KEK is used to 
encrypt or decrypt a key.







Chapter 32: Encryption Key Management 

--


PostgreSQL supports internal Encryption Key Management System, which is 
designed to manage the life cycles of cryptographic keys within the PostgreSQL 
system. This includes dealing with their generation, storage, usage and 
rotation.



Encryption Key Management is enabled when PostgreSQL is build with 
--with-openssl and cluster passphrase command is specified during initdb. The 
cluster passphrase provided by --cluster-passphrase-command option during 
initdb and the one generated by cluster_passphrase_command in the 
postgresql.conf must match, otherwise, the database cluster will not start up.



32.1 Key Generations and Derivations

--



When cluster_passphrase_command option is specified to the initdb, the process 
will derive the cluster passphrase into a Key Encryption Key (KEK) and a HMAC 
Key using key derivation protocol before the actual generation of application 
level cryptographic level keys.



-Key Encryption Key (KEK)

KEK is primarily used to encapsulate or restore a given application level 
cryptographic key



-HMAC Key

HMAC key is used to compute the HASH of a given application level cryptographic 
key for integrity check purposes



These 2 keys are not stored physically within the PostgreSQL cluster as they 
are designed to be derived from the correctly configured cluster passphrase.



Encryption Key Management System currently manages 3 application level 
cryptographic keys that have different purposes and usages within the 
PostgreSQL system and these are generated using pg_strong_random() after KEK 
and HMAC key derivation during initdb process.



The 3 keys are:



-SQL Level Key

SQL Level Key is used to wrap and unwrap a user secret / passphrase via 
pg_wrap() and pg_unwrap() SQL functions. These 2 functions are designed to be 
used in conjunction with the cryptographic functions provided by pgcrypto 
extension to perform column level encryption/decryption without having to 
supply a clear text user secret or passphrase that is required by many pgcrypto 
functions as input. Please refer to [Wrap and Unwrap User Secret section] for 
usage examples.



-Block Level Key

Block Level Key is primarily used to encrypt / decrypt buffers as part of the 
Transparent Data Encryption (TDE) feature



-WAL Level Key

WAL Level Key is primarily used to encrypt / decrypt WAL files as part of the 
Transparent Data Encryption (TDE) feature



The 3 application level keys above will be encapsulated and hashed using KEK 
and HMAC key mentioned above before they are physically stored to pg_cryptokeys 
directory within the cluster.



32.1. Key Initialization

-



When a PostgreSQL cluster with encryption key management enabled is started, 
the cluster_passphrase_command parameter in postgresql.conf will be evaluated 
and the cluster passphrase will be derived into KEK and HMAC Key in similar 
ways as initdb.



After that, the 3 encapsulated application level cryptographic keys will be 
retrieved from pg_cryptokeys directory to be restored and integrity-checked by 
the key management system using

Re: Internal key management system

2020-03-30 Thread Masahiko Sawada

On Tue, 31 Mar 2020 at 09:36, Cary Huang  wrote:
>
> Hi
> I had a look on kms_v9 patch and have some comments
>
> --> pg_upgrade.c
> keys are copied correctly, but as pg_upgrade progresses further, it will try 
> to start the new_cluster from "issue_warnings_and_set_wal_level()" function, 
> which is called after key copy. The new cluster will fail to start due to the 
> mismatch between cluster_passphrase_command and the newly copied keys. This 
> causes pg_upgrade to always finish with failure. We could move 
> "copy_master_encryption_key()" to be called after 
> "issue_warnings_and_set_wal_level()" and this will make pg_upgrade to finish 
> with success, but user will still have to manually correct the 
> "cluster_passphrase_command" param on the new cluster in order for it to 
> start up correctly. Should pg_upgrade also take care of copying 
> "cluster_passphrase_command" param from old to new cluster after it has 
> copied the encryption keys so users don't have to do this step? If the 
> expectation is for users to manually correct "cluster_passphrase_command" 
> param after successful pg_upgrade and key copy, then there should be a 
> message to remind the users to do so.

I think both the old cluster and the new cluster must be initialized
with the same passphrase at initdb. Specifying the different
passphrase command to the new cluster at initdb and changing it after
pg_upgrade doesn't make sense. Also I don't think we need to copy
cluster_passphrase_command same as other GUC parameters.

I've changed the patch so that pg_upgrade copies the crypto keys only
if both new and old cluster enable the key management. User must
specify the same passphrase command to both old and new cluster, which
is not cumbersome, I think. I also added the description about this to
the doc.

>
> -->Kmgr.c
> + /*
> + * If there is only temporary directory, it means that the previous
> + * rotation failed after wrapping the all internal keys by the new
> + * passphrase.  Therefore we use the new cluster passphrase.
> + */
> + if (stat(KMGR_DIR, ) != 0)
> + {
> + ereport(DEBUG1,
> + (errmsg("both directories %s and %s exist, use the newly wrapped keys",
> + KMGR_DIR, KMGR_TMP_DIR)));
>
> I think the error message should say "there is only temporary directory 
> exist" instead of "both directories exist"

You're right. Fixed.

I've attached the new version patch.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


kms_v10.patch
Description: Binary data

Re: Internal key management system

2020-03-30 Thread Cary Huang

Hi

I had a look on kms_v9 patch and have some comments

--> pg_upgrade.c

keys are copied correctly, but as pg_upgrade progresses further, it will try to 
start the new_cluster from "issue_warnings_and_set_wal_level()" function, which 
is called after key copy. The new cluster will fail to start due to the 
mismatch between cluster_passphrase_command and the newly copied keys. This 
causes pg_upgrade to always finish with failure. We could move 
"copy_master_encryption_key()" to be called after 
"issue_warnings_and_set_wal_level()" and this will make pg_upgrade to finish 
with success, but user will still have to manually correct the 
"cluster_passphrase_command" param on the new cluster in order for it to start 
up correctly. Should pg_upgrade also take care of copying 
"cluster_passphrase_command" param from old to new cluster after it has copied 
the encryption keys so users don't have to do this step? If the expectation is 
for users to manually correct "cluster_passphrase_command" param after 
successful pg_upgrade and key copy, then there should be a message to remind 
the users to do so. 

-->Kmgr.c 

+   /*

+* If there is only temporary directory, it means that the previous

+* rotation failed after wrapping the all internal keys by the new

+* passphrase.  Therefore we use the new cluster passphrase.

+*/

+   if (stat(KMGR_DIR, ) != 0)

+   {

+   ereport(DEBUG1,

+   (errmsg("both directories %s and %s exist, use 
the newly wrapped keys",

+   KMGR_DIR, KMGR_TMP_DIR)));

I think the error message should say "there is only temporary directory exist" 
instead of "both directories exist"

thanks!

Cary Huang

-

HighGo Software Inc. (Canada)

mailto:cary.hu...@highgo.ca

http://www.highgo.ca

 On Wed, 25 Mar 2020 01:51:08 -0700 Masahiko Sawada 
 wrote 

On Tue, 24 Mar 2020 at 23:15, Bruce Momjian  wrote: 
> 
> On Tue, Mar 24, 2020 at 02:29:57PM +0900, Masahiko Sawada wrote: 
> > That seems to work fine. 
> > 
> > So we will have pg_cryptokeys within PGDATA and each key is stored 
> > into separate file named the key id such as "sql", "tde-wal" and 
> > "tde-block". I'll update the patch and post. 
> 
> Yes, that makes sense to me. 
> 

I've attached the updated patch. With the patch, we have three 
internal keys: SQL key, TDE-block key and TDE-wal key. Only SQL key 
can be used so far to wrap and unwrap user secret via pg_wrap and 
pg_unwrap SQL functions. Each keys is saved to the single file located 
at pg_cryptokeys. After initdb with enabling key manager, the 
pg_cryptokeys directory has the following files: 

$ ll data/pg_cryptokeys 
total 12K 
-rw--- 1 masahiko staff 132 Mar 25 15:45  
-rw--- 1 masahiko staff 132 Mar 25 15:45 0001 
-rw--- 1 masahiko staff 132 Mar 25 15:45 0002 

I used the integer id rather than string id to make the code simple. 

When cluster passphrase rotation, we update all keys atomically using 
temporary directory as follows: 

1. Derive the new passphrase 
2. Wrap all internal keys with the new passphrase 
3. Save all internal keys to the temp directory 
4. Remove the original directory, pg_cryptokeys 
5. Rename the temp directory to pg_cryptokeys 

In case of failure during rotation, pg_cyrptokeys and 
pg_cyrptokeys_tmp can be left in an incomplete state. We recover it by 
checking if the temporary directory exists and the wrapped keys in the 
temporary directory are valid. 

Regards, 

-- 
Masahiko Sawada http://www.2ndQuadrant.com/ 
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-03-27 Thread Bruce Momjian

On Wed, Mar 25, 2020 at 05:51:08PM +0900, Masahiko Sawada wrote:
> On Tue, 24 Mar 2020 at 23:15, Bruce Momjian  wrote:
> >
> > On Tue, Mar 24, 2020 at 02:29:57PM +0900, Masahiko Sawada wrote:
> > > That seems to work fine.
> > >
> > > So we will have pg_cryptokeys within PGDATA and each key is stored
> > > into separate file named the key id such as "sql", "tde-wal" and
> > > "tde-block". I'll update the patch and post.
> >
> > Yes, that makes sense to me.
> >
> 
> I've attached the updated patch. With the patch, we have three
> internal keys: SQL key, TDE-block key and TDE-wal key. Only SQL key
> can be used so far to wrap and unwrap user secret via pg_wrap and
> pg_unwrap SQL functions. Each keys is saved to the single file located
> at pg_cryptokeys. After initdb with enabling key manager, the
> pg_cryptokeys directory has the following files:
> 
> $ ll data/pg_cryptokeys
> total 12K
> -rw--- 1 masahiko staff 132 Mar 25 15:45 
> -rw--- 1 masahiko staff 132 Mar 25 15:45 0001
> -rw--- 1 masahiko staff 132 Mar 25 15:45 0002
> 
> I used the integer id rather than string id to make the code simple.

Great, thanks.  I assume the final version will use file names.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

2020-03-25 Thread Masahiko Sawada

On Tue, 24 Mar 2020 at 23:15, Bruce Momjian  wrote:
>
> On Tue, Mar 24, 2020 at 02:29:57PM +0900, Masahiko Sawada wrote:
> > That seems to work fine.
> >
> > So we will have pg_cryptokeys within PGDATA and each key is stored
> > into separate file named the key id such as "sql", "tde-wal" and
> > "tde-block". I'll update the patch and post.
>
> Yes, that makes sense to me.
>

I've attached the updated patch. With the patch, we have three
internal keys: SQL key, TDE-block key and TDE-wal key. Only SQL key
can be used so far to wrap and unwrap user secret via pg_wrap and
pg_unwrap SQL functions. Each keys is saved to the single file located
at pg_cryptokeys. After initdb with enabling key manager, the
pg_cryptokeys directory has the following files:

$ ll data/pg_cryptokeys
total 12K
-rw--- 1 masahiko staff 132 Mar 25 15:45 
-rw--- 1 masahiko staff 132 Mar 25 15:45 0001
-rw--- 1 masahiko staff 132 Mar 25 15:45 0002

I used the integer id rather than string id to make the code simple.

When cluster passphrase rotation, we update all keys atomically using
temporary directory as follows:

1. Derive the new passphrase
2. Wrap all internal keys with the new passphrase
3. Save all internal keys to the temp directory
4. Remove the original directory, pg_cryptokeys
5. Rename the temp directory to pg_cryptokeys

In case of failure during rotation, pg_cyrptokeys and
pg_cyrptokeys_tmp can be left in an incomplete state. We recover it by
checking if the temporary directory exists and the wrapped keys in the
temporary directory are valid.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

kms_v9.patch
Description: Binary data

Re: Internal key management system

2020-03-24 Thread Bruce Momjian

On Tue, Mar 24, 2020 at 02:29:57PM +0900, Masahiko Sawada wrote:
> That seems to work fine.
> 
> So we will have pg_cryptokeys within PGDATA and each key is stored
> into separate file named the key id such as "sql", "tde-wal" and
> "tde-block". I'll update the patch and post.

Yes, that makes sense to me.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

2020-03-23 Thread Masahiko Sawada

On Tue, 24 Mar 2020 at 07:15, Bruce Momjian  wrote:
>
> On Mon, Mar 23, 2020 at 03:55:34PM +0900, Masahiko Sawada wrote:
> > On Sat, 21 Mar 2020 at 23:50, Bruce Momjian  wrote:
> > > Actually, I think we need three files:
> > >
> > > *  TDE WAL key file
> > > *  TDE block key file
> > > *  SQL-level file
> > >
> > > Primaries and standbys have to use the same TDE WAL key file, but can
> > > use different TDE block key files to allow for key rotation, so having
> > > separate files makes sense --- maybe they need to be in their own
> > > directory.
> >
> > I've considered to have separate key files once but it would make
> > things complex to update multiple files atomically. Postgres server
> > will never start if it crashes in the middle of cluster passphrase
> > rotation. Can we consider to have keys related to TDE after we
> > introduce the basic key management system? Probably having keys in a
> > separate file rather than in pg_control file would be better but we
> > don't need these keys so far.
>
> Well, we need to be able to upgrade this so we have to set it up now in
> a way that allows that.
>
> I am not sure we have ever had a case where we needed to update multiple
> files atomically at the same time, without the help of WAL.
>
> Perhaps we should put the three keys in separate files in a directory
> called 'cryptokeys', and when we change the pass phrase, we create a new
> directory called 'cryptokeys.new'.  Then once we have created the files
> in there with the new pass phrase, we remove cryptokeys and rename
> directory cryptokeys.new to cryptokeys.  On boot, if cryptokeys exists
> and cryptokeys.new does too, remove cryptokeys.new because we crashed
> during key rotation,  If cryptokeys.new exists and cryptokeys doesn't,
> we rename cryptokeys.new to cryptokeys because we crashed before the
> rename.

That seems to work fine.

So we will have pg_cryptokeys within PGDATA and each key is stored
into separate file named the key id such as "sql", "tde-wal" and
"tde-block". I'll update the patch and post.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-03-23 Thread Bruce Momjian

On Mon, Mar 23, 2020 at 03:55:34PM +0900, Masahiko Sawada wrote:
> On Sat, 21 Mar 2020 at 23:50, Bruce Momjian  wrote:
> > Actually, I think we need three files:
> >
> > *  TDE WAL key file
> > *  TDE block key file
> > *  SQL-level file
> >
> > Primaries and standbys have to use the same TDE WAL key file, but can
> > use different TDE block key files to allow for key rotation, so having
> > separate files makes sense --- maybe they need to be in their own
> > directory.
> 
> I've considered to have separate key files once but it would make
> things complex to update multiple files atomically. Postgres server
> will never start if it crashes in the middle of cluster passphrase
> rotation. Can we consider to have keys related to TDE after we
> introduce the basic key management system? Probably having keys in a
> separate file rather than in pg_control file would be better but we
> don't need these keys so far.

Well, we need to be able to upgrade this so we have to set it up now in
a way that allows that.

I am not sure we have ever had a case where we needed to update multiple
files atomically at the same time, without the help of WAL.

Perhaps we should put the three keys in separate files in a directory
called 'cryptokeys', and when we change the pass phrase, we create a new
directory called 'cryptokeys.new'.  Then once we have created the files
in there with the new pass phrase, we remove cryptokeys and rename
directory cryptokeys.new to cryptokeys.  On boot, if cryptokeys exists
and cryptokeys.new does too, remove cryptokeys.new because we crashed
during key rotation,  If cryptokeys.new exists and cryptokeys doesn't,
we rename cryptokeys.new to cryptokeys because we crashed before the
rename.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

2020-03-23 Thread Masahiko Sawada

On Sat, 21 Mar 2020 at 23:50, Bruce Momjian  wrote:
>
> On Sat, Mar 21, 2020 at 10:01:02AM -0400, Bruce Momjian wrote:
> > On Sat, Mar 21, 2020 at 02:12:46PM +0900, Masahiko Sawada wrote:
> > > On Sat, 21 Mar 2020 at 05:30, Bruce Momjian  wrote:
> > > > We should create an SQL-level master key that is different from the
> > > > block-level master key.  By using separate keys, and not deriving them
> > > > from a single key, they keys can be rotated and migrated to a different
> > > > cluster independently.  For example, users might want to create a new
> > > > cluster with a new block-level key, but might want to copy the SQL-level
> > > > key from the old cluster to the new cluster.  Both keys would be
> > > > unlocked with the same passphrase.
> > >
> > > I've updated the patch according to yesterday's meeting. As the above
> > > description by Bruce, the current patch have two encryption keys.
> > > Previously we have the master key in pg_control but due to exceeding
> > > the safe size limit of pg_control I moved two keys to the dedicated
> > > file located at global/pg_key. A wrapped key is 128 bytes and the
> > > total size including two wrapped key became 552 bytes while safe limit
> > > is 512 bytes.
> > >
> > > During pg_upgrade we copy the key file from the old cluster to the new
> > > cluster. Therefore we can unwrap the data that is wrapped on the old
> > > cluster on the new cluster.
> >
> > I wonder if we should just use two files, one for each key.
>
> Actually, I think we need three files:
>
> *  TDE WAL key file
> *  TDE block key file
> *  SQL-level file
>
> Primaries and standbys have to use the same TDE WAL key file, but can
> use different TDE block key files to allow for key rotation, so having
> separate files makes sense --- maybe they need to be in their own
> directory.

I've considered to have separate key files once but it would make
things complex to update multiple files atomically. Postgres server
will never start if it crashes in the middle of cluster passphrase
rotation. Can we consider to have keys related to TDE after we
introduce the basic key management system? Probably having keys in a
separate file rather than in pg_control file would be better but we
don't need these keys so far.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-03-21 Thread Bruce Momjian

On Sat, Mar 21, 2020 at 10:01:02AM -0400, Bruce Momjian wrote:
> On Sat, Mar 21, 2020 at 02:12:46PM +0900, Masahiko Sawada wrote:
> > On Sat, 21 Mar 2020 at 05:30, Bruce Momjian  wrote:
> > > We should create an SQL-level master key that is different from the
> > > block-level master key.  By using separate keys, and not deriving them
> > > from a single key, they keys can be rotated and migrated to a different
> > > cluster independently.  For example, users might want to create a new
> > > cluster with a new block-level key, but might want to copy the SQL-level
> > > key from the old cluster to the new cluster.  Both keys would be
> > > unlocked with the same passphrase.
> > 
> > I've updated the patch according to yesterday's meeting. As the above
> > description by Bruce, the current patch have two encryption keys.
> > Previously we have the master key in pg_control but due to exceeding
> > the safe size limit of pg_control I moved two keys to the dedicated
> > file located at global/pg_key. A wrapped key is 128 bytes and the
> > total size including two wrapped key became 552 bytes while safe limit
> > is 512 bytes.
> > 
> > During pg_upgrade we copy the key file from the old cluster to the new
> > cluster. Therefore we can unwrap the data that is wrapped on the old
> > cluster on the new cluster.
> 
> I wonder if we should just use two files, one for each key.

Actually, I think we need three files:

*  TDE WAL key file
*  TDE block key file
*  SQL-level file

Primaries and standbys have to use the same TDE WAL key file, but can
use different TDE block key files to allow for key rotation, so having
separate files makes sense --- maybe they need to be in their own
directory.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

2020-03-21 Thread Bruce Momjian

On Sat, Mar 21, 2020 at 02:12:46PM +0900, Masahiko Sawada wrote:
> On Sat, 21 Mar 2020 at 05:30, Bruce Momjian  wrote:
> > We should create an SQL-level master key that is different from the
> > block-level master key.  By using separate keys, and not deriving them
> > from a single key, they keys can be rotated and migrated to a different
> > cluster independently.  For example, users might want to create a new
> > cluster with a new block-level key, but might want to copy the SQL-level
> > key from the old cluster to the new cluster.  Both keys would be
> > unlocked with the same passphrase.
> 
> I've updated the patch according to yesterday's meeting. As the above
> description by Bruce, the current patch have two encryption keys.
> Previously we have the master key in pg_control but due to exceeding
> the safe size limit of pg_control I moved two keys to the dedicated
> file located at global/pg_key. A wrapped key is 128 bytes and the
> total size including two wrapped key became 552 bytes while safe limit
> is 512 bytes.
> 
> During pg_upgrade we copy the key file from the old cluster to the new
> cluster. Therefore we can unwrap the data that is wrapped on the old
> cluster on the new cluster.

I wonder if we should just use two files, one for each key.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

2020-03-20 Thread Masahiko Sawada

On Sat, 21 Mar 2020 at 05:30, Bruce Momjian  wrote:
>
> On Thu, Mar 19, 2020 at 09:33:09PM +0900, Masahiko Sawada wrote:
> > Attached updated version patch. This patch incorporated the comments
> > and changed pg_upgrade so that we take over the master encryption key
> > from the old cluster to the new one if both enable key management.
>
> We had a crypto team meeting today, and came away with a few ideas:
>
> We should create an SQL-level master key that is different from the
> block-level master key.  By using separate keys, and not deriving them
> from a single key, they keys can be rotated and migrated to a different
> cluster independently.  For example, users might want to create a new
> cluster with a new block-level key, but might want to copy the SQL-level
> key from the old cluster to the new cluster.  Both keys would be
> unlocked with the same passphrase.

I've updated the patch according to yesterday's meeting. As the above
description by Bruce, the current patch have two encryption keys.
Previously we have the master key in pg_control but due to exceeding
the safe size limit of pg_control I moved two keys to the dedicated
file located at global/pg_key. A wrapped key is 128 bytes and the
total size including two wrapped key became 552 bytes while safe limit
is 512 bytes.

During pg_upgrade we copy the key file from the old cluster to the new
cluster. Therefore we can unwrap the data that is wrapped on the old
cluster on the new cluster.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

kms_v8.patch
Description: Binary data

Re: Internal key management system

2020-03-20 Thread Bruce Momjian

On Thu, Mar 19, 2020 at 09:33:09PM +0900, Masahiko Sawada wrote:
> Attached updated version patch. This patch incorporated the comments
> and changed pg_upgrade so that we take over the master encryption key
> from the old cluster to the new one if both enable key management.

We had a crypto team meeting today, and came away with a few ideas:

We should create an SQL-level master key that is different from the
block-level master key.  By using separate keys, and not deriving them
from a single key, they keys can be rotated and migrated to a different
cluster independently.  For example, users might want to create a new
cluster with a new block-level key, but might want to copy the SQL-level
key from the old cluster to the new cluster.  Both keys would be
unlocked with the same passphrase.

I was confused by how the wrap/unwrap work.  Here is an example from
the proposed doc patch:

+
+=# SELECT pg_wrap('user sercret key');
+   
   pg_wrap

+
+ 
\xb2c89f76f04f95d029f179e0fc3df4ed7254127b5562a9e27d42d1cd037c942dea65ce7c0750c520fa4f4e90481c9eb7e1e42a068248c262c1a6f25c6eab64303b1154ccc9a14361223641aab4a7aabe
+(1 row)
+
+
+  
+   Once wrapping the user key, user can encrypt and decrypt user data 
using the
+   wrapped user key togehter with the key unwrap functions:
+  
+
+
+ =# INSERT INTO tbl
+VALUES (pgp_sym_encrypt('secret data',
+ 
pg_unwrap('\xb2c89f76f04f95d029f179e0fc3df4ed7254127b5562a9e27d42d1cd037c942dea65ce7c0750c520fa4f4e90481c9eb7e1e42a068248c262c1a6f25c6eab64303b1154ccc9a14361223641aab4a7aabe')));
+ INSERT 1
+
+ =# SELECT * FROM tbl;
+   
  col

+--
+ 
\xc30d04070302a199ee38bea0320b75d23c01577bb3ffb315d67eecbeca3e40e869cea65efbf0b470f805549af905f94d94c447fbfb8113f585fc86b30c0bd784b10c9857322dc00d556aa8de14
+(1 row)
+
+ =# SELECT pgp_sym_decrypt(col,
+   
pg_unwrap('\xb2c89f76f04f95d029f179e0fc3df4ed7254127b5562a9e27d42d1cd037c942dea65ce7c0750c520fa4f4e90481c9eb7e1e42a068248c262c1a6f25c6eab64303b1154ccc9a14361223641aab4a7aabe'))
 as col
+FROM tbl;
+col
+--
+ user secret data

All pg_wrap() does is to take the user string, in this case 'user
sercret key' and encrypt it with the SQL-level master key.  It doesn't
mix the SQL-level master key into the output, which is what I originally
thought.  This means that the pg_unwrap() call above just returns 'user
sercret key'.

How would this be used? Users would call pg_wrap() once, and store the
result on the client.  The client could then use the output of pg_wrap()
in all future sessions, without exposing 'user sercret key', which is
the key used to encrypt user data.

The passing of the parameter to pg_wrap() has to be done in a way that
doesn't permanently record the parameter anywhere, like in the logs. 
pgcryptokey (https://momjian.us/download/pgcryptokey/) has a method of
doing this.  This is how it passes the data encryption key without
making it visible in the logs, using psql:

SELECT get_shared_key()
\gset
\set enc_access_password `echo 'my secret' | tr -d '\n' | openssl dgst 
-sha256 -binary | gpg2 --symmetric --batch --cipher-algo AES128 --passphrase 
:'get_shared_key' | xxd -plain | tr -d '\n'`
SELECT set_session_access_password(:'enc_access_password');

Removing the sanity checks and user-interface simplicity, it is
internally doing this:

SELECT set_config('pgcryptokey.shared_key',
  encode(gen_random_bytes(32), 'hex'),
  FALSE) AS get_shared_key
\gset
\set enc_access_password `echo 'my secret' | tr -d '\n' | openssl dgst 
-sha256 -binary | gpg2 --symmetric --batch --cipher-algo AES128 --passphrase 
:'get_shared_key' | xxd -plain | tr -d '\n'`
SELECT set_config('pgcryptokey.access_password',

encode(pgp_sym_decrypt_bytea(decode(:'enc_access_password', 'hex'),
   :'get_shared_key'),
'hex'),
  FALSE) || NULL;

In English, what it does is the server generates a random key, stores it
in a server-side veraible, and sends it to the client.  The client
hashes a user-supplied key and encrypts it with the random key it

Re: Internal key management system

2020-03-20 Thread Masahiko Sawada

On Fri, 20 Mar 2020 at 01:38, Bruce Momjian  wrote:
>
> On Fri, Mar 20, 2020 at 12:50:27AM +0900, Masahiko Sawada wrote:
> > On Fri, Mar 20, 2020 at 0:35 Bruce Momjian  wrote:
> > Well, the issue is if the user can control the user key, there is might 
> > be
> > a way to make the user key do nothing.
> >
> > Well I meant ‘USER_KEY:’ is a fixed length string for the key used for wrap 
> > and
> > unwrap SQL interface functions. So user cannot control it. We will have 
> > another
> > key derived by, for example, HKDF(MK, ‘TDE_KEY:’ || system_identifier) for
> > block encryption.
>
> OK, yes, something liek that might make sense.
>

Attached the updated version patch. The patch introduces KDF to derive
a new key from the master encryption key. We use the derived key for
pg_wrap and pg_unwrap SQL functions, instead of directly using the
master encryption key. In the future, we will be able to have a
separate derived keys for block encryption. As a result of using KDF,
the minimum version of OpenSSL when enabling key management is 1.1.0.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

kms_v7.patch
Description: Binary data

Re: Internal key management system

2020-03-19 Thread Bruce Momjian

On Fri, Mar 20, 2020 at 12:50:27AM +0900, Masahiko Sawada wrote:
> On Fri, Mar 20, 2020 at 0:35 Bruce Momjian  wrote:
> Well, the issue is if the user can control the user key, there is might be
> a way to make the user key do nothing.
> 
> Well I meant ‘USER_KEY:’ is a fixed length string for the key used for wrap 
> and
> unwrap SQL interface functions. So user cannot control it. We will have 
> another
> key derived by, for example, HKDF(MK, ‘TDE_KEY:’ || system_identifier) for
> block encryption.

OK, yes, something liek that might make sense.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

On Fri, Mar 20, 2020 at 0:35 Bruce Momjian  wrote:

> On Thu, Mar 19, 2020 at 11:42:36PM +0900, Masahiko Sawada wrote:
> > On Thu, 19 Mar 2020 at 22:00, Bruce Momjian  wrote:
> > >
> > > On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> > > > On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > > > > I understand that your idea is to include fixed length string to
> the
> > > > > 256 bit key in order to separate key space. But if we do that, I
> think
> > > > > that the key strength would actually be the same as the strength of
> > > > > weaker key length, depending on how we have the fixed string. I
> think
> > > > > if we want to have multiple key spaces, we need to derive keys
> from the
> > > > > master key using KDF.
> > > >
> > > > Or we can simply generate a different encryption key for block
> > > > encryption. Therefore we will end up with having two encryption keys
> > > > inside database. Maybe we can discuss this after the key manager has
> > > > been introduced.
> > >
> > > I know Sehrope liked derived keys so let's get his feedback on this.
> We
> > > might want to have two keys anyway for key rotation purposes.
> > >
> >
> > Agreed. Maybe we can derive a key for the use of wrap and unwrap SQL
> > interface by like HKDF(MK, 'USER_KEY:') or HKDF(KM, 'USER_KEY:' ||
> > system_identifier).
>
> Well, the issue is if the user can control the user key, there is might be
> a way to make the user key do nothing.


Well I meant ‘USER_KEY:’ is a fixed length string for the key used for wrap
and unwrap SQL interface functions. So user cannot control it. We will have
another key derived by, for example, HKDF(MK, ‘TDE_KEY:’ ||
system_identifier) for block encryption.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-03-19 Thread Bruce Momjian

On Thu, Mar 19, 2020 at 11:42:36PM +0900, Masahiko Sawada wrote:
> On Thu, 19 Mar 2020 at 22:00, Bruce Momjian  wrote:
> >
> > On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> > > On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > > > I understand that your idea is to include fixed length string to the
> > > > 256 bit key in order to separate key space. But if we do that, I think
> > > > that the key strength would actually be the same as the strength of
> > > > weaker key length, depending on how we have the fixed string. I think
> > > > if we want to have multiple key spaces, we need to derive keys from the
> > > > master key using KDF.
> > >
> > > Or we can simply generate a different encryption key for block
> > > encryption. Therefore we will end up with having two encryption keys
> > > inside database. Maybe we can discuss this after the key manager has
> > > been introduced.
> >
> > I know Sehrope liked derived keys so let's get his feedback on this.  We
> > might want to have two keys anyway for key rotation purposes.
> >
> 
> Agreed. Maybe we can derive a key for the use of wrap and unwrap SQL
> interface by like HKDF(MK, 'USER_KEY:') or HKDF(KM, 'USER_KEY:' ||
> system_identifier).

Well, the issue is if the user can control the user key, there might be
a way to make the user key do nothing.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

On Thu, 19 Mar 2020 at 22:00, Bruce Momjian  wrote:
>
> On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> > On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > > I understand that your idea is to include fixed length string to the
> > > 256 bit key in order to separate key space. But if we do that, I think
> > > that the key strength would actually be the same as the strength of
> > > weaker key length, depending on how we have the fixed string. I think
> > > if we want to have multiple key spaces, we need to derive keys from the
> > > master key using KDF.
> >
> > Or we can simply generate a different encryption key for block
> > encryption. Therefore we will end up with having two encryption keys
> > inside database. Maybe we can discuss this after the key manager has
> > been introduced.
>
> I know Sehrope liked derived keys so let's get his feedback on this.  We
> might want to have two keys anyway for key rotation purposes.
>

Agreed. Maybe we can derive a key for the use of wrap and unwrap SQL
interface by like HKDF(MK, 'USER_KEY:') or HKDF(KM, 'USER_KEY:' ||
system_identifier).

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system

2020-03-19 Thread Bruce Momjian

On Thu, Mar 19, 2020 at 06:32:57PM +0900, Masahiko Sawada wrote:
> On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
> > I understand that your idea is to include fixed length string to the
> > 256 bit key in order to separate key space. But if we do that, I think
> > that the key strength would actually be the same as the strength of
> > weaker key length, depending on how we have the fixed string. I think
> > if we want to have multiple key spaces, we need to derive keys from the
> > master key using KDF.
> 
> Or we can simply generate a different encryption key for block
> encryption. Therefore we will end up with having two encryption keys
> inside database. Maybe we can discuss this after the key manager has
> been introduced.

I know Sehrope liked derived keys so let's get his feedback on this.  We
might want to have two keys anyway for key rotation purposes.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +

Re: Internal key management system

On Thu, 19 Mar 2020 at 18:32, Masahiko Sawada
 wrote:
>
> On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
>  wrote:
> >
> > Sending to pgsql-hackers again.
> >
> > On Tue, 17 Mar 2020 at 03:18, Bruce Momjian
> >  wrote:
> > >
> > > On Mon, Mar 16, 2020 at 04:13:21PM +0900, Masahiko Sawada wrote:
> > > > On Thu, 12 Mar 2020 at 08:13, Bruce Momjian
> > > >  wrote:
> > > > >
> > > > > On Fri, Mar  6, 2020 at 03:31:00PM +0900, Masahiko Sawada wrote:
> > > > > > On Fri, 6 Mar 2020 at 15:25, Moon, Insung 
> > > > > >  wrote:
> > > > > > >
> > > > > > > Dear Sawada-san
> > > > > > >
> > > > > > > I don't know if my environment or email system is weird, but the 
> > > > > > > V5
> > > > > > > patch file is only include simply a changed list.
> > > > > > > and previous V4 patch file size was 64kb, but the v5 patch file 
> > > > > > > size was 2kb.
> > > > > > > Can you check it?
> > > > > > >
> > > > > >
> > > > > > Thank you! I'd attached wrong file.
> > > > >
> > > > > Looking at this thread, I wanted to make a few comments:
> > > > >
> > > > > Everyone seems to think pgcrypto need some maintenance.  Who would 
> > > > > like
> > > > > to take on that task?
> > > > >
> > > > > This feature does require openssl since all the encryption/decryption
> > > > > happen via openssl function calls.
> > > > >
> > > > > Three are three levels of encrypt here:
> > > > >
> > > > > 1.  The master key generated during initdb
> > > > >
> > > > > 2.  The passphrase to unlock the master key at boot time.  Is that
> > > > > optional or required?
> > > >
> > > > The passphrase is required if the internal kms is enabled during
> > > > initdb. Currently hashing the passphrase is also required but it could
> > > > be optional. Even if we make hashing optional, we still require
> > > > openssl to wrap and unwrap.
> > >
> > > I think openssl should be required for any of this --- that is what I
> > > was asking.
> > >
> > > > > Could the wrap functions expose the master encryption key by passing 
> > > > > in
> > > > > empty string or null?
> > > >
> > > > Currently the wrap function returns NULL if null is passed, and
> > > > doesn't expose the master encryption key even if empty string is
> > > > passed because we add random IV for each wrapping.
> > >
> > > OK, good, makes sense, but you see why I am asking?  We never want the
> > > master key to be visible.
> >
> > Understood.
> >
> > >
> > > > >  I wonder if we should create a derived key from
> > > > > the master key to use for pg_wrap/pg_unwrap, maybe by appending a 
> > > > > fixed
> > > > > string to all strings supplied to these functions.  We could create
> > > > > another derived key for use in block-level encryption, so we are sure
> > > > > the two key spaces would never overlap.
> > > >
> > > > Currently the master key is 32 bytes but you mean to add fixed string
> > > > to the master key to derive a new key?
> > >
> > > Yes, that was my idea --- make a separate keyspace for wrap/unwrap and
> > > block-level encryption.
> >
> > I understand that your idea is to include fixed length string to the
> > 256 bit key in order to separate key space. But if we do that, I think
> > that the key strength would actually be the same as the strength of
> > weaker key length, depending on how we have the fixed string. I think
> > if we want to have multiple key spaces, we need to derive keys from the
> > master key using KDF.
>
> Or we can simply generate a different encryption key for block
> encryption. Therefore we will end up with having two encryption keys
> inside database. Maybe we can discuss this after the key manager has
> been introduced.
>

Attached updated version patch. This patch incorporated the comments
and changed pg_upgrade so that we take over the master encryption key
from the old cluster to the new one if both enable key management.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


kms_v6.patch
Description: Binary data

Re: Internal key management system

On Thu, 19 Mar 2020 at 15:59, Masahiko Sawada
 wrote:
>
> Sending to pgsql-hackers again.
>
> On Tue, 17 Mar 2020 at 03:18, Bruce Momjian
>  wrote:
> >
> > On Mon, Mar 16, 2020 at 04:13:21PM +0900, Masahiko Sawada wrote:
> > > On Thu, 12 Mar 2020 at 08:13, Bruce Momjian
> > >  wrote:
> > > >
> > > > On Fri, Mar  6, 2020 at 03:31:00PM +0900, Masahiko Sawada wrote:
> > > > > On Fri, 6 Mar 2020 at 15:25, Moon, Insung 
> > > > >  wrote:
> > > > > >
> > > > > > Dear Sawada-san
> > > > > >
> > > > > > I don't know if my environment or email system is weird, but the V5
> > > > > > patch file is only include simply a changed list.
> > > > > > and previous V4 patch file size was 64kb, but the v5 patch file 
> > > > > > size was 2kb.
> > > > > > Can you check it?
> > > > > >
> > > > >
> > > > > Thank you! I'd attached wrong file.
> > > >
> > > > Looking at this thread, I wanted to make a few comments:
> > > >
> > > > Everyone seems to think pgcrypto need some maintenance.  Who would like
> > > > to take on that task?
> > > >
> > > > This feature does require openssl since all the encryption/decryption
> > > > happen via openssl function calls.
> > > >
> > > > Three are three levels of encrypt here:
> > > >
> > > > 1.  The master key generated during initdb
> > > >
> > > > 2.  The passphrase to unlock the master key at boot time.  Is that
> > > > optional or required?
> > >
> > > The passphrase is required if the internal kms is enabled during
> > > initdb. Currently hashing the passphrase is also required but it could
> > > be optional. Even if we make hashing optional, we still require
> > > openssl to wrap and unwrap.
> >
> > I think openssl should be required for any of this --- that is what I
> > was asking.
> >
> > > > Could the wrap functions expose the master encryption key by passing in
> > > > empty string or null?
> > >
> > > Currently the wrap function returns NULL if null is passed, and
> > > doesn't expose the master encryption key even if empty string is
> > > passed because we add random IV for each wrapping.
> >
> > OK, good, makes sense, but you see why I am asking?  We never want the
> > master key to be visible.
>
> Understood.
>
> >
> > > >  I wonder if we should create a derived key from
> > > > the master key to use for pg_wrap/pg_unwrap, maybe by appending a fixed
> > > > string to all strings supplied to these functions.  We could create
> > > > another derived key for use in block-level encryption, so we are sure
> > > > the two key spaces would never overlap.
> > >
> > > Currently the master key is 32 bytes but you mean to add fixed string
> > > to the master key to derive a new key?
> >
> > Yes, that was my idea --- make a separate keyspace for wrap/unwrap and
> > block-level encryption.
>
> I understand that your idea is to include fixed length string to the
> 256 bit key in order to separate key space. But if we do that, I think
> that the key strength would actually be the same as the strength of
> weaker key length, depending on how we have the fixed string. I think
> if we want to have multiple key spaces, we need to derive keys from the
> master key using KDF.

Or we can simply generate a different encryption key for block
encryption. Therefore we will end up with having two encryption keys
inside database. Maybe we can discuss this after the key manager has
been introduced.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Internal key management system