On Sun, 31 May 2020 at 17:13, Fabien COELHO <coe...@cri.ensmp.fr> wrote:
>
>
> Hello Masahiko-san,
>
> >> I am sharing here a document patch based on top of kms_v10 that was
> >> shared awhile back. This document patch aims to cover more design
> >> details of the current KMS design and to help people understand KMS
> >> better. Please let me know if you have any more comments.
>
> A few questions and comments, mostly about the design. If I'm off topic,
> or these concerns have been clearly addressed in the thread, please accept
> my apology.

Thank you for your comments! Please correct me if I'm misunderstanding
your questions and comments.

>
> A lot of what I write is based on guessing from a look at the doc & code
> provided in the patch. The patch should provide some explanatory README
> about the overall design.

Agreed.

>
> It is a lot of code, which for me should not be there, inside the backend.
> Could this whole thing be an extension? I cannot see why not. If it could,
> then ISTM that it should. If not, what set of features is needed to allow
> that as an extension? How could pg be improved so that it could be an
> extension?

Let me explain some information about TDE behind this key manager patch.

This key manager is aimed to manage cryptographic keys used for
transparent data encryption. As a result of the discussion, we
concluded it's safer to use multiple keys to encrypt database data
rather than using one key to encrypt the whole thing, for example, in
order to make sure different data is not encrypted with the same key
and IV. Therefore, in terms of TDE, the minimum requirement is that
PostgreSQL can use multiple keys.

Using multiple keys in PG, there are roughly two possible designs:

1. Store all keys for TDE into the external KMS, and PG gets them from
it as needed.
2. PG manages all keys for TDE inside and protect these keys on disk
by the key (i.g. KEK) stored in the external KMS.

There are pros and cons to each design. If I take one cons of #1 as an
example, the operation between PG and the external KMS could be
complex. The operations could be creating, removing and rotate key and
so on. We can implement these operations in an extension to interact
with different kinds of external KMS, and perhaps we can use KMIP. But
the development cost could become high because we might need different
extensions for each key management solutions/services.

#2 is better at that point; the interaction between PG and KMS is only
GET. Other databases employes a similar approach are SQL Server and
DB2.

In terms of the necessity of introducing the key manager into PG core,
I think at least TDE needs to be implemented in PG core. And as this
key manager is for managing keys for TDE, I think the key manager also
needs to be introduced into the core so that TDE functionality doesn't
depend on external modules.

>
> Also, I'm not at fully at ease with some of the underlying principles
> behind this proposal. Are we re-inventing/re-implementing kerberos or
> whatever? Are we re-implementing a brand new KMS inside pg? Why having
> our own?

As I explained above, this key manager is for managing internal keys
used by TDE. It's not an alternative to existing key management
solutions/services.

The requirements of this key manager are generating internal keys,
letting other PG components use them, protecting them by KEK when
persisting, and support KEK rotation. It doesn’t have a feature like
allowing users to store arbitrary keys into this key manager, like
other key management solutions/services have.

>
> I think that key management should *not* belong to pg itself, but to some
> external facility/process with which pg would interact, so that no master
> key would ever be inside pg process, and possibly not on the same host, if
> it was me doing it.
>
> If some extension could provide it inside the process and stores thing
> inside some pg_cryptokeys directory, then fine if it fits the threat model
> being addressed, but the paranoïd user wanting that should have other
> options which could be summarized as "outside".
>
> Another benefit of "outside" is that if there is a security issue attached
> to the kms, then it would not be a pg security issue, and it would not
> affect normal pg users which do not use the feature.

I agree that the key used to encrypt data must not be placed in the
same host. But it's true only when the key is not protected, right? In
this key manager, since we protect all internal keys by KEK it's no
problem unless KEK is leaked. KEK can be obtained from outside key
management solutions/services through cluster_passphrase_command.

>
> Also, implementing a crash-safe key rotation algorithm does not look like
> inside pg backend, that is not its job.

The key rotation this key manager has is KEK rotation, which is very
important. Without KEK rotation, when KEK is leaked an attacker can
get database data by disk theft. Since KEK is responsible for
encrypting all internal keys it's necessary to re-encrypt the internal
keys when KEK is rotated. I think PG is the only role that can do that
job.

In terms of rotation of internal keys, an idea proposed during the
discussion is that change the internal keys when pg_basebackup. The
sender transfers database data after decryption and the receiver
encrypts the received data with the different internal keys from what
the sender has.

>  Likewise, the AEAD AES-CBC
> HMAC-SHA512 does definitely not belong to postgres core backend
> implementation. Why should I use the OpenSSL library and not some other
> facility?

The purpose of AEAD is to do both things: encrypting internal keys and
integrity checks of these keys. We cannot do integrity checks using
only AES. Another option could be to use AES key wrapping[1].

>
> Basically, I'm -1 on having such a feature right inside pg, and +1 on
> allowing pg to have it outside and interact with it appropriately,
> preferably through an extension which could be in core.
>
> So my take is that pg should allow an extension to:
>
>   - provide a *generic* way to interact with an *external* kms
>     eg by running a command (possibly setuid something) and interacting
>     with its stdin/stderr what the command does should be of no concern
>     to pg and use some trivial text protocol, and the existing code
>     can be wrapped as an example working implementation.
>
>   - store some local keys somewhere and provide functions to use these
>     keys to encrypt/decrypt stuff, obviously, as generic as possible.
>
>     ISTM that what crypto algorithms are actually used should not be
>     hardcoded, but I'm not sure how to achieve that. Maybe simply by
>     redefining the relevant function, maybe at the SQL level.
>

I think this key manager satisfies the fist point by
cluster_passphrase_command. For the second point, the key manager
stores local keys inside PG while protecting them by KEK managed
outside of PG.

Inspired by SQL Server's Always Encrypted I implemented pg_encrypt()
and pg_decrypt() but these are actually not necessary in terms of TDE.
We can introduce the key manager with empty internal keys and then
introduce TDE with adding necessary keys.

I agree with the point that crypto algorithms should not be hardcoded.

> There is an open question on how the "command" validates that it is indeed
> the right pg which is interacting with it. This means some authentication,
> probably some passphrase to provide somehow, probably close to what is
> being implemented, so from an interface point of view, it could look quite
> the same, but the key point is that the whole thing would be out of
> postgres process, only encryption keys being used would be in postgres,
> and probably only in the process which actually needs it.
>

I might be missing your point but is the question of how to verify the
passphrase given by cluseter_passphrase_command is correct?

> Random comments about details I saw in passing:
>
> * key_management_enabled
>
> key_management (on|off) ?
>
> * initdb -D dbname --cluster-passphrase-command="cat /path/to/passphrase-file"
>
> Putting example in the documentation looks like a recommendation. It would
> put a caveat that doing the above is probably a bad idea.

Agreed on the above two points.

Regards,

[1] https://tools.ietf.org/html/rfc3394

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply via email to