Re: Proposed patch for key managment

2020-12-31 Thread Stephen Frost
Greetings,

* Fabien COELHO (coe...@cri.ensmp.fr) wrote:
> >>The implementations should not have to be in any particular language: Shell,
> >>Perl, Python, C should be possible.
> >
> >I disagree that it makes any sense to pass actual encryption out to a
> >shell script.
> 
> Yes, sure. I'm talking about key management. For actual crypto functions,
> ISTM that they should be "replaceable", i.e. if some institution does not
> believe in AES then they could switch to something else.

The proposed implementation makes it pretty straight-forward to add in
other implementations and other crypto algorithms if people wish to do
so.

> >>After giving it more thought during the day, I think that only one
> >>command and a basic protocol is needed. Maybe something as simple as
> >>
> >>  /path/to/command --options arguments…
> >
> >This is what's proposed- a command is run to acquire the KEK (as a hex
> >encoded set of bytes, making it possible to use a shell script, which is
> >what the patch does too).
> 
> Yep, but that is not what I'm trying to advocate. The "KEK" (if any), would
> stay in the process, not be given back to the database or command using it.
> Then the database/tool would interact with the command to get the actual
> per-file keys when needed.

"When needed" is every single write that we do of any file in the entire
backend.  Reaching out to something external every time we go to use the
key really isn't sensible- except, perhaps, in the case where we are
doing full crypto off-loading and we don't actually have to deal with
the KEK or the DEK at all, but that's all on the cluster encryption
side, really, and isn't the focus of this patch and what it's bringing-
all of which is needed anyway.

> >[...] The API to fetch the KEK doesn't care at all about where it's stored
> >or how it's derived or anything like that.
> 
> >There's a relatively small change which could be made to have PG request
> >all of the keys that it'll need on startup, if we want to go there as has
> >been suggested elsewhere, but even if we do that, PG needs to be able to
> >do that itself too, otherwise it's not a complete capability and there
> >seems little point in doing something that's just a pass-thru to something
> >else and isn't able to really be used.
> 
> I do not understand. Postgres should provide a working implementation,
> whether the functions are directly inside pg or though an external process,
> they need to be there anyway. I'm trying to fix a clean, possibly external
> API so that it is possible to have something different from the choices made
> in the patch.

Right, we need a working implementation that's part of core, that's what
this effort is trying to bring.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-31 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Wed, Dec 30, 2020 at 06:49:34PM -0500, Stephen Frost wrote:
> > The API to fetch the KEK doesn't care at all about where it's stored or
> > how it's derived or anything like that.  There's a relatively small
> > change which could be made to have PG request all of the keys that it'll
> > need on startup, if we want to go there as has been suggested elsewhere,
> > but even if we do that, PG needs to be able to do that itself too,
> > otherwise it's not a complete capability and there seems little point in
> > doing something that's just a pass-thru to something else and isn't able
> > to really be used.
> 
> Right now, the script returns a cluster key (KEK), and during initdb the
> server generates data encryption keys and wraps them with the KEK. 
> During server start, the server validates the KEK and decrypts the data
> keys.  pg_alterckey allows changing the KEK.

Right- all of those are really necessary things for PG to have a useful
implementation.

> I think Fabien is saying this all should _only_ be done using external
> tools --- that's what I don't agree with.

I also don't agree that everything should be external.  We effectively
have that today and it certainly doesn't get us any closer to having a
solution in PG, a point which we are frequently pointed to as a reason
why certain, entirely reasonable, use-cases can't be satisfied with PG.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-31 Thread Fabien COELHO



Hello,


The API to fetch the KEK doesn't care at all about where it's stored or
how it's derived or anything like that.  There's a relatively small
change which could be made to have PG request all of the keys that it'll
need on startup, if we want to go there as has been suggested elsewhere,
but even if we do that, PG needs to be able to do that itself too,
otherwise it's not a complete capability and there seems little point in
doing something that's just a pass-thru to something else and isn't able
to really be used.


Right now, the script returns a cluster key (KEK), and during initdb the
server generates data encryption keys and wraps them with the KEK.
During server start, the server validates the KEK and decrypts the data
keys.  pg_alterckey allows changing the KEK.

I think Fabien is saying this all should _only_ be done using external
tools --- that's what I don't agree with.


Yep.

I could compromise on "could be done using an external tool", but that 
requires designing the API and thinking about where and how things are 
done before everything is hardwired. Designing afterwards is too late. 
ISTM that the current patch does not separate API design and cryptographic 
design, so both are deeply entwined, and would be difficult to 
disentangle.


--
Fabien.




Re: Proposed patch for key managment

2020-12-31 Thread Fabien COELHO


Hello Stephen,


The implementations should not have to be in any particular language: Shell,
Perl, Python, C should be possible.


I disagree that it makes any sense to pass actual encryption out to a
shell script.


Yes, sure. I'm talking about key management. For actual crypto functions, 
ISTM that they should be "replaceable", i.e. if some institution does not 
believe in AES then they could switch to something else.



After giving it more thought during the day, I think that only one
command and a basic protocol is needed. Maybe something as simple as

  /path/to/command --options arguments…


This is what's proposed- a command is run to acquire the KEK (as a hex
encoded set of bytes, making it possible to use a shell script, which is
what the patch does too).


Yep, but that is not what I'm trying to advocate. The "KEK" (if any), 
would stay in the process, not be given back to the database or command 
using it. Then the database/tool would interact with the command to get 
the actual per-file keys when needed.


[...] The API to fetch the KEK doesn't care at all about where it's 
stored or how it's derived or anything like that.


There's a relatively small change which could be made to have PG request 
all of the keys that it'll need on startup, if we want to go there as 
has been suggested elsewhere, but even if we do that, PG needs to be 
able to do that itself too, otherwise it's not a complete capability and 
there seems little point in doing something that's just a pass-thru to 
something else and isn't able to really be used.


I do not understand. Postgres should provide a working implementation, 
whether the functions are directly inside pg or though an external 
process, they need to be there anyway. I'm trying to fix a clean, possibly 
external API so that it is possible to have something different from the 
choices made in the patch.


--
Fabien.

Re: Proposed patch for key managment

2020-12-30 Thread Bruce Momjian
On Wed, Dec 30, 2020 at 06:49:34PM -0500, Stephen Frost wrote:
> The API to fetch the KEK doesn't care at all about where it's stored or
> how it's derived or anything like that.  There's a relatively small
> change which could be made to have PG request all of the keys that it'll
> need on startup, if we want to go there as has been suggested elsewhere,
> but even if we do that, PG needs to be able to do that itself too,
> otherwise it's not a complete capability and there seems little point in
> doing something that's just a pass-thru to something else and isn't able
> to really be used.

Right now, the script returns a cluster key (KEK), and during initdb the
server generates data encryption keys and wraps them with the KEK. 
During server start, the server validates the KEK and decrypts the data
keys.  pg_alterckey allows changing the KEK.

I think Fabien is saying this all should _only_ be done using external
tools --- that's what I don't agree with.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-30 Thread Stephen Frost
Greetings,

* Fabien COELHO (coe...@cri.ensmp.fr) wrote:
> I think that an API should be carefully thought about, without assumption
> about the underlying cryptography (algorithm, key lengths, modes, how keys
> are derived and stored, and so on), and its usefulness be demonstrated by
> actually being used for one implementation which would be what is currently
> being proposed in the patch, and possibly others thrown in for free.

Perhaps I'm misunderstanding, but this is basically what we have in the
currently proposed patch as 'cipher.h', with cipher.c as a stub that
fails if called directly, and cipher_openssl.c with the actual OpenSSL
based implementation.

> The implementations should not have to be in any particular language: Shell,
> Perl, Python, C should be possible.

I disagree that it makes any sense to pass actual encryption out to a
shell script.

> After giving it more thought during the day, I think that only one
> command and a basic protocol is needed. Maybe something as simple as
> 
>   /path/to/command --options arguments…

This is what's proposed- a command is run to acquire the KEK (as a hex
encoded set of bytes, making it possible to use a shell script, which is
what the patch does too).

> With a basic (text? binary?) protocol on stdin/stdout (?) for the different
> functions. What the command actually does (connect to a remote server, ask
> for a master password, open some other database, whatever) should be
> irrelevant to pg, which would just get and pass bunch of bytes to functions,
> which could use them for keys, secrets, whatever, and be easily replaceable.

Right- we just get bytes back from the command and then we use that as
the KEK.  How that scripts gets those bytes is entirely up to the script
and is not an issue for PG to worry about.

> The API should NOT make assumptions about the cryptographic design, what
> depends about what, where things are stored… ISTM that Pg should only care
> about naming keys, holding them when created/retrieved (but not create
> them), basically interacting with the key manager, passing the stuff to
> functions for encryption/decryption seen as black boxes.

The API to fetch the KEK doesn't care at all about where it's stored or
how it's derived or anything like that.  There's a relatively small
change which could be made to have PG request all of the keys that it'll
need on startup, if we want to go there as has been suggested elsewhere,
but even if we do that, PG needs to be able to do that itself too,
otherwise it's not a complete capability and there seems little point in
doing something that's just a pass-thru to something else and isn't able
to really be used.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-28 Thread Fabien COELHO



I want to repeat here what I said in another thread:


I think ultimately we will need three commands to control the keys.
First, there is the cluster_key_command, which we have now.  Second, I
think we will need an optional command which returns random bytes ---
this would allow users to get random bytes from a different source than
that used by the server code.

Third, we will probably need a command that returns the data encryption
keys directly, either heap/index or WAL keys, probably based on key
number --- you pass the key number you want, and the command returns the
data key.  There would not be a cluster key in this case, but the
command could still prompt the user for perhaps a password to the KMS
server. It could not be used if any of the previous two commands are
used. I assume an HMAC would still be stored in the pg_cryptokeys
directory to check that the right key has been returned.

I thought we should implement the first command, because it will
probably be the most common and easiest to use, and then see what people
want added.


There is also a fourth option where the command returns multiple keys,
one per line of hex digits.  That could be written in shell script, but
it would be fragile and complex.  It could be written in Perl, but that
would add a new language requirement for this feature.  It could be
written in C, but that would limits its flexibility because changes
would require a recompile, and you would probably need that C file to
call external scripts to tailor input like we do now from the server.

You could actually write a full implemention of what we do on the server
side in client code, but pg_alterckey would not work, since it would not
know the data format, so we would need another cluster key alter for that.

It could be written as a C extension, but that would be also have C's
limitations.  In summary, having the server do most of the complex work
for the default case seems best, and eventually allowing the ability for
the client to do everything seems ideal.  I think we need more input
before we go beyond what we do now.


As I said in the commit thread, I disagree with this approach because it 
pushes for no or partial or maybe bad design.


I think that an API should be carefully thought about, without assumption 
about the underlying cryptography (algorithm, key lengths, modes, how keys 
are derived and stored, and so on), and its usefulness be demonstrated by 
actually being used for one implementation which would be what is 
currently being proposed in the patch, and possibly others thrown in for 
free.


The implementations should not have to be in any particular language: 
Shell, Perl, Python, C should be possible.


After giving it more thought during the day, I think that only one
command and a basic protocol is needed. Maybe something as simple as

  /path/to/command --options arguments…

With a basic (text? binary?) protocol on stdin/stdout (?) for the 
different functions. What the command actually does (connect to a remote 
server, ask for a master password, open some other database, whatever) 
should be irrelevant to pg, which would just get and pass bunch of bytes 
to functions, which could use them for keys, secrets, whatever, and be 
easily replaceable.


The API should NOT make assumptions about the cryptographic design, what 
depends about what, where things are stored… ISTM that Pg should only care 
about naming keys, holding them when created/retrieved (but not create 
them), basically interacting with the key manager, passing the stuff to 
functions for encryption/decryption seen as black boxes.


I may have suggested something along these lines at the beginning of the 
key management thread, probably. Not going this way implicitely implies 
making some assumptions which may or may not suit other use cases, so 
makes them specific not generic. I do not think pg should do that.


--
Fabien.

Re: Proposed patch for key managment

2020-12-27 Thread Bruce Momjian
On Sat, Dec 19, 2020 at 11:45:08AM +, Alastair Turner wrote:
> On Fri, 18 Dec 2020 at 21:36, Stephen Frost  wrote:
> > These ideas don't seem too bad but I'd think we'd pass which key we want
> > on the command-line using a %i or something like that, rather than using
> > stdin, unless there's some reason that would be an issue..?  Certainly
> > the CLI utilities I've seen tend to expect the name of the secret that
> > you're asking for to be passed on the command line.
> 
> Agreed. I was working on the assumption that the calling process
> (initdb or pg_ctl) would have access to the challenge material and be
> passing it to the utility, so putting it in a command line would allow
> it to leak. If the utility is accessing the challenge material
> directly, then just an indicator of which key to work with would be a
> lot simpler, true.

I want to repeat here what I said in another thread:

> I think ultimately we will need three commands to control the keys.
> First, there is the cluster_key_command, which we have now.  Second, I
> think we will need an optional command which returns random bytes ---
> this would allow users to get random bytes from a different source than
> that used by the server code.
> 
> Third, we will probably need a command that returns the data encryption
> keys directly, either heap/index or WAL keys, probably based on key
> number --- you pass the key number you want, and the command returns the
> data key.  There would not be a cluster key in this case, but the
> command could still prompt the user for perhaps a password to the KMS
> server. It could not be used if any of the previous two commands are
> used. I assume an HMAC would still be stored in the pg_cryptokeys
> directory to check that the right key has been returned.
> 
> I thought we should implement the first command, because it will
> probably be the most common and easiest to use, and then see what people
> want added.

There is also a fourth option where the command returns multiple keys,
one per line of hex digits.  That could be written in shell script, but
it would be fragile and complex.  It could be written in Perl, but that
would add a new language requirement for this feature.  It could be
written in C, but that would limits its flexibility because changes
would require a recompile, and you would probably need that C file to
call external scripts to tailor input like we do now from the server.

You could actually write a full implemention of what we do on the server
side in client code, but pg_alterckey would not work, since it would not
know the data format, so we would need another cluster key alter for that.

It could be written as a C extension, but that would be also have C's
limitations.  In summary, having the server do most of the complex work
for the default case seems best, and eventually allowing the ability for
the client to do everything seems ideal.  I think we need more input
before we go beyond what we do now.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-22 Thread Bruce Momjian
On Tue, Dec 22, 2020 at 10:40:17AM -0500, Bruce Momjian wrote:
> On Mon, Dec 21, 2020 at 10:07:48PM -0500, Bruce Momjian wrote:
> > Attached is the script patch.  It is also at:
> > 
> > 
> > https://github.com/postgres/postgres/compare/master...bmomjian:cfe-sh.diff
> > 
> > I think it still needs docs but those will have to be done after the
> > code doc patch is added.
> 
> Here is an updated patch.  Are people happy with the Makefile, its
> location in the source tree, and the install directory name?  I used the
> directory name 'auth_commands' because I thought 'auth' was too easily
> misinterpreted. I put the scripts in /src/backend/utils/auth_commands. 
> It also contains a script that can be used for SSL passphrase prompting,
> but I haven't written the C code for that yet.

Here is a new patch, build on previous patches, which allows for the SSL
passphrase to be prompted from the terminal.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 639c623..850813e
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** include_dir 'conf.d'
*** 1452,1469 
  mechanism is used.
 
 
! The command must print the passphrase to the standard output and exit
! with code 0.  In the parameter value, %p is
! replaced by a prompt string.  (Write %% for a
! literal %.)  Note that the prompt string will
! probably contain whitespace, so be sure to quote adequately.  A single
! newlines is stripped from the end of the output if present.
!
!
! The command does not actually have to prompt the user for a
! passphrase.  It can read it from a file, obtain it from a keychain
! facility, or similar.  It is up to the user to make sure the chosen
! mechanism is adequately secure.
 
 
  This parameter can only be set in the postgresql.conf
--- 1452,1469 
  mechanism is used.
 
 
! The command must print the passphrase to the standard output
! and exit with code 0.  It can prompt from the terminal if
! --authprompt is used.  In the parameter value,
! %R represents the file descriptor number opened
! to the terminal that started the server.  A file descriptor is only
! available if enabled at server start.  If %R
! is used and no file descriptor is available, the server will not
! start.  Value %p is replaced by a pre-defined
! prompt string.  (Write %% for a literal
! %.)  Note that the prompt string will probably
! contain whitespace, so be sure to quote its use adequately.
! Newlines are stripped from the end of the output if present.
 
 
  This parameter can only be set in the postgresql.conf
*** include_dir 'conf.d'
*** 1486,1495 
  parameter is off (the default), then
  ssl_passphrase_command will be ignored during a
  reload and the SSL configuration will not be reloaded if a passphrase
! is needed.  That setting is appropriate for a command that requires a
! TTY for prompting, which might not be available when the server is
! running.  Setting this parameter to on might be appropriate if the
! passphrase is obtained from a file, for example.
 
 
  This parameter can only be set in the postgresql.conf
--- 1486,1495 
  parameter is off (the default), then
  ssl_passphrase_command will be ignored during a
  reload and the SSL configuration will not be reloaded if a passphrase
! is needed.  This setting is appropriate for a command that requires a
! terminal for prompting, which might not be available when the server is
! running.  Setting this parameter on might be appropriate, for
! example, if the passphrase is obtained from a file.
 
 
  This parameter can only be set in the postgresql.conf
diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
new file mode 100644
index f04e417..0662ae0
*** a/doc/src/sgml/ref/pg_ctl-ref.sgml
--- b/doc/src/sgml/ref/pg_ctl-ref.sgml
*** PostgreSQL documentation
*** 380,387 
--authprompt

 
! Allows the --cluster-key-command command
! to prompt for a passphrase or PIN.
 

   
--- 380,388 
--authprompt

 
! Allows ssl_passphrase_command or
! cluster_key_command to prompt for a passphrase
! or PIN.
 

   
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
new file mode 100644
index 

Re: Proposed patch for key managment

2020-12-22 Thread Bruce Momjian
On Tue, Dec 22, 2020 at 04:13:06PM -0500, Bruce Momjian wrote:
> On Tue, Dec 22, 2020 at 08:15:27PM +, Alastair Turner wrote:
> > Hi Bruce
> > 
> > In ckey_passphrase.sh.sample
> > 
> > +
> > +echo "$PASS" | sha256sum | cut -d' ' -f1
> > +
> > 
> > Under the threat model discussed, a copy of the keyfile could be
> > attacked offline. So getting from passphrase to DEKs should be as
> > resource intensive as possible to slow down brute-force attempts.
> > Instead of just a SHA hash, this should be at least a PBKDF2 (PKCS#5)
> 
> I am satisfied with the security of SHA256.

Sorry, I should have said I am happy with a SHA512 HMAC in a 256-bit
keyspace.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-22 Thread Bruce Momjian
On Tue, Dec 22, 2020 at 08:15:27PM +, Alastair Turner wrote:
> Hi Bruce
> 
> In ckey_passphrase.sh.sample
> 
> +
> +echo "$PASS" | sha256sum | cut -d' ' -f1
> +
> 
> Under the threat model discussed, a copy of the keyfile could be
> attacked offline. So getting from passphrase to DEKs should be as
> resource intensive as possible to slow down brute-force attempts.
> Instead of just a SHA hash, this should be at least a PBKDF2 (PKCS#5)

I am satisfied with the security of SHA256.

> On Tue, 22 Dec 2020 at 15:40, Bruce Momjian  wrote:
> >
> > Here is an updated patch.  Are people happy with the Makefile, its
> > location in the source tree, and the install directory name?  I used the
> > directory name 'auth_commands' because I thought 'auth' was too easily
> > misinterpreted. I put the scripts in /src/backend/utils/auth_commands.
> >
> 
> What's implemented in these patches is an internal keystore, wrapped
> with a key derived from a passphrase. I'd think that the scripts
> directory should reflect what they interact with, so
> 'keystore_commands' or 'local_keystore_command' sounds more specific
> and therefore better than 'auth_commands'.

The point is that some commands are used for keystore and some for SSL
certificate passphrase entry, hence "auth".

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-22 Thread Alastair Turner
Hi Bruce

In ckey_passphrase.sh.sample

+
+echo "$PASS" | sha256sum | cut -d' ' -f1
+

Under the threat model discussed, a copy of the keyfile could be
attacked offline. So getting from passphrase to DEKs should be as
resource intensive as possible to slow down brute-force attempts.
Instead of just a SHA hash, this should be at least a PBKDF2 (PKCS#5)
operation or preferably scrypt (which is memory intensive as well as
computationally intensive). While OpenSSL provides these key
derivation options for it's encoding operations, it does not offer a
command line option for key derivation using them. What's your view on
including third party utilities like nettlepbkdf in the sample or
providing this utility as a compiled program?

In the same theme - in ossl_cipher_ctx_create and the functions using
that context should rather be using the key-wrapping variants of the
cipher. As well as lower throughput, with the resulting effects on
brute force attempts, the key wrap variants provide more robust
ciphertext output
"
KW, KWP, and TKW are each robust in the sense that each bit of output
can be expected to depend in a nontrivial fashion on each bit of
input, even when the length of the input data is greater than one
block. This property is achieved at the cost of a considerably lower
throughput rate, compared to other authenticated-encryption modes, but
the tradeoff may be appropriate for some key management applications.
For example, a robust method may be desired when the length of the
keys to be protected is greater than the block size of the underlying
block cipher, or when the value of the protected data is very high.
" - NIST SP 800-38F

and aim to manage the risk of short or predictable IVs ([1] final para
of page 1)

On Tue, 22 Dec 2020 at 15:40, Bruce Momjian  wrote:
>
> Here is an updated patch.  Are people happy with the Makefile, its
> location in the source tree, and the install directory name?  I used the
> directory name 'auth_commands' because I thought 'auth' was too easily
> misinterpreted. I put the scripts in /src/backend/utils/auth_commands.
>

What's implemented in these patches is an internal keystore, wrapped
with a key derived from a passphrase. I'd think that the scripts
directory should reflect what they interact with, so
'keystore_commands' or 'local_keystore_command' sounds more specific
and therefore better than 'auth_commands'.

Regards
Alastair

[1] https://web.cs.ucdavis.edu/~rogaway/papers/keywrap.pdf




Re: Proposed patch for key managment

2020-12-22 Thread Bruce Momjian
On Mon, Dec 21, 2020 at 10:07:48PM -0500, Bruce Momjian wrote:
> Attached is the script patch.  It is also at:
> 
>   
> https://github.com/postgres/postgres/compare/master...bmomjian:cfe-sh.diff
> 
> I think it still needs docs but those will have to be done after the
> code doc patch is added.

Here is an updated patch.  Are people happy with the Makefile, its
location in the source tree, and the install directory name?  I used the
directory name 'auth_commands' because I thought 'auth' was too easily
misinterpreted. I put the scripts in /src/backend/utils/auth_commands. 
It also contains a script that can be used for SSL passphrase prompting,
but I haven't written the C code for that yet.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



cfe-sh.diff.gz
Description: application/gzip


Re: Proposed patch for key managment

2020-12-21 Thread Bruce Momjian
On Mon, Dec 21, 2020 at 04:35:14PM -0500, Bruce Momjian wrote:
> On Sun, Dec 20, 2020 at 05:59:09PM -0500, Stephen Frost wrote:
> OK, here it the updated patch.  The major change, as Stephen pointed
> out, is that the cluser_key_command (was cluster_passphrase_command) now
> returns a hex-string representing the 32-byte KEK, rather than having
> the server hash whatever the command used to return.  This avoids
> double-hashing in cases where you are _not_ using a passphrase, but are
> computing a random 32-byte value in the script.  This does mean the
> script has to run sha256 for passphrases, but it seems like a win. 
> Updated script are attached too.

Attached is the script patch.  It is also at:


https://github.com/postgres/postgres/compare/master...bmomjian:cfe-sh.diff

I think it still needs docs but those will have to be done after the
code doc patch is added.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

diff --git a/src/backend/utils/auth_key_cmds/Makefile b/src/backend/utils/auth_key_cmds/Makefile
new file mode 100644
index ...e0bee97
*** a/src/backend/utils/auth_key_cmds/Makefile
--- b/src/backend/utils/auth_key_cmds/Makefile
***
*** 0 
--- 1,41 
+ #-
+ #
+ # Makefile for src/backend/utils/auth_key_cmds
+ #
+ # src/backend/utils/auth_key_cmds/Makefile
+ #
+ #-
+ 
+ PGFILEDESC = "auth_key_cmds - authentication key commands"
+ PGAPPICON = win32
+ 
+ subdir = src/backend/utils/auth_key_cmds
+ top_builddir = ../../../..
+ include $(top_builddir)/src/Makefile.global
+ 
+ SCRIPTS = ckey_aws.sh.sample \
+ 	ckey_direct.sh.sample \
+ 	ckey_passphrase.sh.sample \
+ 	ckey_piv_nopin.sh.sample  \
+ 	ckey_piv_pin.sh.sample \
+ 	ssl_paasphrase.sh.sample
+ 
+ SCRIPTDIR=auth_key_cmds
+ 
+ install: all installdirs
+ 	$(INSTALL_DATA) $(SCRIPTS) '$(DESTDIR)$(datadir)/$(SCRIPTDIR)'
+ 
+ installdirs:
+ 	$(MKDIR_P) '$(DESTDIR)$(datadir)' '$(DESTDIR)$(datadir)/$(SCRIPTDIR)'
+ 
+ uninstall:
+ 	@set -e; \
+ 	set $(SCRIPT) ; \
+ 	while [ "$$#" -gt 0 ] ; \
+ 	do \
+ 		script=$$1; shift;  \
+ 		rm -f '$(DESTDIR)$(datadir)/$(SCRIPTDIR)/'$${lang}.stop ; \
+ 	done
+ 
+ clean distclean maintainer-clean: clean-lib
+ 	rm -f $(OBJS) $(SQLSCRIPT)
diff --git a/src/backend/utils/auth_key_cmds/ckey_aws.sh.sample b/src/backend/utils/auth_key_cmds/ckey_aws.sh.sample
new file mode 100755
index ...0341621
*** a/src/backend/utils/auth_key_cmds/ckey_aws.sh.sample
--- b/src/backend/utils/auth_key_cmds/ckey_aws.sh.sample
***
*** 0 
--- 1,50 
+ #!/bin/sh
+ 
+ # This uses the AWS Secrets Manager using the AWS CLI and OpenSSL.
+ 
+ [ "$#" -ne 1 ] && echo "cluster_key_command usage: $0 \"%d\"" 1>&2 && exit 1
+ # No need for %R or -R since we are not prompting
+ 
+ DIR="$1"
+ [ ! -e "$DIR" ] && echo "$DIR does not exist" 1>&2 && exit 1
+ [ ! -d "$DIR" ] && echo "$DIR is not a directory" 1>&2 && exit 1
+ 
+ # File containing the id of the AWS secret
+ AWS_ID_FILE="$DIR/aws-secret.id"
+ 
+ 
+ # --
+ 
+ 
+ # Create an AWS Secrets Manager secret?
+ if [ ! -e "$AWS_ID_FILE" ]
+ then	# The 'postgres' operating system user must have permission to
+ 	# access the AWS CLI
+ 
+ 	# The epoch-time/directory/hostname combination is unique
+ 	HASH=$(echo -n "$(date '+%s')$DIR$(hostname)" | sha1sum | cut -d' ' -f1)
+ 	AWS_SECRET_ID="Postgres-cluster-key-$HASH"
+ 
+ 	# Use stdin to avoid passing the secret on the command line
+ 	openssl rand -hex 32 |
+ 	aws secretsmanager create-secret \
+ 		--name "$AWS_SECRET_ID" \
+ 		--description 'Used for Postgres cluster file encryption' \
+ 		--secret-string 'file:///dev/stdin' \
+ 		--output text > /dev/null
+ 	if [ "$?" -ne 0 ]
+ 	then	echo 'cluster key generation failed' 1>&2
+ 		exit 1
+ 	fi
+ 
+ 	echo "$AWS_SECRET_ID" > "$AWS_ID_FILE"
+ fi
+ 
+ if ! aws secretsmanager get-secret-value \
+ 	--secret-id "$(cat "$AWS_ID_FILE")" \
+ 	--output text
+ then	echo 'cluster key retrieval failed' 1>&2
+ 	exit 1
+ fi | awk -F'\t' 'NR == 1 {print $4}'
+ 
+ exit 0
diff --git a/src/backend/utils/auth_key_cmds/ckey_direct.sh.sample b/src/backend/utils/auth_key_cmds/ckey_direct.sh.sample
new file mode 100755
index ...4a56cc3
*** a/src/backend/utils/auth_key_cmds/ckey_direct.sh.sample
--- b/src/backend/utils/auth_key_cmds/ckey_direct.sh.sample
***
*** 0 
--- 1,37 
+ #!/bin/sh
+ 
+ # This uses a key supplied by the user
+ # If OpenSSL is installed, you can generate a pseudo-random key by running:
+ #	openssl rand -hex 32
+ # To get a true random key, run:
+ #	wget -q -O - 'https://www.random.org/cgi-bin/randbyte?nbytes=32=h' | tr -d ' \n'; echo
+ 
+ [ "$#" -lt 1 ] && echo "cluster_key_command usage: $0 %R [\"%P\"]" 1>&2 && exit 1

Re: Proposed patch for key managment

2020-12-21 Thread Bruce Momjian
On Mon, Dec 21, 2020 at 03:00:32PM -0700, David G. Johnston wrote:
> I agree with Stephen; so maybe that part of the Wiki needs to be updated
> instead of having it sit there for use as a hammer when someone disagrees with
> a proffered patch.
> 
> Or maybe consider that "a patch" doesn't only mean "implement" - it is simply
> another language through which desirability and design can also be
> communicated.  The author gets to decide how wordy their prose is and choosing
> a wordy but concise language that is understandable by the vast majority of 
> the
> target audience seems like it should be considered acceptable.

If you want to debate that TODO section, you will need to start a new
thread.  I suggest Alastair's patch also appear in a new thread.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-21 Thread David G. Johnston
On Mon, Dec 21, 2020 at 2:44 PM Bruce Momjian  wrote:

> On Sun, Dec 20, 2020 at 09:38:55PM -0500, Stephen Frost wrote:
> > > I'll have a go at adding another keyring and/or abstracting the key
> > > wrap and see how well a true peer to the passphrase approach fits in.
> >
> > Having patches to review and consider definitely helps, imv.
>
> I disagree.  Our order is:
>
> Desirability -> Design -> Implement -> Test -> Review -> Commit
> https://wiki.postgresql.org/wiki/Todo#Development_Process
>
> so, by posting a patch, you are decided to skip the first _two_
> requirements.  I think it might be time for me to create a new thread
> so it is not confused by whatever is posted here.
>
>
I agree with Stephen; so maybe that part of the Wiki needs to be updated
instead of having it sit there for use as a hammer when someone disagrees
with a proffered patch.

Or maybe consider that "a patch" doesn't only mean "implement" - it is
simply another language through which desirability and design can also be
communicated.  The author gets to decide how wordy their prose is and
choosing a wordy but concise language that is understandable by the vast
majority of the target audience seems like it should be considered
acceptable.

David J.


Re: Proposed patch for key managment

2020-12-21 Thread Bruce Momjian
On Sun, Dec 20, 2020 at 05:59:09PM -0500, Stephen Frost wrote:
> I do generally like the idea of having the single command able to be
> used to either provide the KEK (where PG manages the keyring
> internally), or called multiple times to retrieve the DEKs (in which
> case PG wouldn't be managing the keyring).
> 
> However, after chatting with Bruce about it for a bit this weekend, I'm
> not sure that we need to tackle the second case today.  I don't think
> there's any doubt that there will be users who will want PG to manage
> the keyring and to be able to work with just a passphrase, as Bruce has
> worked to make happen here and which we have a patch for.  I'm hoping
> Bruce will post a new patch soon based on the work that he and I
> discussed today (mostly just clarifications around keys vs. passphrases
> and having the PG side accept a key which the command that's run will
> produce).  With that, I'm feeling pretty comfortable that we can move
> forward and at least get this initial work committed.

I agree with very little of this sub-discussion, but I am still reading
it to see if I can get useful ideas from it.  Looking at what we used to
do with 'passphrase', we did the prompting in the script, and did the
hash, HMAC validation, data key generation and its wrap in the server. 
With the 'cluster key' patch I just posted, the hashing of the
passphrase is removed from the server and happens only in the script,
because in many cases the hashing is not needed, and double-hashing is
less secure, so that seems like a win.

To go any further, you are starting to look at possible data key
generation in the script, either at boot time, and then wrapped with a
passphrase, or just retrieved on every boot.  That would create a larger
burden for any script, meaning a passphrase usage would have to not only
hash, which it does now, but also verify its MAC, then decrypt keys
stored in the file system, then echo those to its output, to be read by
the server.  I think this is the big point --- I have five scripts, and
only one needs to hash its input (passphrase).  If you go any farther in
pushing code into the scripts, these scripts become much harder to
write, and much harder to do error checks.

I think the common case is passphrase, so I want to optimize for that. 
Pushing anymore features into the scripts is going to make that common
case less reliable, which I am opposed to.

Also, as far as Desirability, we only have _one_ person saying people
will want more options.  I need to hear from a lot more people before
this gets added to Postgres.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-21 Thread Bruce Momjian
On Sun, Dec 20, 2020 at 09:38:55PM -0500, Stephen Frost wrote:
> > I'll have a go at adding another keyring and/or abstracting the key
> > wrap and see how well a true peer to the passphrase approach fits in.
> 
> Having patches to review and consider definitely helps, imv.

I disagree.  Our order is:

Desirability -> Design -> Implement -> Test -> Review -> Commit
https://wiki.postgresql.org/wiki/Todo#Development_Process

so, by posting a patch, you are decided to skip the first _two_
requirements.  I think it might be time for me to create a new thread
so it is not confused by whatever is posted here.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-21 Thread Bruce Momjian
On Sun, Dec 20, 2020 at 05:59:09PM -0500, Stephen Frost wrote:
> However, after chatting with Bruce about it for a bit this weekend, I'm
> not sure that we need to tackle the second case today.  I don't think
> there's any doubt that there will be users who will want PG to manage
> the keyring and to be able to work with just a passphrase, as Bruce has
> worked to make happen here and which we have a patch for.  I'm hoping
> Bruce will post a new patch soon based on the work that he and I
> discussed today (mostly just clarifications around keys vs. passphrases
> and having the PG side accept a key which the command that's run will
> produce).  With that, I'm feeling pretty comfortable that we can move
> forward and at least get this initial work committed.

OK, here it the updated patch.  The major change, as Stephen pointed
out, is that the cluser_key_command (was cluster_passphrase_command) now
returns a hex-string representing the 32-byte KEK, rather than having
the server hash whatever the command used to return.  This avoids
double-hashing in cases where you are _not_ using a passphrase, but are
computing a random 32-byte value in the script.  This does mean the
script has to run sha256 for passphrases, but it seems like a win. 
Updated script are attached too.

The URLs are still accurate:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key.diff.gz
Description: application/gzip


key-alter.diff.gz
Description: application/gzip


ckey_aws.sh
Description: Bourne shell script


ckey_direct.sh
Description: Bourne shell script


ckey_passphrase.sh
Description: Bourne shell script


ckey_piv_nopin.sh
Description: Bourne shell script


ckey_piv_pin.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-20 Thread Stephen Frost
Greetings,

* Alastair Turner (min...@decodable.me) wrote:
> On Mon, 21 Dec 2020 at 00:33, Stephen Frost  wrote:
> > * Alastair Turner (min...@decodable.me) wrote:
> > > What I'd like specifically is to have the option of an external
> > > keyring as a first class key store, where the keys stored in the
> > > external keyring are used directly to encrypt the database contents.
> >
> > Right- understood, but that's not what the patch does today and there
> > isn't anyone who has proposed such a patch, nor explained why we
> > couldn't add that capability later.
> 
> I'm keen to propose a patch along those lines, even if just as a
> sample. Minimising the amount of code which would have to be unpicked
> in that effort would be great. Particularly avoiding any changes to
> data structures, since that may effectively block adding new
> capabilities.

Ok, sure, but are there such changes that would need to be made for this
case...?  Seems to only be speculation at this point.

> > > Regarding the on-disk format, separate storage of the key HMACs and
> > > the wrapped keys sounds like a requirement for implementing a fully
> > > external keyring or doing key wrapping externally via an OpenSSL or
> > > nss tls pluggable engine. I'm still looking through the code.
> >
> > This seems a bit confusing as the question at hand is the on-disk format
> > of the internal keyring, not anything to do with an external keyring
> > (which we wouldn't have any storage of and so I don't understand how it
> > makes sense to even discuss the idea of storage for the external
> > keyring..?).
> 
> A requirement for validating the keys, no matter which keyring they
> came from, was mentioned along the way. Since there would be no point
> in validating keys from the internal ring twice, storing the
> validation data (HMACs in the current design) separately from the
> internal keyring's keys seems like it would avoid changing the data
> structures for the internal keyring when adding an external option.

This doesn't strike me as very clear- specifically which HMACs are you
referring to which would need to be stored separately from the internal
keyring to make it possible for an external keyring to be used?

> > Again, we will need the ability to perform the encryption using a simple
> > passphrase provided by the user, while using multiple randomly generated
> > data encryption keys, and that's what the latest set of patches are
> > geared towards.  I'm generally in support of adding additional
> > capabilities in this area in the future, but I don't think we need to
> > bog down the current effort by demanding that be implemented today.
> 
> I'm really not looking to bog down the current effort.

Great, glad to hear that.

> I'll have a go at adding another keyring and/or abstracting the key
> wrap and see how well a true peer to the passphrase approach fits in.

Having patches to review and consider definitely helps, imv.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-20 Thread Alastair Turner
Thanks Stephen,

On Mon, 21 Dec 2020 at 00:33, Stephen Frost  wrote:
>
> Greetings,
>
> * Alastair Turner (min...@decodable.me) wrote:
...
> >
> > What I'd like specifically is to have the option of an external
> > keyring as a first class key store, where the keys stored in the
> > external keyring are used directly to encrypt the database contents.
>
> Right- understood, but that's not what the patch does today and there
> isn't anyone who has proposed such a patch, nor explained why we
> couldn't add that capability later.
>

I'm keen to propose a patch along those lines, even if just as a
sample. Minimising the amount of code which would have to be unpicked
in that effort would be great. Particularly avoiding any changes to
data structures, since that may effectively block adding new
capabilities.

>
> >...Your description above of changes to pass keys out
> > of the command sound like they may have addressed this.
>
> The updates are intended to make it so that the KEK which wraps the
> internal keyring will be obtained directly from the cluster key command,
> pushing the transformation of a passphrase (should one be provided) into
> a proper key to the script, but otherwise allowing the result of things
> like 'openssl rand -hex 32' to be used directly as the KEK.
>

Sounds good.

>
> > Regarding the on-disk format, separate storage of the key HMACs and
> > the wrapped keys sounds like a requirement for implementing a fully
> > external keyring or doing key wrapping externally via an OpenSSL or
> > nss tls pluggable engine. I'm still looking through the code.
>
> This seems a bit confusing as the question at hand is the on-disk format
> of the internal keyring, not anything to do with an external keyring
> (which we wouldn't have any storage of and so I don't understand how it
> makes sense to even discuss the idea of storage for the external
> keyring..?).
>

A requirement for validating the keys, no matter which keyring they
came from, was mentioned along the way. Since there would be no point
in validating keys from the internal ring twice, storing the
validation data (HMACs in the current design) separately from the
internal keyring's keys seems like it would avoid changing the data
structures for the internal keyring when adding an external option.

>
> Again, we will need the ability to perform the encryption using a simple
> passphrase provided by the user, while using multiple randomly generated
> data encryption keys, and that's what the latest set of patches are
> geared towards.  I'm generally in support of adding additional
> capabilities in this area in the future, but I don't think we need to
> bog down the current effort by demanding that be implemented today.
>

I'm really not looking to bog down the current effort.

I'll have a go at adding another keyring and/or abstracting the key
wrap and see how well a true peer to the passphrase approach fits in.

Regards
Alastair




Re: Proposed patch for key managment

2020-12-20 Thread Stephen Frost
Greetings,

* Alastair Turner (min...@decodable.me) wrote:
> On Sun, 20 Dec 2020 at 22:59, Stephen Frost  wrote:
> > Yes, it's true that after things are implemented it can be more
> > difficult to change them- but if you're concerned about the specific
> > on-disk format of the keyring then please help us understand what your
> > concern is there.  The concern you've raised so far is just that you'd
> > like an option to have an external keyring, and that's fine, but we are
> > also going to need to have an internal one and which comes first doesn't
> > seem particularly material to me.  I don't think having one or the other
> > in place first will really detract or make more difficult the adding of
> > the other later.
>
> What I'd like specifically is to have the option of an external
> keyring as a first class key store, where the keys stored in the
> external keyring are used directly to encrypt the database contents.

Right- understood, but that's not what the patch does today and there
isn't anyone who has proposed such a patch, nor explained why we
couldn't add that capability later.

> The examples in this discussion so far have all put the internal
> keyring between any other crypto infrastructure and the file
> encryption process. Your description above of changes to pass keys out
> of the command sound like they may have addressed this.

The updates are intended to make it so that the KEK which wraps the
internal keyring will be obtained directly from the cluster key command,
pushing the transformation of a passphrase (should one be provided) into
a proper key to the script, but otherwise allowing the result of things
like 'openssl rand -hex 32' to be used directly as the KEK.

> Regarding the on-disk format, separate storage of the key HMACs and
> the wrapped keys sounds like a requirement for implementing a fully
> external keyring or doing key wrapping externally via an OpenSSL or
> nss tls pluggable engine. I'm still looking through the code.

This seems a bit confusing as the question at hand is the on-disk format
of the internal keyring, not anything to do with an external keyring
(which we wouldn't have any storage of and so I don't understand how it
makes sense to even discuss the idea of storage for the external
keyring..?).

Again, we will need the ability to perform the encryption using a simple
passphrase provided by the user, while using multiple randomly generated
data encryption keys, and that's what the latest set of patches are
geared towards.  I'm generally in support of adding additional
capabilities in this area in the future, but I don't think we need to
bog down the current effort by demanding that be implemented today.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-20 Thread Stephen Frost
Greetings Alastair,

* Alastair Turner (min...@decodable.me) wrote:
> On Sat, 19 Dec 2020 at 16:58, Bruce Momjian  wrote:
> > To enable the direct injection of keys into the server, we would need a
> > new command for this, since trying to make the passphrase command do
> > this will lead to unnecessary complexity.  The passphrase command should
> > do one thing, and you can't have it changing its behavior based on
> > parameters, since the parameters are for the script to process, not to
> > change behavior of the server.
> >
> > If we wanted to do this, we would need a new command, and one of the
> > parameters would be %k or something that would identify the key number
> > we want to retrieve from the KMS.  Stephen pointed out that we could
> > still validate that key;  the key would not be stored wrapped in the
> > file system, but we could store an HMAC in the file system, and use that
> > for validation.
> 
> I think that this is where we have ended up talking at cross-purposes.
> My idea (after some refining based on Stephen's feedback) is that
> there should be only this new command (the cluster key command) and
> that the program for unwrapping stored keys should be called this way.
> As could a program to contact an HSM, etc. This moves the generating
> and wrapping functionality out of the core Postgres processes, making
> it far easier to add alternatives. I see this approach was discussed
> on the email thread you linked to, but I can't see where or how a
> conclusion was reached about it...

I do generally like the idea of having the single command able to be
used to either provide the KEK (where PG manages the keyring
internally), or called multiple times to retrieve the DEKs (in which
case PG wouldn't be managing the keyring).

However, after chatting with Bruce about it for a bit this weekend, I'm
not sure that we need to tackle the second case today.  I don't think
there's any doubt that there will be users who will want PG to manage
the keyring and to be able to work with just a passphrase, as Bruce has
worked to make happen here and which we have a patch for.  I'm hoping
Bruce will post a new patch soon based on the work that he and I
discussed today (mostly just clarifications around keys vs. passphrases
and having the PG side accept a key which the command that's run will
produce).  With that, I'm feeling pretty comfortable that we can move
forward and at least get this initial work committed.

> The pg_strong_random() function uses OpesnSSL's RAND_bytes() where
> available. With appropriate configuration of an OpenSSL engine
> supplying an alternative RAND, this could be handed off to a TRNG.

Sure.

> This is an example of the other option for providing flexibility to
> support specific key generation, wrapping, ... requirements - handing
> the operations off to a library like OpenSSL or nss tls which, in
> turn, can use pluggable providers for these functions. FWIW, the
> proprietary DBMSs use a pluggable engine approach to meet requirements
> in this category.

OpenSSL provides for this configuration and gives us a pluggable
architecture to make this happen, so I'm not sure what the concern here
is..?  I agree that eventually it'd be nice to allow the administrator
to, more directly, control the DEKs but we're still going to need to
have a solution for the user who wishes to just provide a passphrase or
a KEK and I don't see any particular reason to hold off on the
implementation of that.

> > My final point is that we can find ways to do what you are suggesting as
> > an addition to what we are adding now.  What we need is clear
> > justification of why these additional features are needed.  Requiring
> > the use of a true random number generator is a valid argument, but we
> > need to figure out, once this is implemented, who really wants that, and
> > how to implement it cleanly.
> 
> Clean interfaces would be either "above" the database calling out to
> commands in user-land or "below" the database in the abstractions
> which OpenSSL, nss tls and friends provide. Since the conclusion seems
> already to have been reached that the keyring should be inside the
> core process and only the passphrase should pop out above, I'll review
> the patch in this vein and see if I can suggest any routes to carving
> out more manageable subsets of the patch.

There's no doubt that there needs to be a solution where the keyring is
managed by PG.  Perhaps there are options to allow an external keyring
as well, but we can add that in the future.

> "...once this is implemented..." changes become a lot more difficult.
> Particularly when the changes would affect what are regarded as the
> database's on-disk files. Which I suspect is a contributing factor to
> the level of engagement surrounding this feature.

Yes, it's true that after things are implemented it can be more
difficult to change them- but if you're concerned about the specific
on-disk format of the keyring then please help us 

Re: Proposed patch for key managment

2020-12-20 Thread Alastair Turner
Thanks, Bruce

On Sat, 19 Dec 2020 at 16:58, Bruce Momjian  wrote:
>
...
>
> To enable the direct injection of keys into the server, we would need a
> new command for this, since trying to make the passphrase command do
> this will lead to unnecessary complexity.  The passphrase command should
> do one thing, and you can't have it changing its behavior based on
> parameters, since the parameters are for the script to process, not to
> change behavior of the server.
>
> If we wanted to do this, we would need a new command, and one of the
> parameters would be %k or something that would identify the key number
> we want to retrieve from the KMS.  Stephen pointed out that we could
> still validate that key;  the key would not be stored wrapped in the
> file system, but we could store an HMAC in the file system, and use that
> for validation.
>

I think that this is where we have ended up talking at cross-purposes.
My idea (after some refining based on Stephen's feedback) is that
there should be only this new command (the cluster key command) and
that the program for unwrapping stored keys should be called this way.
As could a program to contact an HSM, etc. This moves the generating
and wrapping functionality out of the core Postgres processes, making
it far easier to add alternatives. I see this approach was discussed
on the email thread you linked to, but I can't see where or how a
conclusion was reached about it...

>
> On other interesting approach would be to allow the server to call out
> for a KMS when it needs to generate the initial data keys that are
> wrapped by the passphrase;  this is in contrast to calling the KMS
> everytime it needs the data keys.
>
...
> ...The data keys are generated using this random value
> code:
>
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/port/pg_strong_random.c;hb=HEAD
>
> so someone would have to explain why generating this remotely in a KMS
> is superior, not just based on company policy.
>

The pg_strong_random() function uses OpesnSSL's RAND_bytes() where
available. With appropriate configuration of an OpenSSL engine
supplying an alternative RAND, this could be handed off to a TRNG.

This is an example of the other option for providing flexibility to
support specific key generation, wrapping, ... requirements - handing
the operations off to a library like OpenSSL or nss tls which, in
turn, can use pluggable providers for these functions. FWIW, the
proprietary DBMSs use a pluggable engine approach to meet requirements
in this category.

>
> We should create code for the general use-case, which is currently
> passphrase-wrapped system-generated data keys, and then get feedback on
> what else we need.  However, I should point out that the community is
> somewhat immune to the "my company needs this" kind of argument without
> logic to back it up, though sometimes I think this entire feature is in
> that category. ...

Yeah. Security in general, and crypto in particular, are often
"because someone told me to" requirements.

>
> My final point is that we can find ways to do what you are suggesting as
> an addition to what we are adding now.  What we need is clear
> justification of why these additional features are needed.  Requiring
> the use of a true random number generator is a valid argument, but we
> need to figure out, once this is implemented, who really wants that, and
> how to implement it cleanly.
>

Clean interfaces would be either "above" the database calling out to
commands in user-land or "below" the database in the abstractions
which OpenSSL, nss tls and friends provide. Since the conclusion seems
already to have been reached that the keyring should be inside the
core process and only the passphrase should pop out above, I'll review
the patch in this vein and see if I can suggest any routes to carving
out more manageable subsets of the patch.

"...once this is implemented..." changes become a lot more difficult.
Particularly when the changes would affect what are regarded as the
database's on-disk files. Which I suspect is a contributing factor to
the level of engagement surrounding this feature.

Regards
Alastair




Re: Proposed patch for key managment

2020-12-19 Thread Bruce Momjian
On Sat, Dec 19, 2020 at 11:58:37AM -0500, Bruce Momjian wrote:
> My final point is that we can find ways to do what you are suggesting as
> an addition to what we are adding now.  What we need is clear
> justification of why these additional features are needed.  Requiring
> the use of a true random number generator is a valid argument, but we
> need to figure out, once this is implemented, who really wants that, and
> how to implement it cleanly.

I added a comment to this script to explain how to generate a true
random passphrase, attached.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_fd.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-19 Thread Bruce Momjian
On Sat, Dec 19, 2020 at 11:45:15AM +, Alastair Turner wrote:
> Sorry, I have waved Desirability through under the headings of ease of
> adoption or not raising barriers to adoption, without detailing what
> barriers I see or how to avoid them. I also realise that "don't scare
> the users" is so open-ended so as to be actively unhelpful and very
> quickly starts to sound like "I want my secret decoder ring to glow
> pink, or blue, or green when anyone asks". Given the complexity of the
> feature and pixels spilled in discussing it, I understand that it gets
> frustrating. That said, I believe there is an important case to be
> made here.

I am pleased you understood my feelings on this.  Our last big
discussion on this topic is here:


https://www.postgresql.org/message-id/flat/CA%2Bfd4k7q5o6Nc_AaX6BcYM9yqTbC6_pnH-6nSD%3D54Zp6NBQTCQ%40mail.gmail.com

and it was so unproductive that we started having closed voice calls
every other Friday so we could discuss this without lots of "decoder
ring" ideas that had to be explained.  The result is our wiki page:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption

It has taken me a while to understand why this topic seems to almost
uniquely gravitate toward "decoder ring" discussion.

> In summary, I believe that forcing an approach for key generation and
> wrapping onto users of Cluster File Encryption limits the Desirability
> of the feature.
> 
> Cluster File Encryption for Postgres would be Desirable to many users
> I meet if, and only if, the generation and handling of keys fits with
> their corporate policies. Therefore, giving a user the option to pass
> an encryption key to Postgres for CFE is Desirable. I realise that the
> option for feeding the keys in directly is an additional feature, and
> that it has a user experience impact if a passphrase is used. It is,
> however, a feature which opens up near-infinite flexibility. To
> stretch the analogy, it is the clip for attaching coloured or opaque
> coverings to the glowy bit on the secret decoder ring.
> 
> The generation of keys and the wrapping of keys are contentious issues
> and are frequently addressed/specified in corporate security policies,
> standards and technical papers (NIST 800-38F is often mentioned in
> product docs). There are, for instance, policies in the wild which
> require that keys for long term use are generated from the output of a
> True Random Number Generator - the practical implication being that
> the key used to encrypt data at rest should be created by an HSM. When
> and why to use symmetric or asymmetric cyphers for key wrapping is
> another item which different organisations have different policies on
> - the hardware and cloud service vendors all offer both options, even
> if they recommend one and make it easier to use.

To enable the direct injection of keys into the server, we would need a
new command for this, since trying to make the passphrase command do
this will lead to unnecessary complexity.  The passphrase command should
do one thing, and you can't have it changing its behavior based on
parameters, since the parameters are for the script to process, not to
change behavior of the server.

If we wanted to do this, we would need a new command, and one of the
parameters would be %k or something that would identify the key number
we want to retrieve from the KMS.  Stephen pointed out that we could
still validate that key;  the key would not be stored wrapped in the
file system, but we could store an HMAC in the file system, and use that
for validation.

On other interesting approach would be to allow the server to call out
for a KMS when it needs to generate the initial data keys that are
wrapped by the passphrase;  this is in contrast to calling the KMS
everytime it needs the data keys.

We should create code for the general use-case, which is currently
passphrase-wrapped system-generated data keys, and then get feedback on
what else we need.  However, I should point out that the community is
somewhat immune to the "my company needs this" kind of argument without
logic to back it up, though sometimes I think this entire feature is in
that category.  The data keys are generated using this random value
code:


https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/port/pg_strong_random.c;hb=HEAD

so someone would have to explain why generating this remotely in a KMS
is superior, not just based on company policy.

My final point is that we can find ways to do what you are suggesting as
an addition to what we are adding now.  What we need is clear
justification of why these additional features are needed.  Requiring
the use of a true random number generator is a valid argument, but we
need to figure out, once this is implemented, who really wants that, and
how to implement it cleanly.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a 

Re: Proposed patch for key managment

2020-12-19 Thread Alastair Turner
Hi Bruce

On Sat, 19 Dec 2020 at 02:38, Bruce Momjian  wrote:
>
> I am not going be as kind.  Our workflow is:
>
> Desirability -> Design -> Implement -> Test -> Review -> Commit
> https://wiki.postgresql.org/wiki/Todo#Development_Process
>
> I have already asked about the first item, and received a reply talking
> about the design --- that is not helpful.  I only have so much patience
> for the "I want my secret decoder ring to glow in the dark" type of
> feature additions to this already complex feature.  Unless we stay on
> Desirability, I will not be replying.  If you can't answer the
> Desirability question, well, talking about items farther right on the
> list is not very helpful.
>
> Now, I will say that your question about how a KMS will use this got me
> thinking about how to test this, and that got me to implement the AWS
> Secret Manager script, so that we definitely helpful.  My point is that
> I don't think it is helpful to get into long discussions unless the
> Desirability is clearly answered.  This is not just related to this
> thread --- this kind of jump-over-Desirability has happened a lot in
> relation to this feature, so I thought it would be good to clearly state
> it now.
>

Sorry, I have waved Desirability through under the headings of ease of
adoption or not raising barriers to adoption, without detailing what
barriers I see or how to avoid them. I also realise that "don't scare
the users" is so open-ended so as to be actively unhelpful and very
quickly starts to sound like "I want my secret decoder ring to glow
pink, or blue, or green when anyone asks". Given the complexity of the
feature and pixels spilled in discussing it, I understand that it gets
frustrating. That said, I believe there is an important case to be
made here.

In summary, I believe that forcing an approach for key generation and
wrapping onto users of Cluster File Encryption limits the Desirability
of the feature.

Cluster File Encryption for Postgres would be Desirable to many users
I meet if, and only if, the generation and handling of keys fits with
their corporate policies. Therefore, giving a user the option to pass
an encryption key to Postgres for CFE is Desirable. I realise that the
option for feeding the keys in directly is an additional feature, and
that it has a user experience impact if a passphrase is used. It is,
however, a feature which opens up near-infinite flexibility. To
stretch the analogy, it is the clip for attaching coloured or opaque
coverings to the glowy bit on the secret decoder ring.

The generation of keys and the wrapping of keys are contentious issues
and are frequently addressed/specified in corporate security policies,
standards and technical papers (NIST 800-38F is often mentioned in
product docs). There are, for instance, policies in the wild which
require that keys for long term use are generated from the output of a
True Random Number Generator - the practical implication being that
the key used to encrypt data at rest should be created by an HSM. When
and why to use symmetric or asymmetric cyphers for key wrapping is
another item which different organisations have different policies on
- the hardware and cloud service vendors all offer both options, even
if they recommend one and make it easier to use.

Thanks

Alastair




Re: Proposed patch for key managment

2020-12-19 Thread Alastair Turner
Hi Stephen

On Fri, 18 Dec 2020 at 21:36, Stephen Frost  wrote:
>
> Greetings Alastair,
>
> * Alastair Turner (min...@decodable.me) wrote:
> > On Wed, 16 Dec 2020 at 22:43, Stephen Frost  wrote:
...
> > passphrase key wrapper, the secret store and the cloud/HW KMS.
> >
> > Since the examples expand the purpose of cluster_passphrase_command,
> > let's call it cluster_key_challenge_command in the examples.
>
> These ideas don't seem too bad but I'd think we'd pass which key we want
> on the command-line using a %i or something like that, rather than using
> stdin, unless there's some reason that would be an issue..?  Certainly
> the CLI utilities I've seen tend to expect the name of the secret that
> you're asking for to be passed on the command line.

Agreed. I was working on the assumption that the calling process
(initdb or pg_ctl) would have access to the challenge material and be
passing it to the utility, so putting it in a command line would allow
it to leak. If the utility is accessing the challenge material
directly, then just an indicator of which key to work with would be a
lot simpler, true.

>
> > Starting with the passphrase key wrapper, since it's what's in place now.
> >
> >  - cluster_key_challenge_command = 'password_key_wrapper %q'
>
> I do tend to like this idea of having what
> cluster_key_challenge_command, or whatever we call it, expects is an
> actual key and have the command that is run be a separate command that
> takes the passphrase and runs the KDF (key derivation function) on it,
> when a passphrase is what the user wishes to use.
>
> That generally makes more sense to me than having the key derivation
> effort built into the backend which I have a hard time seeing any
> particular reason for, as long we're already calling out to some
> external utility of some kind to get the key.
>
...
>
> With the thought of trying to keep it a reasonably simple interface, I
> had a thought along these lines:
>
> - Separate command that runs the KDF, this is simple enough as it's
>   really just a hash, and it returns the key on stdout.
> - initdb time option that says if we're going to have PG manage the
>   sub-keys, or not.
> - cluster_key_command defined as "a command that is passed the ID of
>   the key, or keys, required for the cluster to start"
>
> initdb --pg-subkeys
>   - Calls cluster_key_command once with "$PG_SYSTEM_ID-main" or similar
> and expects the main key to be provided on stdout.  Everything else
> is then more-or-less as is today: PG generates DEK sub-keys for data
> and WAL and then encrypts them with the main key and stores them.
>
> As long as the enveloped keys file exists on the filesystem, when PG
> starts, it'll call the cluster_key_command and will expect the 'main'
> key to be provided and it'll then decrypt and verify the DEK sub-keys,
> very similar to today.
>
> In this scenario, cluster_key_command might be set to call a command
> which accepts a passphrase and runs a KDF on it, or it might be set to
> call out to an external vaulting system or to a Yubikey, etc.
>
> initdb --no-pg-subkeys
>   - Calls cluster_key_command for each of the sub-keys that "pg-subkeys"
> would normally generate itself, passing "$PG_SYSTEM_ID-keyid" for
> each (eg: $PG_SYSTEM_ID-data, $PG_SYSTEM_ID-wal), and getting back
> the keys on stdout to use.
>
> When PG starts, it sees that the enveloped keys file doesn't exist and
> does the same as initdb did- calls cluster_key_command multiple times to
> get the keys which are needed.  We'd want to make sure to have a way
> early on that checks that the data DEK provided actually decrypts the
> cluster, and bail out otherwise, before actually writing any data with
> that key.  I'll note though that this approach would actually allow you
> to change the WAL key, if you wanted to, though, which could certainly
> be nice (naturally there would be various caveats about doing so and
> that replicas would have to also be switched to the new key, etc, but
> that all seems reasonably solvable). Having a stand-alone utility that
> could do that for the --pg-subkeys case would be useful too (and just
> generally decrypt it for viewing/backup/replacement/etc).
>
> Down the line this might even allow us to do an online re-keying, at
> least once the challenges around enabling online data checksums are
> sorted out, but I don't think that's something to worry about today.
> Still, it seems like this potentially provides a pretty clean interface
> for that to happen eventually.
>

Changing the WAL key independently of the local storage key is the big
operational advantage I see of managing the two keys separately. It
lays the basis for a full key rotation story.

Rotating WAL keys during switchover between a pair with the new key
applicable from a certain timeline number maye be close enough to
online to satisfy most requirements. Not something to worry about
today, as you say.

>
> Right, I think we can agree on this aspect 

Re: Proposed patch for key managment

2020-12-18 Thread Bruce Momjian
On Fri, Dec 18, 2020 at 04:36:24PM -0500, Stephen Frost wrote:
> > Starting with the passphrase key wrapper, since it's what's in place now.
> > 
> >  - cluster_key_challenge_command = 'password_key_wrapper %q'
> 
> I do tend to like this idea of having what
> cluster_key_challenge_command, or whatever we call it, expects is an
> actual key and have the command that is run be a separate command that
> takes the passphrase and runs the KDF (key derivation function) on it,
> when a passphrase is what the user wishes to use.
> 
> That generally makes more sense to me than having the key derivation
> effort built into the backend which I have a hard time seeing any
> particular reason for, as long we're already calling out to some
> external utility of some kind to get the key.

I have modified the patch to do as you suggested, and as you explained
to me on the phone.  :-)  Instead of having the server hash the pass
phrase provided by the cluster_passphrase_command, have the
cluster_passphrase_command generate a sha256 hash if the user-provided
input is not already 64 hex bytes.  If it is already 64 hex bytes, e.g,
via openssl rand -hex 32, no need to have the server hash that again. 
Double-hashing is less secure.

I have modified the attached patch to do that, and the scripts --- all
my tests pass.  FYI, I moved hex_decode to src/common so that front-end
could use it, and removed it from ecpg since ecpg can use the common one
now too.

> > Sorry, I wasn't trying to hand wave it away. For automated
> > interactions, like big iron HSMs or cloud KSMs, the difference between
> > 2 operations and 10 operations to start a DB server won't matter. For
> > an admin/operator having to type 10 passwords or get 10 clean
> > thumbprint scans, it would be horrible. My underlying question was, is
> > that toil the only problem to be solved, or is there another reason to
> > get into key combination, key splitting and the related issues which
> > are less documented and less well understood than key wrapping.
> 
> I appreciate that you're not trying to hand wave it away but this also
> didn't really answer the actual question- what's the advantage to having
> all of the keys provided by something external rather than having that
> external thing provide just one 'main' key, which we then use to decrypt
> our enveloped keys that we actually use?  I can think of some possible
> advantages but I'd really like to know what advantages you're seeing in
> doing that.

I am not going be as kind.  Our workflow is:

Desirability -> Design -> Implement -> Test -> Review -> Commit
https://wiki.postgresql.org/wiki/Todo#Development_Process

I have already asked about the first item, and received a reply talking
about the design --- that is not helpful.  I only have so much patience
for the "I want my secret decoder ring to glow in the dark" type of
feature additions to this already complex feature.  Unless we stay on
Desirability, I will not be replying.  If you can't answer the
Desirability question, well, talking about items farther right on the
list is not very helpful.

Now, I will say that your question about how a KMS will use this got me
thinking about how to test this, and that got me to implement the AWS
Secret Manager script, so that we definitely helpful.  My point is that
I don't think it is helpful to get into long discussions unless the
Desirability is clearly answered.  This is not just related to this
thread --- this kind of jump-over-Desirability has happened a lot in
relation to this feature, so I thought it would be good to clearly state
it now.

> > > This seems like a crux of at least one concern- that the current patch
> > > is deriving the actual KEK from the passphrase and not just taking the
> > > provided value (at least, that's what it looks like from a *very* quick
> > > look into it), and that's the part that I was suggesting that we might
> > > add an option for- to indicate if the cluster passphrase command is
> > > actually returning a passphrase which should be used to derive a key, or
> > > if it's returning a key directly to be used.  That does seem to be a
> > > material difference to me and one which we should care about.
> > 
> > Yes. Caring about that is the reason I've been making a nuisance of myself.
> 
> Right, I think we can agree on this aspect and I've chatted with Bruce
> about it some also.  When a direct key can be provided, we should use
> that, and not run it through a KDF.  This seems like a particularly
> important case that we should care about even at this early stage.

Agreed, and done in the attached patch.  :-)  I am thankful for all the
ideas that has helped move this feature forward.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key.diff.gz
Description: application/gzip


key-alter.diff.gz
Description: application/gzip


pass_aws.sh

Re: Proposed patch for key managment

2020-12-18 Thread Stephen Frost
Greetings Alastair,

* Alastair Turner (min...@decodable.me) wrote:
> On Wed, 16 Dec 2020 at 22:43, Stephen Frost  wrote:
> > If I'm following, you're suggesting something like:
> >
> > cluster_passphrase_command = 'aws get %q'
> >
> > and then '%q' gets replaced with "Please provide the WAL DEK: ", or
> > something like that?  Prompting the user for each key?  Not sure how
> > well that's work if want to automate everything though.
> >
> > If you have specific ideas, it'd really be helpful to give examples of
> > what you're thinking.
> 
> I can think of three specific ideas off the top of my head: the
> passphrase key wrapper, the secret store and the cloud/HW KMS.
> 
> Since the examples expand the purpose of cluster_passphrase_command,
> let's call it cluster_key_challenge_command in the examples.

These ideas don't seem too bad but I'd think we'd pass which key we want
on the command-line using a %i or something like that, rather than using
stdin, unless there's some reason that would be an issue..?  Certainly
the CLI utilities I've seen tend to expect the name of the secret that
you're asking for to be passed on the command line.

> Starting with the passphrase key wrapper, since it's what's in place now.
> 
>  - cluster_key_challenge_command = 'password_key_wrapper %q'

I do tend to like this idea of having what
cluster_key_challenge_command, or whatever we call it, expects is an
actual key and have the command that is run be a separate command that
takes the passphrase and runs the KDF (key derivation function) on it,
when a passphrase is what the user wishes to use.

That generally makes more sense to me than having the key derivation
effort built into the backend which I have a hard time seeing any
particular reason for, as long we're already calling out to some
external utility of some kind to get the key.

>  - Supplied on stdin to the process is the wrapped DEK (either a
> combined key for db files and WAL or one for each, on separate calls)
>  - %q is "Please provide WAL key wrapper password" or just "...provide
> key wrapper password"
>  - Returned on stdout is the unwrapped DEK

> For a secret store
> 
>  - cluster_key_challenge_command = 'vault_key_fetch'
>  - Supplied on stdin to the process is the secret's identifier 
> (pg_dek_xxUUIDxx)
>  - Returned on stdout is the DEK, which may never have been wrapped,
> depending on implementation
>  - Access control to the secret store is managed through the challenge
> command's own config, certs, HBA, ...

> For an HSM or cloud KMS
> 
>  - cluster_key_challenge_command = 'gcp_kms_key_fetch'
>  - Supplied on stdin to the process is the the wrapped DEK (individual
> or combined)
>  - Returned on stdout is the DEK (individual or combined)
>  - Access control to the kms is managed through the challenge
> command's own config, certs, HBA, ...
> 
> The secret store and HSM/KMS options may be promptless, and therefore
> amenable to automation, depending on the setup of those clients.

We can't really have what is passed on stdin, or not, be different
without having some other option say which it is and that seems like
it'd be making it overly complicated, and I get why Bruce would rather
not make this too complicated.

With the thought of trying to keep it a reasonably simple interface, I
had a thought along these lines:

- Separate command that runs the KDF, this is simple enough as it's
  really just a hash, and it returns the key on stdout.
- initdb time option that says if we're going to have PG manage the
  sub-keys, or not.
- cluster_key_command defined as "a command that is passed the ID of
  the key, or keys, required for the cluster to start"

initdb --pg-subkeys
  - Calls cluster_key_command once with "$PG_SYSTEM_ID-main" or similar
and expects the main key to be provided on stdout.  Everything else
is then more-or-less as is today: PG generates DEK sub-keys for data
and WAL and then encrypts them with the main key and stores them.

As long as the enveloped keys file exists on the filesystem, when PG
starts, it'll call the cluster_key_command and will expect the 'main'
key to be provided and it'll then decrypt and verify the DEK sub-keys,
very similar to today.

In this scenario, cluster_key_command might be set to call a command
which accepts a passphrase and runs a KDF on it, or it might be set to
call out to an external vaulting system or to a Yubikey, etc.

initdb --no-pg-subkeys
  - Calls cluster_key_command for each of the sub-keys that "pg-subkeys"
would normally generate itself, passing "$PG_SYSTEM_ID-keyid" for
each (eg: $PG_SYSTEM_ID-data, $PG_SYSTEM_ID-wal), and getting back
the keys on stdout to use.

When PG starts, it sees that the enveloped keys file doesn't exist and
does the same as initdb did- calls cluster_key_command multiple times to
get the keys which are needed.  We'd want to make sure to have a way
early on that checks that the data DEK provided actually decrypts the
cluster, and bail out 

Re: Proposed patch for key managment

2020-12-18 Thread Bruce Momjian
On Fri, Dec 18, 2020 at 11:13:44AM -0500, Stephen Frost wrote:
> Greetings,
> 
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Fri, Dec 18, 2020 at 08:18:53AM -0500, Stephen Frost wrote:
> > > - C API in backend - we should try to at least set up the structure to
> > >   allow multiple encryption implementations, either via different
> > >   libraries or if someone feels it'd be useful to write a built-in
> > >   implementation, but as I mentioned just a moment ago, we aren't going
> > >   to get this perfect and we should accept that.
> > 
> > All my OpenSSL-specific stuff is isolated in src/common.
> 
> ... and in 'openssl' files, with 'generic' functions on top, right?
> That's what I recall from before and seems like the right approach to at
> least start with, and then we likely make adjustments as needed to the
> API when we add another encryption library.

Uh, I did it the way cryptohash_openssl.c does it. Sawada-san's patch did
it kind of like that, but had #ifdef calls to OpenSSL versions, while
the cryptohash_openssl.c method has two files with exactly the same
function names, and they are conditionally compiled in the makefile
based on makefile defines, and that is how I did it.  Michael Paquier
pointed this out, and the msvc changes needed to handle it.

> > > > I split apart my patch to create cipher{_openssl}.c and hmac{_openssl}.c
> > > > so the single HMAC function is not in the cipher file anymore, attached.
> > > 
> > > Will try to look at this soon, but in general the idea seems right.
> > 
> > Should I be using the init/update/final HMAC API that SCRAM uses so it
> > is easier to integrate into Michael's patch?  I can do that, but didn't
> > because that is going to require me to create a much larger footprint in
> > the main code, and that might be harder to integrate once Michael's API
> > is final.  It is easier for me to change one line to five lines than to
> > change five lines to five different lines.
> 
> For my 2c, ideally we'd get a review done of Michael's changes and just
> get that committed and have your work here rebased over that.  I don't

That would be nice.

> see any reason why we can't get that done in relatively short order.
> Just because it's registered in the January CF doesn't mean we actually
> have to wait for that to get done, it just needs a review from another
> committer (or self certification from Michael if he's happy with it).

Agreed.  I just don't want to wait until mid-January to get this into
the tree, and I am going to defer to Michael's timeline on this, which
is why I figured I might need to go first.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-18 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Fri, Dec 18, 2020 at 08:18:53AM -0500, Stephen Frost wrote:
> > - C API in backend - we should try to at least set up the structure to
> >   allow multiple encryption implementations, either via different
> >   libraries or if someone feels it'd be useful to write a built-in
> >   implementation, but as I mentioned just a moment ago, we aren't going
> >   to get this perfect and we should accept that.
> 
> All my OpenSSL-specific stuff is isolated in src/common.

... and in 'openssl' files, with 'generic' functions on top, right?
That's what I recall from before and seems like the right approach to at
least start with, and then we likely make adjustments as needed to the
API when we add another encryption library.

> > - User API - we should avoid having OpenSSL-specific bits exposed to
> >   users, and I don't think we do today, so I don't think this is an
> >   issue at the moment.
> 
> Right, there is nothing OpenSSL-specific on the user side, but some of
> the scripts assume you can pass an open file descriptor to a
> PKCS11-enabled tool, and I don't know if non-OpenSSL libraries support
> that.

That generally seems alright to me.

> Ideally, we would have scripts for each library to use their
> command-line API for this.  I am hestitant to rename the scripts to
> contain the name openssl because I am unclear if we will ever have
> non-OpenSSL implementations of these.  I updated the script comments at
> the top to indicate "It uses OpenSSL with PKCS11 enabled via OpenSC.".

I'm not too worried about the specific names of those scripts.
Definitely having comments that indicate what the dependencies are is
certainly good.

> One point is that the passphrase scripts are useful for cluster file
> encryption _and_ unlocking TLS certificates.

That's certainly good.

> > > I split apart my patch to create cipher{_openssl}.c and hmac{_openssl}.c
> > > so the single HMAC function is not in the cipher file anymore, attached.
> > 
> > Will try to look at this soon, but in general the idea seems right.
> 
> Should I be using the init/update/final HMAC API that SCRAM uses so it
> is easier to integrate into Michael's patch?  I can do that, but didn't
> because that is going to require me to create a much larger footprint in
> the main code, and that might be harder to integrate once Michael's API
> is final.  It is easier for me to change one line to five lines than to
> change five lines to five different lines.

For my 2c, ideally we'd get a review done of Michael's changes and just
get that committed and have your work here rebased over that.  I don't
see any reason why we can't get that done in relatively short order.
Just because it's registered in the January CF doesn't mean we actually
have to wait for that to get done, it just needs a review from another
committer (or self certification from Michael if he's happy with it).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-18 Thread Bruce Momjian
On Fri, Dec 18, 2020 at 08:18:53AM -0500, Stephen Frost wrote:
> What I would be thinking about with this are really three pieces-
> 
> - C API for libpq (not relevant for this, but we have had issues in the
>   past with exposing OpenSSL-specific things there)

Right.

> - C API in backend - we should try to at least set up the structure to
>   allow multiple encryption implementations, either via different
>   libraries or if someone feels it'd be useful to write a built-in
>   implementation, but as I mentioned just a moment ago, we aren't going
>   to get this perfect and we should accept that.

All my OpenSSL-specific stuff is isolated in src/common.

> - User API - we should avoid having OpenSSL-specific bits exposed to
>   users, and I don't think we do today, so I don't think this is an
>   issue at the moment.

Right, there is nothing OpenSSL-specific on the user side, but some of
the scripts assume you can pass an open file descriptor to a
PKCS11-enabled tool, and I don't know if non-OpenSSL libraries support
that.

Ideally, we would have scripts for each library to use their
command-line API for this.  I am hestitant to rename the scripts to
contain the name openssl because I am unclear if we will ever have
non-OpenSSL implementations of these.  I updated the script comments at
the top to indicate "It uses OpenSSL with PKCS11 enabled via OpenSC.".

One point is that the passphrase scripts are useful for cluster file
encryption _and_ unlocking TLS certificates.

> > I split apart my patch to create cipher{_openssl}.c and hmac{_openssl}.c
> > so the single HMAC function is not in the cipher file anymore, attached.
> 
> Will try to look at this soon, but in general the idea seems right.

Should I be using the init/update/final HMAC API that SCRAM uses so it
is easier to integrate into Michael's patch?  I can do that, but didn't
because that is going to require me to create a much larger footprint in
the main code, and that might be harder to integrate once Michael's API
is final.  It is easier for me to change one line to five lines than to
change five lines to five different lines.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-18 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Fri, Dec 18, 2020 at 10:01:22AM +0900, Michael Paquier wrote:
> > On Thu, Dec 17, 2020 at 12:10:22PM -0500, Bruce Momjian wrote:
> > > On Thu, Dec 17, 2020 at 11:39:55AM -0500, Stephen Frost wrote:
> > >> I don't think there's any need for us to implement a fallback
> > >> implementation of AES.  I'm not entirely sure we need it for hashes
> > >> but since we've already got it...
> > 
> > We need fallback implementations for cryptohashes (MD5, SHA1/2) and
> > HMAC because we have SCRAM authentication, pgcrypto and SQL functions
> > that should be able to work even when not building with any SSL
> > libraries.  So that's here to stay to ensure compatibility with what
> > we do.  All this stuff is already in the tree for ages, it was just
> > not shaped for a more centralized reuse.
> 
> One question is whether we want to support cluster file encryption
> without OpenSSL --- right now we can't.  I am wondering if we really
> need the hardware acceleration of OpenSSL for AES so doing our own AES
> implementation might not even make sense, performance-wise.

No, I don't think we need to support file encryption independently of
any library being available.  Maybe some day we will, but that should
really just be another option to build with.

Guess it wasn't clear, but I was being a bit tounge-in-cheek regarding
the idea of dropping SCRAM support if we aren't built with OpenSSL.

> > > Agreed.  I think there is serious risk we would do AES in a different
> > > way than OpenSSL, especially if I did it.  ;-)  We can add a native AES
> > > one day if we want, but as stated by Michael Paquier, it has to be
> > > tested so we are sure it returns exactly the same values as OpenSSL.
> > 
> > I think that it would be good to put some generalization here, and
> > look at options that are offered by other SSL libraries, like libnss
> > so as we don't finish with a design that restricts the use of a given
> > feature only to OpenSSL.
> 
> Uh, you mean the C API or the user API?  I don't think we have any
> OpenSSL requirement at the user level, except that my examples use
> command-line openssl to generate the passphrase.

What I would be thinking about with this are really three pieces-

- C API for libpq (not relevant for this, but we have had issues in the
  past with exposing OpenSSL-specific things there)

- C API in backend - we should try to at least set up the structure to
  allow multiple encryption implementations, either via different
  libraries or if someone feels it'd be useful to write a built-in
  implementation, but as I mentioned just a moment ago, we aren't going
  to get this perfect and we should accept that.

- User API - we should avoid having OpenSSL-specific bits exposed to
  users, and I don't think we do today, so I don't think this is an
  issue at the moment.

> I split apart my patch to create cipher{_openssl}.c and hmac{_openssl}.c
> so the single HMAC function is not in the cipher file anymore, attached.

Will try to look at this soon, but in general the idea seems right.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-18 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Thu, Dec 17, 2020 at 12:10:22PM -0500, Bruce Momjian wrote:
> > Agreed.  I think there is serious risk we would do AES in a different
> > way than OpenSSL, especially if I did it.  ;-)  We can add a native AES
> > one day if we want, but as stated by Michael Paquier, it has to be
> > tested so we are sure it returns exactly the same values as OpenSSL.
> 
> I think that it would be good to put some generalization here, and
> look at options that are offered by other SSL libraries, like libnss
> so as we don't finish with a design that restricts the use of a given
> feature only to OpenSSL.

While I agree with the general idea proposed here, I don't know that we
need to push super hard on it to be somehow perfect right now because it
simply won't be until we actually add support for another library, and I
don't think that's really this patch's responsibility.

So, yes, let's lay the groundwork and structure and perhaps spend a bit
of time looking at other libraries, but not demand this patch also add
support for a second library today, and accept that that means that the
structure we put in place may not end up being exactly perfect.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-17 Thread Bruce Momjian
On Fri, Dec 18, 2020 at 10:01:22AM +0900, Michael Paquier wrote:
> On Thu, Dec 17, 2020 at 12:10:22PM -0500, Bruce Momjian wrote:
> > On Thu, Dec 17, 2020 at 11:39:55AM -0500, Stephen Frost wrote:
> >> I don't think there's any need for us to implement a fallback
> >> implementation of AES.  I'm not entirely sure we need it for hashes
> >> but since we've already got it...
> 
> We need fallback implementations for cryptohashes (MD5, SHA1/2) and
> HMAC because we have SCRAM authentication, pgcrypto and SQL functions
> that should be able to work even when not building with any SSL
> libraries.  So that's here to stay to ensure compatibility with what
> we do.  All this stuff is already in the tree for ages, it was just
> not shaped for a more centralized reuse.

One question is whether we want to support cluster file encryption
without OpenSSL --- right now we can't.  I am wondering if we really
need the hardware acceleration of OpenSSL for AES so doing our own AES
implementation might not even make sense, performance-wise.

> > Agreed.  I think there is serious risk we would do AES in a different
> > way than OpenSSL, especially if I did it.  ;-)  We can add a native AES
> > one day if we want, but as stated by Michael Paquier, it has to be
> > tested so we are sure it returns exactly the same values as OpenSSL.
> 
> I think that it would be good to put some generalization here, and
> look at options that are offered by other SSL libraries, like libnss
> so as we don't finish with a design that restricts the use of a given
> feature only to OpenSSL.

Uh, you mean the C API or the user API?  I don't think we have any
OpenSSL requirement at the user level, except that my examples use
command-line openssl to generate the passphrase.

I split apart my patch to create cipher{_openssl}.c and hmac{_openssl}.c
so the single HMAC function is not in the cipher file anymore, attached.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key.diff.gz
Description: application/gzip


key-alter.diff.gz
Description: application/gzip


Re: Proposed patch for key managment

2020-12-17 Thread Bruce Momjian
On Fri, Dec 18, 2020 at 11:19:02AM +0800, Neil Chen wrote:
> 
> 
> On Fri, Dec 18, 2020 at 3:02 AM Bruce Momjian  wrote:
> 
> 
> Here is a run of all four authentication methods, and updated scripts.
> I have renamed Yubiki to PIV since the script should work with anY
> PIV-enabled deviced, like a CAC.
> 
> 
>  
> Thanks for attaching these patches. 
> The unfortunate thing is that I am not very familiar with yubikey, so I will
> try to read it but may not be able to give useful advice. 
> Regarding the location of script storage, why don't we name them like
> "pass_fd.sh.sample" and store them in the $DATA/share/postgresql directory
> after installation, where other .sample files are also stored here. In the
> source code directory, just put them in a directory related to KMGR.

Yeah, that makes sense.  They are small.

> Through your suggestions, I am learning about Cybertec's TDE which is a
> relatively "complete" implementation. I will continue to rely on these TDE
> patches and the goals listed in the Wiki to verify whether the KMS system can
> support our future feature.

Great to hear, thanks.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-17 Thread Neil Chen
On Fri, Dec 18, 2020 at 3:02 AM Bruce Momjian  wrote:

>
> Here is a run of all four authentication methods, and updated scripts.
> I have renamed Yubiki to PIV since the script should work with anY
> PIV-enabled deviced, like a CAC.
>
>
Thanks for attaching these patches.
The unfortunate thing is that I am not very familiar with yubikey, so I
will try to read it but may not be able to give useful advice.
Regarding the location of script storage, why don't we name them like
"pass_fd.sh.sample" and store them in the $DATA/share/postgresql directory
after installation, where other .sample files are also stored here. In the
source code directory, just put them in a directory related to KMGR.

Through your suggestions, I am learning about Cybertec's TDE which is a
relatively "complete" implementation. I will continue to rely on these TDE
patches and the goals listed in the Wiki to verify whether the KMS system
can support our future feature.

Thanks.
-- 
There is no royal road to learning.
HighGo Software Co.


Re: Proposed patch for key managment

2020-12-17 Thread Michael Paquier
On Thu, Dec 17, 2020 at 12:10:22PM -0500, Bruce Momjian wrote:
> On Thu, Dec 17, 2020 at 11:39:55AM -0500, Stephen Frost wrote:
>> I don't think there's any need for us to implement a fallback
>> implementation of AES.  I'm not entirely sure we need it for hashes
>> but since we've already got it...

We need fallback implementations for cryptohashes (MD5, SHA1/2) and
HMAC because we have SCRAM authentication, pgcrypto and SQL functions
that should be able to work even when not building with any SSL
libraries.  So that's here to stay to ensure compatibility with what
we do.  All this stuff is already in the tree for ages, it was just
not shaped for a more centralized reuse.

> Agreed.  I think there is serious risk we would do AES in a different
> way than OpenSSL, especially if I did it.  ;-)  We can add a native AES
> one day if we want, but as stated by Michael Paquier, it has to be
> tested so we are sure it returns exactly the same values as OpenSSL.

I think that it would be good to put some generalization here, and
look at options that are offered by other SSL libraries, like libnss
so as we don't finish with a design that restricts the use of a given
feature only to OpenSSL.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-17 Thread Bruce Momjian
On Mon, Dec 14, 2020 at 11:16:18PM -0500, Bruce Momjian wrote:
> On Tue, Dec 15, 2020 at 10:36:56AM +0800, Neil Chen wrote:
> > Since our implementation is not in contrib, I don't think we should put the
> > script there. Maybe we can refer to postgresql.conf.sample?  
> 
> Uh, the script are 20-60 lines long --- I am attaching them to this
> email.  Plus, when we allow user prompting for the SSL passphrase, we
> will have another script, or maybe three mor if people want to use a
> Yubikey to unlock the SSL passphrase.

Here is a run of all four authentication methods, and updated scripts. 
I have renamed Yubiki to PIV since the script should work with anY
PIV-enabled deviced, like a CAC.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee


aws_test.sh
--

+ cp /pgdev/cfe/pass_aws.sh /u/postgres/tmp
+ pgstop -w
waiting for server to shut down done
server stopped
+ rm -rf /u/pg/data/PG_VERSION /u/pg/data/base /u/pg/data/global 
/u/pg/data/pg_commit_ts /u/pg/data/pg_cryptokeys /u/pg/data/pg_dynshmem 
/u/pg/data/pg_hba.conf /u/pg/data/pg_ident.conf /u/pg/data/pg_logical 
/u/pg/data/pg_multixact /u/pg/data/pg_notify /u/pg/data/pg_replslot 
/u/pg/data/pg_serial /u/pg/data/pg_snapshots /u/pg/data/pg_stat 
/u/pg/data/pg_stat_tmp /u/pg/data/pg_subtrans /u/pg/data/pg_tblspc 
/u/pg/data/pg_twophase /u/pg/data/pg_wal /u/pg/data/pg_xact 
/u/pg/data/postgresql.auto.conf /u/pg/data/postgresql.conf 
/u/pg/data/postmaster.opts
+ rm -rf /u/postgres/.aws
+ mkdir /u/postgres/.aws
+ cp /root/.aws/AWS-ssh.pem /root/.aws/README /root/.aws/config 
/root/.aws/credentials /u/postgres/.aws
+ chown postgres.postgres /u/postgres/.aws/AWS-ssh.pem /u/postgres/.aws/README 
/u/postgres/.aws/config /u/postgres/.aws/credentials
+ aspg initdb -K 256 -R -c /u/postgres/tmp/pass_aws.sh "%d"
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start

+ aspg pg_ctl -R -l /u/pg/server.log start
waiting for server to start... done
server started
+ aspg pg_altercpass -R /u/postgres/tmp/pass_aws.sh "%d" 
/u/postgres/tmp/pass_aws.sh "%d"

+ aspg pg_altercpass -r
repair unnecessary
+ rm -rf /u/postgres/.aws

key_test.sh
--

+ cp /pgdev/cfe/pass_fd.sh /u/postgres/tmp
+ pgstop -w
waiting for server to shut down done
server stopped
+ rm -rf /u/pg/data/PG_VERSION /u/pg/data/base /u/pg/data/global 
/u/pg/data/pg_commit_ts /u/pg/data/pg_cryptokeys /u/pg/data/pg_dynshmem 
/u/pg/data/pg_hba.conf /u/pg/data/pg_ident.conf /u/pg/data/pg_logical 
/u/pg/data/pg_multixact /u/pg/data/pg_notify /u/pg/data/pg_replslot 
/u/pg/data/pg_serial /u/pg/data/pg_snapshots /u/pg/data/pg_stat 
/u/pg/data/pg_stat_tmp /u/pg/data/pg_subtrans /u/pg/data/pg_tblspc 
/u/pg/data/pg_twophase /u/pg/data/pg_wal /u/pg/data/pg_xact 
/u/pg/data/postgresql.auto.conf /u/pg/data/postgresql.conf 
/u/pg/data/postmaster.opts
+ aspg initdb -K 256 -R -c /u/postgres/tmp/pass_fd.sh %R "%P"
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok


Re: Proposed patch for key managment

2020-12-17 Thread Bruce Momjian
On Thu, Dec 17, 2020 at 11:39:55AM -0500, Stephen Frost wrote:
> Greetings,
> 
> * Michael Paquier (mich...@paquier.xyz) wrote:
> > On Wed, Dec 16, 2020 at 05:04:12PM -0500, Bruce Momjian wrote:
> > >> fallback implementation.  Finally, pgcrypto is not touched, but we
> > > 
> > > I have a fallback implemention --- it fails?  ;-)  Did you want me to
> > > include an AES implementation?
> > 
> > No idea about this one yet.  There are no direct users of AES except
> > pgcrypto in core.  One thing that would be good IMO is to properly
> > split the patch of this thread into individual parts that could be
> > reviewed separately using for example "git format-patch" to generate
> > patch series.  What's presented is a mixed bag, so that's harder to
> > look at it and consider how this stuff should work, and if there are
> > pieces that should be designed better or not.
> 
> I don't think there's any need for us to implement a fallback
> implementation of AES.  I'm not entirely sure we need it for hashes
> but since we've already got it...

Agreed.  I think there is serious risk we would do AES in a different
way than OpenSSL, especially if I did it.  ;-)  We can add a native AES
one day if we want, but as stated by Michael Paquier, it has to be
tested so we are sure it returns exactly the same values as OpenSSL.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-17 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Wed, Dec 16, 2020 at 05:04:12PM -0500, Bruce Momjian wrote:
> >> fallback implementation.  Finally, pgcrypto is not touched, but we
> > 
> > I have a fallback implemention --- it fails?  ;-)  Did you want me to
> > include an AES implementation?
> 
> No idea about this one yet.  There are no direct users of AES except
> pgcrypto in core.  One thing that would be good IMO is to properly
> split the patch of this thread into individual parts that could be
> reviewed separately using for example "git format-patch" to generate
> patch series.  What's presented is a mixed bag, so that's harder to
> look at it and consider how this stuff should work, and if there are
> pieces that should be designed better or not.

I don't think there's any need for us to implement a fallback
implementation of AES.  I'm not entirely sure we need it for hashes
but since we've already got it...

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-16 Thread Michael Paquier
On Thu, Dec 17, 2020 at 01:15:37AM +0100, Daniel Gustafsson wrote:
> In vtls library contexts are abstracted to the core code, with implementations
> supplying a struct with a set of function pointers implementing functionality
> (this difference is due to libcurl supporting multiple TLS libraries compiled
> at the same time, something postgres IMO shouldn't do).  We do give
> implementations a bit more leeway with how feature complete they must be,
> mainly due to the wide variety of libraries supported (from OpenSSL to IBM
> GSKit and most ones in between).  While basic it has served us quite well and
> we have had first time contributors successfully come with a new TLS library 
> as
> a patch.

This infrastructure has been chosen because curl requires to be able
to use multiple types of libraries at run-time, right?  I don't think
we need to get down to that for Postgres and keep things so as we are
only able to use one TLS library at the same time, the one compiled
with.  This makes the protocol simpler.  But perhaps I just lack
ambition and vision.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-16 Thread Daniel Gustafsson
> On 16 Dec 2020, at 17:26, Stephen Frost  wrote:
> 
> Greetings,
> 
> * Michael Paquier (mich...@paquier.xyz) wrote:
>> On Tue, Dec 15, 2020 at 02:09:09PM -0500, Stephen Frost wrote:
>>> Yeah, looking at what's been done there seems like the right approach to
>>> me as well, ideally we'd have one set of APIs that'll support all these
>>> use cases (not unlike what curl does..).
>> 
>> Ooh...  This is interesting.  What did curl do wrong here?  It is
>> always good to hear from such past experiences.
> 
> Not sure that came across very well- curl did something right in terms
> of their vTLS layer which allows for building curl with a number of
> different SSL/TLS libraries without the core code having to know about
> all the differences.

The vtls layer in libcurl is actually quite similar to what we have in libpq
with be-secure.c and be-secure-*.c for TLS and with what Michael has been doing
lately with cryptohash.

In vtls library contexts are abstracted to the core code, with implementations
supplying a struct with a set of function pointers implementing functionality
(this difference is due to libcurl supporting multiple TLS libraries compiled
at the same time, something postgres IMO shouldn't do).  We do give
implementations a bit more leeway with how feature complete they must be,
mainly due to the wide variety of libraries supported (from OpenSSL to IBM
GSKit and most ones in between).  While basic it has served us quite well and
we have had first time contributors successfully come with a new TLS library as
a patch.

What we have so far in the tree (an abstract *reasonably generic* API hiding
all library context interactions) makes implementing support for a new TLS
library less daunting as no core code needs to be touched really.  Having
kicked the tyres on this a fair bit I really think we should maintain that
separation, it's complicated enough as it is given just how much code needs to
be written.

cheers ./daniel



Re: Proposed patch for key managment

2020-12-16 Thread Alastair Turner
On Wed, 16 Dec 2020 at 22:43, Stephen Frost  wrote:
>
> Greetings,
...
>
> If I'm following, you're suggesting something like:
>
> cluster_passphrase_command = 'aws get %q'
>
> and then '%q' gets replaced with "Please provide the WAL DEK: ", or
> something like that?  Prompting the user for each key?  Not sure how
> well that's work if want to automate everything though.
>
> If you have specific ideas, it'd really be helpful to give examples of
> what you're thinking.

I can think of three specific ideas off the top of my head: the
passphrase key wrapper, the secret store and the cloud/HW KMS.

Since the examples expand the purpose of cluster_passphrase_command,
let's call it cluster_key_challenge_command in the examples.

Starting with the passphrase key wrapper, since it's what's in place now.

 - cluster_key_challenge_command = 'password_key_wrapper %q'
 - Supplied on stdin to the process is the wrapped DEK (either a
combined key for db files and WAL or one for each, on separate calls)
 - %q is "Please provide WAL key wrapper password" or just "...provide
key wrapper password"
 - Returned on stdout is the unwrapped DEK

For a secret store

 - cluster_key_challenge_command = 'vault_key_fetch'
 - Supplied on stdin to the process is the secret's identifier (pg_dek_xxUUIDxx)
 - Returned on stdout is the DEK, which may never have been wrapped,
depending on implementation
 - Access control to the secret store is managed through the challenge
command's own config, certs, HBA, ...

For an HSM or cloud KMS

 - cluster_key_challenge_command = 'gcp_kms_key_fetch'
 - Supplied on stdin to the process is the the wrapped DEK (individual
or combined)
 - Returned on stdout is the DEK (individual or combined)
 - Access control to the kms is managed through the challenge
command's own config, certs, HBA, ...

The secret store and HSM/KMS options may be promptless, and therefore
amenable to automation, depending on the setup of those clients.

>
...
>
> > > ...That
> > > avoids the complication of having to have an API that somehow provides
> > > more than one key, while also using the primary DEK key as-is from the
> > > key management service and the KEK never being seen on the system where
> > > PG is running.
> >
> > Other than calling out (and therefore potentially prompting) twice,
> > what do you see as the complications of having more than one key?
>
> Mainly just a concern about the API and about what happens if, say, we
> decide that we want to have another sub-key, for example.  If we're
> handling them then there's really no issue- we just add another key in,
> but if that's not happening then it's going to mean changes for
> administrators.  If there's a good justification for it, then perhaps
> that's alright, but hand waving at what the issue is doesn't really
> help.
>

Sorry, I wasn't trying to hand wave it away. For automated
interactions, like big iron HSMs or cloud KSMs, the difference between
2 operations and 10 operations to start a DB server won't matter. For
an admin/operator having to type 10 passwords or get 10 clean
thumbprint scans, it would be horrible. My underlying question was, is
that toil the only problem to be solved, or is there another reason to
get into key combination, key splitting and the related issues which
are less documented and less well understood than key wrapping.

>
...
> >
> > I'd describe what the current patch does as using YubiKey to encrypt
> > and then decrypt an intermediate secret, which is then used to
> > generate/derive a KEK, which is then used to unwrap the stored,
> > wrapped DEK.
>
> This seems like a crux of at least one concern- that the current patch
> is deriving the actual KEK from the passphrase and not just taking the
> provided value (at least, that's what it looks like from a *very* quick
> look into it), and that's the part that I was suggesting that we might
> add an option for- to indicate if the cluster passphrase command is
> actually returning a passphrase which should be used to derive a key, or
> if it's returning a key directly to be used.  That does seem to be a
> material difference to me and one which we should care about.
>

Yes. Caring about that is the reason I've been making a nuisance of myself.

> > > There's an entirely independent discussion to be had about figuring out
> > > a way to actually off-load *entirely* the encryption/decryption of data
> > > to the linux crypto system or hardware devices, but unless someone can
> > > step up and write code for those today, I'd suggest that we table that
> > > effort until after we get this initial capability of having TDE with PG
> > > doing all of the encryption/decryption.
> >
> > I'm hopeful that the work on abstracting OpenSSL, nsstls, etc is going
> > to help in this direct.
>
> Yes, I agree with that general idea but it's a 'next step' kind of
> thing, not something we need to try and bake in today.
>

Agreed.

Thanks
Alastair




Re: Proposed patch for key managment

2020-12-16 Thread Michael Paquier
On Wed, Dec 16, 2020 at 05:04:12PM -0500, Bruce Momjian wrote:
> On Wed, Dec 16, 2020 at 10:23:23AM +0900, Michael Paquier wrote:
>> Hmm.  I don't think that this is right.  First, this still mixes
>> ciphers and HMAC.
> 
> I looked at the code.  The HMAC function body is just one function call
> to OpenSSL.  Do you want it moved to cryptohash.c and
> cryptohash_openssl.c?  To a new C file?
> 
>> Second, it is only possible here to use HMAC with
>> SHA512 but we want to be able to use SHA256 as well with SCRAM (per se
>> the scram_HMAC_* business in scram-common.c).  Third, this lacks a
> 
> I looked at the SCRAM HMAC code.  It looks complex, but I assume ideally
> that would be moved into a separate C file and used by cluster file
> encryption and SCRAM, right?  I need help to do that because I am
> unclear how much of the SCRAM hash code is specific to SCRAM.

I have gone though the same exercise for HMAC, with more success it
seems:
https://www.postgresql.org/message-id/x9m0nkejezipx...@paquier.xyz

This includes a fallback implementation that returns the same results
as OpenSSL, and the same results as samples in wikipedia or such
sources.  The set APIs in this refactoring could be reused here and
plugged into any SSL libraries, and my patch has changed SCRAM to
reuse that.  With this, it is also possible to remove all the HMAC
code from pgcrypto but this cannot happen without SHA1 support in
cryptohash.c, another patch I have submitted -hackers recently.  So I
think that it is a win as a whole.

>> fallback implementation.  Finally, pgcrypto is not touched, but we
> 
> I have a fallback implemention --- it fails?  ;-)  Did you want me to
> include an AES implementation?

No idea about this one yet.  There are no direct users of AES except
pgcrypto in core.  One thing that would be good IMO is to properly
split the patch of this thread into individual parts that could be
reviewed separately using for example "git format-patch" to generate
patch series.  What's presented is a mixed bag, so that's harder to
look at it and consider how this stuff should work, and if there are
pieces that should be designed better or not.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-16 Thread Stephen Frost
Greetings,

* Alastair Turner (min...@decodable.me) wrote:
> On Wed, 16 Dec 2020 at 21:32, Stephen Frost  wrote:
> > * Alastair Turner (min...@decodable.me) wrote:
> > > On Wed, 16 Dec 2020 at 00:12, Bruce Momjian  wrote:
> > > > The second approach is to make a new API for what you want
> > >
> > > I am trying to motivate for an alternate API. Specifically, an API
> > > which allows any potential adopter of Postgres and Cluster File
> > > Encryption to adopt them without having to accept any particular
> > > approach to key management, key derivation, wrapping, validation, etc.
> > > A passphrase key-wrapper with validation will probably be very useful
> > > to a lot of people, but making it mandatory and requiring twists and
> > > turns to integrate with already-established security infrastructure
> > > sounds like a barrier to adoption.
> >
> > Yeah, I mentioned this in one of the other threads where we were
> > discussing KEKs and DEKs and such.  Forcing one solution for working
> > with the KEK/DEK isn't really ideal.
> >
> > That said, maybe there's an option to have the user indicate to PG if
> > they're passing in a passphrase or a DEK, ...
> 
> Some options very much like the current cluster_passphrase_command,
> but that takes an input sounds to me like it would meet that
> requirement. The challenge sent to the command could be just a text
> prompt, or it could be the wrapped DEK. The selection of the command
> would indicate what the user was expected to pass. The return would be
> a DEK.

If I'm following, you're suggesting something like:

cluster_passphrase_command = 'aws get %q'

and then '%q' gets replaced with "Please provide the WAL DEK: ", or
something like that?  Prompting the user for each key?  Not sure how
well that's work if want to automate everything though.

If you have specific ideas, it'd really be helpful to give examples of
what you're thinking.

> > ...but we otherwise more-or-less
> > keep things as they are where the DEK that the user provides actually
> > goes to encrypting the sub-keys that we use for tables vs. WAL..?  ...
> 
> The idea of subkeys sounds great, if it can merge the two potential
> interactions into one and not complicate rotating the two parts
> separately. But having Postgres encrypting them, leaves us open to
> many of the same issues. That still feels like managing the keys, so a
> corporate who have strong opinions on how that should be done will
> still ask all sorts of pointy questions, complicating adoption.

This really needs more detail- what exactly is the concern?  What are
the 'pointy questions'?  What's complicated about using sub-keys?  One
of the great advantages of using sub-keys is that you can rotate the KEK
(or whatever is passed to PG) without having to actually go through the
entire system and decyprt/re-encrypt everything.  There's, of course,
the downside that then you don't get the keys rotated as often as you
might like to, but I am, at least, hopeful that we might be able to find
a way to do that in the future.

> > ...That
> > avoids the complication of having to have an API that somehow provides
> > more than one key, while also using the primary DEK key as-is from the
> > key management service and the KEK never being seen on the system where
> > PG is running.
> 
> Other than calling out (and therefore potentially prompting) twice,
> what do you see as the complications of having more than one key?

Mainly just a concern about the API and about what happens if, say, we
decide that we want to have another sub-key, for example.  If we're
handling them then there's really no issue- we just add another key in,
but if that's not happening then it's going to mean changes for
administrators.  If there's a good justification for it, then perhaps
that's alright, but hand waving at what the issue is doesn't really
help.

> > > > ...It would be
> > > > similar to the cluster_passphrase_command, but it would be simple in
> > > > some ways, but complex in others since we need at least two DEKs, and
> > > > key rotation might be very risky since there isn't validation in the
> > > > server.
> > >
> ...
> >
> > I'm not quite following this- I agree that we don't want PG, or anything
> > really, to see the private key that's on the Yubikey, as that wouldn't
> > be good- instead, we should be able to ask for the Yubikey to decrypt
> > the DEK for us and then use that, which I had thought was happening but
> > it's not entirely clear from the above discussion (and, admittedly, I've
> > not tried using the patch with my yubikey yet, but I do plan to...).
> >
> > Are we sure we're talking about the same thing here..?  That's really
> > the question I'm asking myself.
> 
> I'd describe what the current patch does as using YubiKey to encrypt
> and then decrypt an intermediate secret, which is then used to
> generate/derive a KEK, which is then used to unwrap the stored,
> wrapped DEK.

This seems like a crux of at least one concern- that 

Re: Proposed patch for key managment

2020-12-16 Thread Alastair Turner
On Wed, 16 Dec 2020 at 21:32, Stephen Frost  wrote:
>
> Greetings,
>
> * Alastair Turner (min...@decodable.me) wrote:
> > On Wed, 16 Dec 2020 at 00:12, Bruce Momjian  wrote:
> > > The second approach is to make a new API for what you want
> >
> > I am trying to motivate for an alternate API. Specifically, an API
> > which allows any potential adopter of Postgres and Cluster File
> > Encryption to adopt them without having to accept any particular
> > approach to key management, key derivation, wrapping, validation, etc.
> > A passphrase key-wrapper with validation will probably be very useful
> > to a lot of people, but making it mandatory and requiring twists and
> > turns to integrate with already-established security infrastructure
> > sounds like a barrier to adoption.
>
> Yeah, I mentioned this in one of the other threads where we were
> discussing KEKs and DEKs and such.  Forcing one solution for working
> with the KEK/DEK isn't really ideal.
>
> That said, maybe there's an option to have the user indicate to PG if
> they're passing in a passphrase or a DEK, ...

Some options very much like the current cluster_passphrase_command,
but that takes an input sounds to me like it would meet that
requirement. The challenge sent to the command could be just a text
prompt, or it could be the wrapped DEK. The selection of the command
would indicate what the user was expected to pass. The return would be
a DEK.

> ...but we otherwise more-or-less
> keep things as they are where the DEK that the user provides actually
> goes to encrypting the sub-keys that we use for tables vs. WAL..?  ...

The idea of subkeys sounds great, if it can merge the two potential
interactions into one and not complicate rotating the two parts
separately. But having Postgres encrypting them, leaves us open to
many of the same issues. That still feels like managing the keys, so a
corporate who have strong opinions on how that should be done will
still ask all sorts of pointy questions, complicating adoption.

> ...That
> avoids the complication of having to have an API that somehow provides
> more than one key, while also using the primary DEK key as-is from the
> key management service and the KEK never being seen on the system where
> PG is running.
>

Other than calling out (and therefore potentially prompting) twice,
what do you see as the complications of having more than one key?

> > > ...It would be
> > > similar to the cluster_passphrase_command, but it would be simple in
> > > some ways, but complex in others since we need at least two DEKs, and
> > > key rotation might be very risky since there isn't validation in the
> > > server.
> >
...
>
> I'm not quite following this- I agree that we don't want PG, or anything
> really, to see the private key that's on the Yubikey, as that wouldn't
> be good- instead, we should be able to ask for the Yubikey to decrypt
> the DEK for us and then use that, which I had thought was happening but
> it's not entirely clear from the above discussion (and, admittedly, I've
> not tried using the patch with my yubikey yet, but I do plan to...).
>
> Are we sure we're talking about the same thing here..?  That's really
> the question I'm asking myself.
>

I'd describe what the current patch does as using YubiKey to encrypt
and then decrypt an intermediate secret, which is then used to
generate/derive a KEK, which is then used to unwrap the stored,
wrapped DEK.

> There's an entirely independent discussion to be had about figuring out
> a way to actually off-load *entirely* the encryption/decryption of data
> to the linux crypto system or hardware devices, but unless someone can
> step up and write code for those today, I'd suggest that we table that
> effort until after we get this initial capability of having TDE with PG
> doing all of the encryption/decryption.

I'm hopeful that the work on abstracting OpenSSL, nsstls, etc is going
to help in this direct.

Regards

Alastair




Re: Proposed patch for key managment

2020-12-16 Thread Bruce Momjian
On Wed, Dec 16, 2020 at 01:42:57PM -0500, Bruce Momjian wrote:
> On Wed, Dec 16, 2020 at 06:07:26PM +, Alastair Turner wrote:
> > Hi Bruce
> > 
> > On Wed, 16 Dec 2020 at 00:12, Bruce Momjian  wrote:
> > >
> > ...
> > >
> > > The second approach is to make a new API for what you want
> > 
> > I am trying to motivate for an alternate API. Specifically, an API
> > which allows any potential adopter of Postgres and Cluster File
> > Encryption to adopt them without having to accept any particular
> > approach to key management, key derivation, wrapping, validation, etc.
> > A passphrase key-wrapper with validation will probably be very useful
> > to a lot of people, but making it mandatory and requiring twists and
> > turns to integrate with already-established security infrastructure
> > sounds like a barrier to adoption.
> 
> Attached is a script that uses the AWS Secrets Manager, and it does key
> rotation with the new pg_altercpass tool too, just like all the other
> methods.

Attached is an improved script that does not pass the secret on the
command line.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_aws.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-16 Thread Bruce Momjian
On Wed, Dec 16, 2020 at 04:32:00PM -0500, Stephen Frost wrote:
> Greetings,
> 
> * Alastair Turner (min...@decodable.me) wrote:
> > On Wed, 16 Dec 2020 at 00:12, Bruce Momjian  wrote:
> > > The second approach is to make a new API for what you want
> > 
> > I am trying to motivate for an alternate API. Specifically, an API
> > which allows any potential adopter of Postgres and Cluster File
> > Encryption to adopt them without having to accept any particular
> > approach to key management, key derivation, wrapping, validation, etc.
> > A passphrase key-wrapper with validation will probably be very useful
> > to a lot of people, but making it mandatory and requiring twists and
> > turns to integrate with already-established security infrastructure
> > sounds like a barrier to adoption.
> 
> Yeah, I mentioned this in one of the other threads where we were
> discussing KEKs and DEKs and such.  Forcing one solution for working
> with the KEK/DEK isn't really ideal.
> 
> That said, maybe there's an option to have the user indicate to PG if
> they're passing in a passphrase or a DEK, but we otherwise more-or-less
> keep things as they are where the DEK that the user provides actually
> goes to encrypting the sub-keys that we use for tables vs. WAL..?  That

Yes, that is what I suggested in an earlier email.

> avoids the complication of having to have an API that somehow provides
> more than one key, while also using the primary DEK key as-is from the
> key management service and the KEK never being seen on the system where
> PG is running.

How is that diffeent from what we have now?  Did you read my reply today
to Alastair with the AWS script?

> > > ...It would be
> > > similar to the cluster_passphrase_command, but it would be simple in
> > > some ways, but complex in others since we need at least two DEKs, and
> > > key rotation might be very risky since there isn't validation in the
> > > server.
> > 
> > I guess that the risk you're talking about here is encryption with a
> > bogus key and bricking the data? In a world where the wrapped keys are
> > opaque to the application, a key would be validated through a
> > roundtrip: request the unwrapping of the key, encrypt a known string,
> > request the unwrapping again, decrypt the known string, compare. If
> > any of those steps fail, don't use the wrapped key provided.
> > Validating that the stored keys have not been fiddled/damaged is the
> > KMS/HSM's responsibility.
> 
> I'm not really sure what the validation concern being raised here is,
> but I can understand Bruce's point about having to set up an API that
> allows us to ask for two distinct DEK's- but I'd think that keeping the
> two keys we have today as sub-keys addresses that as I outline above.

Right, how is a passphrase different than subkeys?

> > >... I don't think this can be accomplished by a contrib module, but
> > > would actually be easy to implement, but perhaps hard to document
> > > because the user API might be tricky.
> > 
> > If integration through a pipeline isn't good enough (it would be a
> > pain for the passphrase, with multiple prompts), what else do you see
> > an API having to provide?
> 
> I certainly agree that it'd be good to have a way to get an encrypted
> cluster set up and running which doesn't involve actual prompts.

Uh, two of my four scripts do not require prompts.

> > > I think my big question is about your sentence, "A key feature of these
> > > key management approaches is that the application handling the
> > > encrypted data doesn't get the KEK, the HSM/KSM passes the DEK back
> > > after unwrapping it."  It is less secure for the HSM to return a KEK
> > > rather than a DEK?  I can't see why it would be.  The application never
> > > sees the KEK used to wrap the DEK it has stored in the file system,
> > > though that DEK is actually used as a passphrase by Postgres.  This is
> > > the same with the Yubikey --- Postgres never sees the private key used
> > > to unlock what it locked with the Yubikey public key.
> > 
> > The protocols and practices are designed for a lot of DEKs and small
> > number of KEK's. So the compromise of a KEK would, potentially, lead
> > to compromise of thousands of DEKs. In this particular case with 2
> > DEKs wrapped by one KEK, it doesn't sound like much of a difference, I
> > agree. From an acceptance and adoption point of view, I'm just
> > inclined to use the security ecosystem in an established and
> > understood way.
> 
> I'm not quite following this- I agree that we don't want PG, or anything
> really, to see the private key that's on the Yubikey, as that wouldn't
> be good- instead, we should be able to ask for the Yubikey to decrypt
> the DEK for us and then use that, which I had thought was happening but
> it's not entirely clear from the above discussion (and, admittedly, I've
> not tried using the patch with my yubikey yet, but I do plan to...).

What it does is to encrypt the passphrase on boot, store it on 

Re: Proposed patch for key managment

2020-12-16 Thread Bruce Momjian
On Wed, Dec 16, 2020 at 10:23:23AM +0900, Michael Paquier wrote:
> On Tue, Dec 15, 2020 at 04:02:12PM -0500, Bruce Momjian wrote:
> > I thought this was going to be a huge job, but once I looked at it, it
> > was clear exactly what you were saying.  Comparing cryptohash.c and
> > cryptohash_openssl.c I saw exactly what you wanted, and I think I have
> > completed it in the attached patch.  cryptohash.c implemented the hash
> > in C code if OpenSSL is not present --- I assume you didn't want me to
> > do that, but rather to split the API so it was easy to add another
> > implementation.
> 
> Hmm.  I don't think that this is right.  First, this still mixes
> ciphers and HMAC.

I looked at the code.  The HMAC function body is just one function call
to OpenSSL.  Do you want it moved to cryptohash.c and
cryptohash_openssl.c?  To a new C file?

> Second, it is only possible here to use HMAC with
> SHA512 but we want to be able to use SHA256 as well with SCRAM (per se
> the scram_HMAC_* business in scram-common.c).  Third, this lacks a

I looked at the SCRAM HMAC code.  It looks complex, but I assume ideally
that would be moved into a separate C file and used by cluster file
encryption and SCRAM, right?  I need help to do that because I am
unclear how much of the SCRAM hash code is specific to SCRAM.

> fallback implementation.  Finally, pgcrypto is not touched, but we

I have a fallback implemention --- it fails?  ;-)  Did you want me to
include an AES implementation?

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-16 Thread Stephen Frost
Greetings,

* Alastair Turner (min...@decodable.me) wrote:
> On Wed, 16 Dec 2020 at 00:12, Bruce Momjian  wrote:
> > The second approach is to make a new API for what you want
> 
> I am trying to motivate for an alternate API. Specifically, an API
> which allows any potential adopter of Postgres and Cluster File
> Encryption to adopt them without having to accept any particular
> approach to key management, key derivation, wrapping, validation, etc.
> A passphrase key-wrapper with validation will probably be very useful
> to a lot of people, but making it mandatory and requiring twists and
> turns to integrate with already-established security infrastructure
> sounds like a barrier to adoption.

Yeah, I mentioned this in one of the other threads where we were
discussing KEKs and DEKs and such.  Forcing one solution for working
with the KEK/DEK isn't really ideal.

That said, maybe there's an option to have the user indicate to PG if
they're passing in a passphrase or a DEK, but we otherwise more-or-less
keep things as they are where the DEK that the user provides actually
goes to encrypting the sub-keys that we use for tables vs. WAL..?  That
avoids the complication of having to have an API that somehow provides
more than one key, while also using the primary DEK key as-is from the
key management service and the KEK never being seen on the system where
PG is running.

> > ...It would be
> > similar to the cluster_passphrase_command, but it would be simple in
> > some ways, but complex in others since we need at least two DEKs, and
> > key rotation might be very risky since there isn't validation in the
> > server.
> 
> I guess that the risk you're talking about here is encryption with a
> bogus key and bricking the data? In a world where the wrapped keys are
> opaque to the application, a key would be validated through a
> roundtrip: request the unwrapping of the key, encrypt a known string,
> request the unwrapping again, decrypt the known string, compare. If
> any of those steps fail, don't use the wrapped key provided.
> Validating that the stored keys have not been fiddled/damaged is the
> KMS/HSM's responsibility.

I'm not really sure what the validation concern being raised here is,
but I can understand Bruce's point about having to set up an API that
allows us to ask for two distinct DEK's- but I'd think that keeping the
two keys we have today as sub-keys addresses that as I outline above.

> >... I don't think this can be accomplished by a contrib module, but
> > would actually be easy to implement, but perhaps hard to document
> > because the user API might be tricky.
> 
> If integration through a pipeline isn't good enough (it would be a
> pain for the passphrase, with multiple prompts), what else do you see
> an API having to provide?

I certainly agree that it'd be good to have a way to get an encrypted
cluster set up and running which doesn't involve actual prompts.

> > I think my big question is about your sentence, "A key feature of these
> > key management approaches is that the application handling the
> > encrypted data doesn't get the KEK, the HSM/KSM passes the DEK back
> > after unwrapping it."  It is less secure for the HSM to return a KEK
> > rather than a DEK?  I can't see why it would be.  The application never
> > sees the KEK used to wrap the DEK it has stored in the file system,
> > though that DEK is actually used as a passphrase by Postgres.  This is
> > the same with the Yubikey --- Postgres never sees the private key used
> > to unlock what it locked with the Yubikey public key.
> 
> The protocols and practices are designed for a lot of DEKs and small
> number of KEK's. So the compromise of a KEK would, potentially, lead
> to compromise of thousands of DEKs. In this particular case with 2
> DEKs wrapped by one KEK, it doesn't sound like much of a difference, I
> agree. From an acceptance and adoption point of view, I'm just
> inclined to use the security ecosystem in an established and
> understood way.

I'm not quite following this- I agree that we don't want PG, or anything
really, to see the private key that's on the Yubikey, as that wouldn't
be good- instead, we should be able to ask for the Yubikey to decrypt
the DEK for us and then use that, which I had thought was happening but
it's not entirely clear from the above discussion (and, admittedly, I've
not tried using the patch with my yubikey yet, but I do plan to...).

Are we sure we're talking about the same thing here..?  That's really
the question I'm asking myself.

There's an entirely independent discussion to be had about figuring out
a way to actually off-load *entirely* the encryption/decryption of data
to the linux crypto system or hardware devices, but unless someone can
step up and write code for those today, I'd suggest that we table that
effort until after we get this initial capability of having TDE with PG
doing all of the encryption/decryption.

Thanks,

Stephen


signature.asc
Description: PGP 

Re: Proposed patch for key managment

2020-12-16 Thread Bruce Momjian
On Wed, Dec 16, 2020 at 06:07:26PM +, Alastair Turner wrote:
> Hi Bruce
> 
> On Wed, 16 Dec 2020 at 00:12, Bruce Momjian  wrote:
> >
> ...
> >
> > The second approach is to make a new API for what you want
> 
> I am trying to motivate for an alternate API. Specifically, an API
> which allows any potential adopter of Postgres and Cluster File
> Encryption to adopt them without having to accept any particular
> approach to key management, key derivation, wrapping, validation, etc.
> A passphrase key-wrapper with validation will probably be very useful
> to a lot of people, but making it mandatory and requiring twists and
> turns to integrate with already-established security infrastructure
> sounds like a barrier to adoption.

Attached is a script that uses the AWS Secrets Manager, and it does key
rotation with the new pg_altercpass tool too, just like all the other
methods.

I am not inclined to add any more APIs unless there is clear value for
it, and I am not seeing that yet.

> > ...It would be
> > similar to the cluster_passphrase_command, but it would be simple in
> > some ways, but complex in others since we need at least two DEKs, and
> > key rotation might be very risky since there isn't validation in the
> > server.
> >
> 
> I guess that the risk you're talking about here is encryption with a
> bogus key and bricking the data? In a world where the wrapped keys are
> opaque to the application, a key would be validated through a
> roundtrip: request the unwrapping of the key, encrypt a known string,
> request the unwrapping again, decrypt the known string, compare. If
> any of those steps fail, don't use the wrapped key provided.
> Validating that the stored keys have not been fiddled/damaged is the
> KMS/HSM's responsibility.

OK, but we already have validation.

> >... I don't think this can be accomplished by a contrib module, but
> > would actually be easy to implement, but perhaps hard to document
> > because the user API might be tricky.
> >
> 
> If integration through a pipeline isn't good enough (it would be a
> pain for the passphrase, with multiple prompts), what else do you see
> an API having to provide?

The big problem is that at bootstrap time you have to call the command
at a specific time, and I don't see how that could be done via contrib.
Also, I am trying to see value in offering another API.  We don't need
to serve every need.

> > I think my big question is about your sentence, "A key feature of these
> > key management approaches is that the application handling the
> > encrypted data doesn't get the KEK, the HSM/KSM passes the DEK back
> > after unwrapping it."  It is less secure for the HSM to return a KEK
> > rather than a DEK?  I can't see why it would be.  The application never
> > sees the KEK used to wrap the DEK it has stored in the file system,
> > though that DEK is actually used as a passphrase by Postgres.  This is
> > the same with the Yubikey --- Postgres never sees the private key used
> > to unlock what it locked with the Yubikey public key.
> 
> The protocols and practices are designed for a lot of DEKs and small
> number of KEK's. So the compromise of a KEK would, potentially, lead
> to compromise of thousands of DEKs. In this particular case with 2
> DEKs wrapped by one KEK, it doesn't sound like much of a difference, I
> agree. From an acceptance and adoption point of view, I'm just
> inclined to use the security ecosystem in an established and
> understood way.

Right, if there were many DEKs I can see your point.  Using many DEKs in
a database is a big problem since so many parts are interrelated.  We
looked at per-user or per-tablespace DEKs, but found it unworkable.  We
use two DEKs so we can failover to a standby for DEK rotation purposes.

I think for your purposes, your KMS DEK ends up being a KEK for
Postgres.  I am guessing most applications don't have the validation and
key rotation needs Postgres has, so a DEK is fine, but for Postgres,
because we are supporing already four different authentication methods
via a single command, we have those features already, so getting a DEK
from a KMS that we treat as a KEK seems natural to me.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_aws.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-16 Thread Alastair Turner
Hi Bruce

On Wed, 16 Dec 2020 at 00:12, Bruce Momjian  wrote:
>
...
>
> The second approach is to make a new API for what you want

I am trying to motivate for an alternate API. Specifically, an API
which allows any potential adopter of Postgres and Cluster File
Encryption to adopt them without having to accept any particular
approach to key management, key derivation, wrapping, validation, etc.
A passphrase key-wrapper with validation will probably be very useful
to a lot of people, but making it mandatory and requiring twists and
turns to integrate with already-established security infrastructure
sounds like a barrier to adoption.

> ...It would be
> similar to the cluster_passphrase_command, but it would be simple in
> some ways, but complex in others since we need at least two DEKs, and
> key rotation might be very risky since there isn't validation in the
> server.
>

I guess that the risk you're talking about here is encryption with a
bogus key and bricking the data? In a world where the wrapped keys are
opaque to the application, a key would be validated through a
roundtrip: request the unwrapping of the key, encrypt a known string,
request the unwrapping again, decrypt the known string, compare. If
any of those steps fail, don't use the wrapped key provided.
Validating that the stored keys have not been fiddled/damaged is the
KMS/HSM's responsibility.

>
>... I don't think this can be accomplished by a contrib module, but
> would actually be easy to implement, but perhaps hard to document
> because the user API might be tricky.
>

If integration through a pipeline isn't good enough (it would be a
pain for the passphrase, with multiple prompts), what else do you see
an API having to provide?

>
> I think my big question is about your sentence, "A key feature of these
> key management approaches is that the application handling the
> encrypted data doesn't get the KEK, the HSM/KSM passes the DEK back
> after unwrapping it."  It is less secure for the HSM to return a KEK
> rather than a DEK?  I can't see why it would be.  The application never
> sees the KEK used to wrap the DEK it has stored in the file system,
> though that DEK is actually used as a passphrase by Postgres.  This is
> the same with the Yubikey --- Postgres never sees the private key used
> to unlock what it locked with the Yubikey public key.
>

The protocols and practices are designed for a lot of DEKs and small
number of KEK's. So the compromise of a KEK would, potentially, lead
to compromise of thousands of DEKs. In this particular case with 2
DEKs wrapped by one KEK, it doesn't sound like much of a difference, I
agree. From an acceptance and adoption point of view, I'm just
inclined to use the security ecosystem in an established and
understood way.

Regards
Alastair




Re: Proposed patch for key managment

2020-12-16 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Tue, Dec 15, 2020 at 02:09:09PM -0500, Stephen Frost wrote:
> > Yeah, looking at what's been done there seems like the right approach to
> > me as well, ideally we'd have one set of APIs that'll support all these
> > use cases (not unlike what curl does..).
> 
> Ooh...  This is interesting.  What did curl do wrong here?  It is
> always good to hear from such past experiences.

Not sure that came across very well- curl did something right in terms
of their vTLS layer which allows for building curl with a number of
different SSL/TLS libraries without the core code having to know about
all the differences.  I was suggesting that we might want to look at how
they did that, and naturally discuss with Daniel and ask him what
thoughts he has from having worked with curl and the vTLS layer.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-15 Thread Michael Paquier
On Tue, Dec 15, 2020 at 04:02:12PM -0500, Bruce Momjian wrote:
> I thought this was going to be a huge job, but once I looked at it, it
> was clear exactly what you were saying.  Comparing cryptohash.c and
> cryptohash_openssl.c I saw exactly what you wanted, and I think I have
> completed it in the attached patch.  cryptohash.c implemented the hash
> in C code if OpenSSL is not present --- I assume you didn't want me to
> do that, but rather to split the API so it was easy to add another
> implementation.

Hmm.  I don't think that this is right.  First, this still mixes
ciphers and HMAC.  Second, it is only possible here to use HMAC with
SHA512 but we want to be able to use SHA256 as well with SCRAM (per se
the scram_HMAC_* business in scram-common.c).  Third, this lacks a
fallback implementation.  Finally, pgcrypto is not touched, but we
should be able to simplify the area around int_digest_list in
px-hmac.c which lists for HMAC as available options MD5, SHA1 and all
four SHA2.

Please note that we have MD5 and SHA2 as choices in cryptohash.h, and
I have already a patch to add SHA1 to the cryptohash set pending for
review: https://commitfest.postgresql.org/31/2868/.  If all that is
done correctly, a lot of code can be cleaned up while making things
still pluggable with various SSL libraries.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-15 Thread Michael Paquier
Hi Stephen,

On Tue, Dec 15, 2020 at 02:09:09PM -0500, Stephen Frost wrote:
> Yeah, looking at what's been done there seems like the right approach to
> me as well, ideally we'd have one set of APIs that'll support all these
> use cases (not unlike what curl does..).

Ooh...  This is interesting.  What did curl do wrong here?  It is
always good to hear from such past experiences.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-15 Thread Bruce Momjian
On Tue, Dec 15, 2020 at 11:13:12PM +, Alastair Turner wrote:
> Since it's exciting stuff, I've been trying to lash together a PoC integration
> with the crypto infrastructure we see at these customers. Unfortunately, in
> short, the API doesn't seem to suit integration with HSM big iron, like 
> Thales,
> Utimaco, (other options are available), or their cloudy lookalikes.
> 
> I understand the model of wrapping the Data Encryption Key and storing the
> wrapped key with the encrypted data. The thing I can't find support for in 
> your
> patch is passing a wrapped DEK to an external system for decryption. A key
> feature of these key management approaches is that the application handling 
> the
> encrypted data doesn't get the KEK, the HSM/KSM passes the DEK back after
> unwrapping it. Google have a neat illustration of their approach, which is 
> very
> similar to others, at https://cloud.google.com/kms/docs/envelope-encryption#
> how_to_decrypt_data_using_envelope_encryption
> 
> So instead of calling out to a passphrase command which returns input for a
> PBKDF, Postgres (in the form of initdb or postmaster) should be passing the
> wrapped DEK and getting back the DEK in plain. The value passed from Postgres
> to the external process doesn't even have to be a wrapped DEK, it could be a
> key in the key->value sense, for use in a lookup against Vault, CredHub or the
> Kubernetes secret store. Making the stored keys opaque to the Postgres
> processes and pushing the task of turning them into cryptographic material to
> an external hepler process probably wouldn't shrink the patch overall, but it
> would move a lot of code from core processes into the helper. Maybe that 
> helper
> should live in contrib, as discussed below, where it would hopefully be joined
> by a bridge for KMIP and others.

Well, I think we can go two ways with this.  First, if you look at my
attached Yubikey script, you will see that, at boot time, since
$DIR/yubipass.key does not exit, the script generates a passphrase,
wraps it with the Yubikey's PIV #3 public key, and stores it in the live
key directory.  The next time it is called, it decrypts the wrapped
passphrase using the Yubikey, and uses that as the KEK to decrypt the
two DEKs.  Now, you could modify the script to call the HSM at boot time
to retrieve a DEK wrapped in a KEK, and store it in the key directory. 
On all future calls, you can pass the DEK wrapped by KEK to the HSM, and
use the returned DEK as the KEK/passphrase to decrypt the real DEK.  The
big value of this is that the keys are validated, and it uses the
existing API.  Ideally we write a sample script of how to this and
include it in /contrib.

The second approach is to make a new API for what you want.  It would be
similar to the cluster_passphrase_command, but it would be simple in
some ways, but complex in others since we need at least two DEKs, and
key rotation might be very risky since there isn't validation in the
server.  I don't think this can be accomplished by a contrib module, but
would actually be easy to implement, but perhaps hard to document
because the user API might be tricky.

I think my big question is about your sentence, "A key feature of these
key management approaches is that the application handling the
encrypted data doesn't get the KEK, the HSM/KSM passes the DEK back
after unwrapping it."  It is less secure for the HSM to return a KEK
rather than a DEK?  I can't see why it would be.  The application never
sees the KEK used to wrap the DEK it has stored in the file system,
though that DEK is actually used as a passphrase by Postgres.  This is
the same with the Yubikey --- Postgres never sees the private key used
to unlock what it locked with the Yubikey public key.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_yubi_nopin.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-15 Thread Alastair Turner
Hi Bruce et al

Firstly, thanks for shaping the patch, getting it down to a manageable
scope of cluster file encryption. I think this is a great feature and it
matters to a lot of the customers I talk to at VMware about
adopting Postgres.

Since it's exciting stuff, I've been trying to lash together a PoC
integration with the crypto infrastructure we see at these customers.
Unfortunately, in short, the API doesn't seem to suit integration with HSM
big iron, like Thales, Utimaco, (other options are available), or their
cloudy lookalikes.

I understand the model of wrapping the Data Encryption Key and storing the
wrapped key with the encrypted data. The thing I can't find support for in
your patch is passing a wrapped DEK to an external system for decryption. A
key feature of these key management approaches is that the
application handling the encrypted data doesn't get the KEK, the HSM/KSM
passes the DEK back after unwrapping it. Google have a neat illustration of
their approach, which is very similar to others, at
https://cloud.google.com/kms/docs/envelope-encryption#how_to_decrypt_data_using_envelope_encryption

So instead of calling out to a passphrase command which returns input for a
PBKDF, Postgres (in the form of initdb or postmaster) should be passing the
wrapped DEK and getting back the DEK in plain. The value passed from
Postgres to the external process doesn't even have to be a wrapped DEK, it
could be a key in the key->value sense, for use in a lookup against Vault,
CredHub or the Kubernetes secret store. Making the stored keys opaque to
the Postgres processes and pushing the task of turning them into
cryptographic material to an external hepler process probably wouldn't
shrink the patch overall, but it would move a lot of code from core
processes into the helper. Maybe that helper should live in contrib, as
discussed below, where it would hopefully be joined by a bridge for KMIP
and others.

On Tue, 15 Dec 2020 at 19:09, Stephen Frost  wrote:

> Greetings,
>
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Tue, Dec 15, 2020 at 10:05:49AM +0900, Michael Paquier wrote:
> > > On Mon, Dec 14, 2020 at 06:06:15PM -0500, Bruce Momjian wrote:

 

> > > >   There are a few shell script I should include to show how to create
> > > >   commands.  Where should they be stored?  /contrib module?
> > >
> > > Why are these needed.  Is it a matter of documentation?
> >
> > I have posted some of the scripts.  They involved complex shell
> > scripting that I doubt the average user can do.  The scripts are for
> > prompting for a passphrase from the user's terminal, or using the
> > Yubikey, with our without typing a pin.  I can put them in the docs and
> > then users can copy them into a file.  Is that the preferred method?
>
> If we are going to include these in core anywhere, I would think a new
> directory under contrib would make the most sense.  I'd hope that we
> could find a way to automate the testing of them though, so that we have
> some idea when they break because otherwise I'd be concerned that
> they'll break somewhere down the line and we won't notice for quite a
> while.
>
> Regards

Alastair


Re: Proposed patch for key managment

2020-12-15 Thread Bruce Momjian
On Tue, Dec 15, 2020 at 02:09:09PM -0500, Stephen Frost wrote:
> Greetings,
> 
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Tue, Dec 15, 2020 at 10:05:49AM +0900, Michael Paquier wrote:
> > > On Mon, Dec 14, 2020 at 06:06:15PM -0500, Bruce Momjian wrote:
> > > > I am getting close to applying these patches, probably this week.  The
> > > > patches are cumulative:
> > > > 
> > > > 
> > > > https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
> > > > 
> > > > https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff
> > > 
> > > I strongly object to a commit based on the current state of the patch.
> 
> Yeah, I agree that this could use some work though I don't think it's
> all that far away from being something we can get in, to allow us to
> move forward with the other work around supporting TDE.

Glad to hear that.  Michael Paquier was right that the common/cipher*.c
API was wrong and should not be committed until fixed, which I think it
has been.  If there are other messy parts of this patch, I would like to
fix those too.

> > I have posted some of the scripts.  They involved complex shell
> > scripting that I doubt the average user can do.  The scripts are for
> > prompting for a passphrase from the user's terminal, or using the
> > Yubikey, with our without typing a pin.  I can put them in the docs and
> > then users can copy them into a file.  Is that the preferred method?
> 
> If we are going to include these in core anywhere, I would think a new
> directory under contrib would make the most sense.  I'd hope that we
> could find a way to automate the testing of them though, so that we have
> some idea when they break because otherwise I'd be concerned that
> they'll break somewhere down the line and we won't notice for quite a
> while.

I am doing automated testing here, but I have a Yubikey.  Michael
Paquier recommended TAP tests, I guess like we do for pg_upgrade, and I
will look into that.

> This doesn't seem like something that would make sense to only have in
> the documentation, which isn't a terribly convenient way to make use of
> them.

Yeah, it is hard to cut-paste 60 lines.  The /contrib directory would be
for SSL and cluster file encryption passphrase control, so it should
have more general usefulness.  Would those file be copied to the install
directory somewhere if you install /contrib?  Do we have any cases of
this?

> > The point is that the command-line options will be active, but will not
> > be documented.  It will not do anything on its own unless you specify
> > that command-line option.  I can comment-out the command-line options
> > from being active but the code it going to look very messy.
> 
> I'm a bit concerned with what's being contemplated here..  Ideally,
> we'll actually get everything committed for v14 but if we don't and this
> doesn't serve any use-case then I'm not sure that it should be
> included in the release.  I also don't like committing and reverting
> things either, but having command line options that aren't documented
> but which exist in the code isn't great either, nor is having random
> code commented out..

Agreed.  Once we use this code for SSL passphrase prompting, many of the
options will actually have usefulness.  What we could do is to not hide
anything, including the docs, and then hide them once we are ready to
start beta.  At that point, maybe putting in the #ifdefs will be
acceptable, since we would not be working on the feature until we
branch, and then we can just remove them again.  We had similar issues
with the Win32 port, but that had fewer visible knobs.

> > > I think that you should attach such patches directly to the emails
> > > sent to pgsql-hackers, if those github links disappear for some
> > > reason, it would become impossible to refer to see what has been
> > > discussed here.
> > 
> > Well, the patches are changing frequently.  I am attaching a version to
> > this email.
> 
> I agree with having the patches posted to the list.  I don't really
> agree with the argument of "they change frequently".

Sure, they are only 35k compressed.  My point is that I am modifying my
git tree all day, and with github, I can easily push the changes and
when someone looks at the diff, they see the most recent version.

> > I think that designing a correct set of APIs that can be plugged with
> > any SSL library is the correct move in the long term.  I have on my
> > agenda to clean up HMAC as SCRAM uses that with SHA256 and you would
> > use that with SHA512.  Daniel has mentioned that he has been touching
> > this area, and I also got a patch halfly done though pgcrypto needs
> > some extra thoughts.  So this is still WIP but you could reuse that
> > here.
> 
> Yeah, looking at what's been done there seems like the right approach to
> me as well, ideally we'd have one set of APIs that'll support all these
> use cases (not unlike what curl does..).

I think I accomplished this in a 

Re: Proposed patch for key managment

2020-12-15 Thread Bruce Momjian
On Tue, Dec 15, 2020 at 02:20:33PM +0900, Michael Paquier wrote:
> On Mon, Dec 14, 2020 at 10:19:02PM -0500, Bruce Momjian wrote:
> > I am going to need someone to help me make these changes.  I don't feel
> > I know enough about the crypto API to do it, and it will take me 1+ week
> > to learn it.
> 
> I think that designing a correct set of APIs that can be plugged with
> any SSL library is the correct move in the long term.  I have on my
> agenda to clean up HMAC as SCRAM uses that with SHA256 and you would
> use that with SHA512.  Daniel has mentioned that he has been touching
> this area, and I also got a patch halfly done though pgcrypto needs
> some extra thoughts.  So this is still WIP but you could reuse that
> here.

I thought this was going to be a huge job, but once I looked at it, it
was clear exactly what you were saying.  Comparing cryptohash.c and
cryptohash_openssl.c I saw exactly what you wanted, and I think I have
completed it in the attached patch.  cryptohash.c implemented the hash
in C code if OpenSSL is not present --- I assume you didn't want me to
do that, but rather to split the API so it was easy to add another
implementation.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key-alter.diff.gz
Description: application/gzip


Re: Proposed patch for key managment

2020-12-15 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Dec 15, 2020 at 10:05:49AM +0900, Michael Paquier wrote:
> > On Mon, Dec 14, 2020 at 06:06:15PM -0500, Bruce Momjian wrote:
> > > I am getting close to applying these patches, probably this week.  The
> > > patches are cumulative:
> > > 
> > >   https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
> > >   
> > > https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff
> > 
> > I strongly object to a commit based on the current state of the patch.

Yeah, I agree that this could use some work though I don't think it's
all that far away from being something we can get in, to allow us to
move forward with the other work around supporting TDE.

> > >   There are a few shell script I should include to show how to create
> > >   commands.  Where should they be stored?  /contrib module?
> > 
> > Why are these needed.  Is it a matter of documentation?
> 
> I have posted some of the scripts.  They involved complex shell
> scripting that I doubt the average user can do.  The scripts are for
> prompting for a passphrase from the user's terminal, or using the
> Yubikey, with our without typing a pin.  I can put them in the docs and
> then users can copy them into a file.  Is that the preferred method?

If we are going to include these in core anywhere, I would think a new
directory under contrib would make the most sense.  I'd hope that we
could find a way to automate the testing of them though, so that we have
some idea when they break because otherwise I'd be concerned that
they'll break somewhere down the line and we won't notice for quite a
while.

This doesn't seem like something that would make sense to only have in
the documentation, which isn't a terribly convenient way to make use of
them.

> > >   Are people okay with having the feature enabled, but invisible
> > >   since the docs and --help output are missing?  When we enable
> > >   ssl_passphrase_command to prompt from the terminal, some of the
> > >   command-line options will be useful.
> > 
> > You are suggesting to enable the feature by default, make it invisible
> > to the users and leave it undocumented?  Is there something I missing
> > here?
> 
> The point is that the command-line options will be active, but will not
> be documented.  It will not do anything on its own unless you specify
> that command-line option.  I can comment-out the command-line options
> from being active but the code it going to look very messy.

I'm a bit concerned with what's being contemplated here..  Ideally,
we'll actually get everything committed for v14 but if we don't and this
doesn't serve any use-case then I'm not sure that it should be
included in the release.  I also don't like committing and reverting
things either, but having command line options that aren't documented
but which exist in the code isn't great either, nor is having random
code commented out..

> > >   Do people like the command-letter choices?
> > > 
> > >   I called the alter passphrase utility pg_altercpass.  I could
> > >   have called it pg_clusterpass, but I wanted to highlight it is
> > >   only for changing the passphrase, not for creating them.
> > 
> > I think that you should attach such patches directly to the emails
> > sent to pgsql-hackers, if those github links disappear for some
> > reason, it would become impossible to refer to see what has been
> > discussed here.
> 
> Well, the patches are changing frequently.  I am attaching a version to
> this email.

I agree with having the patches posted to the list.  I don't really
agree with the argument of "they change frequently".

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Mon, Dec 14, 2020 at 10:19:02PM -0500, Bruce Momjian wrote:
> > On Tue, Dec 15, 2020 at 10:05:49AM +0900, Michael Paquier wrote:
> >> - The cipher split also should be done in its own patch, and reviewed
> >> on its own.  There is a mix of dependencies between non-OpenSSL and
> >> OpenSSL code paths, making the pluggin of a new SSL library harder to
> >> do.  In short, this requires proper API design, which is not something
> >> we have here.  There are also no changes in pgcrypto for that stuff.
> > 
> > I am going to need someone to help me make these changes.  I don't feel
> > I know enough about the crypto API to do it, and it will take me 1+ week
> > to learn it.
> 
> I think that designing a correct set of APIs that can be plugged with
> any SSL library is the correct move in the long term.  I have on my
> agenda to clean up HMAC as SCRAM uses that with SHA256 and you would
> use that with SHA512.  Daniel has mentioned that he has been touching
> this area, and I also got a patch halfly done though pgcrypto needs
> some extra thoughts.  So this is still WIP but you could reuse that
> here.

Yeah, looking at what's been done there seems like the right approach to
me as well, ideally we'd have one set of APIs that'll support all these
use cases (not 

Re: Proposed patch for key managment

2020-12-15 Thread Bruce Momjian
On Tue, Dec 15, 2020 at 10:36:56AM +0800, Neil Chen wrote:
> 2. I tried to add support for AES_CTR mode, and the code for encrypting buffer
> blocks. In the process I found that in pg_cipher_ctx_create, the key length is
> declared as "byte". However, in the CryptoKey structure, the length is stored
> as "bit", which leads me to use a form similar to Key->klen / 8 when I call
> this function. Maybe we should unify the two to avoid unnecessary confusion.

Yes, I would also like to get opinions on this.  We certainly have to
have the key length be in _bit_ units when visible by users, but I see a
lot of cases where we allocate arrays based on bytes.  I am unclear
where the proper units should be.  At a minimum, we should specify the
units in the function parameter names.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-15 Thread Bruce Momjian
On Mon, Dec 14, 2020 at 11:16:18PM -0500, Bruce Momjian wrote:
> > 1. Previously, we added a variable bootstrap_keys_wrap that is used for
> > encryption during initdb. However, since we save the "wrapped" key, we need 
> > to
> > use a global KEK that can be accessed in boot mode to unwrap it before 
> > use... I
> > don't know if that's good. To make it simple, I modified the
> > bootstrap_keys_wrap to store the "unwrapped" key so that the encryption
> > function can get it correctly. (The variable name should be changed
> > accordingly).
> 
> I see what you are saying.  We store the wrapped in bootstrap mode, but
> the unwrapped in normal mode.  There is also the case of when we copy
> the keys from an old cluster.  I will work on a patch tomorrow and
> report back here.

I had not considered that we need the date keys available in bootstrap
mode, even if we copied them from another cluster during pg_upgrade.  I
have updated the diff URLs and attaching a patch showing the changes I
made. Basically, I had to separate BootStrapKmgr() into sections:

1.  copy or create an empty live key directory
2.  get the pass phrase
3.  populate the live key directory if we didn't copy it
4.  decrypt they keys into a file-scoped variable

Thanks for showing me this missing feature.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

diff --git a/src/backend/crypto/kmgr.c b/src/backend/crypto/kmgr.c
new file mode 100644
index 9143e72..24c5d7f
*** a/src/backend/crypto/kmgr.c
--- b/src/backend/crypto/kmgr.c
*** static KmgrShmemData *KmgrShmem;
*** 50,56 
  char   *cluster_passphrase_command = NULL;
  int		file_encryption_keylen = 0;
  
! CryptoKey	bootstrap_keys_wrap[KMGR_MAX_INTERNAL_KEYS];
  
  extern char *bootstrap_old_key_datadir;
  extern int	bootstrap_file_encryption_keylen;
--- 50,56 
  char   *cluster_passphrase_command = NULL;
  int		file_encryption_keylen = 0;
  
! CryptoKey	bootstrap_keys[KMGR_MAX_INTERNAL_KEYS];
  
  extern char *bootstrap_old_key_datadir;
  extern int	bootstrap_file_encryption_keylen;
*** static CryptoKey *generate_crypto_key(in
*** 65,74 
  void
  BootStrapKmgr(void)
  {
! 	PgKeyWrapCtx	*ctx;
  	char		passphrase[KMGR_MAX_PASSPHRASE_LEN];
- 	uint8		KEK_enc[KMGR_ENC_KEY_LEN];
- 	uint8		KEK_hmac[KMGR_MAC_KEY_LEN];
  	int			passlen;
  
  #ifndef USE_OPENSSL
--- 65,74 
  void
  BootStrapKmgr(void)
  {
! 	char		live_path[MAXPGPATH];
! 	CryptoKey	*keys_wrap;
! 	int			nkeys;
  	char		passphrase[KMGR_MAX_PASSPHRASE_LEN];
  	int			passlen;
  
  #ifndef USE_OPENSSL
*** BootStrapKmgr(void)
*** 78,83 
--- 78,85 
  			  errhint("Compile with --with-openssl to use cluster encryption.";
  #endif
  
+ 	snprintf(live_path, sizeof(live_path), "%s/%s", DataDir, LIVE_KMGR_DIR);
+ 
  	/* copy cluster file encryption keys from an old cluster? */
  	if (bootstrap_old_key_datadir != NULL)
  	{
*** BootStrapKmgr(void)
*** 87,122 
   bootstrap_old_key_datadir, LIVE_KMGR_DIR);
  		copydir(old_key_dir, LIVE_KMGR_DIR, true);
  	}
! 	/* generate new cluster file encryption keys */
  	else
  	{
! 		char live_path[MAXPGPATH];
! 
! 		if (mkdir(LIVE_KMGR_DIR, pg_dir_create_mode) < 0)
  			ereport(ERROR,
  	(errcode_for_file_access(),
  	 errmsg("could not create cluster file encryption directory \"%s\": %m",
  			LIVE_KMGR_DIR)));
  
- 		memset(bootstrap_keys_wrap, 0, sizeof(bootstrap_keys_wrap));
- 		/* bzero keys on exit */
- 		on_proc_exit(bzeroKmgrKeys, 0);
- 	
- 		/* Get key encryption key from the passphrase command */
- 		snprintf(live_path, sizeof(live_path), "%s/%s", DataDir, LIVE_KMGR_DIR);
- 		passlen = kmgr_run_cluster_passphrase_command(cluster_passphrase_command,
- 	  passphrase, KMGR_MAX_PASSPHRASE_LEN,
- 	  live_path);
- 		if (passlen < KMGR_MIN_PASSPHRASE_LEN)
- 			ereport(ERROR,
- 	(errmsg("passphrase must be at least %d bytes",
- 			KMGR_MIN_PASSPHRASE_LEN)));
- 	
  		/* Get key encryption key and HMAC key from passphrase */
  		kmgr_derive_keys(passphrase, passlen, KEK_enc, KEK_hmac);
  
- 		explicit_bzero(passphrase, passlen);
- 	
  		/* Create temporary key wrap context */
  		ctx = pg_create_keywrap_ctx(KEK_enc, KEK_hmac);
  		if (!ctx)
--- 89,128 
   bootstrap_old_key_datadir, LIVE_KMGR_DIR);
  		copydir(old_key_dir, LIVE_KMGR_DIR, true);
  	}
! 	/* create empty directory */
  	else
  	{
! 		 if (mkdir(LIVE_KMGR_DIR, pg_dir_create_mode) < 0)
  			ereport(ERROR,
  	(errcode_for_file_access(),
  	 errmsg("could not create cluster file encryption directory \"%s\": %m",
  			LIVE_KMGR_DIR)));
+ 	}
+ 
+ 	/*
+ 	 * Get key encryption key from the passphrase command.  The passphrase
+ 	 * command might want to check for the existance of files in the
+ 	 * live directory, so run this _after_ copying the directory in place.
+ 	 */
+ 	passlen = 

Re: Proposed patch for key managment

2020-12-15 Thread Bruce Momjian
On Tue, Dec 15, 2020 at 02:20:33PM +0900, Michael Paquier wrote:
> > Uh, I got this code from pg_resetwal.c, which does something similar to
> > pg_altercpass.
> 
> Yeah, that's actually the part where it is a bad idea to just copy
> this pattern.  pg_resetwal should not do that in the long term in my
> opinion, but nobody has come to clean up this stuff.  (- -;)

Glad you asked about this.  It turns out pg_altercpass works fine with
postgres_fe.h, so I will now use that.  pg_resetwal still can't use it
though.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-14 Thread Michael Paquier
On Mon, Dec 14, 2020 at 10:19:02PM -0500, Bruce Momjian wrote:
> On Tue, Dec 15, 2020 at 10:05:49AM +0900, Michael Paquier wrote:
>> - The HMAC split is terrible, as mentioned upthread.  I think that you
>> would need to extract this stuff as a separate patch, and not add more
>> mess to the existing mess (SCRAM uses its own HMAC with SHA256, which
>> is already wrong).  We can and should have a fallback implementation,
>> because that's easy to do.  And we need to have the fallback
>> implementation depend on the contents of cryptohash.c (in a
>> src/common/hmac.c), while the OpenSSL portion requires a
>> hmac_openssl.c where you can choose the hash type based on
>> pg_cryptohash_type.  So ossl_HMAC_SHA512() does not do the right
>> thing.  APIs flexible enough to allow a new SSL library to plug into
>> this stuff are required.
>> - Not much a fan of the changes done in cryptohash.c for the resource
>> owners as well.  It also feels like this could be thought as something
>> directly for resowner.c.
>> - The cipher split also should be done in its own patch, and reviewed
>> on its own.  There is a mix of dependencies between non-OpenSSL and
>> OpenSSL code paths, making the pluggin of a new SSL library harder to
>> do.  In short, this requires proper API design, which is not something
>> we have here.  There are also no changes in pgcrypto for that stuff.
> 
> I am going to need someone to help me make these changes.  I don't feel
> I know enough about the crypto API to do it, and it will take me 1+ week
> to learn it.

I think that designing a correct set of APIs that can be plugged with
any SSL library is the correct move in the long term.  I have on my
agenda to clean up HMAC as SCRAM uses that with SHA256 and you would
use that with SHA512.  Daniel has mentioned that he has been touching
this area, and I also got a patch halfly done though pgcrypto needs
some extra thoughts.  So this is still WIP but you could reuse that
here.

> Uh, I got this code from pg_resetwal.c, which does something similar to
> pg_altercpass.

Yeah, that's actually the part where it is a bad idea to just copy
this pattern.  pg_resetwal should not do that in the long term in my
opinion, but nobody has come to clean up this stuff.  (- -;)

>> +#ifdef USE_OPENSSL
>> +   ctx = (PgCipherCtx *) palloc0(sizeof(PgCipherCtx));
>> +
>> +   ctx->encctx = ossl_cipher_ctx_create(cipher, key, klen, true);
>> +   ctx->decctx = ossl_cipher_ctx_create(cipher, key, klen, false);
>> +#endif
>> There is a cipher_openssl.c and a cipher.c that includes USE_OPENSSL.
>> This requires a cleaner split IMO.  We should avoid as much as
>> possible OpenSSL-specific code paths in files part of src/common/ when
>> not building with OpenSSL.  So this is now a mixed bag of
>> dependencies.
> 
> Again, I need help here.

My take would be to try to sort out the HMAC part first, then look at
the ciphers.  One step at a time.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-14 Thread Bruce Momjian
On Tue, Dec 15, 2020 at 10:36:56AM +0800, Neil Chen wrote:
> Hi, Bruce  
> 
> I read your question and here are some of my thoughts.
> 
> 
>         Why is KmgrShmemData a struct, when it only has a single member? 
> Are
>         all shared memory areas structs?
> 
>  
> Yes, all areas created by ShmemInitStruct() are structs. I think the
> significance of using struct is that it delimits an "area"  to store the KMS
> related things, which makes our memory management more clear and unified.

OK, thanks, that helps.

>         What testing infrastructure should this have?
> 
>         There are a few shell script I should include to show how to 
> create
>         commands.  Where should they be stored?  /contrib module?
> 
>  
> 
>         Are people okay with having the feature enabled, but invisible
>         since the docs and --help output are missing? When we enable
>         ssl_passphrase_command to prompt from the terminal, some of the
>         command-line options will be useful.
> 
>  
> Since our implementation is not in contrib, I don't think we should put the
> script there. Maybe we can refer to postgresql.conf.sample?  

Uh, the script are 20-60 lines long --- I am attaching them to this
email.  Plus, when we allow user prompting for the SSL passphrase, we
will have another script, or maybe three mor if people want to use a
Yubikey to unlock the SSL passphrase.

> Actually, I wonder whether we can add the UDK(user data encryption key) to the
> first version of KMS, which can be provided to plug-ins such as pgsodium. In
> this way, users can at least use it to a certain extent.

I don't thinK I want to get into that at this point.  It can be done
later.

> Also, I have tested some KMS functionalities by adding very simple TDE logic.
> In the meantime, I found some confusion in the following places:

OK, I know Cybertec has a TDE patch too.

> 1. Previously, we added a variable bootstrap_keys_wrap that is used for
> encryption during initdb. However, since we save the "wrapped" key, we need to
> use a global KEK that can be accessed in boot mode to unwrap it before use... 
> I
> don't know if that's good. To make it simple, I modified the
> bootstrap_keys_wrap to store the "unwrapped" key so that the encryption
> function can get it correctly. (The variable name should be changed
> accordingly).

I see what you are saying.  We store the wrapped in bootstrap mode, but
the unwrapped in normal mode.  There is also the case of when we copy
the keys from an old cluster.  I will work on a patch tomorrow and
report back here.

> 2. I tried to add support for AES_CTR mode, and the code for encrypting buffer
> blocks. In the process I found that in pg_cipher_ctx_create, the key length is
> declared as "byte". However, in the CryptoKey structure, the length is stored
> as "bit", which leads me to use a form similar to Key->klen / 8 when I call
> this function. Maybe we should unify the two to avoid unnecessary confusion.

Agreed.  I will look at that too tomorrow.

> 3. This one is not a problem/bug. I have some doubts about the length of data
> encryption blocks. For the page, we do not encrypt the PageHeaderData, which 
> is
> 192 bit long. By default, the page size is 8K(65536 bits), so the length of 
> the
> data we want to encrypt is 65344 bits. This number can't be divisible by 128
> and 192 keys and 256 bits long keys. Will this cause a problem?

Since we are using CTR mode for everything, it should be fine.  We
encrypt WAL as it is written.

> Here is a simple patch that I wrote with references to Sawada's previous TDE
> works, hope it can help you.

Thanks.  I will work on the items you mentioned.  Can you look at the
Cybertec patch and then our wiki to see what needs to be done next?

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption

Thanks for getting a proof-of-concept patch out there.  :-)

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_fd.sh
Description: Bourne shell script


pass_yubi_nopin.sh
Description: Bourne shell script


pass_yubi_pin.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-14 Thread Bruce Momjian
On Tue, Dec 15, 2020 at 10:05:49AM +0900, Michael Paquier wrote:
> On Mon, Dec 14, 2020 at 06:06:15PM -0500, Bruce Momjian wrote:
> > I am getting close to applying these patches, probably this week.  The
> > patches are cumulative:
> > 
> > https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
> > 
> > https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff
> 
> I strongly object to a commit based on the current state of the patch.
> Based on my lookup of the patches you are referring to, I see a couple
> of things that should be splitted up and refactored properly before
> thinking about the core part of the patch (FWIW, I don't have much
> thoughts to offer about the core part because I haven't really thought
> about it, but it does not prevent to do a correct refactoring).  Here
> are some notes:
> - The code lacks a lot of comments IMO.  Why is retrieve_passphrase()
> doing what it does?  It seems to me that pg_altercpass needs a large
> brushup.

Uh, pg_altercpass is a new file I wrote and almost every block has a
comment.  I added a few more, but can you be more specific?

> - There are no changes to src/tools/msvc/.  Seeing the diffs in
> src/common/, this stuff breaks Windows builds.

OK, done in patch at URL.

> - The HMAC split is terrible, as mentioned upthread.  I think that you
> would need to extract this stuff as a separate patch, and not add more
> mess to the existing mess (SCRAM uses its own HMAC with SHA256, which
> is already wrong).  We can and should have a fallback implementation,
> because that's easy to do.  And we need to have the fallback
> implementation depend on the contents of cryptohash.c (in a
> src/common/hmac.c), while the OpenSSL portion requires a
> hmac_openssl.c where you can choose the hash type based on
> pg_cryptohash_type.  So ossl_HMAC_SHA512() does not do the right
> thing.  APIs flexible enough to allow a new SSL library to plug into
> this stuff are required.
> - Not much a fan of the changes done in cryptohash.c for the resource
> owners as well.  It also feels like this could be thought as something
> directly for resowner.c.
> - The cipher split also should be done in its own patch, and reviewed
> on its own.  There is a mix of dependencies between non-OpenSSL and
> OpenSSL code paths, making the pluggin of a new SSL library harder to
> do.  In short, this requires proper API design, which is not something
> we have here.  There are also no changes in pgcrypto for that stuff.

I am going to need someone to help me make these changes.  I don't feel
I know enough about the crypto API to do it, and it will take me 1+ week
to learn it.

> > I do have a few questions:
> 
> That looks like a lot of things to sort out as well.
> 
> > Can anyone test this on Windows, particularly -R handling?
> > 
> > What testing infrastructure should this have?
> 
> Using TAP tests would be adapted for those two points.

OK, I will try that.

> > There are a few shell script I should include to show how to create
> > commands.  Where should they be stored?  /contrib module?
> 
> Why are these needed.  Is it a matter of documentation?

I have posted some of the scripts.  They involved complex shell
scripting that I doubt the average user can do.  The scripts are for
prompting for a passphrase from the user's terminal, or using the
Yubikey, with our without typing a pin.  I can put them in the docs and
then users can copy them into a file.  Is that the preferred method?

> > Are people okay with having the feature enabled, but invisible
> > since the docs and --help output are missing?  When we enable
> > ssl_passphrase_command to prompt from the terminal, some of the
> > command-line options will be useful.
> 
> You are suggesting to enable the feature by default, make it invisible
> to the users and leave it undocumented?  Is there something I missing
> here?

The point is that the command-line options will be active, but will not
be documented.  It will not do anything on its own unless you specify
that command-line option.  I can comment-out the command-line options
from being active but the code it going to look very messy.

> > Do people like the command-letter choices?
> > 
> > I called the alter passphrase utility pg_altercpass.  I could
> > have called it pg_clusterpass, but I wanted to highlight it is
> > only for changing the passphrase, not for creating them.
> 
> I think that you should attach such patches directly to the emails
> sent to pgsql-hackers, if those github links disappear for some
> reason, it would become impossible to refer to see what has been
> discussed here.

Well, the patches are changing frequently.  I am attaching a version to
this email.

> +/*
> + * We have to use postgres.h not postgres_fe.h here, because there's so much
> + * backend-only stuff in the kmgr include files we need.  But we need a
> + * frontend-ish environment otherwise.  Hence this 

Re: Proposed patch for key managment

2020-12-14 Thread Neil Chen
Hi, Bruce

I read your question and here are some of my thoughts.

Why is KmgrShmemData a struct, when it only has a single member?
> Are
> all shared memory areas structs?
>

Yes, all areas created by ShmemInitStruct() are structs. I think the
significance of using struct is that it delimits an "area"  to store the
KMS related things, which makes our memory management more clear and
unified.

What testing infrastructure should this have?
>
> There are a few shell script I should include to show how to create
> commands.  Where should they be stored?  /contrib module?



Are people okay with having the feature enabled, but invisible
> since the docs and --help output are missing? When we enable
> ssl_passphrase_command to prompt from the terminal, some of the
> command-line options will be useful.


Since our implementation is not in contrib, I don't think we should put the
script there. Maybe we can refer to postgresql.conf.sample?
Actually, I wonder whether we can add the UDK(user data encryption key) to
the first version of KMS, which can be provided to plug-ins such as
pgsodium. In this way, users can at least use it to a certain extent.

Also, I have tested some KMS functionalities by adding very simple TDE
logic. In the meantime, I found some confusion in the following places:

1. Previously, we added a variable bootstrap_keys_wrap that is used for
encryption during initdb. However, since we save the "wrapped" key, we need
to use a global KEK that can be accessed in boot mode to unwrap it before
use... I don't know if that's good. To make it simple, I modified the
bootstrap_keys_wrap to store the "unwrapped" key so that the encryption
function can get it correctly. (The variable name should be changed
accordingly).

2. I tried to add support for AES_CTR mode, and the code for encrypting
buffer blocks. In the process I found that in pg_cipher_ctx_create, the key
length is declared as "byte". However, in the CryptoKey structure, the
length is stored as "bit", which leads me to use a form similar to
Key->klen / 8 when I call this function. Maybe we should unify the two to
avoid unnecessary confusion.

3. This one is not a problem/bug. I have some doubts about the length of
data encryption blocks. For the page, we do not encrypt the PageHeaderData,
which is 192 bit long. By default, the page size is 8K(65536 bits), so the
length of the data we want to encrypt is 65344 bits. This number can't be
divisible by 128 and 192 keys and 256 bits long keys. Will this cause a
problem?

Here is a simple patch that I wrote with references to Sawada's previous
TDE works, hope it can help you.

Thanks.
-- 
There is no royal road to learning.
HighGo Software Co.


encrypt_buffer.diff
Description: Binary data


Re: Proposed patch for key managment

2020-12-14 Thread Michael Paquier
On Mon, Dec 14, 2020 at 06:06:15PM -0500, Bruce Momjian wrote:
> I am getting close to applying these patches, probably this week.  The
> patches are cumulative:
> 
>   https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
>   
> https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff

I strongly object to a commit based on the current state of the patch.
Based on my lookup of the patches you are referring to, I see a couple
of things that should be splitted up and refactored properly before
thinking about the core part of the patch (FWIW, I don't have much
thoughts to offer about the core part because I haven't really thought
about it, but it does not prevent to do a correct refactoring).  Here
are some notes:
- The code lacks a lot of comments IMO.  Why is retrieve_passphrase()
doing what it does?  It seems to me that pg_altercpass needs a large
brushup.
- There are no changes to src/tools/msvc/.  Seeing the diffs in
src/common/, this stuff breaks Windows builds.
- The HMAC split is terrible, as mentioned upthread.  I think that you
would need to extract this stuff as a separate patch, and not add more
mess to the existing mess (SCRAM uses its own HMAC with SHA256, which
is already wrong).  We can and should have a fallback implementation,
because that's easy to do.  And we need to have the fallback
implementation depend on the contents of cryptohash.c (in a
src/common/hmac.c), while the OpenSSL portion requires a
hmac_openssl.c where you can choose the hash type based on
pg_cryptohash_type.  So ossl_HMAC_SHA512() does not do the right
thing.  APIs flexible enough to allow a new SSL library to plug into
this stuff are required.
- Not much a fan of the changes done in cryptohash.c for the resource
owners as well.  It also feels like this could be thought as something
directly for resowner.c.
- The cipher split also should be done in its own patch, and reviewed
on its own.  There is a mix of dependencies between non-OpenSSL and
OpenSSL code paths, making the pluggin of a new SSL library harder to
do.  In short, this requires proper API design, which is not something
we have here.  There are also no changes in pgcrypto for that stuff.

> I do have a few questions:

That looks like a lot of things to sort out as well.

>   Can anyone test this on Windows, particularly -R handling?
>   
>   What testing infrastructure should this have?

Using TAP tests would be adapted for those two points.

>   There are a few shell script I should include to show how to create
>   commands.  Where should they be stored?  /contrib module?

Why are these needed.  Is it a matter of documentation?

>   Are people okay with having the feature enabled, but invisible
>   since the docs and --help output are missing?  When we enable
>   ssl_passphrase_command to prompt from the terminal, some of the
>   command-line options will be useful.

You are suggesting to enable the feature by default, make it invisible
to the users and leave it undocumented?  Is there something I missing
here?

>   Do people like the command-letter choices?
> 
>   I called the alter passphrase utility pg_altercpass.  I could
>   have called it pg_clusterpass, but I wanted to highlight it is
>   only for changing the passphrase, not for creating them.

I think that you should attach such patches directly to the emails
sent to pgsql-hackers, if those github links disappear for some
reason, it would become impossible to refer to see what has been
discussed here.

+/*
+ * We have to use postgres.h not postgres_fe.h here, because there's so much
+ * backend-only stuff in the kmgr include files we need.  But we need a
+ * frontend-ish environment otherwise.  Hence this ugly hack.
+ */
+#define FRONTEND 1
+
+#include "postgres.h"
I would advise to really understand why this happens and split up the
backend and frontend parts cleanly.  This style ought to be avoided as
much as possible.

@@ -95,9 +101,9 @@ pg_cryptohash_create(pg_cryptohash_type type)

if (state->evpctx == NULL)
{
+#ifndef FRONTEND
explicit_bzero(state, sizeof(pg_cryptohash_state));
explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
-#ifndef FRONTEND
I think that this change is incorrect.  Any clean up of memory should
be done for the frontend *and* the backend.

+#ifdef USE_OPENSSL
+   ctx = (PgCipherCtx *) palloc0(sizeof(PgCipherCtx));
+
+   ctx->encctx = ossl_cipher_ctx_create(cipher, key, klen, true);
+   ctx->decctx = ossl_cipher_ctx_create(cipher, key, klen, false);
+#endif
There is a cipher_openssl.c and a cipher.c that includes USE_OPENSSL.
This requires a cleaner split IMO.  We should avoid as much as
possible OpenSSL-specific code paths in files part of src/common/ when
not building with OpenSSL.  So this is now a mixed bag of
dependencies.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-14 Thread Bruce Momjian
On Wed, Dec  2, 2020 at 04:38:14PM -0500, Bruce Momjian wrote:
> Attached is a patch for key management, which will eventually be part of
> cluster file encryption (CFE), called TDE (Transparent Data Encryption)
> by Oracle.  It is an update of Masahiko Sawada's patch from July 31:
> 
>   
> https://www.postgresql.org/message-id/ca+fd4k6rjwnvztro3q2f5hsdd8hgyuc4cuy9u3e6ran4c6t...@mail.gmail.com
> 
> Sawada-san did all the hard work, and I just redirected the patch.  The
> general outline of this CFE feature can be seen here:
> 
>   https://wiki.postgresql.org/wiki/Transparent_Data_Encryption
> 
> The currently planned progression for this feature is to allow secure
> retrieval of key encryption keys (KEK) outside of the database, then use
> those to encrypt data keys that encrypt heap/index/tmpfile files.
...
> If most people approve of this general approach, and the design
> decisions made, I would like to apply this in the next few weeks, but
> this brings complications.  The syntax added by this commit might not
> provide a useful feature until PG 15, so how do we hide it from users. 
> I was thinking of not applying the doc changes (or commenting them out)
> and commenting out the --help output.

I am getting close to applying these patches, probably this week.  The
patches are cumulative:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff

I do have a few questions:

Why is KmgrShmemData a struct, when it only has a single member?  Are
all shared memory areas structs?

Should pg_altercpass be using fsync's for directory renames?

Can anyone test this on Windows, particularly -R handling?

What testing infrastructure should this have?

There are a few shell script I should include to show how to create
commands.  Where should they be stored?  /contrib module?

Are people okay with having the feature enabled, but invisible
since the docs and --help output are missing?  When we enable
ssl_passphrase_command to prompt from the terminal, some of the
command-line options will be useful.

Do people like the command-letter choices?

I called the alter passphrase utility pg_altercpass.  I could
have called it pg_clusterpass, but I wanted to highlight it is
only for changing the passphrase, not for creating them.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

 




Re: Proposed patch for key managment

2020-12-11 Thread Bruce Momjian
On Fri, Dec 11, 2020 at 01:01:14PM -0500, Bruce Momjian wrote:
> On Wed, Dec  9, 2020 at 08:40:50PM -0500, Bruce Momjian wrote:
> > My next task is to write a script for Yubikey authentication.
> 
> I know Craig Ringer wanted Yubikey support, which allows two-factor
> authentication, so I have added it to the most recent patch by adding a
> cluster_passphrase_command %d/directory parameter:
> 
>   https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
> 
> You can also store the PIN in a file, so you don't need a PIN to be
> entered by the user for each server start.

Here is the output without requiring a PIN;  attached is the script used:

++ initdb -K 256 -R -c '/u/postgres/tmp/pass_yubi_nopin.sh "%d"'
The files belonging to this database system will be owned by user 
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ... engine "pkcs11" set.

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.
engine "pkcs11" set.
ok
performing post-bootstrap initialization ... engine "pkcs11" set.
ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start

++ pg_ctl -R -l /u/pg/server.log start
waiting for server to start... done
server started
++ pg_altercpass -R '/u/postgres/tmp/pass_yubi_nopin.sh "%d"' 
'/u/postgres/tmp/pass_yubi_nopin.sh "%d"'
engine "pkcs11" set.
engine "pkcs11" set.

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.
engine "pkcs11" set.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_yubi_nopin.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-11 Thread Bruce Momjian
On Wed, Dec  9, 2020 at 08:40:50PM -0500, Bruce Momjian wrote:
> My next task is to write a script for Yubikey authentication.

I know Craig Ringer wanted Yubikey support, which allows two-factor
authentication, so I have added it to the most recent patch by adding a
cluster_passphrase_command %d/directory parameter:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

You can also store the PIN in a file, so you don't need a PIN to be
entered by the user for each server start.  Attached is the script I
with a PIN required.  Here is a session:

$ initdb -K 256 -R -c '/u/postgres/tmp/pass_yubi.sh %R "%d"'
The files belonging to this database system will be owned by user 
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ...
Enter Yubikey PIN:

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.

--> Enter Yubikey PIN:
ok
performing post-bootstrap initialization ...
--> Enter Yubikey PIN:
ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start


$ pg_ctl -R -l /u/pg/server.log start
waiting for server to start...
--> Enter Yubikey PIN:
 done
server started

It even allows for passphrase rotation using my pg_altercpass tool with
this patch:


https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff

The Yubikey-encrypted passphrase is stored in the key directory, so the
encrypted passphrase stays with the data/WAL keys during passphrase
rotation:

$ pg_altercpass -R '/u/postgres/tmp/pass_yubi.sh %R "%d"' 
'/u/postgres/tmp/pass_yubi.sh %R "%d"'

--> Enter Yubikey PIN:

--> Enter Yubikey PIN:

WARNING:  The Yubikey can be locked and require a reset if too many pin
attempts fail.  It is recommended to run this command manually and save
the passphrase in a secure location for possible recovery.

--> Enter Yubikey PIN:

Yubikey PIN rotation has to be done using the Yubikey tool, and data/WAL
key rotation has to be done via switching to a standby, which hasn't
been implemented yet.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



pass_yubi.sh
Description: Bourne shell script


Re: Proposed patch for key managment

2020-12-10 Thread Bruce Momjian
On Thu, Dec 10, 2020 at 07:26:48PM +0800, Neil Chen wrote:
> 
> 
> Hi, everyone
> 
> I have read the patch and did some simple tests. I'm not entirely sure
> about some code segments; e.g.:
> 
> In the BootStrapKmgr() we generate a data encryption key by:
> key = generate_crypto_key(file_encryption_keylen);
> 
> However, I found that the file_encryption_keylen is always 0 in bootstrap
> mode because there exitst another variable 
> bootstrap_file_encryption_keylen
> in xlog.c and bootstrap.c.

Oh, good point;  that is very helpful.  I was relying on SetConfigOption
to set file_encryption_keylen, but that happens _after_ we create the
keys, so they were zero length.  I have fixed this by passing
bootstrap_file_encryption_keylen to the boot routines.  The diff URL has
the fix:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

> We get the REL/WAL key by KmgrGetKey() call and it works like:
> return (const CryptoKey *) &(KmgrShmem->intlKeys[id]);
> 
> But in bootstrap mode, the KmgrShmem are not assigned. So, if we want to
> use it to encrypt something in bootstrap mode, I suggest we make the
> following changes:
> if ( in bootstrap mode)
> return intlKeys[id]; // a static variable which contains key
> else
> reutrn (const CryptoKey *) &(KmgrShmem->intlKeys[id]);

Yes, you are also correct here.  I had not gotten to using KmgrGetKey
yet, but it clearly needs your suggestion, so have done that.

Thanks for your help.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-10 Thread Bruce Momjian
On Wed, Dec  9, 2020 at 05:18:37PM -0500, Stephen Frost wrote:
> Greetings,
> 
> * Daniel Gustafsson (dan...@yesql.se) wrote:
> > > On 2 Dec 2020, at 22:38, Bruce Momjian  wrote:
> > > Attached is a patch for key management, which will eventually be part of
> > > cluster file encryption (CFE), called TDE (Transparent Data Encryption)
> > > by Oracle.
> > 
> > The attackvector protected against seems to be operating systems users
> > (*without* any legitimate database access) gaining access to the data files.
> 
> That isn't the only attack vector that this addresses (though it is one
> that this is envisioned to help with- to wit, someone rsync'ing the DB
> directory).  TDE also helps with traditional data at rest requirements,
> in environments where you don't trust the storage layer to handle that
> (for whatever reason), and it's at least imagined that backups with
> pg_basebackup would also be encrypted, helping protect against backup
> theft.
> 
> There's, further, certainly no shortage of folks asking for this.

Can we flesh this out more in the docs?  Any idea on wording compared to
what I have?

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-10 Thread Bruce Momjian
On Wed, Dec  9, 2020 at 10:34:59PM +0100, Daniel Gustafsson wrote:
> > On 2 Dec 2020, at 22:38, Bruce Momjian  wrote:
> > 
> > Attached is a patch for key management, which will eventually be part of
> > cluster file encryption (CFE), called TDE (Transparent Data Encryption)
> > by Oracle.
> 
> The attackvector protected against seems to be operating systems users
> (*without* any legitimate database access) gaining access to the data files.
> Such an attacker would also gain access to postgresql.conf and therefore may
> have the cluster passphrase command in case it's stored there.  That would
> imply the attacker is likely (or perhaps better phrased "not entirely
> unlikely") to be able to run that command and decrypt the data *iff* there is
> no interactive/2fa aspect to the passphrase command.  Is that an accurate
> interpretation?  Do Oracle TDE et.al use a similar setup where an database
> restart require manual intervention?

I have updated the docs to say "read" access more clearly:

   The purpose of cluster file encryption is to prevent users with read
   access to the directories used to store database files from being able to
   access the data stored in those files.  For example, when using cluster
   file encryption, users who have read access to the cluster directories
   for backup purposes will not be able to read the data stored in the
   these files.

I already said "read access" in the later part of the paragraph, but
obviously it needs to be mentioned early too.  If we need more text
here, please let me know.

As far as someone hijacking the passphrase command, that is a serious
risk.  No matter how you do the authentication, even TFA, you are going
to have someone able to grab that passphrase as it comes out of the
script and passes to the server.  It might help to store the script and
postgres binaries in a more secure location than the data, but I am not
sure how realistic that is.  I can if you are using a NAS for data store
and have local binaries and passphrase script, that would be more secure
than putting the script on the NAS.  Is that something we should explain?
Should we explicity say that there is no protection against write
access?  What can someone with write access to PGDATA do if
postgresql.conf is not stored there?  I don't know.

> Admittedly I haven't read the other threads on the topic so if it's discussed
> somewhere else then I'd appreciate a whacking with a cluestick.
> 
> A few other comments from skimming the patch:
> 
> +  is data encryption keys, specifically keys zero and one.  Key zero is
> +  the key uses to encrypt database heap and index files which are stored in
> ".. is the key used to .."?

Fixed, thanks.

> +   matches the initdb-supplied passphrase.  If this check fails, and the
> +   the server will refuse to start.
> Seems like something is missing, perhaps just s/and the//?

Fixed.

> +  url="https://tools.ietf.org/html/rfc2315;>RFC2315.
> Tiny nitpick: turns out that RFC 2315 (with a space) is the correct spelling.

Fixed.

> + * are read-only.  All wrapping and unwrapping key routines depends on
> + * the openssl library for now.
> OpenSSL is a name so it should be cased as OpenSSL in text like this.

Fixed, and fixed "depends".

>  #include 
> +#include 
> Why do we need this header in be-secure-openssl.c?  There are no other changes
> to that file?

Not sure, removed, since it compiles fine without it.
> 
> +/* Define to 1 if you have the `OPENSSL_init_crypto' function. */
> +#undef HAVE_OPENSSL_INIT_CRYPTO
> This seems to be unused?

Agreed, removed.

> KmgrSaveCryptoKeys doesn't seem to be explicitly clamping down permissions on
> the generated file, is that something we want for belt-and-suspender level
> paranoia around keys? The same goes for read_one_keyfile etc.

Well, KmgrSaveCryptoKeys is calling BasicOpenFile, which is calling
BasicOpenFilePerm(fileName, fileFlags, pg_file_create_mode).  The
question is whether we should honor the pg_file_create_mode specified on
the data directory, or make it tighter for these keys.  The file is
already owner-rw-only by default:

$ ls -l
drwx-- 2 postgres postgres 4096 Dec 10 13:13 live/
$ cd live/
$ ls -l
total 8
-rw--- 1 postgres postgres 356 Dec 10 13:13 0
-rw--- 1 postgres postgres 356 Dec 10 13:13 1

If the key files were mixed in with a bunch of other files of lesser
value, like with Apache, I could see not honoring it, but for this case,
since all the files are of equal value, I think just honoring it makes
sense.

> Also, KmgrSaveCryptoKeys assumes that keys are stored in plain files which is
> true for OpenSSL but won't necessarily be true for other crypto libraries.
> Would it make sense to have an API in be-kmgr.c with the OpenSSL specifics in
> be-kmgr-openssl.c like how the TLS backend support is written?  We have spent 
> a
> lot effort in making the backend not assume a particular TLS library, it seems
> a 

Re: Proposed patch for key managment

2020-12-10 Thread Neil Chen
Hi, everyone
>
> I have read the patch and did some simple tests. I'm not entirely sure
> about some code segments; e.g.:
>
> In the BootStrapKmgr() we generate a data encryption key by:
> key = generate_crypto_key(file_encryption_keylen);
>
> However, I found that the file_encryption_keylen is always 0 in bootstrap
> mode because there exitst another variable bootstrap_file_encryption_keylen
> in xlog.c and bootstrap.c.
>
> We get the REL/WAL key by KmgrGetKey() call and it works like:
> return (const CryptoKey *) &(KmgrShmem->intlKeys[id]);
>
> But in bootstrap mode, the KmgrShmem are not assigned. So, if we want to
> use it to encrypt something in bootstrap mode, I suggest we make the
> following changes:
> if ( in bootstrap mode)
> return intlKeys[id]; // a static variable which contains key
> else
> reutrn (const CryptoKey *) &(KmgrShmem->intlKeys[id]);
>
>

-- 
There is no royal road to learning.
Highgo Software Co.


Re: Proposed patch for key managment

2020-12-09 Thread Bruce Momjian
On Fri, Dec  4, 2020 at 10:32:45PM -0500, Bruce Momjian wrote:
> I can break out the -R/file descriptor passing part as a separate patch,
> and have the ssl_passphrase_command use that, but that's the only part I
> know can be useful on its own.
> 
> Since the patch is large, I found a way to push the branch to git and
> how to make a download link that tracks whatever I push to the 'key'
> branch on my github account.  Here is the updated patch link:
> 
>   https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

I have made some good progress on the patch.  I realized that pg_upgrade
can't just copy the keys from the old cluster --- they encrypt the user
heap/index files that are copied/linked by pg_upgrade, but also encrypt
the system tables, which initdb creates, so the keys have to be copied
at initdb bootstrap time --- I have added an option to do that.  I also
realized that pg_upgrade will be starting/stopping the server, so I need
to add an option to pg_upgrade to allow that prompting.  I can now
successfully pg_upgrade a cluster that uses cluster file encryption, and
keep the same keys.  All at the same URL.

In addition I have completed the command-line tool to allow changing of
the cluster passphrase, which applies over the first diff;  diff at:


https://github.com/bmomjian/postgres/compare/key...bmomjian:key-alter.diff

My next task is to write a script for Yubikey authentication.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-09 Thread Stephen Frost
Greetings,

* Daniel Gustafsson (dan...@yesql.se) wrote:
> > On 2 Dec 2020, at 22:38, Bruce Momjian  wrote:
> > Attached is a patch for key management, which will eventually be part of
> > cluster file encryption (CFE), called TDE (Transparent Data Encryption)
> > by Oracle.
> 
> The attackvector protected against seems to be operating systems users
> (*without* any legitimate database access) gaining access to the data files.

That isn't the only attack vector that this addresses (though it is one
that this is envisioned to help with- to wit, someone rsync'ing the DB
directory).  TDE also helps with traditional data at rest requirements,
in environments where you don't trust the storage layer to handle that
(for whatever reason), and it's at least imagined that backups with
pg_basebackup would also be encrypted, helping protect against backup
theft.

There's, further, certainly no shortage of folks asking for this.

> Such an attacker would also gain access to postgresql.conf and therefore may
> have the cluster passphrase command in case it's stored there.  That would
> imply the attacker is likely (or perhaps better phrased "not entirely
> unlikely") to be able to run that command and decrypt the data *iff* there is
> no interactive/2fa aspect to the passphrase command.  Is that an accurate
> interpretation?  Do Oracle TDE et.al use a similar setup where an database
> restart require manual intervention?

This is very much an 'it depends' as other products have different
capabilities in this area.  The most similar implementation to what is
being contemplated for PG today would be the big O's "tablespace" TDE,
which offers options of either "Password-based software keystores" (you
have to provide a password when you want to bring that tablespace
online), or "Auto-login software keystores" (there's a "system generated
password" that's used to encrypt the keystore, which seems to be
more-or-less the username and hostname of the system), or HSM based
options.  As such, a DB restart might require manual intervention (if
the keystore is password based, or an HSM that requires manual
involvement) or it might not (auto-login keystore of some kind).

> Admittedly I haven't read the other threads on the topic so if it's discussed
> somewhere else then I'd appreciate a whacking with a cluestick.

I'm sure it was, but hopefully the above helps you avoid having to dig
through the very (very (very)) long threads on this topic.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-09 Thread Daniel Gustafsson
> On 2 Dec 2020, at 22:38, Bruce Momjian  wrote:
> 
> Attached is a patch for key management, which will eventually be part of
> cluster file encryption (CFE), called TDE (Transparent Data Encryption)
> by Oracle.

The attackvector protected against seems to be operating systems users
(*without* any legitimate database access) gaining access to the data files.
Such an attacker would also gain access to postgresql.conf and therefore may
have the cluster passphrase command in case it's stored there.  That would
imply the attacker is likely (or perhaps better phrased "not entirely
unlikely") to be able to run that command and decrypt the data *iff* there is
no interactive/2fa aspect to the passphrase command.  Is that an accurate
interpretation?  Do Oracle TDE et.al use a similar setup where an database
restart require manual intervention?

Admittedly I haven't read the other threads on the topic so if it's discussed
somewhere else then I'd appreciate a whacking with a cluestick.

A few other comments from skimming the patch:

+  is data encryption keys, specifically keys zero and one.  Key zero is
+  the key uses to encrypt database heap and index files which are stored in
".. is the key used to .."?

+   matches the initdb-supplied passphrase.  If this check fails, and the
+   the server will refuse to start.
Seems like something is missing, perhaps just s/and the//?

+ https://tools.ietf.org/html/rfc2315;>RFC2315.
Tiny nitpick: turns out that RFC 2315 (with a space) is the correct spelling.

+ * are read-only.  All wrapping and unwrapping key routines depends on
+ * the openssl library for now.
OpenSSL is a name so it should be cased as OpenSSL in text like this.

 #include 
+#include 
Why do we need this header in be-secure-openssl.c?  There are no other changes
to that file?

+/* Define to 1 if you have the `OPENSSL_init_crypto' function. */
+#undef HAVE_OPENSSL_INIT_CRYPTO
This seems to be unused?

KmgrSaveCryptoKeys doesn't seem to be explicitly clamping down permissions on
the generated file, is that something we want for belt-and-suspender level
paranoia around keys? The same goes for read_one_keyfile etc.

Also, KmgrSaveCryptoKeys assumes that keys are stored in plain files which is
true for OpenSSL but won't necessarily be true for other crypto libraries.
Would it make sense to have an API in be-kmgr.c with the OpenSSL specifics in
be-kmgr-openssl.c like how the TLS backend support is written?  We have spent a
lot effort in making the backend not assume a particular TLS library, it seems
a shame to step away from that here with a tight coupling. (yes, I am biased)

The same goes for src/common/cipher.c which wraps calls in ifdefs, why not just
have an thin wrapper API which cipher-openssl.c implements?

+   case 'K':
+   file_encryption_keylen = atoi(optarg);
+   break;
atoi() will return zero on failing to parse the keylen, where 0 implies
"disabled".  Wouldn't it make sense to parse this with code which can't
silently fall back on disabling in case of users mistyping?

+* Skip cryptographic keys. It's generally not good idea to copy the
".. not *a* good idea .."

+   pg_log_fatal("cluser passphrase referenced %%R, but -R 
not specified");
s/cluser/cluster/  (there are few copy-pasteos of that elsewhere too)

+   elog(ERROR, "too many cryptographic kes");
s/kes/keys/

+#ifndef CIPHER_H
+#define CIPHER_H
The risk for headerguard collision with non-PG headers seem quite high, maybe
PG_CIPHER_H would be better?

I will try to dive in a bit deeper over the holidays.

cheers ./daniel



Re: Proposed patch for key managment

2020-12-06 Thread Bruce Momjian
On Mon, Dec  7, 2020 at 11:03:30AM +0900, Michael Paquier wrote:
> On Sat, Dec 05, 2020 at 10:42:05AM -0500, Bruce Momjian wrote:
> > On Sat, Dec  5, 2020 at 09:54:33PM +0900, Michael Paquier wrote:
> >> On Fri, Dec 04, 2020 at 10:52:29PM -0500, Bruce Momjian wrote:
> >>>   if (state->evpctx == NULL)
> >>>   {
> >>> - explicit_bzero(state, sizeof(pg_cryptohash_state));
> >>> - explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
> >>>  #ifndef FRONTEND
> >>>   ereport(ERROR,
> >>>   (errcode(ERRCODE_OUT_OF_MEMORY),
> >> 
> >> And this original position is IMO more correct.
> > 
> > Do we even need them?
> 
> That's the intention to clean up things in a consistent way.

Ah, I see your point.  Added:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-06 Thread Bruce Momjian
On Mon, Dec  7, 2020 at 09:30:03AM +0900, Masahiko Sawada wrote:
> Thank you for updating the patch!
> 
> I think we need explicit_bzero() also in freeing the keywrap context.

pg_cryptohash_free() already has this:

explicit_bzero(state, sizeof(pg_cryptohash_state));
explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));

Do we need more?

> BTW, when we need -R option pg_ctl command to start the server, how
> can we start it in the single-user mode?

I added code for that, but I hadn't tested it yet.  Now that I tried it,
I realized that it is awkward to supply a file descriptor number (that
will be closed) from the command-line, so I added code and docs to allow
-1 to duplicate standard error, and it worked:

$ postgres --single -R -1 -D /u/pg/data

Enter password:
PostgreSQL stand-alone backend 14devel
backend> select 100;
 1: ?column?(typeid = 23, len = 4, typmod = -1, byval = t)

 1: ?column? = "100"(typeid = 23, len = 4, typmod = -1, 
byval = t)


Updated patch at the same URL:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-06 Thread Michael Paquier
On Sat, Dec 05, 2020 at 10:42:05AM -0500, Bruce Momjian wrote:
> On Sat, Dec  5, 2020 at 09:54:33PM +0900, Michael Paquier wrote:
>> On Fri, Dec 04, 2020 at 10:52:29PM -0500, Bruce Momjian wrote:
>>> if (state->evpctx == NULL)
>>> {
>>> -   explicit_bzero(state, sizeof(pg_cryptohash_state));
>>> -   explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
>>>  #ifndef FRONTEND
>>> ereport(ERROR,
>>> (errcode(ERRCODE_OUT_OF_MEMORY),
>> 
>> And this original position is IMO more correct.
> 
> Do we even need them?

That's the intention to clean up things in a consistent way.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-06 Thread Masahiko Sawada
On Sun, 6 Dec 2020 at 00:42, Bruce Momjian  wrote:
>
> On Sat, Dec  5, 2020 at 09:54:33PM +0900, Michael Paquier wrote:
> > On Fri, Dec 04, 2020 at 10:52:29PM -0500, Bruce Momjian wrote:
> > > OK, I worked with Sawada-san and added the attached patch.  The updated
> > > full patch is at the same URL:  :-)
> > >
> > > 
> > > https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
> >
> > Oh, I see that you use SHA256 during firstboot, which is why you
> > bypass the resowner paths in cryptohash_openssl.c.  Wouldn't it be
> > better to use IsBootstrapProcessingMode() then?
>
> I tried that, but we also use the resource references before the
> resource system is started even in non-bootstrap mode.  Maybe we should
> be creating a resource owner for this, but it is so early in the boot
> process I don't know if that is safe.
>
> > > @@ -72,14 +72,15 @@ pg_cryptohash_create(pg_cryptohash_type type)
> > > ctx = ALLOC(sizeof(pg_cryptohash_ctx));
> > > if (ctx == NULL)
> > > return NULL;
> > > +   explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
> > >
> > > state = ALLOC(sizeof(pg_cryptohash_state));
> > > if (state == NULL)
> > > {
> > > -   explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
> > > FREE(ctx);
> > > return NULL;
> > > }
> > > +   explicit_bzero(state, sizeof(pg_cryptohash_state));
> >
> > explicit_bzero() is used to give the guarantee that any sensitive data
> > gets cleaned up after an error or on free.  So I think that your use
> > is incorrect here for an initialization.
>
> It turns out what we were missing was a clear of state->resowner in
> cases where the resowner was null.  I removed the bzero calls completely
> and it now runs fine.
>
> > > if (state->evpctx == NULL)
> > > {
> > > -   explicit_bzero(state, sizeof(pg_cryptohash_state));
> > > -   explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
> > >  #ifndef FRONTEND
> > > ereport(ERROR,
> > > (errcode(ERRCODE_OUT_OF_MEMORY),
> >
> > And this original position is IMO more correct.
>
> Do we even need them?
>
> > Anyway, at quick glance, the backtrace of upthread is indicating a
> > double-free with an attempt to free a resource owner that has been
> > already free'd.
>
> I think that is now fixed too.  Updated patch at the same URL:
>
> 
> https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

Thank you for updating the patch!

I think we need explicit_bzero() also in freeing the keywrap context.

BTW, when we need -R option pg_ctl command to start the server, how
can we start it in the single-user mode?

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Proposed patch for key managment

2020-12-05 Thread Bruce Momjian
On Sat, Dec  5, 2020 at 09:54:33PM +0900, Michael Paquier wrote:
> On Fri, Dec 04, 2020 at 10:52:29PM -0500, Bruce Momjian wrote:
> > OK, I worked with Sawada-san and added the attached patch.  The updated
> > full patch is at the same URL:  :-)
> > 
> > https://github.com/postgres/postgres/compare/master...bmomjian:key.diff
> 
> Oh, I see that you use SHA256 during firstboot, which is why you
> bypass the resowner paths in cryptohash_openssl.c.  Wouldn't it be
> better to use IsBootstrapProcessingMode() then?

I tried that, but we also use the resource references before the
resource system is started even in non-bootstrap mode.  Maybe we should
be creating a resource owner for this, but it is so early in the boot
process I don't know if that is safe.

> > @@ -72,14 +72,15 @@ pg_cryptohash_create(pg_cryptohash_type type)
> > ctx = ALLOC(sizeof(pg_cryptohash_ctx));
> > if (ctx == NULL)
> > return NULL;
> > +   explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
> >  
> > state = ALLOC(sizeof(pg_cryptohash_state));
> > if (state == NULL)
> > {
> > -   explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
> > FREE(ctx);
> > return NULL;
> > }
> > +   explicit_bzero(state, sizeof(pg_cryptohash_state));
> 
> explicit_bzero() is used to give the guarantee that any sensitive data
> gets cleaned up after an error or on free.  So I think that your use
> is incorrect here for an initialization.

It turns out what we were missing was a clear of state->resowner in
cases where the resowner was null.  I removed the bzero calls completely
and it now runs fine.

> > if (state->evpctx == NULL)
> > {
> > -   explicit_bzero(state, sizeof(pg_cryptohash_state));
> > -   explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
> >  #ifndef FRONTEND
> > ereport(ERROR,
> > (errcode(ERRCODE_OUT_OF_MEMORY),
> 
> And this original position is IMO more correct.

Do we even need them?

> Anyway, at quick glance, the backtrace of upthread is indicating a
> double-free with an attempt to free a resource owner that has been
> already free'd.

I think that is now fixed too.  Updated patch at the same URL:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-05 Thread Michael Paquier
On Fri, Dec 04, 2020 at 10:52:29PM -0500, Bruce Momjian wrote:
> OK, I worked with Sawada-san and added the attached patch.  The updated
> full patch is at the same URL:  :-)
> 
>   https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

Oh, I see that you use SHA256 during firstboot, which is why you
bypass the resowner paths in cryptohash_openssl.c.  Wouldn't it be
better to use IsBootstrapProcessingMode() then?

> @@ -72,14 +72,15 @@ pg_cryptohash_create(pg_cryptohash_type type)
>   ctx = ALLOC(sizeof(pg_cryptohash_ctx));
>   if (ctx == NULL)
>   return NULL;
> + explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
>  
>   state = ALLOC(sizeof(pg_cryptohash_state));
>   if (state == NULL)
>   {
> - explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
>   FREE(ctx);
>   return NULL;
>   }
> + explicit_bzero(state, sizeof(pg_cryptohash_state));

explicit_bzero() is used to give the guarantee that any sensitive data
gets cleaned up after an error or on free.  So I think that your use
is incorrect here for an initialization.

>   if (state->evpctx == NULL)
>   {
> - explicit_bzero(state, sizeof(pg_cryptohash_state));
> - explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
>  #ifndef FRONTEND
>   ereport(ERROR,
>   (errcode(ERRCODE_OUT_OF_MEMORY),

And this original position is IMO more correct.

Anyway, at quick glance, the backtrace of upthread is indicating a
double-free with an attempt to free a resource owner that has been
already free'd.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-05 Thread Masahiko Sawada
On Sat, 5 Dec 2020 at 11:08, Bruce Momjian  wrote:
>
> On Wed, Dec  2, 2020 at 04:38:14PM -0500, Bruce Momjian wrote:
> > If most people approve of this general approach, and the design
> > decisions made, I would like to apply this in the next few weeks, but
> > this brings complications.  The syntax added by this commit might not
> > provide a useful feature until PG 15, so how do we hide it from users.
> > I was thinking of not applying the doc changes (or commenting them out)
> > and commenting out the --help output.
>
> Here is an updated patch to handle the new hash API introduced by
> commit 87ae9691d2.
>

Thank you for updating the patch and moving forward!

I've tried to use this patch on my environment (FreeBSD 12.1) but it
doesn't work. I got the following error:

$ bin/initdb -D data -E UTF8 --no-locale -c'echo
"1234567890123456789012345678901234567890123456789012345678901234567890"'
The files belonging to this database system will be owned by user "masahiko".
This user must also own the server process.

The database cluster will be initialized with locale "C".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

creating directory data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Japan
creating configuration files ... ok
running bootstrap script ... child process was terminated by signal
10: Bus error
initdb: removing data directory "data"

The backtrace is:

(lldb) bt
* thread #1, name = 'postgres', stop reason = signal SIGBUS: hardware error
frame #0: 0x00d3b074
postgres`ResourceArrayRemove(resarr=0x7f7f7f7f7f7f80df,
value=34383251232) at resowner.c:308:23
frame #1: 0x00d3c0ef
postgres`ResourceOwnerForgetCryptoHash(owner=0x7f7f7f7f7f7f7f7f,
handle=34383251232) at resowner.c:1421:7
frame #2: 0x00d77c54
postgres`pg_cryptohash_free(ctx=0x00080166c720) at
cryptohash_openssl.c:228:3
  * frame #3: 0x00d6c065
postgres`kmgr_derive_keys(passphrase="1234567890123456789012345678901234567890123456789012345678901234567890\n",
passlen=71, enckey="\x01d

\x17(\x93\xa8id\xae\xa5\x86\x84\xb9Y>\xa84\x16\U0085\U009d'\r\x11123456789012345678901234567890
1234567890123456789012345678901234567890\n",
mackey="0\x1f\xb4_eg\x95\x02\x9e2\x0e\xba\t\b^\x18\x90U\xa0\x8e\xaei\x01oYJL\xa4\xb3E\x97a,\x06h4\x9fX\x9eS\xb2\x81%^d\xa4\x01d

\x17(\x93\xa8id\xae\xa5\x86\x84\xb9Y>\xa84\x16\U0085\U009d'\r\x1112345678901234567890123456789
01234567890123456789012345678901234567890\n") at kmgr_utils.c:239:2
frame #4: 0x00d60810 postgres`BootStrapKmgr at kmgr.c:93:2
frame #5: 0x00647fa1 postgres`BootStrapXLOG at xlog.c:5359:3
frame #6: 0x0066f034 postgres`AuxiliaryProcessMain(argc=7,
argv=0x7fffcdb8) at bootstrap.c:450:4
frame #7: 0x008e9cfb postgres`main(argc=8,
argv=0x7fffcdb0) at main.c:201:3
frame #8: 0x0051010f postgres`_start(ap=,
cleanup=) at crt1.c:76:7

Investigating the issue, it seems we need to initialize the context
and the state with 0. The following change fixes this issue.

diff --git a/src/common/cryptohash_openssl.c b/src/common/cryptohash_openssl.c
index e5233daab6..a45c86fa67 100644
--- a/src/common/cryptohash_openssl.c
+++ b/src/common/cryptohash_openssl.c
@@ -81,6 +81,8 @@ pg_cryptohash_create(pg_cryptohash_type type)
return NULL;
}

+   memset(ctx, 0, sizeof(pg_cryptohash_ctx));
+   memset(state, 0, sizeof(pg_cryptohash_state));
ctx->data = state;
ctx->type = type;

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Proposed patch for key managment

2020-12-04 Thread Bruce Momjian
On Sat, Dec  5, 2020 at 12:15:13PM +0900, Masahiko Sawada wrote:
> diff --git a/src/common/cryptohash_openssl.c b/src/common/cryptohash_openssl.c
> index e5233daab6..a45c86fa67 100644
> --- a/src/common/cryptohash_openssl.c
> +++ b/src/common/cryptohash_openssl.c
> @@ -81,6 +81,8 @@ pg_cryptohash_create(pg_cryptohash_type type)
> return NULL;
> }
> 
> +   memset(ctx, 0, sizeof(pg_cryptohash_ctx));
> +   memset(state, 0, sizeof(pg_cryptohash_state));
> ctx->data = state;
> ctx->type = type;

OK, I worked with Sawada-san and added the attached patch.  The updated
full patch is at the same URL:  :-)

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee

diff --git a/src/common/cryptohash_openssl.c b/src/common/cryptohash_openssl.c
index e5233daab6..02dec1fd1b 100644
--- a/src/common/cryptohash_openssl.c
+++ b/src/common/cryptohash_openssl.c
@@ -72,14 +72,15 @@ pg_cryptohash_create(pg_cryptohash_type type)
 	ctx = ALLOC(sizeof(pg_cryptohash_ctx));
 	if (ctx == NULL)
 		return NULL;
+	explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
 
 	state = ALLOC(sizeof(pg_cryptohash_state));
 	if (state == NULL)
 	{
-		explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
 		FREE(ctx);
 		return NULL;
 	}
+	explicit_bzero(state, sizeof(pg_cryptohash_state));
 
 	ctx->data = state;
 	ctx->type = type;
@@ -97,8 +98,6 @@ pg_cryptohash_create(pg_cryptohash_type type)
 
 	if (state->evpctx == NULL)
 	{
-		explicit_bzero(state, sizeof(pg_cryptohash_state));
-		explicit_bzero(ctx, sizeof(pg_cryptohash_ctx));
 #ifndef FRONTEND
 		ereport(ERROR,
 (errcode(ERRCODE_OUT_OF_MEMORY),


Re: Proposed patch for key managment

2020-12-04 Thread Bruce Momjian
On Sat, Dec  5, 2020 at 11:39:18AM +0900, Michael Paquier wrote:
> On Fri, Dec 04, 2020 at 09:08:03PM -0500, Bruce Momjian wrote:
> > Here is an updated patch to handle the new hash API introduced by
> > commit 87ae9691d2.
> 
> +   if (!ossl_initialized)
> +   {
> +#ifdef HAVE_OPENSSL_INIT_SSL
> +   OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL);
> +#else
> +   OPENSSL_config(NULL);
> +   SSL_library_init();
> +   SSL_load_error_strings();
> +#endif
> +   ossl_initialized = true;
> This is a duplicate of what's done in be-secure-openssl.c, and it does
> not strike me as a good idea to do that potentially twice.

Yeah, I kind of wondered about that.  In fact, the code from the
original patch would not compile so I got this init code from somewhere
else. I have now removed it and it works fine.  :-)

> git diff --check complains.

Uh, can you be more specific?  I don't see any output from that command.

> +extern bool pg_HMAC_SHA512(const uint8 *key,
> +   const uint8 *in, int inlen,
> +   uint8 *out);
> I think that the split done in this patch makes the HMAC handling in
> the core code messier:
> - SCRAM makes use of HMAC internally, and we should try to use the
> HMAC of OpenSSL if building with it even for SCRAM.
> - For the first reason, I think that we should also have a fallback
> implementation.
> - This API layer should not depend directly on the SHA2 used (SCRAM
> uses SHA256 with HMAC).
> FWIW, I got plans to work on that once I am done with the business
> around MD5 and OpenSSL.

Uh, I just kind of kept all that code and didn't modify it.  It would be
great if you can help me improve it.  I will be using the hash code for
the command-line tool that alters the passphrase, so having that in
common/ does help me.

> The refactoring done with the ciphers moved from pgcrypto to
> src/common/ should be a separate patch.  In short, it would be good to

Uh, I am kind of unclear exactly what was done there since I just took
that part of the patch unchanged.

> rework this patch and split it into pieces that are independently
> useful.  This would make the review much easier as well.

I can break out the -R/file descriptor passing part as a separate patch,
and have the ssl_passphrase_command use that, but that's the only part I
know can be useful on its own.

Since the patch is large, I found a way to push the branch to git and
how to make a download link that tracks whatever I push to the 'key'
branch on my github account.  Here is the updated patch link:

https://github.com/postgres/postgres/compare/master...bmomjian:key.diff

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Proposed patch for key managment

2020-12-04 Thread Michael Paquier
On Fri, Dec 04, 2020 at 09:08:03PM -0500, Bruce Momjian wrote:
> Here is an updated patch to handle the new hash API introduced by
> commit 87ae9691d2.

+   if (!ossl_initialized)
+   {
+#ifdef HAVE_OPENSSL_INIT_SSL
+   OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL);
+#else
+   OPENSSL_config(NULL);
+   SSL_library_init();
+   SSL_load_error_strings();
+#endif
+   ossl_initialized = true;
This is a duplicate of what's done in be-secure-openssl.c, and it does
not strike me as a good idea to do that potentially twice.

git diff --check complains.

+extern bool pg_HMAC_SHA512(const uint8 *key,
+   const uint8 *in, int inlen,
+   uint8 *out);
I think that the split done in this patch makes the HMAC handling in
the core code messier:
- SCRAM makes use of HMAC internally, and we should try to use the
HMAC of OpenSSL if building with it even for SCRAM.
- For the first reason, I think that we should also have a fallback
implementation.
- This API layer should not depend directly on the SHA2 used (SCRAM
uses SHA256 with HMAC).
FWIW, I got plans to work on that once I am done with the business
around MD5 and OpenSSL.

The refactoring done with the ciphers moved from pgcrypto to
src/common/ should be a separate patch.  In short, it would be good to
rework this patch and split it into pieces that are independently
useful.  This would make the review much easier as well.
--
Michael


signature.asc
Description: PGP signature


Re: Proposed patch for key managment

2020-12-04 Thread Bruce Momjian
On Wed, Dec  2, 2020 at 04:38:14PM -0500, Bruce Momjian wrote:
> If most people approve of this general approach, and the design
> decisions made, I would like to apply this in the next few weeks, but
> this brings complications.  The syntax added by this commit might not
> provide a useful feature until PG 15, so how do we hide it from users. 
> I was thinking of not applying the doc changes (or commenting them out)
> and commenting out the --help output.

Here is an updated patch to handle the new hash API introduced by
commit 87ae9691d2.

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key.diff.gz
Description: application/gzip


Proposed patch for key managment

2020-12-02 Thread Bruce Momjian
Attached is a patch for key management, which will eventually be part of
cluster file encryption (CFE), called TDE (Transparent Data Encryption)
by Oracle.  It is an update of Masahiko Sawada's patch from July 31:


https://www.postgresql.org/message-id/ca+fd4k6rjwnvztro3q2f5hsdd8hgyuc4cuy9u3e6ran4c6t...@mail.gmail.com

Sawada-san did all the hard work, and I just redirected the patch.  The
general outline of this CFE feature can be seen here:

https://wiki.postgresql.org/wiki/Transparent_Data_Encryption

The currently planned progression for this feature is to allow secure
retrieval of key encryption keys (KEK) outside of the database, then use
those to encrypt data keys that encrypt heap/index/tmpfile files.

This patch was changed from Masahiko Sawada's version by removing
references to non-cluster file encryption because having SQL-level keys
stored in this way was considered to have limited usefulness.  I have
also remove references to SQL-level KEK rotation since that is best done
as a command-line too.

If most people approve of this general approach, and the design
decisions made, I would like to apply this in the next few weeks, but
this brings complications.  The syntax added by this commit might not
provide a useful feature until PG 15, so how do we hide it from users. 
I was thinking of not applying the doc changes (or commenting them out)
and commenting out the --help output.

Once this patch is applied, I would like to use the /dev/tty file
descriptor passing feature for the ssl_passphrase_command parameter, so
the ssl passphrase can be prompted from the terminal.  (I am attaching
pass_fd.sh as a POC for how file descriptor handling works.)  I would
also then write the KEK rotation command-line tool.  After that, we can
start working on heap/index/tmpfile encryption using this patch as a
base.

Here is an example of the current patch in action, using a KEK like
'7CE7945EA2F7322536F103E4D7D91C21E52288089EF99D186B9A76D666EBCA66',
which is not echoed to the terminal:

$ initdb -R -c '/u/postgres/tmp/pass_fd.sh "Enter password:" %R'
The files belonging to this database system will be owned by user 
"postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.
Cluster file encryption is enabled.

fixing permissions on existing directory /u/pgsql/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ...
--> Enter password:ok
performing post-bootstrap initialization ...
--> Enter password:ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

pg_ctl -D /u/pgsql/data -l logfile start

$ pg_ctl -R -l /u/pg/server.log start
waiting for server to start...
--> Enter password: done
server started

A non-matching passphrase looks like this:

$ pg_ctl -R -l /u/pg/server.log start
waiting for server to start...
--> Enter password: stopped waiting
pg_ctl: could not start server
Examine the log output.

$ tail -2 /u/pg/server.log
2020-12-02 16:32:34.045 EST [13056] FATAL:  cluster passphrase does not 
match expected passphrase
2020-12-02 16:32:34.047 EST [13056] LOG:  database system is shut down

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee



key.diff.gz
Description: application/gzip


pass_fd.sh
Description: Bourne shell script