Hi Will,

I still don't see how public/private encryption helps. It seems the private 
keys in your idea(s) are needed in all the same places and for the same 
duration as the secret keys in mine. I think it's clear, though, that I didn't 
fully comprehend your post.

I can at least see that my proposal's use of key wrapping techniques came 
without an explanation, which I include later in this post.

The core part of 'native encryption' is in the first commit, the mere mechanics 
of using AES in counter mode at the right places in couch_file to cause all the 
bytes on disk to be correctly encrypted and yet correctly decrypted on 
subsequent read, no matter which section of the file is read or in what order.

That bit, I hope, is not controversial (though it does require careful review).

It is the layers above that which we, as a dev community, need to ponder.

To be useful, it needs to be the case that an attacker is prevented from 
reading the data contained within the .couch files under as many scenarios as 
possible. We wouldn't want the confidentiality of data to be compromised easily.

To be secure, we must follow the security guidance for AES and for the selected 
mode (Counter Mode, in this case).  We don't want keys to live forever. We 
don't want to encrypt too much with the same key. We must never use the same 
key and the same counter to encrypt different plaintext, and so on.

CouchDB can open .couch (and .view) files very frequently, so we probably 
cannot afford to contact an external key manager for every single couch_file 
open. There is an LRU for performance reasons but I'm wary of coupling that to 
a potentially slow unwrap service call.

I've used a key wrapping technique for a few reasons. Firstly, so that every 
file has its own key, chosen at random on creation, that is entirely 
independent of any other key. This hugely helps with ensuring we've encrypted 
within the guidance. It also means the counter can be the file offset which, 
for our append-only files, ensures uniqueness. The second reason is that it 
introduces a layer of management keys which can have a longer lifetime, managed 
externally to one degree or another. It is those keys and their management 
which I'm trying to decide on before merging.

Perhaps it would help if I make a strawman proposal for a key manager interface?

Before I do that, imagine a number of small changes to the branch of work I've 
presented already;

1. A key manager interface is defined.
2. The concrete implementation of that interface can be defined for a couchdb 
installation somehow (perhaps in vm.args or default.ini)
3. That couch_file delegates to this interface at the points where it currently 
needs to wrap or unwrap the file key.

An interface might look like this;

-callback new_wrapped_key(WrappingKeyId :: binary()) ->
    {ok, WrappedKey :: binary(), UnwrappedKey :: binary()} | {error, Reason :: 
term()}.

-callback unwrap_key(WrappingKeyId::binary(), WrappedKey::binary()) ->
    {ok, UnwrappedKey :: binary() | {error, Reason :: term()}.

couch_file would call new_wrapped_key when creating a new file, and would 
receive the wrapped form, for writing to the header, and the unwrapped form, 
for initialising the ciphers held in the state variable.

For existing files, couch_file would read the wrapped key from the file and 
call unwrap_key to retrieve the unwrapped form, for the same purpose as 
previous.

An implementation of this interface could be done in erlang, as I've already 
shown, or could involve a remote network connection to some service that does 
it for us (and, one hopes, does so over HTTPS).

So the questions I'm most interested in discussing are whether this is the 
right level of abstraction and, if not, what others think would be?

I hope most folks can see that the above interface could be introduced in my 
branch fairly easily, and the parts of that work which use aegis_keywrap or 
read from config could be consolidated into an implementation of it. I'm happy 
to do that work if it helps.

B.

> On 18 May 2022, at 14:47, Will Young <lostnetwork...@gmail.com> wrote:
> 
> Hi Robert,
> 
>  I think it is best to try to clear up the matter of non-extractable
> keys since that is where I am confused about the capabilities without
> asymmetric keys. The setup I am used to seeing for non-extractable
> keys looks similar to crypto's OpenSSL engine references where erlang
> says it supports only RSA and the underlying OpenSSL supports RSA or
> some EC-variants. I think that is pretty inline for ~smartcard-chip
> pkcs11 tokens like yubikeys, i.e. I have an old feitian epass2003
> which says it supports some symmetric key algorithms but really it
> supports generating only keypairs in pkcs11-tool and some
> ~acceleration of shared keys.
> 
> Looking at AWS' cloudHSM, OTOH, I see that it supports symmetric
> non-extractable keys, but also backing up, restoring and ending up
> with clones that are copies of one original HSM. I see how that could
> work with only symmetric keys, but I find that a little scary and I
> think clonability is supposed to never be the case for the portable
> non-extractable tokens I actually use.
> 
> If a non-clonable non-extractable device is used I don't think it is
> practical to do correct management with just non-extractable symmetric
> keys. They all have to be present to be encrypted to making them all
> subject to mishap in production at the same time, and they can have no
> restorable backups as that is indistinguishable from future clones.
> 
> Thus I arrived at asymmetric setup for tokens so encryption can occur
> to additional offline tokens. I would expect to need to pass an engine
> reference to keep access to the private key open to read a wrapped
> shard key and while theoretically one could blindly encrypt to public
> keys, I think there's probably a need to effectively doing a
> sign/encrypt with a/the local privatekey to each publickey.
> I.e.: a header would need at least 2 slots on a pure asymmetric setup:
>   to-keyid:local-token  encrypted-shard-KEY
>   to-keyid:backups encrypted-shard-KEY
> In compaction one would encrypt the new shard key to whichever
> public-keys one wants but would naturally want to choose at least one
> that is the local token or private key to not put the shard offline.
> 
> But as you alluded to, one doesn't really want a keypair for the node
> itself, one could mix one or more asymmetric keys to solve the
> management problems of trusting specific non-clonable tokens with a
> node local symmetric key for rewrapping to itself. As long as there's
> a trustworthy offline key, the admin could be much more confident that
> local keys could safely be rotated out and destroyed without
> permanently losing access to data or backups by mistake so keys can
> actually get deleted from production. That introduces the same kind of
> limit on the value of current production wrapper keys similar to the
> frequent shard key rotation but instead on any access to shards in
> backups of the data volume and even if there is no local hsm there is
> still not much opportunity to snoop by copying the symmetric keys that
> are being used today for looking at future backups.
> 
> If we are talking only about supporting "enterprise" HSMs that can be
> cloned to various backup clusters, etc, I'm not really sure if they
> fix the problems I was concerned about or only punt to managing risk
> with online clones, etc. In many cases I think they actually make it
> more likely to have a very complete HSM configuration of every key
> ever used cloned around to every host and create a lot of internal
> access to be able to look through all old backups and so on. I would
> also feel uncomfortable using an HSM that only works on one cloud
> without signing to an outside asymmetric key to ensure continuity if
> something went wrong with agreements to use that cloud.
> 
> So really I think I view the ability to encrypt to an offline
> asymmetric key that may or may not be an HSM key as kind of important
> to eventually feel comfortable using encryption in production and be
> tuning toward good security practice without fear of data loss.
> 
> At any rate I hope that makes the direction I have been thinking in a
> little clearer?
> Thanks,
> Will

Reply via email to