On Mon, Oct 10, 2016 at 10:11 PM, Pawel Jakub Dawidek <[email protected]> wrote:
> Hi Thomas,
>
> I'm going through your OpenZFS presentation and I'd like to get more
> info about IV storage.
>
> In the presentation you mention that you cannot generate IV, that it has
> to be stored somewhere. Generating it from 'DVA[0] + birth txg + salt'
> does sound like a good idea, but you mentioned that birth txg can rewind
> on import. It can, but we still generate new salt every time we store
> the block (don't we?) and birth txg could be fixed by simply starting
> from some sane value on import, eg. 'on-disk-birth-txg + 1024'.
> Consecutive crashes on import might be a problem, though.
> In the presentation you address why DVA[0] and birth txg don't give us
> uniqueness, but you don't talk about the salt, which is the most
> promissing bit. Could you elaborate?
>
> There is also IV (96 bits) on the slide, but I think I saw you mention
> somewhere else (on github?) it is now 128 bits, right? If I wrong, could
> you please talk about 96 bits IV too?

TL;DR: The underlying encryption modes fall down before we run out of
IVs so we stopped considering larger IVs. We also can't use anything
based on the txg because the txg isn't cryptographically protected by
anything. We need to store the IV because any unique metadata we could
use to derive an IV from is lost when you do an encrypted send. Sorry
for the long winded answer, but as long as this is going on the
mailing list, I might as well make the entire discussion visible for
anyone who might have a better idea.


So Matt and I actually talked about this for a really long while, but
I really should put together a big summary of everything we thought of
in case anyone can think of anything better. These were the options we
considered:

1) Using the bookmark: This won't work because (1) it can change per
snapshot / per dedup and (2) because the same bookmark can be
rewritten many times meaning the IV won't be unique.

2) Using the bookmark + txg. This solves half the problem of using the
bookmark but we would still need to store it somewhere for decryption
because, again, a read might be issued from any snapshot. This is the
solution Oracle went with (they also store the IV in DVA[2])

3) DVA[0] + txg: (for completeness, although you already mentioned it)
The problem here is that blocks are written out to disk before the
transaction group. So it is technically possible for an IV to be
reused in the event of a crash since the most recent transaction group
would have data that was written to disk, but was not itself synced.
Granted, this would be very hard to take advantage of. The attacker
would need historical read access to the disk, but we didn't like the
idea of a known (albeit very theoretical) attack. We could have tried
to combat this with (as you said) adding some sane number to the txg
to make sure this probably didn't happen, but ultimately the problem
is that the txg itself is not protected by anything in particular. If
you have write access to the disk you could still rewind the txg
enough to counteract our txg buffer. If you are really ambitious you
can also rewrite the txg of every block that has a txg after the one
you rewound to (correcting the checksums manually as you went) so the
pool wouldn't even notice the difference.

4) A counter maintained in the keychain: At about the same time we had
another idea to keep a counter in the keychain. This would enable us
to use the full 96 bit IV space without worrying about collisions.
This was important because, if you do the math, writing out 1 million
blocks per second you would get a 1/1 billion chance of an IV
collision in about 3 hours. You would get a 1 / 1 million chance in
just over 2 days. This also had the same problems as DVA[0] + txg.
There is another interesting problem that we found out about at this
point. AES-GCM and AES-CCM actually start to break down if you keep
using the same key, even without reusing an IV. If you follow through
with the math that is in the GCM spec (see
http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/gcm/gcm-spec.pdf,
section 7) you find out that (at 128k block size) GCM starts to fall
down well before we run out of our 96 bit IVs. At this point it became
obvious that we would need some more entropy for key rotation.

4) DVA[0] + txg + salt: This was the best idea we had for quite a
while and it existed in the implementation for about a month. The idea
here is that the salt is regenerated every time the dataset is
remounted so even if you could rewind the txg the newly generated
random salt wouldn't match. The salt could also be used for key
rotation (something we kept in the current implementation). We really
thought this would work, but then we realized that all of the
solutions that don't actually store the IV have one common flaw:
encrypted sends. The idea of encrypted sends is that we should be able
to take the data on disk and send it exactly as it is to an untrusted
server for backup / replication purposes. This is a really neat
feature to have and it would enable us to use ZFS as an end-to-end
encryption platform. Unfortunately, when you do the send the DVAs and
the txg are completely lost. They end up in the receiving pool's
current txg and placed wherever the receiving pool has room. At this
point we kind of decided that we would need to store the IV somewhere.

5) Randomly generated IV: This is what we finally decided on. In the
end a PRNG is what is generally recommended for IV generation and it
should actually be cheaper to generate than a secure hash of anything
else. It is simple and elegant even if storing the resulting IV isn't,
but at this point this was the best we could come up with after a
month and a half or so.

Let me know if that doesn't answer it. I see now that since I started
typing this Matt might have answered your question, but it's still
probably best to get this discussion out there.


-------------------------------------------
openzfs-developer
Archives: https://www.listbox.com/member/archive/274414/=now
RSS Feed: https://www.listbox.com/member/archive/rss/274414/28015062-cce53afa
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=28015062&id_secret=28015062-f966d51c
Powered by Listbox: http://www.listbox.com

Reply via email to