On Mon, Oct 10, 2016 at 10:11 PM, Pawel Jakub Dawidek <[email protected]> wrote: > Hi Thomas, > > I'm going through your OpenZFS presentation and I'd like to get more > info about IV storage. > > In the presentation you mention that you cannot generate IV, that it has > to be stored somewhere. Generating it from 'DVA[0] + birth txg + salt' > does sound like a good idea, but you mentioned that birth txg can rewind > on import. It can, but we still generate new salt every time we store > the block (don't we?) and birth txg could be fixed by simply starting > from some sane value on import, eg. 'on-disk-birth-txg + 1024'. > Consecutive crashes on import might be a problem, though. > In the presentation you address why DVA[0] and birth txg don't give us > uniqueness, but you don't talk about the salt, which is the most > promissing bit. Could you elaborate? > > There is also IV (96 bits) on the slide, but I think I saw you mention > somewhere else (on github?) it is now 128 bits, right? If I wrong, could > you please talk about 96 bits IV too?
TL;DR: The underlying encryption modes fall down before we run out of IVs so we stopped considering larger IVs. We also can't use anything based on the txg because the txg isn't cryptographically protected by anything. We need to store the IV because any unique metadata we could use to derive an IV from is lost when you do an encrypted send. Sorry for the long winded answer, but as long as this is going on the mailing list, I might as well make the entire discussion visible for anyone who might have a better idea. So Matt and I actually talked about this for a really long while, but I really should put together a big summary of everything we thought of in case anyone can think of anything better. These were the options we considered: 1) Using the bookmark: This won't work because (1) it can change per snapshot / per dedup and (2) because the same bookmark can be rewritten many times meaning the IV won't be unique. 2) Using the bookmark + txg. This solves half the problem of using the bookmark but we would still need to store it somewhere for decryption because, again, a read might be issued from any snapshot. This is the solution Oracle went with (they also store the IV in DVA[2]) 3) DVA[0] + txg: (for completeness, although you already mentioned it) The problem here is that blocks are written out to disk before the transaction group. So it is technically possible for an IV to be reused in the event of a crash since the most recent transaction group would have data that was written to disk, but was not itself synced. Granted, this would be very hard to take advantage of. The attacker would need historical read access to the disk, but we didn't like the idea of a known (albeit very theoretical) attack. We could have tried to combat this with (as you said) adding some sane number to the txg to make sure this probably didn't happen, but ultimately the problem is that the txg itself is not protected by anything in particular. If you have write access to the disk you could still rewind the txg enough to counteract our txg buffer. If you are really ambitious you can also rewrite the txg of every block that has a txg after the one you rewound to (correcting the checksums manually as you went) so the pool wouldn't even notice the difference. 4) A counter maintained in the keychain: At about the same time we had another idea to keep a counter in the keychain. This would enable us to use the full 96 bit IV space without worrying about collisions. This was important because, if you do the math, writing out 1 million blocks per second you would get a 1/1 billion chance of an IV collision in about 3 hours. You would get a 1 / 1 million chance in just over 2 days. This also had the same problems as DVA[0] + txg. There is another interesting problem that we found out about at this point. AES-GCM and AES-CCM actually start to break down if you keep using the same key, even without reusing an IV. If you follow through with the math that is in the GCM spec (see http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/gcm/gcm-spec.pdf, section 7) you find out that (at 128k block size) GCM starts to fall down well before we run out of our 96 bit IVs. At this point it became obvious that we would need some more entropy for key rotation. 4) DVA[0] + txg + salt: This was the best idea we had for quite a while and it existed in the implementation for about a month. The idea here is that the salt is regenerated every time the dataset is remounted so even if you could rewind the txg the newly generated random salt wouldn't match. The salt could also be used for key rotation (something we kept in the current implementation). We really thought this would work, but then we realized that all of the solutions that don't actually store the IV have one common flaw: encrypted sends. The idea of encrypted sends is that we should be able to take the data on disk and send it exactly as it is to an untrusted server for backup / replication purposes. This is a really neat feature to have and it would enable us to use ZFS as an end-to-end encryption platform. Unfortunately, when you do the send the DVAs and the txg are completely lost. They end up in the receiving pool's current txg and placed wherever the receiving pool has room. At this point we kind of decided that we would need to store the IV somewhere. 5) Randomly generated IV: This is what we finally decided on. In the end a PRNG is what is generally recommended for IV generation and it should actually be cheaper to generate than a secure hash of anything else. It is simple and elegant even if storing the resulting IV isn't, but at this point this was the best we could come up with after a month and a half or so. Let me know if that doesn't answer it. I see now that since I started typing this Matt might have answered your question, but it's still probably best to get this discussion out there. ------------------------------------------- openzfs-developer Archives: https://www.listbox.com/member/archive/274414/=now RSS Feed: https://www.listbox.com/member/archive/rss/274414/28015062-cce53afa Modify Your Subscription: https://www.listbox.com/member/?member_id=28015062&id_secret=28015062-f966d51c Powered by Listbox: http://www.listbox.com
