Zooko Wilcox-O'Hearn wrote: > Dear Darren J Moffat: > > I don't understand why you need a MAC when you already have the hash of > the ciphertext. Does it have something to do with the fact that the > checksum is non-cryptographic by default > (http://docs.sun.com/app/docs/doc/819-5461/ftyue?a=view ), and is that > still true? Your original design document  said you needed a way to > force the checksum to be SHA-256 if encryption was turned on. But back > then you were planning to support non-authenticating modes like CBC. I > guess once you dropped non-authenticating modes then you could relax > that requirement to force the checksum to be secure. > > Too bad, though! Not only are you now tight on space in part because > you have two integrity values where one ought to do, but also a secure > hash of the ciphertext is actually stronger than a MAC! A secure hash > of the ciphertext tells whether the ciphertext is right (assuming the > hash function is secure and implemented correctly). Given that the > ciphertext is right, then the plaintext is right (given that the > encryption is implemented correctly and you use the right decryption > key).
Hmm. That may be too many "given"s. Tahoe (see www.allmydata.org) has an open bug to add a plaintext hash, precisely because the encryption might not be implemented correctly or the encryption key might not be correct: <http://allmydata.org/trac/tahoe/ticket/453> It seems as though ZFS (and many other protocols) is in the same position as Tahoe, in wanting some way to validate that the ciphertext is correct without needing the decryption key, but also wanting to minimize the risk of some implementation error, and/or use of the wrong decryption key, resulting in undetected errors in the plaintext. I had something similar to the following in mind for the next update to my proposal for Tahoe's new crypto protocol (simplified here to avoid Tahoe-specific details and terminology): - a "plaintext verifier" is Hash1(index, salt, plaintext). - a "ciphertext verifier" is Hash2(index, ciphertext). - at a location determined by 'index', store: ciphertext = Encrypt[K](salt, plaintext) This has the following advantages: - For integrity of the plaintext, you only need to assume that the implementation of the hash is correct. Moreover, if the hash implementation is not correct, that is very likely to cause it to fail to verify good data, which is noticeable as an error in normal operation. To get bad data to pass verification, the attacker would need to have some control over the output value of the incorrect hash; an error that effectively randomizes the value does not help them. - The verification also ensures integrity of the index. So, if a ciphertext ends up being stored in the wrong place, that will be detected. - Verification of the plaintext does not require the decryption key; it can be done using just the known plaintext verifier, and the purported values of 'salt' and 'plaintext' obtained from decryption. This is very important "if it must be possible to have all cryptographic key material stored and/or created entirely in a hardware device", as  states as a requirement for ZFS. If the verification can be done safely in software and if the encryption uses a standard mode, then it is more likely that existing crypto hardware, or at least hardware that has no specific dependency on ZFS, can be used. - Knowledge of the plaintext verifier by itself leaks no information about the plaintext, under the assumptions that the hash is oneway, and that there is no repetition of an (index, salt, plaintext) triple. - A non-malicious corruption of any of the plaintext verifier, the ciphertext, or the decryption key will cause the plaintext to fail to verify. - A malicious change to the ciphertext or any induced error in the decryption will cause the plaintext to fail to verify as long as the correct plaintext verifier is used. Contrast with the case where we only use a ciphertext checksum, where either an error in the decryption, or corruption of the decryption key, will result in an undetected error in the plaintext. Of course we also need to consider the space constraints. 384 bits would fit two 192-bit hashes for the plaintext and ciphertext verifiers; but then we would have no space to accomodate the ciphertext expansion that results from encrypting the salt together with the plaintext. I'm not familiar enough with ZFS's on-disk format to tell whether there is a way around this. Note that the encrypted salt does not need to be stored in the same place as either the verifiers or the rest of the ciphertext. > A MAC on the plaintext tells you only that the plaintext was > chosen by someone who knew the key. See what I mean? A MAC can't be > used to give someone the ability to read some data while withholding > from them the ability to alter that data. A secure hash can. Right. If hashes are used instead of MACs, then the integrity of the system does not depend on keeping secrets. It only depends on preventing the attacker from modifying the root of the Merkle tree. One consequence of this is that if there are side-channel attacks against the implementations of crypto algorithms, there is no information that they can leak to an attacker that would allow compromising integrity. (Of course, the integrity of the OS also needs to be protected. One way of doing that would be to have a TPM, or the same hardware that is used for crypto, store the root hash of the Merkle tree and also the hash of a boot loader that supports ZFS. Then the boot loader would load an OS from the ZFS filesystem, and only that OS would be permitted to update the ZFS root hash.) > One of the founding ideas of the whole design of ZFS was end-to-end > integrity checking. It does that successfully now, for the case of > accidents, using large checksums. If the checksum is secure then it > also does it for the case of malice. In contrast a MAC doesn't do > "end-to-end" integrity checking. A cryptographic checksum on the ciphertext alone doesn't do end-to-end integrity checking either. Even if everything is implemented correctly and there are no hardware errors, it doesn't verify the integrity of the decryption key. > For example, if you've previously > allowed someone to read a filesystem (i.e., you've given them access to > the key), but you never gave them permission to write to it, but they > are able to exploit the isses that you mention at the beginning of  > such as "Untrusted path to SAN", then the MAC can't stop them from > altering the file, nor can the non-secure checksum, but a secure hash > can (provided that they can't overwrite all the way up the Merkle Tree > of the whole pool and any copies of the Merkle Tree root hash). The scheme I suggested above also has that advantage: if you have a plaintext verifier, then you can check the integrity of the plaintext even if an attacker knows the decryption key (and no separate MAC key is needed). > Likewise, a secure hash can be relied on as a dedupe tag *even* if > someone with malicious intent may have slipped data into the pool. An > insecure hash or a MAC tag can't -- a malicious actor could submit data > which would cause a collision in an insecure hash or a MAC tag, causing > tag-based dedupe to mistakenly unify two different blocks. I agree. I don't think that Darren Moffat was suggesting to use the MAC tag for dedupe. I also agree that a hash used for dedupe needs to be quite long (256 bits would be nice, but 192 is probably OK). >  > http://hub.opensolaris.org/bin/download/Project+zfs%2Dcrypto/files/zfs%2Dcrypto%2Ddesign.pdf -- David-Sarah Hopwood http://davidsarah.livejournal.com --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com