[cryptography] “On the limits of the use cases for authenticated encryption”
Folks: I posted this on Google+, which I'm effectively using as a blog: https://plus.google.com/108313527900507320366/posts/cMng6kChAAW I'll paste the content of my essay below. It elicited some keen observations from Nikita Borisov in the comments on G+, but I guess you'll have to actually load the page yourself to read those. I also posted it on the tahoe-dev mailing list, where a small thread ensued: https://tahoe-lafs.org/pipermail/tahoe-dev/2012-April/007315.html Regards, Zooko *“On the limits of the use cases for authenticated encryption**”* *What is authenticated encryption?* “Authenticated Encryption” is an abstraction that is getting a lot of attention among cryptographers and crypto programmers nowadays. Authenticated Encryption is just like normal (symmetric) encryption, in that it prevents anyone who doesn't know the key from learning anything [*] about the text. The authenticated part is that it *also* prevents anyone who doesn't know the key from undetectably altering the text. (If someone who doesn't know the key does alter the text, then the recipient will cleanly reject it as corrupted rather than accepting the altered text.) It is a classic mistake for engineers using crypto to confuse encryption with authentication. If you're trying to find weaknesses in someone's crypto protocol, one of the first things to check is whether the designers of the protocol assumed that by encrypting some data they were preventing that data from being undetectably modified. Encryption doesn't accomplish that, so if they made that mistake, you can attack the system by modifying the ciphertext. Depending on the details of their system, this could lead to a full break of the system, such that you can violate the security properties that they had intended to provide to their users. Since this is such a common mistake, with such potentially bad consequences, and because fixing it is not that easy (especially due to timing and exception-oracle attacks against authentication schemes), cryptographers have studied how to efficiently and securely integrate both encryption and authentication into one package. The resulting schemes are called “Authenticated Encryption” schemes. In the years since cryptographers developed some good authenticated encryption schemes, they've started thinking of them as a drop-in replacement for normal old unauthenticated encryption schemes, and started suggesting that everyone should use authenticated encryption schemes instead of unauthenticated encryption schemes in all cases. There was a recent move among cryptographers, spearheaded by the estimable Daniel J. Bernstein, to collectively focus on developing new improved authenticated encryption schemes. This would be a sort of community-wide collaboration, now that the community-wide collaboration on secure hash functions—the SHA-3 contest—is coming to an end. Several modern cryptography libraries, including “Keyczar” and Daniel J. Bernstein's “nacl”, try to make it easy for the programmer to use an authenticated encryption mode and some of them make it difficult or impossible to use an unauthenticated encryption mode. When Brian Warner and I presented Tahoe-LAFS at the RSA Conference in 2010, I was surprised and delighted when an audience member who approached me afterward turned out to be Prof. Phil Rogaway, renowned cryptographer and author of a very efficient authenticated encryption scheme (OCB mode). He said something nice about our presentation and then asked why we didn't use an authenticated encryption mode. Shortly before that conversation he had published a very stimulating paper named “Practice-Oriented Provable Security and the Social Construction of Cryptography”, but I didn't read it until years later. In that fascinating and wide-ranging paper he opines, among many other ideas, that authenticated encryption is one of “the most useful abstraction boundaries”. So, here's what I wish I had been quick-witted enough to say to him when we met in 2010: authenticated encryption can't satisfy any of my use cases! *Tahoe-LAFS access control semantics* I'm one of the original and current designers of the Tahoe-LAFS secure distributed filesystem. We started out, in 2006, by choosing the access control semantics that we wanted to offer our users and that we knew how to implement. Here's what we chose: *There are two kinds of files: immutable and mutable. When you write a file to the filesystem you can choose which kind of file it will be in the filesystem. Immutable files can't be modified once they have been written. A mutable file can be modified by someone with read-write access to it. A user can have read-write access to a mutable file or read-only access to it, or no access to it at all.* *In addition to read-write access and read-only access, we implement a third, more limited, form of access which is verify-only access. You can grant someone the ability to check the integrity of your ciphertexts without also
Re: [cryptography] “On the limits of the use cases for authenticated encryption”
I think Tahoe-LAFS is the exception to any rule that one should use AE, and really, the very rare exception. Not the only exception, though this type of application might be the only exception we want. A ZFS-like COW filesystem with Merkle hash trees should have requirements similar to Tahoe's, specifically the ability to verify and repair on-disk structures, including encrypted file data (and some meta-data!) without being able to decrypt said data. One way to do this is to use AE and also hash the encrypted data, storing both, the hashes and the AE tags in block pointers, but this is rather wasteful. Storing only hashes has other problems, but these could be addressed by MACing the root hash of every encrypted stream and storing that hash in an appropriate place (e.g., the pointer to the root node for that stream, or else separately in the node pointing to that root node). You could argue that such a filesystem is the same sort of application that Tahoe-LAFS is, even if it isn't networked (but nowadays the storage can be networked, so effectively a ZFS belongs in the same exact bucket as Tahoe). But in traditional network protocols (TLS, SSHv2, ESP, ...) I have to strain to think of reasons to not use AE when you want confidentiality protection (encryption). Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] data integrity: secret key vs. non-secret verifier; and: are we winning? (was: “On the limits of the use cases for authenticated encryption”)
On 04/25/2012 10:11 PM, Zooko Wilcox-O'Hearn wrote: It goes like this: suppose you want to ensure the integrity of a chunk of data. There are at least two ways to do this (excluding public key digital signatures): 1. the secret-oriented way: you make a MAC tag of the chunk (or equivalently you use Authenticated Encryption on it) using a secret key known to the good guy(s) and unknown to the attacker(s). 2. the verifier-oriented way: you make a secure hash of the chunk, and make the resulting hash value known to the good guy(s) in an authenticated way. Is option 2 sort of just pushing the problem around? What's going on under the hood in the term in an authenticated way? How do you do authentication in an automated system without someone somewhere keeping something secret? Is authenticating the hash value fundamentally different from ensuring the integrity of a chunk of data? - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] data integrity: secret key vs. non-secret verifier; and: are we winning? (was: “On the limits of the use cases for authenticated encryption”)
You'd have to ask Darren, but IIRC the design he settled on allows for unkeyed integrity verification and repair. I too think that's a critical feature to have even if having it were to mean leaking some information, such as file length in blocks, and number of files, as I look at this from an operations perspective. Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] data integrity: secret key vs. non-secret verifier; and: are we winning? (was: “On the limits of the use cases for authenticated encryption”)
On Wed, Apr 25, 2012 at 10:27 PM, Marsh Ray ma...@extendedsubset.com wrote: On 04/25/2012 10:11 PM, Zooko Wilcox-O'Hearn wrote: 2. the verifier-oriented way: you make a secure hash of the chunk, and make the resulting hash value known to the good guy(s) in an authenticated way. Is option 2 sort of just pushing the problem around? What's going on under the hood in the term in an authenticated way? How do you do authentication in an automated system without someone somewhere keeping something secret? Is authenticating the hash value fundamentally different from ensuring the integrity of a chunk of data? You have two choices for providing AE and (2): a) MAC the root of each file's (or directory's, or dataset's) Merkle hash tree, or b) store a hash and a MAC, thereby forming a Merkle hash tree and a parallel Merkle MAC tree. In terms of additional storage and compute power (a) is clearly superior. I believe the security of (a) is adequate. Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] data integrity: secret key vs. non-secret verifier; and: are we winning? (was: “On the limits of the use cases for authenticated encryption”)
Also, On Wed, Apr 25, 2012 at 10:11 PM, Zooko Wilcox-O'Hearn zo...@zooko.com wrote: Hello Nico Williams. Nice to hear from you. Yes, when David-Sarah Hopwood and I (both Tahoe-LAFS hackers) participated on the zfs-crypto mailing list with you and others, I learned about a lot of similarities between Tahoe-LAFS and ZFS. Yes, I remember that too. It was fun and enlightening. But in traditional network protocols (TLS, SSHv2, ESP, ...) I have to strain to think of reasons to not use AE when you want confidentiality protection (encryption). Yes, I agree with you on that. And OTR ¹, CurveCP ², mosh ³, tcpcrypt ⁴, and ZRTP ⁵. All of these eight protocols we've just named have in common that there are only two parties, that only current data in-flight is protected, and that the protocol has already ensured (more or less -- haha) a shared secret key known to both of the users and not to any attackers. Remember that in ZFS we also speak of end-to-end integrity protection, except in ZFS there's a single end: the system that implements it, and the attackers are presumed to be between that system and its storage devices. It's end-to-end because even though there's only one end, that end is effectively communicating with itself [over untrusted storage media]. The on-disk format is the equivalent of secure transport protocol (with SAS, IB, .. being the equivalent of TCP/IP). Of course, if you access said storage from multiple heads then there will be multiple ends, but since only one can be writing at any given time (and really, even reading)... In ZFS w/ encryption there are no additional ends, and protection against a local privileged agent is not in scope, but protection of data at rest (on the storage devices making up the ZFS volumes) is in scope. Additional protection is available when and for as long as the keys are not loaded on the system running ZFS. I think the distinction between filesystems on the one hand and communication protocols on the other is that in the first case we always have snapshots of data that we can apply Merkle hash trees to, and we always have *all* the data available and subject to use and re-use in random access patterns at any given time, including years later. Whereas in the second case the data is ephemeral, consumed and thrown away or otherwise transformed (outside the scope of the transport protocol) as soon as possible -- there's no need to consider an attack where a block earlier in the octet/message stream gets modified after it's been received and consumed. We could store files as TLS streams using PSK and have those shared keys be the files' keys, but that would be inefficient, particularly if you were to need to write in any fashion other than strictly append-only. This distinction is what I believe drives us to design/apply completely different cryptographic protocols to the two types of protocols. I don't question the usefulness of the Authenticated Encryption abstraction for protocols that fall into that category. Right, me either. I can't even imagine not using AE in that context, whether by generic composition or -much better- via integrated AE ciphers/cipher modes. Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] data integrity: secret key vs. non-secret verifier; and: are we winning? (was: “On the limits of the use cases for authenticated encryption”)
On 2012-04-26 1:11 PM, Zooko Wilcox-O'Hearn wrote: how are we doing? Are we winning? I don't know about you, but I consider myself to be primarily a producer of defense technology. I'd like for every individual on the planet to have confidentiality, data integrity, to be able to share certain access with chosen people while denying it to everyone else, and more. Judging from the news and the chatter, offense is winning. Apparently almost nobody actually enjoys this kind of protection against the modern attacker, even big organizations who pay big money for it, much less the rest of the populace. Obviously this is a solved problem in principle. In principle, we know how to do it. In practice not. So it is a UI problem. Zooko envisaged how to solve the UI problem in principle, but Tahoe does not seem user friendly, nor scalable as an information sharing and transmission solution. It is narrowly targeted at solving the cloud backup problem. ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography