Hi everyone.

This is very possibly a newb question (or series of questions), and if so I apologize in advance. I scoured everywhere I could think of for the last couple of days trying to find information on this and came up empty, but maybe I just didn't know the right terms to search for.

--- Background ---

I was reverse-engineering a system recently and came across an issue that I know from experience and training is pretty widespread when developers without a strong cryptography background use cryptography without thinking things through: although strong cryptography (AES) was used, and the key was stored securely, the system itself unintentionally provided a means for an attacker to decrypt arbitrary data without ever knowing the key. I have a set of recommendations in mind for how to avoid this type of vulnerability, but I'd like to sanity-check them with people who actually do have a cryptography background.

What I'm hoping to avoid is being the security guy who makes a recommendation for improving something, but unintentionally introduces a different vulnerability as a result. I have gotten pretty good at exploiting commonly-made mistakes in software that uses cryptography, but I am not a cryptography expert, or even cryptography adept.

The system in question uses a fairly common mechanism where the state of certain non-sensitive variables is maintained on the client by means of encrypted data which the client doesn't have the key to decrypt. The only reason for the encryption is to prevent the client from tampering with the data. This allows multiple different load-balanced nodes on the back-end to respond to requests from the same client without having to sync their state. Think of ASP.NET's ViewState, except that here the variables are broken out into individual components instead of there being one giant encrypted blob that contains all of the data.

Like many systems that use this model, there is a flaw that would (probably?) be trivial in the absence of other factors: some of those non-sensitive values are displayed back to the user after being decrypted. In other words, as I mentioned above, the system includes the unintentional ability for users to decrypt arbitrary data, as long as that data was encrypted using the same key as the data it actually expects.

Unfortunately, there is other - sensitive - data in the system which is also encrypted using the same key. It's data that must be stored in a reversibly-encrypted format, but which end users should not be able to retrieve. For the sake of argument, let's say it's the password for a service account that the system uses to execute batch jobs, or a stored credit card number used to make purchases by a customer. In both cases, the system needs the ability to obtain the original value, but end users do not - they just refer to the value abstractly, such as "use this service account to execute this task", or "I want to make a purchase using the card whose number ends in 1234". I am using examples from other systems that I've looked at in the past here as opposed to the current one, so please don't get stuck on those two specific cases. Assume that there is a requirement that the system be able to decrypt the data, but that it should not be accessible to end users after it's originally entered.

The combination of those two aspects of the system means that if a user can obtain the encrypted version of the second type of data, they can feed it into their cookie, and the system will happily display to them the decrypted value, because it doesn't know any better and because the same key is used for both types of data.

Now, normally users can't actually obtain this sensitive data, even in encrypted format - there are OS- and database-level permissions that are supposed to prevent that - but over time, people have a tendency to forget why certain things were configured the way they were, someone makes a configuration change, and people who shouldn't be able to get to the encrypted data are suddenly able to.

--- Proposal/Question ---

Of course, one of my main recommendations is going to be "don't use the same key for multiple types of data!!", but because my background is in systems engineering, one of my interests is building redundant safety features into a system design so that any one failure or human error won't completely compromise the system.

Part 1 of my proposal is that encrypted values should be wrapped in some kind of metadata to identify their type, as well as delimit where the plaintext value starts and ends (to help prevent someone from using block-shuffling in a way that involves changing the length of the desired plaintext, if someone makes a mistake and uses ECB mode instead of CBC). Some really basic examples of the plaintext might be:

<password>12345? That's the same combination as on my luggage!</password>
versus
<customThemeName>Autumn</customThemeName>

...or...

[value&&type::password&&length::52]12345? That's the same combination as on my luggage![/value]
versus
[value&&type::customThemeName&&length::6]Autumn[/value]

This is obviously going to involve an increase in storage size. For example, using the "Autumn" example and XML-style wrapper, with a block size of 128 bits, the ciphertext balloons from (size of IV + 16 bytes) to (size of IV + 48 bytes). The benefit I see is that it allows the application to do a check to make sure that the type of data it has just decrypted is actually of the type it expects, prevent other types of data from being returned to the user, and possibly generate an alert if it was expecting e.g. the name of a custom webpage theme but found a service account password instead. There is a whole side-topic here related to making sure that mechanism isn't itself exploitable, but I will set that aside because then the email would be even longer.

As I said, the application I'm asking about uses strong encryption for which there are no known known-plaintext attacks. However, as soon as I thought of the above concept, I realized that if a practical known-plaintext attack were ever discovered for AES, that scheme would be setting up the system for compromise, because all values of a certain type would have at least their first block of plaintext be highly-predictable.

So part 2 of my proposal is that the plaintext include a throwaway section *before* the actual data of concern, which has a length of one block, and is filled with random (or at least pseudo-random) data that is uniquely-generated for each encrypted value. As long as CBC mode was used, it seems to me that it would be sort of like a second IV (a "reinitialization vector", I guess? :)), except that it would never be stored outside of the ciphertext, would be immediately discarded upon decryption, and never intentionally reused. In other words, while I see it as serving a purpose somewhat related to an IV, I also see them as being complementary to each other instead of redundant - the IV helps ensure that identical plaintext encrypts to different ciphertext, and the "RIV" helps guard against future known-plaintext attacks when used with CBC encryption mode.

This is probably stating the obvious, but in the case of one of the examples above, if the encryption used were AES or another algorithm with a block size of 128 bits, the plaintext modified according to both parts 1 and 2 of my proposal would look like this:

XXXXXXXXXXXXXXXX[value&&type::password&&length::52]12345? That's the same combination as on my luggage![/value]

...where XXXXXXXXXXXXXXXX represents 16 bytes of random/pseudorandom values from 0-255. This whole long set of plaintext would then be encrypted, appended to the IV, and finally stored.

To hammer home the storage downside of this, the result is that what was originally going to be potentially an 80-byte value (16-byte IV + 64-byte ciphertext) has swollen to 148 bytes (16-byte IV + 128-byte ciphertext). Because the password in question is unusually long, let's say that it generally doubles or triples the size of the stored data, and of course increases CPU time for encryption and decryption.

However, at least superficially, I think it greatly reduces the likelihood of sensitive data being obtained by people who shouldn't have it, because it provides a means of allowing the application to perform "output validation" before displaying values to the user, and (unless I'm mistaken) it guards against future known-plaintext attacks on the encryption algorithm. In combination with using different encryption keys for different types of data (and of course using unique IVs for each encrypted value), it seems to me that it makes it much less likely for any one mistake to compromise the system.

--- Wrapping up ---

I can definitely see an argument that this is a bunch of over-engineering, but the type of flaw I'm describing is ridiculously widespread in commercial software. I'd already run into it myself, and then when I went to a SANS advanced web pen-testing course there was an entire day dedicated to it and related defects.

I feel like I need to be able to make some recommendations to developers who aren't cryptography experts that will let them design and build systems that have a degree of built-in redundancy so that the failure of any one design element related to the encrypted data won't result in a complete compromise of that system. I need to be able to come up with a simple recipe for that, and it can't be any one silver bullet (like "use different keys for different types of data"), because single mechanisms will always fail at some point. It also can't be something unrealistic like "become an awesome cryptographer before you design any system that uses cryptographic algorithms", because I know that's not going to happen and I have to account for the reality of the situation. I feel like it needs to be 3-5 overlapping design philosophies/patterns that are easy to remember, in addition to the ones that are well-known like "use existing, well-vetted cryptographic algorithms instead of writing your own".

From a cryptography perspective, is this a stupid idea? Are there better ways to achieve my goal? Am I introducing any new weaknesses into the system? Has any element of this topic been done to death and I just didn't know what to search for?

In any case, if anyone got to the end of this rambling email, thank you.

- Ben Lincoln
_______________________________________________
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography

Reply via email to