On Thu, Oct 1, 2009 at 4:37 PM, Kyle Hamilton <aerow...@gmail.com> wrote: > The question becomes more one of: Why does the OP need to keep the > HMAC computation key secret? Is the OP using the same key for HMAC > calculation as for symmetric encryption? (If so, why? If not, why > does the OP need to keep the verification key secret?)
I'm not using the same key for HMAC as for symmetric encryption. It was my understanding that the HMAC key needed to be secret even when used for verification. (Otherwise people could forge an HMAC.) But you ask a fair question (i.e. Why keep the key secret?) and I'll try to answer. I'm not using an HMAC for message authentication, but for a more indirect purpose. (I apologize in advance for the length, explaining "why" requires a little context.) I am working with a backup system where the files are stored and referenced by their hash (similar to how git stores it's data). I would like to make it be able to store those files in encrypted form. In order for this system to work, we want two different encryptions of the same file with the same key to produce the exact same result. This rules out using a random initialization vector (IV). With the exception of SIV (which isn't yet widely implemented), my understanding is that reusing an IV for two different messages opens up avenues of crypto-analytical attack. Thus we want to use a different IV for each file, but use the same IV when the file contents are the same (*). The obvious choice is to use a cryptographic hash of the file's contents as the IV. It will be the same when the file contents are the same, but different when the file contents are different. Now that works great except for one thing. For simplicity of implementation, we would like to store that calculated IV in clear text as a header at the front of the encrypted file and then just use one of the block-cipher modes that remains secure even when the IV is known to the attacker. However, storing the IV in the clear opens up a dictionary attack if the attacker can easily compute the hash used to compute the IV (**). To get around this, I was planning on using a secret key with an HMAC (so the attacker couldn't compute his own hashes), but passing that key on the command leaks that secret key. To summarize: I would have done this: let IV = Hash(file) in concatenate(IV, encrypt(IV, Key2, file)). Except that sending IV in the clear opens a dictionary attack on the contents of file. So to fix that I was going to do this: let IV = HMAC(Key1, file) in concatenate(IV, encrypt(IV, Key2, file)). Except that doesn't gain anything over the previous one if Key1 isn't kept secret. So now what I'm thinking is to do this: let IV = encrypt(one-block-mode(***), Key1, Hash(file)) in concatenate(IV, encrypt(IV, Key2, file)). Again, sorry for the length, but I hope that de-mystifies some of why I want to keep the key secret. (*) Yes, this would open up a dictionary attack if the attacker could use the backup system as an encryption oracle. Fortunately, due to external factors, in our situation the attacker can't inject arbitrary data into the backup system and thus can't use it as an oracle. (**) This is the same dictionary attack as in "(*)" except that now the attacker is attacking the IV and doesn't need to use the backup system as an oracle. He can just run the hash algorithm himself. (***) We don't need a block-cipher mode here (it's basically ECB mode) if we ensure that the hash length matches the cipher block length. Michael D. Adams ______________________________________________________________________ OpenSSL Project http://www.openssl.org User Support Mailing List openssl-users@openssl.org Automated List Manager majord...@openssl.org