On Thu, Oct 1, 2009 at 4:37 PM, Kyle Hamilton <aerow...@gmail.com> wrote:
> The question becomes more one of: Why does the OP need to keep the
> HMAC computation key secret? Is the OP using the same key for HMAC
> calculation as for symmetric encryption?  (If so, why?  If not, why
> does the OP need to keep the verification key secret?)

I'm not using the same key for HMAC as for symmetric encryption.

It was my understanding that the HMAC key needed to be secret even
when used for verification.  (Otherwise people could forge an HMAC.)

But you ask a fair question (i.e. Why keep the key secret?) and I'll
try to answer.  I'm not using an HMAC for message authentication, but
for a more indirect purpose.  (I apologize in advance for the length,
explaining "why" requires a little context.)

I am working with a backup system where the files are stored and
referenced by their hash (similar to how git stores it's data).  I
would like to make it be able to store those files in encrypted form.
In order for this system to work, we want two different encryptions of
the same file with the same key to produce the exact same result.
This rules out using a random initialization vector (IV).

With the exception of SIV (which isn't yet widely implemented), my
understanding is that reusing an IV for two different messages opens
up avenues of crypto-analytical attack.  Thus we want to use a
different IV for each file, but use the same IV when the file contents
are the same (*).  The obvious choice is to use a cryptographic hash
of the file's contents as the IV.  It will be the same when the file
contents are the same, but different when the file contents are
different.

Now that works great except for one thing.  For simplicity of
implementation, we would like to store that calculated IV in clear
text as a header at the front of the encrypted file and then just use
one of the block-cipher modes that remains secure even when the IV is
known to the attacker.  However, storing the IV in the clear opens up
a dictionary attack if the attacker can easily compute the hash used
to compute the IV (**).

To get around this, I was planning on using a secret key with an HMAC
(so the attacker couldn't compute his own hashes), but passing that
key on the command leaks that secret key.

To summarize:

I would have done this: let IV = Hash(file) in concatenate(IV,
encrypt(IV, Key2, file)).
Except that sending IV in the clear opens a dictionary attack on the
contents of file.

So to fix that I was going to do this: let IV = HMAC(Key1, file) in
concatenate(IV, encrypt(IV, Key2, file)).
Except that doesn't gain anything over the previous one if Key1 isn't
kept secret.

So now what I'm thinking is to do this: let IV =
encrypt(one-block-mode(***), Key1, Hash(file)) in concatenate(IV,
encrypt(IV, Key2, file)).

Again, sorry for the length, but I hope that de-mystifies some of why
I want to keep the key secret.

(*) Yes, this would open up a dictionary attack if the attacker could
use the backup system as an encryption oracle.  Fortunately, due to
external factors, in our situation the attacker can't inject arbitrary
data into the backup system and thus can't use it as an oracle.

(**) This is the same dictionary attack as in "(*)" except that now
the attacker is attacking the IV and doesn't need to use the backup
system as an oracle.  He can just run the hash algorithm himself.

(***) We don't need a block-cipher mode here (it's basically ECB mode)
if we ensure that the hash length matches the cipher block length.

Michael D. Adams
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to