Hello, triggered by a talk at Passwordconf'15 by Joseph Birr-Pixton about slow PBKDF2 implementations (where Java was particular bad)
https://www.youtube.com/watch?v=k_szwKBuNBw https://bsideslv2015.sched.org/event/3uke/pbkdf2-performance-matters I had a look at the source because I remembered that PBKDF implementation does re-use a initialized HMAC and that HMAC does pre-compute the init vectors. As it turns out, however the HMAC implementation does only pre-compute the padded keys, not the result of the first compression function. This is rather unfortunate not only for performance reasons, but also because it needs to store the key-xors which is added security risk. So what I would recommend is to add the compressions to the initialisation. (In this case pre-compute the state of inner and outer digest) The problem with this is however, the official interface to get a pre-calculated digest state would use clone(). And this would increase the memory pressure. So what really would help is to have a private method of getting and setting the state. Here is the clone based implementation with comments hinting in the direction of state extraction. engineInit(..) { ... md.update(k_opad); mdOuter = md.clone(); // md.getState(); md.reset(); md.update(k_ipad); mdInner = md.clone(); //md.getState(); null(k_opad, k_ipad); // goooood! engineReset() } engineReset() { md = mdInner.clone(); // md.setState(mdInnerState); } This has the big advantage that you can also remove the "first" flag from the update() methods: engineUpdate() { md.update(bytes); } and the final looks like engineDoFinal() { ... byte[] tmp = md.digest(); md = mdOuter.clone(); //// md.setState(mdOuterState); md.update(tmp); md.digest(tmp, 0, tmp.length); return tmp; } Without the state construct we would have a few Digest clones per iteration which might be quite bad, but this can be avoided with the state access. An alternative would be to code HMac without reering to a full fleged hash primitive (only the compression function), but I guess that would make validation of the construct harder. For a PBKDF2 construct this would nearly double the speed, which is important to give attackers not too much headstart, but it might also speed up other repeating usages of HMACs. What do you think? (btw: I had before asked about access to the state, so maybe it is really a good idea to have an interface for this? Gruss Bernd