This is a review request for the draft of a new Key Derivation API.  The goal of this API will be to provide a framework for KDF algorithms like HKDF, TLS-PRF, PBKDF2 and so forth to be publicly accessible.  We also plan to provide an SPI that let 3rd parties create their own implementations of KDFs in their providers, rather than trying to force them into KeyGenerators, SecretKeyFactories and the like.

Rather than stuff this email full of the specification text (since it is likely to get quite a few iterations of comments and comments-to-comments), I have placed the API both in simple text form and as a Javadoc at the following locations:

spec: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/kdfspec.01.txt

javadoc: http://cr.openjdk.java.net/~jnimeh/reviews/kdfspec/javadoc.01/

They're both the same content, just use whichever is friendlier for your eyes.

In addition, I have opened up the JEP as well:


Thanks to those who have contributed to very early internal drafts of this so far, and thanks in advance to those who will be contributing comments going forward.


Most of the following suggestions (and please take them as such regardless of any directive language) represent things I've had to do manually that I'd really prefer to do in a real key derivation API.  A few are related to how to keep things securely stored in an HSM.

Add a .reset() method to KeyDerivation.  Call this to clear the state of the KDF.

Add an .initialize(List<DerivationParameterSpec>, SecretKey masterSecret) method.  Remove the argument to deriveKey and deriveKeys.  This plays with the stuff to follow, but basically, a KDF may need all of the per-key derivation input to calculate the total length of the output key stream as an internal input to the KDF before ever emitting a single key.   Also - how exactly were you planning on keying the KDF? I guess you could pass that in in the KeyDerivation.getInstance() call or as part of the algorithmParameter but.... probably makes more sense to keep the KDF instance key-free to allow for reuse.
Well, let's get the easy one out of the way.  As you suspected I planned to pass the SecretKey in via AlgorithmParameterSpec.  The three classes unfortunately didn't show that.  Maybe on the next iteration I can put an HkdfParameterSpec in there just as a sample so folks can see that where the key comes in.  The reason I went that way was because the goal was to provide all algorithm paramters at instantiation time, and the SecretKey was just another input.  I don't know if just making the KDF key-free would be enough for reuse, at least not for all cases.  Thinking about HKDF and TLS 1.3 for instance, the key is the same for a collection of keys (like the client and server app traffic master keys that come from the master secret, for instance) - what changes are the other inputs to HKDF.

Yup - but that's easily handled through the new initialization call - which again matches the way Cipher, Signature and KeyAgreement do things.   Simplifying (??) the interface just to make one use case easier is probably not a great tradeoff.

One issue that came up on an early internal rev of the API was that we didn't want to separate instantiation and initialization, so all the inputs to the KDF now come in at getInstance time through AlgorithmParameterSpecs, rather than doing getInstance/init/... like KeyAgreement does.  I wonder if it would be OK to still have an init (and a reset as you wanted) method so we can provide new inputs top-to-bottom into the KDF object.  All the getInstance forms would stay more or less the same, so there's no way to make a KDF object without it being in an initialized state.  But when you need new inputs you don't have to make a new object.  I like being able to reuse the object and reset it to its starting state.  I don't know if the folks that brought up the instance/init issue would have a problem with that.  I think we're still adhering to the spirit of what they wanted to see since getInstance still gives you a fully initialized object.

As I noted in my other email, that's not the general form of a contract in the JCA.

That's a bit different than what you're talking about with your initialize method, I kinda birdwalked a bit.  Let me ask a couple questions: When you proposed initialize(), were you envisioning that applications would always need to call it before derive*?

Yes, init would always need to be called before you begin derives. A KDF call would require an instantiation (where you pass the parameters of the mechanism - think about the SP800-108 chinese menu of stuff that needs to be specified), an initialization (to get the keys in place and to set up the queue of derivation material and calculate the total length L if needed by the KDF), and then one or more derive commands to convert the derived key stream bytes into the keys.

You could merge the init and derive states for simple things, but each call has a specific set of things its trying to accomplish and its probably better to keep the init/derive stages separate.

  Or did you really mean "may" and an implementation would have to go back and generate more material if they exhausted everything they knew about?  Given your changes to deriveKey(s) it looked more like you intended to know the total length up-front, since there's no other way to say some arbitrary next key is of a specific length with no argument to deriveKey[s].

Take a look at SP800-108 - the counter mode KDF. There's a parameter L in there that mostly not optional.  It's there to ensure that if you twiddle with the length of the total output the entire underlying keystream changes.   This turns out to be a critical security aspect of these things especially if you're doing any of this in an HSM.

If you did want the total length of all keys/data/objects to be supplied before derivation, what if we were to supply that to the getInstance calls?
It's better to let the underlying function do the calculation as the number of key stream bytes might not actually be what you think it is for the assignment to a key.

A similar idea was put forth internally, but we decided to hold off on it and wait for some feedback from the field.  So if we were to go this route then getInstance calls might look like this:

public static KeyDerivation getInstance(String alg, AlgorithmParameterSpec params, List<DerivationParameterSpec> deriveParams); public static KeyDerivation getInstance(String alg, String provider, AlgorithmParameterSpec params, List<DerivationParameterSpec> deriveParams); public static KeyDerivation getInstance(String alg, Provider provider, AlgorithmParameterSpec params, List<DerivationParameterSpec> deriveParams);

You end up with a ready-to-use KDF right from the get-go.
Still not buying it.   this removes one user line of code as a cost of loss of flexibility, a model that looks nothing like those for Cipher, KeyAgreement and Signature etc.

If we're going that route though, *and* we try to make it reusable, then we have to specify both KDF parameters and derivation parameters in an initialize call.  If reusability isn't all that important then we don't have reset and initialize and you just make a new KDF every time.  I like the former approach better, myself - though I would like to know how others feel about it.

Rename DerivedKeyParameterSpec to DeriviationParameterSpec and provide an algorithm name for "IV" or "Cleartext".  See below for .deriveData()
I think we could do that.  Those don't sound like names that would be a problem.  But maybe we go with an even more generic name like "data" or "raw".  Cleartext sounds too much like plaintext/ciphertext kind of lingo and IV is use specific.

Yup.  Names are easy.

deriveKey() emits the next key in the sequence using the data stream to key conversion rules.

deriveKeys() emits as many keys left in the stream to the next data derivation or the defined end of stream based on the input specs.  deriveKeys(int num) derives the next num keys.
Minor clarification: "...emits as many keys left in the stream to the next data /*or Object*/ derivation" (I'm asking, not stating, just making sure I understand what you intended).
derive keys will derive objects that are subclasses of java.crypto.Key.   If the next object is specifying raw bytes or an object that is not a Key, then it stops.  So your language is correct.  Maybe cleaner to say "many keys left in the stream until the next non-Key derivation"

Add a .deriveData() with a return class of byte[].   This gets a portion of the derived data stream in the clear. E.g. an IV.

Add a .deriveObject() with a return class of Object.  The returned object may not be an instance of java.security.Key. This takes the derived data stream and converts it into the object type specified by the derivation parameter.  In a hardware security module, this might be a reference to a secured set of data or even an confidential IV.
Again, just want to make sure I understand fully: So in a case where I want a given output to be an Object, I would provide a DerivationParameterSpec with an alg of..."Object" (?), a byte length, and Object-specific parameters provided through the "params" argument to the DPS?

Working this through, but it should be a Class  being specified with a constructor of a byte array plus a length.

All of the derive methods throw an InvalidParameterSpecException if the next derivation parameter doesn't match the calling method (e.g. trying to deriveData when the parameter spec says emit a key).
Makes sense to me.  Are you OK with IllegalStateException when you try to derive a key after all elements in List<DerivationParameterSpec> have been previously returned?

Maybe - I was trying to figure out the nuances of returning a RuntimeException vs a normal exception in this case.  That's probably OK but I want to think about it and re-read the RuntimeException general contract stuff.

In KeyDerivation, change the output class of the deriveKey to java.security.Key; similar for deriveKeys change the output to List<Key>.   Basically, its possible to use the output of a KDF stream to derive private keys and this should be supported. It's occasionally helpful (but not very often) for two devices to share a key pair that they create through a key agreement process (e.g. two HSMs acting as backup to each other).  Alternately, consider adding a "public KeyPair deriveKeyPair()" method.
Changing the output to Key makes sense.  For the HSM to HSM use case you're mentioning, that seems better suited to the KeyAgreement API, wouldn't it?

the way this works is:

1) HSM1 generates a key pair
2) HSM2 generates a key pair
3) HSM1 and 2 exchange the public keys from the key pair
4) HSM1 calculates ECDH (HSM1private, HSM2public) while HSM2 calculates ECDSA (HSM2private,HSM1Public) to get the same shared secret S. 5) HSM1 and HSM2 both using a well defined KDF instantiate that KDF and initialize it to emit a private key.  Basically, for an EC private key on P-256 the KDF emits 320 bytes which are converted into a big integer and then taken mod P of the curve to get the common private key p for the two boxes.  Depending on the mixin data for the KDF this could be one of a few different pairs or could be regenerated as needed if there were problems with storage.  Both sides can calculate the common public key P as P = pG where G is the basepoint of the curve.

In other news I want to add an ability to generate a public key if all you have is a private key.  We fixed this for RSA in PKCS11 a few years ago and it would be nice to carry it forward here.

Consider adding a marker interface  javax.crypto.MasterSecret (subclass of javax.crypto.SecretKey) and using that as class for the initialize call argument.
Maybe OBE since I'm proposing to pass the secret through the AlgorithmParameterSpec.  If not, I would recommend not subclassing it from SecretKey.  The Secret won't always be a key.  For an alg like PBKDF2 it would be a password.

Point taken.  The MasterSecret markers would be useful to indicate key material that can only be used with a kdf - there are some attacks that can be avoided if you can keep master secrets from being able to key the underlying PRF and vice versa.  E.g. if the master secret is an HMAC secret key, the output of HMAC is public data, but the output of the KDF with HMAC is private (key) data.

Let me think about the password input case.  I think it actually is a SecretKey - and using the SecretKeySpec with PBKDF2 as the algorithm to get a SecretKey object could make sense.

I'm happy to provide an edited .java file with these proposed changes - but not until at least next Monday; I'm on travel.

Let me know your thoughts on this and maybe I can cook up another rev of the spec/javadoc.  Thanks again for the feedback!



