Re: KDF API review, round 2

Michael StJohns Mon, 20 Nov 2017 08:18:55 -0800

Apologies in advance for top posting and a need to be a little pedanticabout KDFs. I'll have some comments inline below as well.

KDF's aren't well understood but people think they are. The key streamgeneration part is pretty straightforward (keyed PRBG), but theinteraction of how the key stream is generated and how the key stream isassigned to actual cryptographic objects is not. Here's why:

1) KDF's are repeatable. Given the exact same inputs (key, mixin data)they produce the same key stream.

2) Any change in the inputs changes ALL of the key stream.

3) Unless the overall length property is included, then changing thelength of the key stream will not change the prefix (e.g. if theoriginal call was for 10 bytes and a second call was for 20, the first10 bytes of both calls will produce the exact same key stream data)4) The general format of each round of key stream generation issomething like PRF (master key, mixins), where mixins are theconcatenation of at least a label and context and a value todifferentiate each round (a counter or the previous rounds output forexample). Including L in the mixin prevents the property described in(3) above. Including a length for each subcomponent as a mixin preventsthe property described in (5) below.5) Unless the length for each derived object is included in the mix in,then it is possible to move the assignment of key stream bytes betweenobjects. For example, both TLS (1.2 and before) and IPSEC use KDFs thatgenerate non-secret IV material along with secret session keymaterial. This is less important for software only KDFs as both thesecret key material and the IV material are both in the JVM memorydomain. This is very important if you're trying to keep your secret keymaterial secure in an HSM.

Example: a given TLS session may need 2 256 bit AES keys and 2 128 bitIVs. That is a requirement for 96 bytes of key stream (if I've got mycalculation correct). We have the HSM produce this (see the PKCS11calling sequence for example) and we get out the IVs. An attacker whohas access to the HSM (which may or may not be on the same machine asthe TLS instantiation) can call the derivation function with new outputparameters (but with the same master key and mixins) which specifiesonly IV material and have the function output the same key stream bytesthat were previously assigned to the secret key material in the IVoutput. A very easy key extraction attack.

This is why TLS1.3 only does single outputs per KDF call and makes thelength of that output a mandatory mixin. An HSM can also look at thelabels and make a determination as to whether an object need beprotected (key material) or in the clear (iv).

Given (3) and (5) I believe that both L and l[i] (subcomponent length)may need to be provided for BEFORE any key material is produced whichargues for input during initialization phase.



On 11/20/2017 5:12 AM, Jamil Nimeh wrote:

On 11/19/2017 12:45 PM, Michael StJohns wrote:
On 11/17/2017 1:07 PM, Adam Petcher wrote:
On 11/17/2017 10:04 AM, Michael StJohns wrote:
On 11/16/2017 2:15 PM, Adam Petcher wrote:
So it seems like they could all be supplied to init.Alternatively, algorithm names could specify more concretealgorithms that include the mode/PRF/etc. Can you provide moreinformation to explain why these existing patterns won't work inthis case?
What I need to do is provide a lifecycle diagram, but its hard todo in text. But basically, the .getInstance() followed by.setParameters() builds a concrete engine while the .init()initializes that engine with a key and the derivation parameters.Think about a TLS 1.2 instance - the PRF is selected once, but theKDF may be used multiple times.
This is the information I was missing. There are two sets ofparameters, and the first set should be fixed, but the second setshould be changed on each init.
I considered the mode/PRF/etc stuff but that works for things likeCipher and Signature because most of those have exactly the samepattern. For the KDF pattern we;ve got fully specified KDFs (e.g.TLS 1.1 and before, IPSEC), almost fully specified KDFs (TLS 1.2and HDKF needs a PRF) and then the SP800 style KDFs which aredefined to be *very* flexible. So translating that into a namingconvention is going to be restrictive and may not cover all of thepossible approaches. I'd rather do it as an algorithmparameterinstead. With a given KDF implementation having a default ifnothing is specified during instantiation.
I agree that this is challenging because there is so much variety inKDFs. But I don't think that SP 800-108 is a good example ofsomething that should be exposed as an algorithm in JCA, because itis too broad. SP 800-108 is more of a toolbox that can be used toconstruct KDFs. Particular specializations of SP 800-108 are widelyused, and they will get names that can be used in getInstance. Forexample, HKDF-Expand is a particular specialization of SP 800-108.
So I think the existing pattern of using algorithm names to specifyconcrete algorithms should work just as well in this API as it doesin the rest of JCA. Of course, more flexibility in the API is a nicefeature, but supporting this level of generality may be out of scopefor this effort.
The more I think about it the more I think you're mostly right. Butlet's split this slightly as almost every KDF allows for thespecification of the PRF. So
<kdfname>/<prf>    as the standard naming convention.
Or TLS13/HMAC-SHA256 and HKDF/HMAC-SHA256 (which are differentbecause of the mandatory inclusion of "L" in the derivationparameters and each component object for TLS13)
Still - let's include the .setParameters() call as a failsafe aslooking forward I can see the need for flexibility rearing its uglyhead (e.g. adding PSS parameters to RSA signatures way late in thegame.....) and it does match the pattern for Signature so its not anew concept. A given provider need not support the call, but itsthere if needed.
Signature appears to have setParameter because the initSign andinitVerify didn't have APS parameters in their method signatures.Since we're talking about providing APS objects through bothgetInstance() for those locked to the algorithm and init() for thingslike salts, info, etc. that can be changed on successive inits itseems like we're covered without the need for a setParameter method.

You're missing the point that setParameter() provides information usedin all future calls to the signature generation, while init() providesdata specifically for a given key stream production. In Signature() youcall .setParameter() to set up the PSS parameters (or use thedefaults). Each subsequent call to initSign or initVerify uses thosePSS parameters. The equivalent part of .init() in KeyDerivation isactually the calls to .update() in signature as they provide thespecific information for the production of the output key stream. Infact, setting up an HMAC signature instance and passing it the mixindata as part of a .update() is a way of producing the key stream round.


So equivalences:

KeyDerivation.getInstance(PRF) == Signature.getInstance(HMAC)
KeyDerivation.setParameters() == Signature.setParameters()

KeyDerivation.init(key, List<Parameters>) == concatenation of theresults of multiple calls (each key stream round based on the neededoutput length) to [Signature.initSign(Key) followed bySignature.update(converttobytearray(List<Parameters>)) followed by Signature.sign()] to produce the key streamKeyDerivation.deriveKey() == various calls to key or object factorieswith parts of the key stream (signature).

(Hmm.. I think I forgot to get back to this comment - a KDF key shouldbe tagged differently than an HMAC key even though the underlyingfunctions are the same. It shouldn't be possible to use an HMACSecretKey (or an AES secret key) as a KDF master key and vice versa,basically because of the property that an HMAC output is by definitionnon-secret data while the key stream production is by definition -secret. You want to make sure that its not trivial to do this).

One additional topic for discussion: Late in the week we talked aboutthe current state of the API internally and one item to revisit iswhere the DerivationParameterSpec objects are passed. It was broughtup by a couple people that it would be better to provide the DPSobjects pertaining to keys at the time they are called for throughderiveKey() and deriveKeys() (and possibly deriveData).
Originally we had them all grouped in a List in the init method. Onereason for needing it up there was to know the total length ofmaterial to generate. If we can provide the total length through theAlgorithmParameterSpec passed in via init() then things like:
Key deriveKey(DerivationParameterSpec param);
List<Key> deriveKeys(List<DerivationParameterSpec> params);
become possible. To my eyes at least it does make it more clear whatDPS you're processing since they're provided at derive time, ratherthan the caller having to keep track in their heads where in the DPSlist they might be with each successive deriveKey or deriveKeyscalls. And I think we could do away with deriveKeys(int), too.

See above - the key stream is logically produced in its entirety beforeany assignment of that stream is made to any cryptographic objectsbecause the mixins (except for the round differentiator) are the samefor each key stream production round. Simply passing in the totallength may not give you the right result if the KDF requires a percomponent length (and it should to defeat (5) or it should only producea single key).

95% of the time this will be a call to produce a single key. 4% of thetime it will be a call to produce multiple keys. Only 1% of the timewill it need to intermix key, data and object productions. Anybody whois doing that is going to write a wrapper around this class to make surethey get the key and data production order correct for each call. SoI'm not all that bothered by keeping the complexity as a price forkeeping flexibility.

You could have a Key deriveKey(Key k, DerivationParameterSpec param) forsome things like TLS1.3 (where you can only make a single call to derivekey between inits) , but then you'd also need at least a byte[]deriveData (Key k, DerivationParameterSpec param) and an ObjectderiveObject(Key k, DerivationParameterSpec param).



I think the most common pattern will be

.init(Key k, DerivationParameterSpec param) followed by .deriveKey()  or
.init(Key k, List<DerivationParameterSpec> params) followed by .deriveKeys()

but the other intermixed patterns are just as valid.


--Jamil

Re: KDF API review, round 2

Reply via email to