Re: KDF API review, round 2

Jamil Nimeh Mon, 27 Nov 2017 22:17:23 -0800

Hi Mike, I know I said you made arguments in favor of specifying thekeys up front in init, but I'm still really uncomfortable with this. It's been bothering me all day. Comments below:


On 11/27/2017 10:09 AM, Michael StJohns wrote:

On 11/27/2017 1:03 AM, Jamil Nimeh wrote:
One additional topic for discussion: Late in the week we talkedabout the current state of the API internally and one item torevisit is where the DerivationParameterSpec objects are passed. Itwas brought up by a couple people that it would be better toprovide the DPS objects pertaining to keys at the time they arecalled for through deriveKey() and deriveKeys() (and possiblyderiveData).
Originally we had them all grouped in a List in the init method.One reason for needing it up there was to know the total length ofmaterial to generate. If we can provide the total length throughthe AlgorithmParameterSpec passed in via init() then things like:
Key deriveKey(DerivationParameterSpec param);
List<Key> deriveKeys(List<DerivationParameterSpec> params);
become possible. To my eyes at least it does make it more clearwhat DPS you're processing since they're provided at derive time,rather than the caller having to keep track in their heads where inthe DPS list they might be with each successive deriveKey orderiveKeys calls. And I think we could do away withderiveKeys(int), too.
See above - the key stream is logically produced in its entiretybefore any assignment of that stream is made to any cryptographicobjects because the mixins (except for the round differentiator) arethe same for each key stream production round. Simply passing inthe total length may not give you the right result if the KDFrequires a per component length (and it should to defeat (5) or itshould only produce a single key).
From looking at 800-108, I don't see any place where the KDF needs aper-component length. It looks like it takes L (total length) as aninput and that is applied to each round of the PRF. HKDF takes Lup-front as an input too, though it doesn't use it as an input to theHMAC function itself. For TLS 1.3 that component length becomes partof the context info (HkdfLabel) through the HKDF-Expand-Labelfunction...and it's only doing one key for a given label which isalso part of that context specific info, necessitating an init()call. Seems like the length can go into the APS provided via init(for those KDFs that need it at least) and you shouldn't need a DPSlist up-front.
HKDF and SP800-108 only deal with the creation of the key stream andignore the issues with assigning the key stream to cryptographicobjects. In the TLS version of HDKF, the L value is mandatory andonly a single object is assigned per init/call to the KDF. An HSMcan look at the HKDF label information and set the appropriatepolicies for the assigned cryptographic object (because if any of thelabel data changes, the entire key stream changes). That's not thecase for the raw HKDF nor for any KDF that allows for multiple objectsto be extracted out of a single key stream. Hence the per-componentlength values.

So enforce a no-zero-length key policy in your provider code. Youprobably can't affect the internals of the HSM, but you should be ableto prevent it in the provider code. I can't get away from the feelingthat this could be dealt with in other ways besides specifying all thisup-front.

Ideally, there should be a complete object spec for each object to begenerated that is part of the mixins (label and context) for anyKDF. That allows an HSM to rely upon the object spec when settingpolicy controls for each generated object - and incidentally allowsfor a KDF to generate both public and non-public data in a secure way.

Between different generations of keystreams do you expect to havedifferent sets of policy controls? The KDF API has no way for you toset those things so I would assume those would be pretty static, or atleast controlled outside the KDF API. If so, why is the KDF APIconcerning itself with how some HSM sets its policy on objects it makes?

So as long as you allow for the specification of all of the productionobjects as part of the .init() I'm good. A given KDF might notrequire this - but I can't see any way of fixing the current KDFs towork in HSMs without something like this.
As far as your (5) scenario goes, I can see how you can twiddle thelengths to get the keystream output with zero-length keys and largeIV buffers. But that scenario really glosses over what should be abig hurdle and a major access control issue that stands outside theKDF API: That the attacker shouldn't have access to the input keyingmaterial in the first place. Protect the input keying materialproperly and their attack cannot be done.
Let me give you an example. I'm running an embedded HSM - to protectTLS keys and to do all of the crypto. An attacker compromises the TLSserver and now has access to the HSM. No problem - I'm going tonotice if the attacker starts extraditing large amounts of data fromthe server (e.g. copies of the TLS in the clear but possiblyreencrypted data stream) so this isn't a threat or is it? Smartattacker does an extraction attack on the TLS 1.2 and before KDF andturns all of the key stream material into IV material and exports itfrom the HSM. The attacker now has the much smaller key material sohe can send a few messages with those keys and allow for the passiveexternal interception of the traffic and decryption thereof withoutthe risk of detection of all that traffic being sent. Alternately, Ican place the key material in a picture via steganography and publishit as part of the server data.

"If the attacker compromises a TLS server" is the part that getsme...we're using external software bugs/security holes as ajustification to make the KDF API in ways that I think are less clear tothe consumer, to cover one class of providers (HSMs).

The idea is to protect extraction of the key material from an HSM_*even from authorized users of that key material*_.

That may well be a goal for the HSM, to be solved by the HSM or theprovider that front-ends it. I do not see that as something to besolved by the KDF API.

KDFs don't currently do this well. Adding the overall length and percomponent length stuff as well as a per component spec to the dataused to derive the key stream means that 1) changes to any of thosechange the entire key stream, 2) the per component spec data may beused by the security module policy engine to enforce restrictions and3) because of (1) and (2) calling the KDF a second time gets meexactly the same objects rather than just the same key stream. Thelast isn't very important in a software based security domain, butturns out to have real implications for policy enforcing security modules.

But there aren't KDFs that take individual component lengths as inputs,so alterations to individual key component lengths don't change thekeystream (unless someone decides to write a KDF that does, but nonethat I've seen do). With the way the KDF API is taking shape, there'sno enforcement that you get the same objects - none of that is locked tothe instance. It can change between inits. If you reinitialize withthe same key and KDF parameters, whether you specify all objects upfront or one at a time in derive calls you can still ask for a differentset of output objects. And changing lengths on various objects won'tmatter because HKDF, Counter-mode KDF, Feedback-mode KDF...none of thosecare a whit about individual component lengths. All they care about isthe total length of the keystream (and HKDF only cares about that tomake sure it's not more than 255 * Hmac length).

This gets worse when you realize that the KDF key is under it alleither a HASH HMAC or CMAC key and all of those algorithms producepublic data. Ideally you need a way of preventing a KDF key fromcalling the raw HASH/HMAC/CMAC functions directly (and vice versa).

I don't see how we'd prevent this in software. If I've got a key asinput to a KDF (a SecretKey) there's no way to prevent it being used byanything else that takes a SecretKey. If you need to prevent that inhardware then that seems like a concern for your provider or the HSM itself.

I would rather see the DPS provided in the deriveKey. It coupleswhat you want out with the call that makes the object and it makes alot more sense to keep those two together than try to remember wherein the submitted list of DPS objects you are.
95% of the time this will be a call to produce a single key. 4% ofthe time it will be a call to produce multiple keys. Only 1% of thetime will it need to intermix key, data and object productions.Anybody who is doing that is going to write a wrapper around thisclass to make sure they get the key and data production ordercorrect for each call. So I'm not all that bothered by keeping thecomplexity as a price for keeping flexibility.
You could have a Key deriveKey(Key k, DerivationParameterSpec param)for some things like TLS1.3 (where you can only make a single callto derive key between inits) , but then you'd also need at least abyte[] deriveData (Key k, DerivationParameterSpec param) and anObject deriveObject(Key k, DerivationParameterSpec param).
I don't think those are necessary. If you're just doing HKDF-Expand(for the HKDF-Expand-Label TLS 1.3 key derivation) then you canprovide the input key, label and max length and any other contextinfo that goes into that HkdfLabel structure...all of that would gointo init(). Then provide the key alg and desired length via the DPSat deriveKey time. Any subsequent keys in the TLS 1.3 key schedulewould need a new init call anyway since the labels change andpossibly the output length.
Over the next day or so I'm going to have to make some finaldecisions on this API as there are internal projects that are waitingon this API to proceed. I'm already past the cut-off date I set, butI recognize these discussions are important to have and I appreciatethe input you and others have provided.
--Jamil
Reading this last I think I've lost the context. Here's where Ithink we are:
1) Get instance gets the default configuration of a given KDF (andthat default will be attached to the instance name defintion)
2) .setParameter() may be used to update the KDF configuration - once.
3) .init() takes at least the key, it may optionally take a set ofderivation parameters. The derivation parameters provided in .init()are intended for use in forming the label and context mixins for theKDF. They may provide - for example - the total length of the keystream, the objects to be derived, the length of the objects,protection parameters for each of the objects etc.4) A kdf generate a free-running or fixed length key stream dependingon the derivation parameters (e.g. if "L" is not a mixin to the KDFthen it is free-running and may produce as much key stream as desiredor if the production object specifications are not part of thederivation mixins).
Doing (4) is mostly not a good idea, but someone might want to dothis. In that case it may make the most sense to just allow them todo deriveData(int length) calls as the only function (a keyed PRNGbasically).
Re the last version of your api - if you add the .setParameter().getParameter() calls to both KeyDerivation and KeyDerivationSpi Ithink I'm happy with this part of the API. I'm wondering if we shouldtalk about KeyAgreement though.

Re: KDF API review, round 2

Reply via email to