Hi Mike, I know I said you made arguments in favor of specifying the
keys up front in init, but I'm still really uncomfortable with this.
It's been bothering me all day. Comments below:
On 11/27/2017 10:09 AM, Michael StJohns wrote:
On 11/27/2017 1:03 AM, Jamil Nimeh wrote:
One additional topic for discussion: Late in the week we talked
about the current state of the API internally and one item to
revisit is where the DerivationParameterSpec objects are passed. It
was brought up by a couple people that it would be better to
provide the DPS objects pertaining to keys at the time they are
called for through deriveKey() and deriveKeys() (and possibly
deriveData).
Originally we had them all grouped in a List in the init method.
One reason for needing it up there was to know the total length of
material to generate. If we can provide the total length through
the AlgorithmParameterSpec passed in via init() then things like:
Key deriveKey(DerivationParameterSpec param);
List<Key> deriveKeys(List<DerivationParameterSpec> params);
become possible. To my eyes at least it does make it more clear
what DPS you're processing since they're provided at derive time,
rather than the caller having to keep track in their heads where in
the DPS list they might be with each successive deriveKey or
deriveKeys calls. And I think we could do away with
deriveKeys(int), too.
See above - the key stream is logically produced in its entirety
before any assignment of that stream is made to any cryptographic
objects because the mixins (except for the round differentiator) are
the same for each key stream production round. Simply passing in
the total length may not give you the right result if the KDF
requires a per component length (and it should to defeat (5) or it
should only produce a single key).
From looking at 800-108, I don't see any place where the KDF needs a
per-component length. It looks like it takes L (total length) as an
input and that is applied to each round of the PRF. HKDF takes L
up-front as an input too, though it doesn't use it as an input to the
HMAC function itself. For TLS 1.3 that component length becomes part
of the context info (HkdfLabel) through the HKDF-Expand-Label
function...and it's only doing one key for a given label which is
also part of that context specific info, necessitating an init()
call. Seems like the length can go into the APS provided via init
(for those KDFs that need it at least) and you shouldn't need a DPS
list up-front.
HKDF and SP800-108 only deal with the creation of the key stream and
ignore the issues with assigning the key stream to cryptographic
objects. In the TLS version of HDKF, the L value is mandatory and
only a single object is assigned per init/call to the KDF. An HSM
can look at the HKDF label information and set the appropriate
policies for the assigned cryptographic object (because if any of the
label data changes, the entire key stream changes). That's not the
case for the raw HKDF nor for any KDF that allows for multiple objects
to be extracted out of a single key stream. Hence the per-component
length values.
So enforce a no-zero-length key policy in your provider code. You
probably can't affect the internals of the HSM, but you should be able
to prevent it in the provider code. I can't get away from the feeling
that this could be dealt with in other ways besides specifying all this
up-front.
Ideally, there should be a complete object spec for each object to be
generated that is part of the mixins (label and context) for any
KDF. That allows an HSM to rely upon the object spec when setting
policy controls for each generated object - and incidentally allows
for a KDF to generate both public and non-public data in a secure way.
Between different generations of keystreams do you expect to have
different sets of policy controls? The KDF API has no way for you to
set those things so I would assume those would be pretty static, or at
least controlled outside the KDF API. If so, why is the KDF API
concerning itself with how some HSM sets its policy on objects it makes?
So as long as you allow for the specification of all of the production
objects as part of the .init() I'm good. A given KDF might not
require this - but I can't see any way of fixing the current KDFs to
work in HSMs without something like this.
As far as your (5) scenario goes, I can see how you can twiddle the
lengths to get the keystream output with zero-length keys and large
IV buffers. But that scenario really glosses over what should be a
big hurdle and a major access control issue that stands outside the
KDF API: That the attacker shouldn't have access to the input keying
material in the first place. Protect the input keying material
properly and their attack cannot be done.
Let me give you an example. I'm running an embedded HSM - to protect
TLS keys and to do all of the crypto. An attacker compromises the TLS
server and now has access to the HSM. No problem - I'm going to
notice if the attacker starts extraditing large amounts of data from
the server (e.g. copies of the TLS in the clear but possibly
reencrypted data stream) so this isn't a threat or is it? Smart
attacker does an extraction attack on the TLS 1.2 and before KDF and
turns all of the key stream material into IV material and exports it
from the HSM. The attacker now has the much smaller key material so
he can send a few messages with those keys and allow for the passive
external interception of the traffic and decryption thereof without
the risk of detection of all that traffic being sent. Alternately, I
can place the key material in a picture via steganography and publish
it as part of the server data.
"If the attacker compromises a TLS server" is the part that gets
me...we're using external software bugs/security holes as a
justification to make the KDF API in ways that I think are less clear to
the consumer, to cover one class of providers (HSMs).
The idea is to protect extraction of the key material from an HSM
_*even from authorized users of that key material*_.
That may well be a goal for the HSM, to be solved by the HSM or the
provider that front-ends it. I do not see that as something to be
solved by the KDF API.
KDFs don't currently do this well. Adding the overall length and per
component length stuff as well as a per component spec to the data
used to derive the key stream means that 1) changes to any of those
change the entire key stream, 2) the per component spec data may be
used by the security module policy engine to enforce restrictions and
3) because of (1) and (2) calling the KDF a second time gets me
exactly the same objects rather than just the same key stream. The
last isn't very important in a software based security domain, but
turns out to have real implications for policy enforcing security modules.
But there aren't KDFs that take individual component lengths as inputs,
so alterations to individual key component lengths don't change the
keystream (unless someone decides to write a KDF that does, but none
that I've seen do). With the way the KDF API is taking shape, there's
no enforcement that you get the same objects - none of that is locked to
the instance. It can change between inits. If you reinitialize with
the same key and KDF parameters, whether you specify all objects up
front or one at a time in derive calls you can still ask for a different
set of output objects. And changing lengths on various objects won't
matter because HKDF, Counter-mode KDF, Feedback-mode KDF...none of those
care a whit about individual component lengths. All they care about is
the total length of the keystream (and HKDF only cares about that to
make sure it's not more than 255 * Hmac length).
This gets worse when you realize that the KDF key is under it all
either a HASH HMAC or CMAC key and all of those algorithms produce
public data. Ideally you need a way of preventing a KDF key from
calling the raw HASH/HMAC/CMAC functions directly (and vice versa).
I don't see how we'd prevent this in software. If I've got a key as
input to a KDF (a SecretKey) there's no way to prevent it being used by
anything else that takes a SecretKey. If you need to prevent that in
hardware then that seems like a concern for your provider or the HSM itself.
I would rather see the DPS provided in the deriveKey. It couples
what you want out with the call that makes the object and it makes a
lot more sense to keep those two together than try to remember where
in the submitted list of DPS objects you are.
95% of the time this will be a call to produce a single key. 4% of
the time it will be a call to produce multiple keys. Only 1% of the
time will it need to intermix key, data and object productions.
Anybody who is doing that is going to write a wrapper around this
class to make sure they get the key and data production order
correct for each call. So I'm not all that bothered by keeping the
complexity as a price for keeping flexibility.
You could have a Key deriveKey(Key k, DerivationParameterSpec param)
for some things like TLS1.3 (where you can only make a single call
to derive key between inits) , but then you'd also need at least a
byte[] deriveData (Key k, DerivationParameterSpec param) and an
Object deriveObject(Key k, DerivationParameterSpec param).
I don't think those are necessary. If you're just doing HKDF-Expand
(for the HKDF-Expand-Label TLS 1.3 key derivation) then you can
provide the input key, label and max length and any other context
info that goes into that HkdfLabel structure...all of that would go
into init(). Then provide the key alg and desired length via the DPS
at deriveKey time. Any subsequent keys in the TLS 1.3 key schedule
would need a new init call anyway since the labels change and
possibly the output length.
Over the next day or so I'm going to have to make some final
decisions on this API as there are internal projects that are waiting
on this API to proceed. I'm already past the cut-off date I set, but
I recognize these discussions are important to have and I appreciate
the input you and others have provided.
--Jamil
Reading this last I think I've lost the context. Here's where I
think we are:
1) Get instance gets the default configuration of a given KDF (and
that default will be attached to the instance name defintion)
2) .setParameter() may be used to update the KDF configuration - once.
3) .init() takes at least the key, it may optionally take a set of
derivation parameters. The derivation parameters provided in .init()
are intended for use in forming the label and context mixins for the
KDF. They may provide - for example - the total length of the key
stream, the objects to be derived, the length of the objects,
protection parameters for each of the objects etc.
4) A kdf generate a free-running or fixed length key stream depending
on the derivation parameters (e.g. if "L" is not a mixin to the KDF
then it is free-running and may produce as much key stream as desired
or if the production object specifications are not part of the
derivation mixins).
Doing (4) is mostly not a good idea, but someone might want to do
this. In that case it may make the most sense to just allow them to
do deriveData(int length) calls as the only function (a keyed PRNG
basically).
Re the last version of your api - if you add the .setParameter()
.getParameter() calls to both KeyDerivation and KeyDerivationSpi I
think I'm happy with this part of the API. I'm wondering if we should
talk about KeyAgreement though.