[COSE] Re: Thoughts about the Context Information in COSE

Orie Steele Thu, 26 Sep 2024 16:03:27 -0700

Protected headers and sig structure already sorta accomplish this for JWS
and cose-sign1.


There was a thread a while back about composite sigs and if PGP, JOSE and
CMS should have protocol specific context separation... I think if you are
going to do composite sigs, it would be ok to do them without protocol
specific context binding... But you have to agree on a context string...
And it
would be weird for COSE to use OIDs for Ed25519 + ML-DSA... But it would
work.

Maybe in 2024 all signatures should have protocol specific context
separation.

OS



On Thu, Sep 26, 2024, 5:00 PM Phillip Hallam-Baker <[email protected]>
wrote:

> Now sign the same message with two signatures over the same data without
> hashing the message  a second time.
>
> ML-DSAhash is a hack so that CMS and other existing ASN.1 things can be
> made to do the right thing without a digest substitution attack possibility.
>
> The context should be "COSE" or "JOSE" so that someone can't take a COSE
> signature and slap it on a JOSE message. But it should be the same string
> for a given envelope format. And then the application gets to fill in the
> COSE/JOSE manifest with its own semantic separation distinguisher.
>
>
> There is a similar issue that arises with Concat KDF [NIST.800-56A] which
> does patch what could be a possible hole in certain applications. I have
> spent the past three hours convincing myself that I do not need Concat KDF
> [NIST.800-56A] and can leave PartyUInfo/PartyVInfo empty for my application
> because they are a mechanism for credential binding which I do explicitly
> through a different mechanism when I need it.
>
> These are now very large systems and some parts are 35 or more years old.
>
>
> On Thu, Sep 26, 2024 at 1:12 PM Sophie Schmieg <[email protected]>
> wrote:
>
>> You can use the comment on Algorithm 7, line 6. "message representative
>> that may optionally be computed in a different cryptographic module"
>> The hash function is SHAKE256(SHAKE256(pk, 64) || 0x00 || 0x00 || m, 64),
>> which already takes care of the context (by setting it to the empty string,
>> as it always should be, since that is a question for the protocol, not the
>> signature scheme). While you can argue that that breaks the cryptographic
>> module boundary (even if explicitly allowed by NIST's comment), the same is
>> true for HashML-DSA, with Algorithm 5 taking an arbitrary size message and
>> a hash function, not the hash of the message itself, so computing it
>> outside of the cryptographic module similarly breaks module boundaries,
>> this time without that being explicitly allowed by the standard.
>>
>> On Thu, Sep 26, 2024 at 9:28 AM Phillip Hallam-Baker <
>> [email protected]> wrote:
>>
>>> Right now, we are discussing this in LAMPS, JOSE, COSE, OpenPGP and the
>>> NIST-PQC list. Which is really not a good way to go about things because
>>> this is a systems architecture issue and what we really need is for all the
>>> groups to arrive at the same approach so that we avoid issues of semantic
>>> substitution, digest substitution, etc. So added LAMPS because the
>>> conclusion is relevant there.
>>>
>>>
>>> The Issue is that ML-DSA-pure doesn't use SHAKE(content), it uses
>>> SHAKE(0x00 + content + context). And there is no way to divide that
>>> operation securely with half taking place inside the HSM and half taking
>>> place outside.
>>>
>>> From the NIST reference code:
>>>
>>>         //5
>>>         int i = 0;
>>>         bytes[i++] = 0;
>>>         if (context is null) {
>>>             bytes[i++] = 0;
>>>             }
>>>         else {
>>>             bytes[i++] = (byte)clen;
>>>             Array.Copy(context, 0, bytes, i, length);
>>>             i += clen;
>>>             }
>>>
>>> Since my code is designed to support multiple signatures over the same
>>> content under different signature algorithms, being told which algorithm to
>>> use doesn't work for me because I am only hashing once.
>>>
>>> The NIST division is actually rather clever because it allows the HSM to
>>> have a bit more intelligence than in the past because the body is always at
>>> least a manifest. And the HSM can be configured to only sign specific types
>>> of content with specific authorization and log what it did.
>>>
>>>
>>> ML-DSA-hash is really only relevant for CMS/PKCS#7. Everything else we
>>> do either has a manifest already (COSE, JOSE, OpenPGP, XML-Signature) or is
>>> short enough that it can be done inside the HSM (except CRLs without
>>> distribution points) . And most of the things that are short enough are
>>> exactly the sort of thing we would want an HSM to validate and log.
>>>
>>> So for example, I might have my HSM configured in such a way that it
>>> will only sign an object with the "PKIX Certificate" extension, if and only
>>> if TBSCertificate is well formatted DER and the requisite set of proofs
>>> (proof of right, validation assertion, CT insertion proof) have been
>>> supplied.
>>>
>>> And yes, that might be more complexity than you would want in a
>>> FIPS-140-4 module which is why you might have a second module acting as a
>>> front end to a 'signer only' module.
>>>
>>>
>>> Soi we should define a set of context strings to separate the COSE,
>>> JOSE, XML-Signature, SAML, Certificate, CRL, etc. domains which are all
>>> atomic strings with no parameters.
>>>
>>> CMS is special in that we have all this infrastructure already committed
>>> based on the RSA approach and so there we should probably allow the context
>>> string to be a prefix followed by the application specific context
>>> separator.
>>>
>>> So two new IANA registries. One for signature format context strings,
>>> one for CMS application contexts.
>>>
>>>
>>> I will write up a draft proposing this and a second draft proposing
>>> adding the ML-DSA hash digest to Ed-448 and Ed-25519 so we can use all the
>>> algorithms in the same fashion with the same API.
>>>
>>>
>>>
>>> On Thu, Sep 26, 2024 at 11:25 AM Sophie Schmieg <[email protected]>
>>> wrote:
>>>
>>>> SHAKE256 supports streaming. It's a sponge construction, you only need
>>>> to keep the sponge as state, and can stream in the data.
>>>>
>>>> On Wed, Sep 25, 2024 at 4:24 PM Phillip Hallam-Baker <
>>>> [email protected]> wrote:
>>>>
>>>>> ML-DSAhash is designed to support streaming. That isn't necessary if
>>>>> you only work at the packet layer but you certainly don't want to sign 1TB
>>>>> files using the SHAKE256 construct and pushing all that data into a
>>>>> FIPS-140-3 HSM.
>>>>>
>>>>> For COSE and JOSE, ML-DSAhash is unnecessary because you are always
>>>>> signing over a manifest. So if you are going to sign a 1TB file, you hash
>>>>> with your favorite digest, specify the digest value and digest algorithm 
>>>>> ID
>>>>> in the SighedHeader field and then sign over that. That is exactly what
>>>>> ML-DSA-pure is intended for.
>>>>>
>>>>> For other systems, the reason we need ML-DSAhash is that we have a lot
>>>>> of APIs that are built around the RSA interface of 'sign digest value and
>>>>> OID'. And ML-DSA needs to support a mode where it is a direct drop in
>>>>> substitute.
>>>>>
>>>>> If the digest algorithm identifier is omitted, there is a possibility
>>>>> of a digest substitution attack. So it is an important consideration.
>>>>>
>>>>>
>>>>> The other concern is semantic substitution and there there are two
>>>>> separate issues, one is crafting a CMS package so that it is a legitimate
>>>>> COSE package. Which seems far fetched but so did gifs that render as
>>>>> jpgs...
>>>>>
>>>>> The second is crafting a COSE package for application A so that it is
>>>>> accepted as legitimate by application B. This is a much more plausible
>>>>> attack.
>>>>>
>>>>>
>>>>> If every signature algorithm supported context, we could use the
>>>>> signature context slot for both. Since they don't and since taking data
>>>>> provided by the signer and putting it into the signature envelope directly
>>>>> is 'icky' to say the least, the better way to do this in a standardized
>>>>> envelope that has a manifest is to use the context string to identify the
>>>>> envelope format, 'XML-DIG-SIG', 'JOSE', 'COSE', etc. and make a slot in 
>>>>> the
>>>>> envelope manifest for application level separation.
>>>>>
>>>>> This is actually done in the SAML assertion format which has an
>>>>> Audience field for the express purpose of semantic binding to the terms 
>>>>> and
>>>>> conditions of the signature.
>>>>>
>>>>> CMS/PKCS#7 is really rooted in the way we did things 30 years ago and
>>>>> the signature context is really the only option.
>>>>>
>>>>>
>>>>> This may seem unnecessarily complex but it is much easier to block
>>>>> this class of attack completely than to spend time auditing every
>>>>> application to see if there is a problem.
>>>>>
>>>>> The way we form the key agreement output doing ECDH (P-256 etc) is a
>>>>> lot more verbose than X25519 because it binds to the keys used for the key
>>>>> agreement. I am really not at all sure what the advantage of doing it that
>>>>> way when your key names are 'Alice' and 'Bob' which is what we have in the
>>>>> RFC but that's what we did and maybe we should have done X25519 exactly 
>>>>> the
>>>>> same way so it was the same...
>>>>>
>>>>>
>>>>> On Wed, Sep 25, 2024 at 5:19 PM Sophie Schmieg <sschmieg=
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Note that there are two options for prehashing with ML-DSA: You can
>>>>>> use the comment on algorithm 7, line 6 and use the hash function
>>>>>> SHAKE256(SHAKE256(pk, 64) || 0x00 || 0x00 || m, 64), in which case it 
>>>>>> works
>>>>>> exactly the same as ECDSA (with a known hash function). I.e. the hash can
>>>>>> be computed elsewhere and transmitted to a signing oracle, producing a
>>>>>> signature that looks the same as if no prehashing has taken place, so 
>>>>>> from
>>>>>> the verifiers perspective this choice does not matter. Or you use the (in
>>>>>> my opinion strictly worse) option of using HashML-KEM, where you prehash
>>>>>> with say SHA512. In that case, the verifier needs to know that you did 
>>>>>> so.
>>>>>> By calling that algorithm HashML-DSA-SHA512 (and putting the algorithm
>>>>>> information in the public key), you can communicate that, but honestly I 
>>>>>> do
>>>>>> not see any reason to do so that would not be better served by just using
>>>>>> ML-DSA, prehashing with the SHAKE256 construction mentioned.
>>>>>>
>>>>>> On Thu, Sep 19, 2024 at 12:34 AM Ilari Liusvaara <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> On Wed, Sep 18, 2024 at 01:50:20PM -0700, Sophie Schmieg wrote:
>>>>>>> > On Tue, Sep 17, 2024 at 1:20 PM Ilari Liusvaara <
>>>>>>> [email protected]>
>>>>>>> > wrote:
>>>>>>> >
>>>>>>> > >
>>>>>>> > > In case of signed JWT, the very first thing that needs to be
>>>>>>> parsed out
>>>>>>> > > is "iss".
>>>>>>> > >
>>>>>>> > > ... Which is a bit problematic.
>>>>>>> >
>>>>>>> > Yeah, I somewhat intentionally did not mention iss, because yeah,
>>>>>>> it is a
>>>>>>> > bit problematic, and forces the "authorization decision passed
>>>>>>> down to
>>>>>>> > downstream system" as a pattern.
>>>>>>>
>>>>>>> Dedicated JWT validation code could callback to map issuer name to
>>>>>>> keyset. But that runs into bit annoying function color issues in many
>>>>>>> laguages (fortunately synchronous factorization does not seem to be
>>>>>>> too
>>>>>>> bad)...
>>>>>>>
>>>>>>>
>>>>>>> > > Unfortunately, that runs into problems with pre-hashing.
>>>>>>> > >
>>>>>>> > > Currently, that only gets problematic for RSA, but supporting
>>>>>>> pre-hashed
>>>>>>> > > ML-DSA would also introduce the problem there.
>>>>>>> > >
>>>>>>> > > ECDSA has essentially fixed prehash (ok), and EdDSA in COSE/JOSE
>>>>>>> does
>>>>>>> > > not support pre-hashing.
>>>>>>> > >
>>>>>>> >
>>>>>>> > I'm not sure I follow. The hash function used with a signature
>>>>>>> scheme is
>>>>>>> > part of the signature scheme as well, and so the public key should
>>>>>>> allow
>>>>>>> > you to derive that information. Several common public key
>>>>>>> serialization
>>>>>>> > formats unfortunately do not properly include the hash function,
>>>>>>> maybe that
>>>>>>> > is what you are referring to? Or do you have a system where the
>>>>>>> decision
>>>>>>> > which hash function to use is taken independently of the decision
>>>>>>> of which
>>>>>>> > key to use? In that case, yeah you have lots of incompatibilities,
>>>>>>> > especially in the case of ML-DSA where the hash function is fixed
>>>>>>> to
>>>>>>> > SHAKE256, and has to be prefixed with a hash of the public key,
>>>>>>> but I'm not
>>>>>>> > sure why the algorithm has to be part of the token to enable this
>>>>>>> use case.
>>>>>>>
>>>>>>> Because public keys frequently fail to include hash function, one
>>>>>>> would
>>>>>>> have to deduce the hash function from the key itself.
>>>>>>>
>>>>>>> That works in practice for ECDSA, EdDSA and HSS-LMS. But it does not
>>>>>>> work for RSA (then there is the PSS versus PKCS#1 v1.5 stuff...).
>>>>>>>
>>>>>>> For ML-DSA, supporting pre-hash mode breaks deducing hash function.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -Ilari
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> COSE mailing list -- [email protected]
>>>>>>> To unsubscribe send an email to [email protected]
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>>>>>> [email protected]
>>>>>>
>>>>>> _______________________________________________
>>>>>> COSE mailing list -- [email protected]
>>>>>> To unsubscribe send an email to [email protected]
>>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>>>> [email protected]
>>>>
>>>>
>>
>> --
>>
>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>> [email protected]
>>
>> _______________________________________________
> COSE mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>

_______________________________________________
COSE mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[COSE] Re: Thoughts about the Context Information in COSE

Reply via email to