[COSE] Re: Thoughts about the Context Information in COSE

Phillip Hallam-Baker Thu, 26 Sep 2024 15:00:46 -0700

Now sign the same message with two signatures over the same data without
hashing the message  a second time.


ML-DSAhash is a hack so that CMS and other existing ASN.1 things can be
made to do the right thing without a digest substitution attack possibility.

The context should be "COSE" or "JOSE" so that someone can't take a COSE
signature and slap it on a JOSE message. But it should be the same string
for a given envelope format. And then the application gets to fill in the
COSE/JOSE manifest with its own semantic separation distinguisher.


There is a similar issue that arises with Concat KDF [NIST.800-56A] which
does patch what could be a possible hole in certain applications. I have
spent the past three hours convincing myself that I do not need Concat KDF
[NIST.800-56A] and can leave PartyUInfo/PartyVInfo empty for my application
because they are a mechanism for credential binding which I do explicitly
through a different mechanism when I need it.

These are now very large systems and some parts are 35 or more years old.


On Thu, Sep 26, 2024 at 1:12 PM Sophie Schmieg <[email protected]> wrote:

> You can use the comment on Algorithm 7, line 6. "message representative
> that may optionally be computed in a different cryptographic module"
> The hash function is SHAKE256(SHAKE256(pk, 64) || 0x00 || 0x00 || m, 64),
> which already takes care of the context (by setting it to the empty string,
> as it always should be, since that is a question for the protocol, not the
> signature scheme). While you can argue that that breaks the cryptographic
> module boundary (even if explicitly allowed by NIST's comment), the same is
> true for HashML-DSA, with Algorithm 5 taking an arbitrary size message and
> a hash function, not the hash of the message itself, so computing it
> outside of the cryptographic module similarly breaks module boundaries,
> this time without that being explicitly allowed by the standard.
>
> On Thu, Sep 26, 2024 at 9:28 AM Phillip Hallam-Baker <[email protected]>
> wrote:
>
>> Right now, we are discussing this in LAMPS, JOSE, COSE, OpenPGP and the
>> NIST-PQC list. Which is really not a good way to go about things because
>> this is a systems architecture issue and what we really need is for all the
>> groups to arrive at the same approach so that we avoid issues of semantic
>> substitution, digest substitution, etc. So added LAMPS because the
>> conclusion is relevant there.
>>
>>
>> The Issue is that ML-DSA-pure doesn't use SHAKE(content), it uses
>> SHAKE(0x00 + content + context). And there is no way to divide that
>> operation securely with half taking place inside the HSM and half taking
>> place outside.
>>
>> From the NIST reference code:
>>
>>         //5
>>         int i = 0;
>>         bytes[i++] = 0;
>>         if (context is null) {
>>             bytes[i++] = 0;
>>             }
>>         else {
>>             bytes[i++] = (byte)clen;
>>             Array.Copy(context, 0, bytes, i, length);
>>             i += clen;
>>             }
>>
>> Since my code is designed to support multiple signatures over the same
>> content under different signature algorithms, being told which algorithm to
>> use doesn't work for me because I am only hashing once.
>>
>> The NIST division is actually rather clever because it allows the HSM to
>> have a bit more intelligence than in the past because the body is always at
>> least a manifest. And the HSM can be configured to only sign specific types
>> of content with specific authorization and log what it did.
>>
>>
>> ML-DSA-hash is really only relevant for CMS/PKCS#7. Everything else we do
>> either has a manifest already (COSE, JOSE, OpenPGP, XML-Signature) or is
>> short enough that it can be done inside the HSM (except CRLs without
>> distribution points) . And most of the things that are short enough are
>> exactly the sort of thing we would want an HSM to validate and log.
>>
>> So for example, I might have my HSM configured in such a way that it will
>> only sign an object with the "PKIX Certificate" extension, if and only if
>> TBSCertificate is well formatted DER and the requisite set of proofs (proof
>> of right, validation assertion, CT insertion proof) have been supplied.
>>
>> And yes, that might be more complexity than you would want in a
>> FIPS-140-4 module which is why you might have a second module acting as a
>> front end to a 'signer only' module.
>>
>>
>> Soi we should define a set of context strings to separate the COSE, JOSE,
>> XML-Signature, SAML, Certificate, CRL, etc. domains which are all atomic
>> strings with no parameters.
>>
>> CMS is special in that we have all this infrastructure already committed
>> based on the RSA approach and so there we should probably allow the context
>> string to be a prefix followed by the application specific context
>> separator.
>>
>> So two new IANA registries. One for signature format context strings, one
>> for CMS application contexts.
>>
>>
>> I will write up a draft proposing this and a second draft proposing
>> adding the ML-DSA hash digest to Ed-448 and Ed-25519 so we can use all the
>> algorithms in the same fashion with the same API.
>>
>>
>>
>> On Thu, Sep 26, 2024 at 11:25 AM Sophie Schmieg <[email protected]>
>> wrote:
>>
>>> SHAKE256 supports streaming. It's a sponge construction, you only need
>>> to keep the sponge as state, and can stream in the data.
>>>
>>> On Wed, Sep 25, 2024 at 4:24 PM Phillip Hallam-Baker <
>>> [email protected]> wrote:
>>>
>>>> ML-DSAhash is designed to support streaming. That isn't necessary if
>>>> you only work at the packet layer but you certainly don't want to sign 1TB
>>>> files using the SHAKE256 construct and pushing all that data into a
>>>> FIPS-140-3 HSM.
>>>>
>>>> For COSE and JOSE, ML-DSAhash is unnecessary because you are always
>>>> signing over a manifest. So if you are going to sign a 1TB file, you hash
>>>> with your favorite digest, specify the digest value and digest algorithm ID
>>>> in the SighedHeader field and then sign over that. That is exactly what
>>>> ML-DSA-pure is intended for.
>>>>
>>>> For other systems, the reason we need ML-DSAhash is that we have a lot
>>>> of APIs that are built around the RSA interface of 'sign digest value and
>>>> OID'. And ML-DSA needs to support a mode where it is a direct drop in
>>>> substitute.
>>>>
>>>> If the digest algorithm identifier is omitted, there is a possibility
>>>> of a digest substitution attack. So it is an important consideration.
>>>>
>>>>
>>>> The other concern is semantic substitution and there there are two
>>>> separate issues, one is crafting a CMS package so that it is a legitimate
>>>> COSE package. Which seems far fetched but so did gifs that render as
>>>> jpgs...
>>>>
>>>> The second is crafting a COSE package for application A so that it is
>>>> accepted as legitimate by application B. This is a much more plausible
>>>> attack.
>>>>
>>>>
>>>> If every signature algorithm supported context, we could use the
>>>> signature context slot for both. Since they don't and since taking data
>>>> provided by the signer and putting it into the signature envelope directly
>>>> is 'icky' to say the least, the better way to do this in a standardized
>>>> envelope that has a manifest is to use the context string to identify the
>>>> envelope format, 'XML-DIG-SIG', 'JOSE', 'COSE', etc. and make a slot in the
>>>> envelope manifest for application level separation.
>>>>
>>>> This is actually done in the SAML assertion format which has an
>>>> Audience field for the express purpose of semantic binding to the terms and
>>>> conditions of the signature.
>>>>
>>>> CMS/PKCS#7 is really rooted in the way we did things 30 years ago and
>>>> the signature context is really the only option.
>>>>
>>>>
>>>> This may seem unnecessarily complex but it is much easier to block this
>>>> class of attack completely than to spend time auditing every application to
>>>> see if there is a problem.
>>>>
>>>> The way we form the key agreement output doing ECDH (P-256 etc) is a
>>>> lot more verbose than X25519 because it binds to the keys used for the key
>>>> agreement. I am really not at all sure what the advantage of doing it that
>>>> way when your key names are 'Alice' and 'Bob' which is what we have in the
>>>> RFC but that's what we did and maybe we should have done X25519 exactly the
>>>> same way so it was the same...
>>>>
>>>>
>>>> On Wed, Sep 25, 2024 at 5:19 PM Sophie Schmieg <sschmieg=
>>>> [email protected]> wrote:
>>>>
>>>>> Note that there are two options for prehashing with ML-DSA: You can
>>>>> use the comment on algorithm 7, line 6 and use the hash function
>>>>> SHAKE256(SHAKE256(pk, 64) || 0x00 || 0x00 || m, 64), in which case it 
>>>>> works
>>>>> exactly the same as ECDSA (with a known hash function). I.e. the hash can
>>>>> be computed elsewhere and transmitted to a signing oracle, producing a
>>>>> signature that looks the same as if no prehashing has taken place, so from
>>>>> the verifiers perspective this choice does not matter. Or you use the (in
>>>>> my opinion strictly worse) option of using HashML-KEM, where you prehash
>>>>> with say SHA512. In that case, the verifier needs to know that you did so.
>>>>> By calling that algorithm HashML-DSA-SHA512 (and putting the algorithm
>>>>> information in the public key), you can communicate that, but honestly I 
>>>>> do
>>>>> not see any reason to do so that would not be better served by just using
>>>>> ML-DSA, prehashing with the SHAKE256 construction mentioned.
>>>>>
>>>>> On Thu, Sep 19, 2024 at 12:34 AM Ilari Liusvaara <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> On Wed, Sep 18, 2024 at 01:50:20PM -0700, Sophie Schmieg wrote:
>>>>>> > On Tue, Sep 17, 2024 at 1:20 PM Ilari Liusvaara <
>>>>>> [email protected]>
>>>>>> > wrote:
>>>>>> >
>>>>>> > >
>>>>>> > > In case of signed JWT, the very first thing that needs to be
>>>>>> parsed out
>>>>>> > > is "iss".
>>>>>> > >
>>>>>> > > ... Which is a bit problematic.
>>>>>> >
>>>>>> > Yeah, I somewhat intentionally did not mention iss, because yeah,
>>>>>> it is a
>>>>>> > bit problematic, and forces the "authorization decision passed down
>>>>>> to
>>>>>> > downstream system" as a pattern.
>>>>>>
>>>>>> Dedicated JWT validation code could callback to map issuer name to
>>>>>> keyset. But that runs into bit annoying function color issues in many
>>>>>> laguages (fortunately synchronous factorization does not seem to be
>>>>>> too
>>>>>> bad)...
>>>>>>
>>>>>>
>>>>>> > > Unfortunately, that runs into problems with pre-hashing.
>>>>>> > >
>>>>>> > > Currently, that only gets problematic for RSA, but supporting
>>>>>> pre-hashed
>>>>>> > > ML-DSA would also introduce the problem there.
>>>>>> > >
>>>>>> > > ECDSA has essentially fixed prehash (ok), and EdDSA in COSE/JOSE
>>>>>> does
>>>>>> > > not support pre-hashing.
>>>>>> > >
>>>>>> >
>>>>>> > I'm not sure I follow. The hash function used with a signature
>>>>>> scheme is
>>>>>> > part of the signature scheme as well, and so the public key should
>>>>>> allow
>>>>>> > you to derive that information. Several common public key
>>>>>> serialization
>>>>>> > formats unfortunately do not properly include the hash function,
>>>>>> maybe that
>>>>>> > is what you are referring to? Or do you have a system where the
>>>>>> decision
>>>>>> > which hash function to use is taken independently of the decision
>>>>>> of which
>>>>>> > key to use? In that case, yeah you have lots of incompatibilities,
>>>>>> > especially in the case of ML-DSA where the hash function is fixed to
>>>>>> > SHAKE256, and has to be prefixed with a hash of the public key, but
>>>>>> I'm not
>>>>>> > sure why the algorithm has to be part of the token to enable this
>>>>>> use case.
>>>>>>
>>>>>> Because public keys frequently fail to include hash function, one
>>>>>> would
>>>>>> have to deduce the hash function from the key itself.
>>>>>>
>>>>>> That works in practice for ECDSA, EdDSA and HSS-LMS. But it does not
>>>>>> work for RSA (then there is the PSS versus PKCS#1 v1.5 stuff...).
>>>>>>
>>>>>> For ML-DSA, supporting pre-hash mode breaks deducing hash function.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -Ilari
>>>>>>
>>>>>> _______________________________________________
>>>>>> COSE mailing list -- [email protected]
>>>>>> To unsubscribe send an email to [email protected]
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>>>>> [email protected]
>>>>>
>>>>> _______________________________________________
>>>>> COSE mailing list -- [email protected]
>>>>> To unsubscribe send an email to [email protected]
>>>>>
>>>>
>>>
>>> --
>>>
>>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>>> [email protected]
>>>
>>>
>
> --
>
> Sophie Schmieg | Information Security Engineer | ISE Crypto |
> [email protected]
>
>

_______________________________________________
COSE mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[COSE] Re: Thoughts about the Context Information in COSE

Reply via email to