[COSE] Re: Thoughts about the Context Information in COSE

Phillip Hallam-Baker Thu, 26 Sep 2024 16:24:48 -0700

On Thu, Sep 26, 2024 at 7:02 PM Orie Steele <[email protected]>
wrote:


> Protected headers and sig structure already sorta accomplish this for JWS
> and cose-sign1.
>

I agree.

The hash version is only really useful for legacy ASN.1 stuff and for
people who roll their own for whatever reason.


There was a thread a while back about composite sigs and if PGP, JOSE and
> CMS should have protocol specific context separation... I think if you are
> going to do composite sigs, it would be ok to do them without protocol
> specific context binding... But you have to agree on a context string...
> And it
> would be weird for COSE to use OIDs for Ed25519 + ML-DSA... But it would
> work.
>

OIDs bleed through into all the protocols, it's just a fact of life. But in
this case, all we need for the context is an opaque sequence of bytes that
is different for COSE, JOSE, XML Sig, etc.

The bytes don't take up any space on the wire even, it's just a fixed
constant.



> Maybe in 2024 all signatures should have protocol specific context
> separation.
>

I agree in spirit but 'protocol' is not a good term to use here because it
is a bit too squishy. It could stand for the envelope or for the
application.

I agree with Sophie that there should not be bleed through of the
application context separation identifier into the signature when signing
with COSE/JOSE/etc. Those envelope formats have plenty of slots where that
information can be added.

I disagree with using empty. It should be a fixed series of bytes that is
different for each envelope or data object.

IANA Content Type feels right. It is text and we have already defined
strings for most things.





> OS
>
>
>
> On Thu, Sep 26, 2024, 5:00 PM Phillip Hallam-Baker <[email protected]>
> wrote:
>
>> Now sign the same message with two signatures over the same data without
>> hashing the message  a second time.
>>
>> ML-DSAhash is a hack so that CMS and other existing ASN.1 things can be
>> made to do the right thing without a digest substitution attack possibility.
>>
>> The context should be "COSE" or "JOSE" so that someone can't take a COSE
>> signature and slap it on a JOSE message. But it should be the same string
>> for a given envelope format. And then the application gets to fill in the
>> COSE/JOSE manifest with its own semantic separation distinguisher.
>>
>>
>> There is a similar issue that arises with Concat KDF [NIST.800-56A] which
>> does patch what could be a possible hole in certain applications. I have
>> spent the past three hours convincing myself that I do not need Concat KDF
>> [NIST.800-56A] and can leave PartyUInfo/PartyVInfo empty for my application
>> because they are a mechanism for credential binding which I do explicitly
>> through a different mechanism when I need it.
>>
>> These are now very large systems and some parts are 35 or more years old.
>>
>>
>> On Thu, Sep 26, 2024 at 1:12 PM Sophie Schmieg <[email protected]>
>> wrote:
>>
>>> You can use the comment on Algorithm 7, line 6. "message representative
>>> that may optionally be computed in a different cryptographic module"
>>> The hash function is SHAKE256(SHAKE256(pk, 64) || 0x00 || 0x00 || m,
>>> 64), which already takes care of the context (by setting it to the empty
>>> string, as it always should be, since that is a question for the protocol,
>>> not the signature scheme). While you can argue that that breaks the
>>> cryptographic module boundary (even if explicitly allowed by NIST's
>>> comment), the same is true for HashML-DSA, with Algorithm 5 taking an
>>> arbitrary size message and a hash function, not the hash of the message
>>> itself, so computing it outside of the cryptographic module similarly
>>> breaks module boundaries, this time without that being explicitly allowed
>>> by the standard.
>>>
>>> On Thu, Sep 26, 2024 at 9:28 AM Phillip Hallam-Baker <
>>> [email protected]> wrote:
>>>
>>>> Right now, we are discussing this in LAMPS, JOSE, COSE, OpenPGP and the
>>>> NIST-PQC list. Which is really not a good way to go about things because
>>>> this is a systems architecture issue and what we really need is for all the
>>>> groups to arrive at the same approach so that we avoid issues of semantic
>>>> substitution, digest substitution, etc. So added LAMPS because the
>>>> conclusion is relevant there.
>>>>
>>>>
>>>> The Issue is that ML-DSA-pure doesn't use SHAKE(content), it uses
>>>> SHAKE(0x00 + content + context). And there is no way to divide that
>>>> operation securely with half taking place inside the HSM and half taking
>>>> place outside.
>>>>
>>>> From the NIST reference code:
>>>>
>>>>         //5
>>>>         int i = 0;
>>>>         bytes[i++] = 0;
>>>>         if (context is null) {
>>>>             bytes[i++] = 0;
>>>>             }
>>>>         else {
>>>>             bytes[i++] = (byte)clen;
>>>>             Array.Copy(context, 0, bytes, i, length);
>>>>             i += clen;
>>>>             }
>>>>
>>>> Since my code is designed to support multiple signatures over the same
>>>> content under different signature algorithms, being told which algorithm to
>>>> use doesn't work for me because I am only hashing once.
>>>>
>>>> The NIST division is actually rather clever because it allows the HSM
>>>> to have a bit more intelligence than in the past because the body is always
>>>> at least a manifest. And the HSM can be configured to only sign specific
>>>> types of content with specific authorization and log what it did.
>>>>
>>>>
>>>> ML-DSA-hash is really only relevant for CMS/PKCS#7. Everything else we
>>>> do either has a manifest already (COSE, JOSE, OpenPGP, XML-Signature) or is
>>>> short enough that it can be done inside the HSM (except CRLs without
>>>> distribution points) . And most of the things that are short enough are
>>>> exactly the sort of thing we would want an HSM to validate and log.
>>>>
>>>> So for example, I might have my HSM configured in such a way that it
>>>> will only sign an object with the "PKIX Certificate" extension, if and only
>>>> if TBSCertificate is well formatted DER and the requisite set of proofs
>>>> (proof of right, validation assertion, CT insertion proof) have been
>>>> supplied.
>>>>
>>>> And yes, that might be more complexity than you would want in a
>>>> FIPS-140-4 module which is why you might have a second module acting as a
>>>> front end to a 'signer only' module.
>>>>
>>>>
>>>> Soi we should define a set of context strings to separate the COSE,
>>>> JOSE, XML-Signature, SAML, Certificate, CRL, etc. domains which are all
>>>> atomic strings with no parameters.
>>>>
>>>> CMS is special in that we have all this infrastructure already
>>>> committed based on the RSA approach and so there we should probably allow
>>>> the context string to be a prefix followed by the application specific
>>>> context separator.
>>>>
>>>> So two new IANA registries. One for signature format context strings,
>>>> one for CMS application contexts.
>>>>
>>>>
>>>> I will write up a draft proposing this and a second draft proposing
>>>> adding the ML-DSA hash digest to Ed-448 and Ed-25519 so we can use all the
>>>> algorithms in the same fashion with the same API.
>>>>
>>>>
>>>>
>>>> On Thu, Sep 26, 2024 at 11:25 AM Sophie Schmieg <[email protected]>
>>>> wrote:
>>>>
>>>>> SHAKE256 supports streaming. It's a sponge construction, you only need
>>>>> to keep the sponge as state, and can stream in the data.
>>>>>
>>>>> On Wed, Sep 25, 2024 at 4:24 PM Phillip Hallam-Baker <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> ML-DSAhash is designed to support streaming. That isn't necessary if
>>>>>> you only work at the packet layer but you certainly don't want to sign 
>>>>>> 1TB
>>>>>> files using the SHAKE256 construct and pushing all that data into a
>>>>>> FIPS-140-3 HSM.
>>>>>>
>>>>>> For COSE and JOSE, ML-DSAhash is unnecessary because you are always
>>>>>> signing over a manifest. So if you are going to sign a 1TB file, you hash
>>>>>> with your favorite digest, specify the digest value and digest algorithm 
>>>>>> ID
>>>>>> in the SighedHeader field and then sign over that. That is exactly what
>>>>>> ML-DSA-pure is intended for.
>>>>>>
>>>>>> For other systems, the reason we need ML-DSAhash is that we have a
>>>>>> lot of APIs that are built around the RSA interface of 'sign digest value
>>>>>> and OID'. And ML-DSA needs to support a mode where it is a direct drop in
>>>>>> substitute.
>>>>>>
>>>>>> If the digest algorithm identifier is omitted, there is a possibility
>>>>>> of a digest substitution attack. So it is an important consideration.
>>>>>>
>>>>>>
>>>>>> The other concern is semantic substitution and there there are two
>>>>>> separate issues, one is crafting a CMS package so that it is a legitimate
>>>>>> COSE package. Which seems far fetched but so did gifs that render as
>>>>>> jpgs...
>>>>>>
>>>>>> The second is crafting a COSE package for application A so that it is
>>>>>> accepted as legitimate by application B. This is a much more plausible
>>>>>> attack.
>>>>>>
>>>>>>
>>>>>> If every signature algorithm supported context, we could use the
>>>>>> signature context slot for both. Since they don't and since taking data
>>>>>> provided by the signer and putting it into the signature envelope 
>>>>>> directly
>>>>>> is 'icky' to say the least, the better way to do this in a standardized
>>>>>> envelope that has a manifest is to use the context string to identify the
>>>>>> envelope format, 'XML-DIG-SIG', 'JOSE', 'COSE', etc. and make a slot in 
>>>>>> the
>>>>>> envelope manifest for application level separation.
>>>>>>
>>>>>> This is actually done in the SAML assertion format which has an
>>>>>> Audience field for the express purpose of semantic binding to the terms 
>>>>>> and
>>>>>> conditions of the signature.
>>>>>>
>>>>>> CMS/PKCS#7 is really rooted in the way we did things 30 years ago and
>>>>>> the signature context is really the only option.
>>>>>>
>>>>>>
>>>>>> This may seem unnecessarily complex but it is much easier to block
>>>>>> this class of attack completely than to spend time auditing every
>>>>>> application to see if there is a problem.
>>>>>>
>>>>>> The way we form the key agreement output doing ECDH (P-256 etc) is a
>>>>>> lot more verbose than X25519 because it binds to the keys used for the 
>>>>>> key
>>>>>> agreement. I am really not at all sure what the advantage of doing it 
>>>>>> that
>>>>>> way when your key names are 'Alice' and 'Bob' which is what we have in 
>>>>>> the
>>>>>> RFC but that's what we did and maybe we should have done X25519 exactly 
>>>>>> the
>>>>>> same way so it was the same...
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 25, 2024 at 5:19 PM Sophie Schmieg <sschmieg=
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Note that there are two options for prehashing with ML-DSA: You can
>>>>>>> use the comment on algorithm 7, line 6 and use the hash function
>>>>>>> SHAKE256(SHAKE256(pk, 64) || 0x00 || 0x00 || m, 64), in which case it 
>>>>>>> works
>>>>>>> exactly the same as ECDSA (with a known hash function). I.e. the hash 
>>>>>>> can
>>>>>>> be computed elsewhere and transmitted to a signing oracle, producing a
>>>>>>> signature that looks the same as if no prehashing has taken place, so 
>>>>>>> from
>>>>>>> the verifiers perspective this choice does not matter. Or you use the 
>>>>>>> (in
>>>>>>> my opinion strictly worse) option of using HashML-KEM, where you prehash
>>>>>>> with say SHA512. In that case, the verifier needs to know that you did 
>>>>>>> so.
>>>>>>> By calling that algorithm HashML-DSA-SHA512 (and putting the algorithm
>>>>>>> information in the public key), you can communicate that, but honestly 
>>>>>>> I do
>>>>>>> not see any reason to do so that would not be better served by just 
>>>>>>> using
>>>>>>> ML-DSA, prehashing with the SHAKE256 construction mentioned.
>>>>>>>
>>>>>>> On Thu, Sep 19, 2024 at 12:34 AM Ilari Liusvaara <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> On Wed, Sep 18, 2024 at 01:50:20PM -0700, Sophie Schmieg wrote:
>>>>>>>> > On Tue, Sep 17, 2024 at 1:20 PM Ilari Liusvaara <
>>>>>>>> [email protected]>
>>>>>>>> > wrote:
>>>>>>>> >
>>>>>>>> > >
>>>>>>>> > > In case of signed JWT, the very first thing that needs to be
>>>>>>>> parsed out
>>>>>>>> > > is "iss".
>>>>>>>> > >
>>>>>>>> > > ... Which is a bit problematic.
>>>>>>>> >
>>>>>>>> > Yeah, I somewhat intentionally did not mention iss, because yeah,
>>>>>>>> it is a
>>>>>>>> > bit problematic, and forces the "authorization decision passed
>>>>>>>> down to
>>>>>>>> > downstream system" as a pattern.
>>>>>>>>
>>>>>>>> Dedicated JWT validation code could callback to map issuer name to
>>>>>>>> keyset. But that runs into bit annoying function color issues in
>>>>>>>> many
>>>>>>>> laguages (fortunately synchronous factorization does not seem to be
>>>>>>>> too
>>>>>>>> bad)...
>>>>>>>>
>>>>>>>>
>>>>>>>> > > Unfortunately, that runs into problems with pre-hashing.
>>>>>>>> > >
>>>>>>>> > > Currently, that only gets problematic for RSA, but supporting
>>>>>>>> pre-hashed
>>>>>>>> > > ML-DSA would also introduce the problem there.
>>>>>>>> > >
>>>>>>>> > > ECDSA has essentially fixed prehash (ok), and EdDSA in
>>>>>>>> COSE/JOSE does
>>>>>>>> > > not support pre-hashing.
>>>>>>>> > >
>>>>>>>> >
>>>>>>>> > I'm not sure I follow. The hash function used with a signature
>>>>>>>> scheme is
>>>>>>>> > part of the signature scheme as well, and so the public key
>>>>>>>> should allow
>>>>>>>> > you to derive that information. Several common public key
>>>>>>>> serialization
>>>>>>>> > formats unfortunately do not properly include the hash function,
>>>>>>>> maybe that
>>>>>>>> > is what you are referring to? Or do you have a system where the
>>>>>>>> decision
>>>>>>>> > which hash function to use is taken independently of the decision
>>>>>>>> of which
>>>>>>>> > key to use? In that case, yeah you have lots of incompatibilities,
>>>>>>>> > especially in the case of ML-DSA where the hash function is fixed
>>>>>>>> to
>>>>>>>> > SHAKE256, and has to be prefixed with a hash of the public key,
>>>>>>>> but I'm not
>>>>>>>> > sure why the algorithm has to be part of the token to enable this
>>>>>>>> use case.
>>>>>>>>
>>>>>>>> Because public keys frequently fail to include hash function, one
>>>>>>>> would
>>>>>>>> have to deduce the hash function from the key itself.
>>>>>>>>
>>>>>>>> That works in practice for ECDSA, EdDSA and HSS-LMS. But it does not
>>>>>>>> work for RSA (then there is the PSS versus PKCS#1 v1.5 stuff...).
>>>>>>>>
>>>>>>>> For ML-DSA, supporting pre-hash mode breaks deducing hash function.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -Ilari
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> COSE mailing list -- [email protected]
>>>>>>>> To unsubscribe send an email to [email protected]
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>>>>>>> [email protected]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> COSE mailing list -- [email protected]
>>>>>>> To unsubscribe send an email to [email protected]
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>>>>> [email protected]
>>>>>
>>>>>
>>>
>>> --
>>>
>>> Sophie Schmieg | Information Security Engineer | ISE Crypto |
>>> [email protected]
>>>
>>> _______________________________________________
>> COSE mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>>
>

_______________________________________________
COSE mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[COSE] Re: Thoughts about the Context Information in COSE

Reply via email to