Sophie wrote: For the algorithm header field, I am not aware of any reason to have that in the token (other than the fact that it historically has been there).
The primary reason that the “alg” is in the protected header is to cryptographically bind the algorithm used for the JWS or JWE to the computation so that an attacker cannot trick the implementation into using a different algorithm by providing a key using a different algorithm. Having “alg” in the protected header enables this mismatch to be caught before any cryptographic operations are performed. Note that the Key ID does not cryptographically bind the key to the token. Depending upon the circumstances, the attacker is free to craft a key with content of their choosing, including a Key ID matching that in the token. They Key ID is a hint enabling optimized processing in the happy path but doesn’t add any security value. For the record, these decisions are the result of discussions including cryptographers at the Internet Identity Workshop (IIW) 15 years ago this month, including cryptographers from Sun Microsystems. The primary decisions are recorded at https://self-issued.info/?p=361, which include: * There is an envelope (a.k.a. header) that completely describes the cryptographic algorithm(s) used * What to sign (envelope.payload or just payload)? Given that the envelope is extensible and therefore may contain security-sensitive information, we reached a consensus (with input from Ben Laurie<http://www.links.org/> via IM) that the combination envelope.payload must be signed. The full set of decisions made with cryptographers and implementers 15 years ago that resulted in the JWS<https://www.rfc-editor.org/rfc/rfc7515.html>, JWE<https://www.rfc-editor.org/rfc/rfc7516.html>, JWK<https://www.rfc-editor.org/rfc/rfc7517.html>, JWA<https://www.rfc-editor.org/rfc/rfc7518.html>, and JWT<https://www.rfc-editor.org/rfc/rfc7519.html> specs we have today are recorded here: * JSON Token Spec Results at IIW on Tuesday<https://self-issued.info/?p=361> · JSON Token Encryption Spec Results at IIW on Wednesday<https://self-issued.info/?p=378> · JSON Token Naming Spec Results at IIW on Wednesday<https://self-issued.info/?p=386> · JSON Public Key Spec Results at IIW on Thursday<https://self-issued.info/?p=390> Many of the COSE decision decisions followed the JOSE precedents for the same reasons. Best wishes, -- Mike From: lgl island-resort.com <[email protected]> Sent: Wednesday, September 18, 2024 10:37 AM To: Sophie Schmieg <[email protected]> Cc: Orie Steele <[email protected]>; Hannes Tschofenig <[email protected]>; cose <[email protected]> Subject: [COSE] Re: Thoughts about the Context Information in COSE It seems OK and useful to put an algorithm ID in a header to help with processing efficiency on the receiving side. For example, the ID of the hash algorithm in a signature format can allow for one-pass processing. I think Illari referred to this as “pre-hashing”. It’s OK to do lots of processing as long as the processor is robust and the data is not used or trusted until the receiving side work is complete, right? What seems important is that, in the end, the ID of the algorithm came from a trusted source, probably out of band relative to the message. Possibly that is from a trusted data structure describing the key, but it doesn’t have to be. Understood, that even “protected” headers aren’t to be trusted with blind faith. LL On Sep 17, 2024, at 11:35 AM, Sophie Schmieg <[email protected]<mailto:[email protected]>> wrote: The main principle is to assume that everything in the token is crafted specifically to lie to you, unless you have been able to confirm it came from an honest party. This means that until you have verified the signature, any data in the token can only be used for the verification if you are entirely indifferent to which of the possible values it presents. This is unfortunately a bit more complicated than just saying "do not trust the header", but "do not trust the header" is a good first approximation. The one example I know of header information safely being used in the verification is the use of a key ID, in the very specific scenario that the key ID allows the selection between multiple different equally trustedkeys. I've written about that use case in a recent blog post [1]. If the keys are not equally trusted, this will allow the attacker to select the key that is least trusted. That still can be okay, if the downstream application takes that information into account and uses it for the authorization decisions it makes, but in my experience, this rarely happens correctly if it is not enforced by the JWT library. In other words, having a clear separation of identification, authentication, and authorization is a good idea. For the algorithm header field, I am not aware of any reason to have that in the token (other than the fact that it historically has been there). To see why, we first need to look at the public key: The public key can never be part of the token, since it is trivial to create a token with a valid signature if the attacker gets to choose the public key, they can just create the public key themselves. But conceptually speaking, the public key includes the algorithm already, a RSA key, a ECDSA key, and a ML-DSA key are not interchangeable after all. So if a token says it uses ECDSA as algorithm, but the public key that is supposed to be used for verification is a ML-DSA key, the token is clearly malformed, making the algorithm field a field that only ever has information that is either superfluous (you already knew it has to be ML-DSA because that is your public key's key type) or invalid (the algorithm field does not align with the public key). Therefore including the algorithm in the token is never useful. But it gets worse, if the application is implemented in the wrong way, it will take the algorithm field of the token as authoritative and essentially reinterpret_cast the public key bytes to the type the header field suggested. This way, you get vulnerabilities casting the say an ECDSA public key into an HMAC key, with the attacker now able to forge the MAC, since the public key is known. But even if you cast different public keys into each other, the results are undefined, and might very well be insecure. For example, if you cast a ML-DSA key into a ECDSA key (with the library truncating all the extra stuff), an adversary with a quantum computer has disabled your post quantum protections, etc. Even just switching between two modes of the same algorithm (say RSA PKCS1 and RSA PSS or ECDSA and Schnorr signatures) is not guaranteed to be secure, since it might be possible to use an artifact obtained in one mode in the other mode, with the security analysis only ever looking at situations where all artifacts are created with a single mode. Another important observation is that we cannot cure this problem by including the header in question in the signature. Since all these attacks are about manipulating the decisions leading up to the signature verification, the attacker either has already successfully abused them by the time the signature verifies, or they already failed at abusing them. Fields like the algorithm that only have one possible valid value for example can be verified to have the one value whether or not they are part of the signature, indeed that is what for example Tink's JWT library does [2]. Key IDs switching between equally trusted keys are implicitly verified, since a wrong value would switch to the wrong key, causing the signature to fail to verify. Note that this does not extend to any possible header fields that are used after verification of the signature, just for header fields used in the decision making involved in signature verification. From a cryptographic standpoint, there is no difference between header and payload fields after verification, that is meaning ascribed to them by the application. [1] https://bughunters.google.com/blog/6182336647790592/cryptographic-agility-and-key-rotation [2] https://github.com/tink-crypto/tink-cc/blob/main/tink/jwt/internal/jwt_format.cc#L137 On Tue, Sep 17, 2024 at 9:11 AM Orie Steele <[email protected]<mailto:[email protected]>> wrote: I agree with much of what you wrote. Lets walk through an example of building an application layer protocol for HPKE to see where parameters show up, if we were designing from scratch and with 2020 hindsight. ## HPKE Crypto Layer recipientPublicKey, recipientPrivateKey = keyGen( ciphersuite ) contentCipherText, kemCipherText = encrypt(plaintext, recipientPublicKey) recoveredPlaintext = decrypt(contentCipherText, kemCipherText, recipientPrivateKey) HPKE has been built with the benefit of learning from ECDH-ES / KDFs / PartyU / PartyV. It internalizes a lot of things that we would have put in headers, previously. However, you still need to convey contentCipherText, kemCipherText... and handle errors that might be produced if kemCT is tampered with: https://datatracker.ietf.org/doc/html/rfc9180#base-crypto ## JOSE / COSE Application Protocol Layer At this point, you are ready to consider protocol specific context information, the purpose of this step is to ensure that sender and receiver agree they are using COSE, or JOSE... with the assumption they are already supporting HPKE. The first step is to construct a single message that contains both contentCipherText, kemCipherText ... it could use base64url and "." or cbor major types. After this step the information conveyed is cborEnvelope or joseEnvelope... not contentCipherText, kemCipherText. ## Application Protocol Context Separation Before encrypting or decrypting, sender and receiver need to agree to use a serialization and an hpke ciphersuite. Here you can add protocol specific context separation: - https://datatracker.ietf.org/doc/html/rfc9052#section-5.3 - https://datatracker.ietf.org/doc/html/rfc7516#section-3 JOSE and COSE go about this step differently... It's even more confusing because in JOSE AEADs are mandatory, whereas in COSE they are not... The objective of this step is to commit some protocol information, into the encryption step... AEAD AAD is used where it can be... KDF context info can also be used here: - https://datatracker.ietf.org/doc/html/rfc9053#name-context-information-structu ... in hindsight, this is a layer violation that forces both JOSE and COSE to maintain a separation between keys and algorithms... or if you want to think of it another way... it's the binding between algorithms and keys in both protocols. ... this is also the layer where we get "2 payloads", because in JOSE we have both the protected header and the payload... and you can put protocol parameters in either... Later this leads to JWT / CWT parameters in headers and payloads. ... it's inherited from ASN.1 supposedly... maintaining this design pattern is the "conservative approach", in that... it's doing what we have "always done". ## Key Discovery In the simple case that there is only 1 supported ciphersuite and each party only has 1 key, there is no need to communicate other information. If there are multiple keys, the key that is being encrypted too needs to be identified, to avoid the recipient having to try all their keys. At this stage we would add the key identifier as a parameter to the cborEnvelope or joseEnvelope. There is never a need to convey the algorithm or ciphersuite... because they are always included in the key representation, so the key identifier explicitly communicates them. In the pull request for ML-DSA key representations, we constructed a new key type for COSE and JOSE, called "algorithm key pair" : https://github.com/cose-wg/draft-ietf-cose-dilithium/pull/5/files The algorithm property is mandatory for this key type, and the thumbprint is computed over it. ... some other comments The fork in the road happens in "Application Protocol Context Separation"... this is where we see the AEAD differences and the context info differences... This is where we get protected header parameters... and where we first get our chance to put "algorithm information" in a "header parameter"... Because of the design of JOSE and COSE, we are forced to take the same path through this step each time, and that is why we are always stuck handling algorithm identifiers and keys as seperate things. In JOSE "alg" is a mandatory header parameter... in COSE it is not... but COSE ends up making it mandatory in a different way, and enabling not AEADs at the same time. JOSE has alg none, which is also a problem at this layer of the design. The counter argument to "don't put algorithms in headers" is "never use an algorithm which you do not trust" and "with a key it is not meant for"... in code this means: - restricting keys to specific algorithms (even tho the specs do not mandate this) - comparing algorithms in header to algorithms in keys (even though they are not required to be present in either) I think time has shown that it would have been safer / simpler to just "not put algorithm identifiers in headers". There is also the issue of bulk encryption / splitting key establishment and content encryption up... both JOSE and COSE do this, and it leads to "intermediate keys" and in JOSE, multiple algorithm identifiers in headers ("alg" and "enc"). JOSE could have shuffled things around like COSE did and avoided "enc" all together... or internalized things like HPKE does... but JOSE came first. ... final thoughts If I could wave a magic wand, I would 100% make algorithms part of keys, and make identifiers committing to keys, and handle the layering differently. Regardless of the era in which these protocols were constructed, we have a responsibility to deprecate the parts of them that are problematic, and offer upgrade paths where possible. For a recent example of this, see: https://datatracker.ietf.org/doc/html/draft-ietf-lamps-cms-cek-hkdf-sha256-04#name-use-of-of-hkdf-with-sha-256 COSE needs a draft that conceptually accomplishes the same thing. New COSE work needs to account for attacks that were discovered after COSE was constructed, it can't just say "we've always done it this way". If you got this far, thanks for reading. OS On Tue, Sep 17, 2024 at 3:33 AM <[email protected]<mailto:[email protected]>> wrote: Hi all, When I presented an update on the COSE HPKE draft at the last IETF meeting (see slides-120-cose-use-of-hpke-with-cose (ietf.org)<https://www.ietf.org/>), Sophie made an insightful remark that got me rethinking the construction of the context information. She noted, "you cannot trust the information in the headers", in response to my presentation. This is particularly relevant because the current draft suggests placing all context information into the header so it is included in the authenticated data. Ideally, when a recipient processes the message, the first step involves using the key ID to retrieve the key required to decrypt the payload (or identify the key used by the key exchange mechanism to derive the content encryption key). Best practices dictate that different keys should be used for different purposes, meaning there should be a one-to-one relationship between the key and the associated algorithm information. For instance, a key designated as a KEK for AES-KW should not be used directly for content encryption. This implies that the parties involved in the communication should avoid including algorithm-related information in the message header. Instead, this information should be retrieved based on the key identifier. Thus, more than just the key ID and the key must be shared between the communicating parties; key-related metadata must also be exchanged. If I understood Sophie correctly, the current approach of relying on header-based context information is not useful. We should reconsider why we are embedding all of this information in the header in the first place, as it may actually weaken security. Ciao Hannes [1] Interestingly, I had already advocated for using the key ID to select all other parameters back in 2015. See [COSE] alg Header Parameter (ietf.org)<https://mailarchive.ietf.org/arch/msg/cose/Ybou-lGY5C2DwYlorI8wRwxlmN0/> _______________________________________________ COSE mailing list -- [email protected]<mailto:[email protected]> To unsubscribe send an email to [email protected]<mailto:[email protected]> -- ORIE STEELE Chief Technology Officer www.transmute.industries<http://www.transmute.industries/> [https://ci3.googleusercontent.com/mail-sig/AIorK4xqtkj5psM1dDeDes_mjSsF3ylbEa5EMEQmnz3602cucAIhjLaHod-eVJq0E28BwrivrNSBMBc]<https://transmute.industries/> -- Sophie Schmieg | Information Security Engineer | ISE Crypto | [email protected]<mailto:[email protected]> _______________________________________________ COSE mailing list -- [email protected]<mailto:[email protected]> To unsubscribe send an email to [email protected]<mailto:[email protected]>
_______________________________________________ COSE mailing list -- [email protected] To unsubscribe send an email to [email protected]
