Oh, I see. There actually is a new transformation here:
OLD: (json) --> UTF-8 --> base64
NEW: (json) --> UTF-8 --> json-string
I'm not totally convinced that this is actually a new thing, that is, that
JSON.stringify(protect) won't produce a JSON-safe string. In the JSON
parsers I'm looking at, the output of JSON.stringify() is a JSON-safe
string.
However, in the course of looking at that question, I came across another
thing that I think is a better reason to abandon this proposal. Namely,
the JSON parsers in both Chrome and Firefox (possible others; haven't
tried) choke on parsing strings that have control characters. So while the
following line executes with no problem:
var x = {a:"one line \n two lines"}
calling JSON.parse(JSON.stringify(x)) results in either an error (Chrome)
or an incorrect parse (Firefox).
So regardless of "canonicalization" concerns (which aren't really about
canonicalization), it doesn't seem to me that this is a change that would
be very deployable. Guess that answers Jim's question about "whether it's
even possible" :)
--Richard
On Mon, Jun 10, 2013 at 5:36 PM, Mike Jones <[email protected]>wrote:
> My point is that if you just base64url encode the string – line the one
> below – CRLFs and all, you don’t have to do any of the canonicalization
> steps to make the representation JSON-string-safe in the first place. And
> you’re using the same tool you’re already using for encoding every other
> piece of binary content that needs to be encoded in some way for JOSE.****
>
> ** **
>
> Otherwise, we also still have to worry about “innocuous” transformations
> that might during transport like space to non-breaking-space, tab to
> spaces, spaces being added at the end of lines (which HTML mail transports
> are notorious for), etc.****
>
> ** **
>
> If we apply the KISS principle and use the same encoding in this case that
> we do in all other cases, we don’t have to even consider whether those and
> other content integrity threats that we may not have even thought of are
> going to hurt anyone in practice, because they can never arise.****
>
> ** **
>
> -- Mike***
> *
>
> ** **
>
> *From:* Richard Barnes [mailto:[email protected]]
> *Sent:* Monday, June 10, 2013 2:30 PM
>
> *To:* Mike Jones
> *Cc:* Jim Schaad; [email protected]
> *Subject:* Re: [jose] Possible change to protected field****
>
> ** **
>
> Mike: They already have to do that stuff before they base64-encode it.
> All Jim is talking about is *not* doing the base64.****
>
> ** **
>
> On Mon, Jun 10, 2013 at 4:25 PM, Mike Jones <[email protected]>
> wrote:****
>
> It’s a form of canonicalization when you take one representation of a
> string and apply a set of rules to transform it to a more
> stable/transmissible string. In this case, we’d be proposing that
> developers have to translate CRLF to \u000DU\u00A, and other similar
> canonicalization transformations. That’s a kind of canonicalization.
> Worse, it’s one that I don’t believe there’s standard library support for,
> so developers would have to write it for every implementation and
> platform. They’d hate us, and with good reason.****
>
> ****
>
> And this isn’t academic. It’s commonplace to use newlines in formatted
> JSON, such as:****
>
> ****
>
> {"alg": "RSA-OAEP",****
>
> "enc": "A256GCM",****
>
> "kid": "7"****
>
> }****
>
> ****
>
> Occam’s Razor suggests always using the same encoding when a stable,
> transmissible string representation is needed – in this case base64url
> encoding.****
>
> ****
>
> The somewhat improved readability of the canonicalization being discussed
> isn’t worth the interop problems or developer pain it would cost.
> Therefore, I sincerely hope that no one actually proposes this.****
>
> ****
>
> -- Mike****
>
> ****
>
> *From:* Jim Schaad [mailto:[email protected]]
> *Sent:* Monday, June 10, 2013 10:52 AM
> *To:* Mike Jones; 'Richard Barnes'
> *Cc:* [email protected]****
>
>
> *Subject:* RE: [jose] Possible change to protected field****
>
> ****
>
> I don’t understand why you think it takes us down the road to
> canonicalization. If there are canonization problems in this proposal I
> would say that they may already exist.****
>
> ****
>
> I don’t know if you are reading the JSON list or not, however you seem to
> be confusing the encoded and the unencoded versions of the string. The
> string does need to be encoded into a format that is legal to place into a
> JSON string field. In that case there is a number of ways that it can be
> done using quoting. However when the JSON parser gives that string to the
> application – it will be the original UTF8 encoded string. This means that
> a bare LF character would be the “code point” 10. That is the UTF8
> encoding for a line feed character. It does not get canonicalized in any
> way. A UTF8 string does have possible code points that are not legal in a
> JSON string, but that is JSONs problem not ours. The JSON string encoding
> format is responsible for the string to text and text to string
> transformations and making sure that those transformations are loss-less.*
> ***
>
> ****
>
> I agree that it may be harder to do, I have not formally presented this as
> a change to the document and will not do so until I have done a full
> analysis on how hard it is. I am currently just trying to figure out if it
> is even technically feasible.****
>
> ****
>
> Jim****
>
> ****
>
> ****
>
> *From:* Mike Jones
> [mailto:[email protected]<[email protected]>]
>
> *Sent:* Monday, June 10, 2013 10:09 AM
> *To:* Richard Barnes
> *Cc:* Jim Schaad; [email protected]
> *Subject:* RE: [jose] Possible change to protected field****
>
> ****
>
> This would take us down the road to canonicalization. How would you
> represent a CRLF in the value? How would you represent a bare LF? How
> would you represent a tab? Canonical representations for all of these, and
> more, would have to be specified.****
>
> ****
>
> Using a special encoding for this field that is different than the
> encoding used for all other fields (base64url encoding) is
> developer-hostile. It forces them to have (and test) special code to
> produce a different for this field alone. This would undoubtedly result in
> interop problems, because the corner cases almost never get tested (and
> often never get coded correctly either).****
>
> ****
>
> We already have a simple means of ensuring the correct transmission of
> fields. Occam’s Razor would suggest just using it.****
>
> ****
>
> -- Mike***
> *
>
> ****
>
> *From:* Richard Barnes [mailto:[email protected] <[email protected]>]
> *Sent:* Monday, June 10, 2013 10:01 AM
> *To:* Mike Jones
> *Cc:* Jim Schaad; [email protected]
> *Subject:* Re: [jose] Possible change to protected field****
>
> ****
>
> I think you've misunderstood the proposal. It's not that the field will
> be unencoded, it's that it will be encoded as a UTF-8 string instead of a
> base64 string.****
>
> ****
>
> OLD: (json) --> UTF-8 --> base64****
>
> NEW: (json) --> UTF-8****
>
> ****
>
> OLD: { protected: "eyJhbGciOiJSUzI1NiJ9Cg==" }****
>
> NEW: { protected: "{\"alg\":\"RS256\"}" }****
>
> ****
>
> Now, there might still be a problem with that, because JSON strings aren't
> canonical. In the example above, the quote characters could also have been
> presented as "\u0022". So if some JSON re-processor were to change the
> encoding ot the string, it would break the signature. However, that's also
> a problem for the base64-encoded version ("a" vs. "\u0061").****
>
> ****
>
> ****
>
> ****
>
> ****
>
> On Mon, Jun 10, 2013 at 12:38 PM, Mike Jones <[email protected]>
> wrote:****
>
> The problem with this suggested change is that it requires unencoded JSON
> objects to be transmitted with no transformations whatsoever, whereas the
> problems with this assumption are well known. For starters, on UNIX-based
> systems, newlines are typically represented by a bare LF character, whereas
> on DOS-based systems, newlines are typically represented by a CRLF pair.
> In transmission, many systems, including mail agents tend to convert
> newlines to the system’s normal format. This would break these JOSE
> objects.****
>
> ****
>
> Some agents wrap lines after a certain length. Particularly when being
> transmitted in HTML or XML, many systems replace two or more consecutive
> spaces with a single space. Others replace two or more spaces with a
> single space and N-1 non-breaking space characters. I could go on…****
>
> ****
>
> I appreciate the attempt to make things appear to be more uniform and
> readable, but the CRLF/LF problems alone are enough to make this a
> non-starter.****
>
> ****
>
> Best
> wishes,****
>
> -- Mike***
> *
>
> ****
>
> *From:* [email protected] [mailto:[email protected]] *On Behalf
> Of *Richard Barnes
> *Sent:* Monday, June 10, 2013 9:29 AM
> *To:* Jim Schaad
> *Cc:* [email protected]
> *Subject:* Re: [jose] Possible change to protected field****
>
> ****
>
> This sounds like a fine idea to me. It saves space and makes the JSON
> format more human-readable. It actually makes kind of a nice analogy to
> ASN.1, namely use of OCTET STRING to encapsulate more DER content.****
>
> ****
>
> The compact serialization can continue to base64url-encode that field, so
> it would not be a breaking change for that serialization.****
>
> ****
>
> --Richard****
>
> ****
>
> On Mon, Jun 10, 2013 at 1:46 AM, Jim Schaad <[email protected]>
> wrote:****
>
> <no hat>****
>
> ****
>
> I am trying to figure out if I am missing something. This is not yet a
> formal proposal to actual change the document.****
>
> ****
>
> I was thinking about proposing that we make a change to the content of the
> protected field in the JWS JSON serialization format. If we encoded this
> as a UTF8 string rather than the base64url encoded UTF8 string, then the
> content would be smaller. The computation of the signature would be
> unchanged in that it would still be computed over the base64url encoded
> string. I believe that the conversion from the UTF8 string to the
> base64url encoded UTF8 string is a deterministic encoding and thus would
> not generate any problems from that point.****
>
> ****
>
> At this point I and trying to figure out if I missed anything that would
> preclude this from working. I am not worried about how hard or easy it
> would be to do, just if it is even possible.****
>
> ****
>
> Jim****
>
> ****
>
>
> _______________________________________________
> jose mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/jose****
>
> ****
>
> ****
>
> ** **
>
_______________________________________________
jose mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/jose