I don't understand why you think it takes us down the road to canonicalization. If there are canonization problems in this proposal I would say that they may already exist.
I don't know if you are reading the JSON list or not, however you seem to be confusing the encoded and the unencoded versions of the string. The string does need to be encoded into a format that is legal to place into a JSON string field. In that case there is a number of ways that it can be done using quoting. However when the JSON parser gives that string to the application - it will be the original UTF8 encoded string. This means that a bare LF character would be the "code point" 10. That is the UTF8 encoding for a line feed character. It does not get canonicalized in any way. A UTF8 string does have possible code points that are not legal in a JSON string, but that is JSONs problem not ours. The JSON string encoding format is responsible for the string to text and text to string transformations and making sure that those transformations are loss-less. I agree that it may be harder to do, I have not formally presented this as a change to the document and will not do so until I have done a full analysis on how hard it is. I am currently just trying to figure out if it is even technically feasible. Jim From: Mike Jones [mailto:[email protected]] Sent: Monday, June 10, 2013 10:09 AM To: Richard Barnes Cc: Jim Schaad; [email protected] Subject: RE: [jose] Possible change to protected field This would take us down the road to canonicalization. How would you represent a CRLF in the value? How would you represent a bare LF? How would you represent a tab? Canonical representations for all of these, and more, would have to be specified. Using a special encoding for this field that is different than the encoding used for all other fields (base64url encoding) is developer-hostile. It forces them to have (and test) special code to produce a different for this field alone. This would undoubtedly result in interop problems, because the corner cases almost never get tested (and often never get coded correctly either). We already have a simple means of ensuring the correct transmission of fields. Occam's Razor would suggest just using it. -- Mike From: Richard Barnes [mailto:[email protected]] Sent: Monday, June 10, 2013 10:01 AM To: Mike Jones Cc: Jim Schaad; [email protected] Subject: Re: [jose] Possible change to protected field I think you've misunderstood the proposal. It's not that the field will be unencoded, it's that it will be encoded as a UTF-8 string instead of a base64 string. OLD: (json) --> UTF-8 --> base64 NEW: (json) --> UTF-8 OLD: { protected: "eyJhbGciOiJSUzI1NiJ9Cg==" } NEW: { protected: "{\"alg\":\"RS256\"}" } Now, there might still be a problem with that, because JSON strings aren't canonical. In the example above, the quote characters could also have been presented as "\u0022". So if some JSON re-processor were to change the encoding ot the string, it would break the signature. However, that's also a problem for the base64-encoded version ("a" vs. "\u0061"). On Mon, Jun 10, 2013 at 12:38 PM, Mike Jones <[email protected]> wrote: The problem with this suggested change is that it requires unencoded JSON objects to be transmitted with no transformations whatsoever, whereas the problems with this assumption are well known. For starters, on UNIX-based systems, newlines are typically represented by a bare LF character, whereas on DOS-based systems, newlines are typically represented by a CRLF pair. In transmission, many systems, including mail agents tend to convert newlines to the system's normal format. This would break these JOSE objects. Some agents wrap lines after a certain length. Particularly when being transmitted in HTML or XML, many systems replace two or more consecutive spaces with a single space. Others replace two or more spaces with a single space and N-1 non-breaking space characters. I could go on. I appreciate the attempt to make things appear to be more uniform and readable, but the CRLF/LF problems alone are enough to make this a non-starter. Best wishes, -- Mike From: [email protected] [mailto:[email protected]] On Behalf Of Richard Barnes Sent: Monday, June 10, 2013 9:29 AM To: Jim Schaad Cc: [email protected] Subject: Re: [jose] Possible change to protected field This sounds like a fine idea to me. It saves space and makes the JSON format more human-readable. It actually makes kind of a nice analogy to ASN.1, namely use of OCTET STRING to encapsulate more DER content. The compact serialization can continue to base64url-encode that field, so it would not be a breaking change for that serialization. --Richard On Mon, Jun 10, 2013 at 1:46 AM, Jim Schaad <[email protected]> wrote: <no hat> I am trying to figure out if I am missing something. This is not yet a formal proposal to actual change the document. I was thinking about proposing that we make a change to the content of the protected field in the JWS JSON serialization format. If we encoded this as a UTF8 string rather than the base64url encoded UTF8 string, then the content would be smaller. The computation of the signature would be unchanged in that it would still be computed over the base64url encoded string. I believe that the conversion from the UTF8 string to the base64url encoded UTF8 string is a deterministic encoding and thus would not generate any problems from that point. At this point I and trying to figure out if I missed anything that would preclude this from working. I am not worried about how hard or easy it would be to do, just if it is even possible. Jim _______________________________________________ jose mailing list [email protected] https://www.ietf.org/mailman/listinfo/jose
_______________________________________________ jose mailing list [email protected] https://www.ietf.org/mailman/listinfo/jose
