It's a form of canonicalization when you take one representation of a string
and apply a set of rules to transform it to a more stable/transmissible string.
In this case, we'd be proposing that developers have to translate CRLF to
\u000DU\u00A, and other similar canonicalization transformations. That's a
kind of canonicalization. Worse, it's one that I don't believe there's
standard library support for, so developers would have to write it for every
implementation and platform. They'd hate us, and with good reason.
And this isn't academic. It's commonplace to use newlines in formatted JSON,
such as:
{"alg": "RSA-OAEP",
"enc": "A256GCM",
"kid": "7"
}
Occam's Razor suggests always using the same encoding when a stable,
transmissible string representation is needed - in this case base64url encoding.
The somewhat improved readability of the canonicalization being discussed isn't
worth the interop problems or developer pain it would cost. Therefore, I
sincerely hope that no one actually proposes this.
-- Mike
From: Jim Schaad [mailto:[email protected]]
Sent: Monday, June 10, 2013 10:52 AM
To: Mike Jones; 'Richard Barnes'
Cc: [email protected]
Subject: RE: [jose] Possible change to protected field
I don't understand why you think it takes us down the road to canonicalization.
If there are canonization problems in this proposal I would say that they may
already exist.
I don't know if you are reading the JSON list or not, however you seem to be
confusing the encoded and the unencoded versions of the string. The string
does need to be encoded into a format that is legal to place into a JSON string
field. In that case there is a number of ways that it can be done using
quoting. However when the JSON parser gives that string to the application -
it will be the original UTF8 encoded string. This means that a bare LF
character would be the "code point" 10. That is the UTF8 encoding for a line
feed character. It does not get canonicalized in any way. A UTF8 string does
have possible code points that are not legal in a JSON string, but that is
JSONs problem not ours. The JSON string encoding format is responsible for the
string to text and text to string transformations and making sure that those
transformations are loss-less.
I agree that it may be harder to do, I have not formally presented this as a
change to the document and will not do so until I have done a full analysis on
how hard it is. I am currently just trying to figure out if it is even
technically feasible.
Jim
From: Mike Jones [mailto:[email protected]]
Sent: Monday, June 10, 2013 10:09 AM
To: Richard Barnes
Cc: Jim Schaad; [email protected]<mailto:[email protected]>
Subject: RE: [jose] Possible change to protected field
This would take us down the road to canonicalization. How would you represent
a CRLF in the value? How would you represent a bare LF? How would you
represent a tab? Canonical representations for all of these, and more, would
have to be specified.
Using a special encoding for this field that is different than the encoding
used for all other fields (base64url encoding) is developer-hostile. It forces
them to have (and test) special code to produce a different for this field
alone. This would undoubtedly result in interop problems, because the corner
cases almost never get tested (and often never get coded correctly either).
We already have a simple means of ensuring the correct transmission of fields.
Occam's Razor would suggest just using it.
-- Mike
From: Richard Barnes [mailto:[email protected]]
Sent: Monday, June 10, 2013 10:01 AM
To: Mike Jones
Cc: Jim Schaad; [email protected]<mailto:[email protected]>
Subject: Re: [jose] Possible change to protected field
I think you've misunderstood the proposal. It's not that the field will be
unencoded, it's that it will be encoded as a UTF-8 string instead of a base64
string.
OLD: (json) --> UTF-8 --> base64
NEW: (json) --> UTF-8
OLD: { protected: "eyJhbGciOiJSUzI1NiJ9Cg==" }
NEW: { protected: "{\"alg\":\"RS256\"}" }
Now, there might still be a problem with that, because JSON strings aren't
canonical. In the example above, the quote characters could also have been
presented as "\u0022". So if some JSON re-processor were to change the
encoding ot the string, it would break the signature. However, that's also a
problem for the base64-encoded version ("a" vs. "\u0061").
On Mon, Jun 10, 2013 at 12:38 PM, Mike Jones
<[email protected]<mailto:[email protected]>> wrote:
The problem with this suggested change is that it requires unencoded JSON
objects to be transmitted with no transformations whatsoever, whereas the
problems with this assumption are well known. For starters, on UNIX-based
systems, newlines are typically represented by a bare LF character, whereas on
DOS-based systems, newlines are typically represented by a CRLF pair. In
transmission, many systems, including mail agents tend to convert newlines to
the system's normal format. This would break these JOSE objects.
Some agents wrap lines after a certain length. Particularly when being
transmitted in HTML or XML, many systems replace two or more consecutive spaces
with a single space. Others replace two or more spaces with a single space and
N-1 non-breaking space characters. I could go on...
I appreciate the attempt to make things appear to be more uniform and readable,
but the CRLF/LF problems alone are enough to make this a non-starter.
Best wishes,
-- Mike
From: [email protected]<mailto:[email protected]>
[mailto:[email protected]<mailto:[email protected]>] On Behalf Of
Richard Barnes
Sent: Monday, June 10, 2013 9:29 AM
To: Jim Schaad
Cc: [email protected]<mailto:[email protected]>
Subject: Re: [jose] Possible change to protected field
This sounds like a fine idea to me. It saves space and makes the JSON format
more human-readable. It actually makes kind of a nice analogy to ASN.1, namely
use of OCTET STRING to encapsulate more DER content.
The compact serialization can continue to base64url-encode that field, so it
would not be a breaking change for that serialization.
--Richard
On Mon, Jun 10, 2013 at 1:46 AM, Jim Schaad
<[email protected]<mailto:[email protected]>> wrote:
<no hat>
I am trying to figure out if I am missing something. This is not yet a formal
proposal to actual change the document.
I was thinking about proposing that we make a change to the content of the
protected field in the JWS JSON serialization format. If we encoded this as a
UTF8 string rather than the base64url encoded UTF8 string, then the content
would be smaller. The computation of the signature would be unchanged in that
it would still be computed over the base64url encoded string. I believe that
the conversion from the UTF8 string to the base64url encoded UTF8 string is a
deterministic encoding and thus would not generate any problems from that point.
At this point I and trying to figure out if I missed anything that would
preclude this from working. I am not worried about how hard or easy it would
be to do, just if it is even possible.
Jim
_______________________________________________
jose mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/jose
_______________________________________________
jose mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/jose