Re: [jose] Possible change to protected field

Mike Jones Mon, 10 Jun 2013 13:25:49 -0700

It's a form of canonicalization when you take one representation of a string 
and apply a set of rules to transform it to a more stable/transmissible string. 
 In this case, we'd be proposing that developers have to translate CRLF to 
\u000DU\u00A, and other similar canonicalization transformations.  That's a 
kind of canonicalization.  Worse, it's one that I don't believe there's 
standard library support for, so developers would have to write it for every 
implementation and platform.  They'd hate us, and with good reason.


And this isn't academic.  It's commonplace to use newlines in formatted JSON, 
such as:

{"alg": "RSA-OAEP",
"enc": "A256GCM",
"kid": "7"
}

Occam's Razor suggests always using the same encoding when a stable, 
transmissible string representation is needed - in this case base64url encoding.

The somewhat improved readability of the canonicalization being discussed isn't 
worth the interop problems or developer pain it would cost.  Therefore, I 
sincerely hope that no one actually proposes this.

                                                            -- Mike

From: Jim Schaad [mailto:[email protected]]
Sent: Monday, June 10, 2013 10:52 AM
To: Mike Jones; 'Richard Barnes'
Cc: [email protected]
Subject: RE: [jose] Possible change to protected field

I don't understand why you think it takes us down the road to canonicalization. 
 If there are canonization problems in this proposal I would say that they may 
already exist.

I don't know if you are reading the JSON list or not, however you seem to be 
confusing the encoded and the unencoded versions of the string.  The string 
does need to be encoded into a format that is legal to place into a JSON string 
field.  In that case there is a number of ways that it can be done using 
quoting.  However when the JSON parser gives that string to the application - 
it will be the original UTF8 encoded string.  This means that a bare LF 
character would be the "code point" 10.  That is the UTF8 encoding for a line 
feed character.  It does not get canonicalized in any way.  A UTF8 string does 
have possible code points that are not legal in a JSON string, but that is 
JSONs problem not ours.  The JSON string encoding format is responsible for the 
string to text and text to string transformations and making sure that those 
transformations are loss-less.

I agree that it may be harder to do, I have not formally presented this as a 
change to the document and will not do so until I have done a full analysis on 
how hard it is.  I am currently just trying to figure out if it is even 
technically feasible.

Jim


From: Mike Jones [mailto:[email protected]]
Sent: Monday, June 10, 2013 10:09 AM
To: Richard Barnes
Cc: Jim Schaad; [email protected]<mailto:[email protected]>
Subject: RE: [jose] Possible change to protected field

This would take us down the road to canonicalization.  How would you represent 
a CRLF in the value?  How would you represent a bare LF?  How would you 
represent a tab?  Canonical representations for all of these, and more, would 
have to be specified.

Using a special encoding for this field that is different than the encoding 
used for all other fields (base64url encoding) is developer-hostile.  It forces 
them to have (and test) special code to produce a different for this field 
alone.  This would undoubtedly result in interop problems, because the corner 
cases almost never get tested (and often never get coded correctly either).

We already have a simple means of ensuring the correct transmission of fields.  
Occam's Razor would suggest just using it.

                                                                -- Mike

From: Richard Barnes [mailto:[email protected]]
Sent: Monday, June 10, 2013 10:01 AM
To: Mike Jones
Cc: Jim Schaad; [email protected]<mailto:[email protected]>
Subject: Re: [jose] Possible change to protected field

I think you've misunderstood the proposal.  It's not that the field will be 
unencoded, it's that it will be encoded as a UTF-8 string instead of a base64 
string.

OLD: (json) --> UTF-8 --> base64
NEW: (json) --> UTF-8

OLD:  { protected: "eyJhbGciOiJSUzI1NiJ9Cg==" }
NEW: { protected: "{\"alg\":\"RS256\"}" }

Now, there might still be a problem with that, because JSON strings aren't 
canonical.  In the example above, the quote characters could also have been 
presented as "\u0022".  So if some JSON re-processor were to change the 
encoding ot the string, it would break the signature.  However, that's also a 
problem for the base64-encoded version ("a" vs. "\u0061").




On Mon, Jun 10, 2013 at 12:38 PM, Mike Jones 
<[email protected]<mailto:[email protected]>> wrote:
The problem with this suggested change is that it requires unencoded JSON 
objects to be transmitted with no transformations whatsoever, whereas the 
problems with this assumption are well known.  For starters, on UNIX-based 
systems, newlines are typically represented by a bare LF character, whereas on 
DOS-based systems, newlines are typically represented by a CRLF pair.  In 
transmission, many systems, including mail agents tend to convert newlines to 
the system's normal format.  This would break these JOSE objects.

Some agents wrap lines after a certain length.  Particularly when being 
transmitted in HTML or XML, many systems replace two or more consecutive spaces 
with a single space.  Others replace two or more spaces with a single space and 
N-1 non-breaking space characters.  I could go on...

I appreciate the attempt to make things appear to be more uniform and readable, 
but the CRLF/LF problems alone are enough to make this a non-starter.

                                                                Best wishes,
                                                                -- Mike

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]<mailto:[email protected]>] On Behalf Of 
Richard Barnes
Sent: Monday, June 10, 2013 9:29 AM
To: Jim Schaad
Cc: [email protected]<mailto:[email protected]>
Subject: Re: [jose] Possible change to protected field

This sounds like a fine idea to me.  It saves space and makes the JSON format 
more human-readable.  It actually makes kind of a nice analogy to ASN.1, namely 
use of OCTET STRING to encapsulate more DER content.

The compact serialization can continue to base64url-encode that field, so it 
would not be a breaking change for that serialization.

--Richard

On Mon, Jun 10, 2013 at 1:46 AM, Jim Schaad 
<[email protected]<mailto:[email protected]>> wrote:
<no hat>

I am trying to figure out if I am missing something.  This is not yet a formal 
proposal to actual change the document.

I was thinking about proposing that we make a change to the content of the 
protected field in the JWS JSON serialization format.  If we encoded this as a 
UTF8 string rather than the base64url encoded UTF8 string, then the content 
would be smaller.  The computation of the signature would be unchanged in that 
it would still be computed over the base64url encoded string.  I believe that 
the conversion from the UTF8 string to the base64url encoded UTF8 string is a 
deterministic encoding and thus would not generate any problems from that point.

At this point I and trying to figure out if I missed anything that would 
preclude this from working.  I am not worried about how hard or easy it would 
be to do, just if it is even possible.

Jim


_______________________________________________
jose mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/jose

_______________________________________________
jose mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/jose

Re: [jose] Possible change to protected field

Reply via email to