I thought the concern was that some intermediary would get a JOSE object, deserialize it, then serialize it back to a different form. It seems like that corresponds to JSON.stringify(JSON.parse(x)).
On Tue, Jun 11, 2013 at 11:10 AM, Manger, James H < [email protected]> wrote: > I think you are doing it in the wrong order. The issue is whether > JSON.parse(JSON.stringify(x)) > returns the same string. > > -- > James Manger > > ----- Reply message ----- > From: "Richard Barnes" <[email protected]> > Date: Wed, Jun 12, 2013 12:10 am > Subject: [jose] Possible change to protected field > To: "Manger, James H" <[email protected]> > Cc: "Mike Jones" <[email protected]>, "Jim Schaad" < > [email protected]>, "[email protected]" <[email protected]> > > On Mon, Jun 10, 2013 at 9:45 PM, Manger, James H < > [email protected]<mailto:[email protected]>> > wrote: > >>> I was thinking about proposing that we make a change to the content of > the protected field in the JWS JSON serialization format. If we encoded > this as a UTF8 string rather than the base64url encoded UTF8 string, then > the content would be smaller. > > > This would work. Mike's canonicalization worries are misplaced. Richard's > Chrome/Firefox "bug" seems wrong. > > Example: JOSE originator creates the following header, with U+000A for > line ending. > > {"alg": "RSA-OAEP", > "enc": "A256GCM", > "kid": "7" > } > > This header consists of 19+1+17+1+10+1+1+1=51 characters (including 3 > spaces and 4 newlines). Put these 51 chars as the value of the "protected" > field. > > {"protected":"{\"alg\": \"RSA-OAEP\",\n\"enc\": \"A256GCM\",\n\"kid\": > \"7\"\n}\n"} > > They appear as 51+16=67 chars inside the quotes of the "protected" field: > 4 newlines are replaced with 4 n chars; plus 16 slashes are added. This > step could have used \u0022 instead of \", or \u000A or \u000a instead of > \n, or escaped any of the other 35 chars. But that doesn't matter since > this escaping will be removed by the first JSON parse at the other end, > which will gives the original 51 chars to its caller (in whatever native > string type the language uses). > > If the originator used CRLF instead (giving 55 chars), the transmitted > data will be {"protected":"...\r\n...\r\n...\r\n...\r\n"}. The first JSON > parse at the recipient will recreate the correct 55 chars. No problem. > > There is plenty of potential to confuse a human doing this manually as > there are 2 layers of JSON string escaping. It is well-defined enough for a > computer (or spec) though. > > > As to whether this is worthwhile... > In this specific example the base64url overhead would be +17 chars, merely > 1 more char than the overhead of omitting the base64url step. > > > A more valuable change would be to apply the crypto operations AFTER > removing the base64url encoding. That is, apply the crypto to a > UTF-8-encoded JSON header, instead of to the US-ASCII encoding of the > base64url-encoded UTF-8-encoded JSON header. Implementations always need > the UTF-8 form at some point anyway. It means the crypto is applied to > fewer bytes so it is faster. It means an idea like Jim's that doesn't need > base64url to maintain the correct bytes in transmission doesn't need to > artificially add a base64url-encoding step at the receiver just to do the > crypto. > > > P.S. > >> the JSON parsers in both Chrome and Firefox (possible others; haven't > tried) choke on parsing strings that have control characters > > JSON does not allow unescaped control chars when representing strings so > choking is the correct behaviour. JSON.parse(JSON.stringify(x)) works for > me in Chrome. JSON.stringify does NOT leave unescaped control chars in > quotes in the result. > > I have heard multiple independent reports of my incorrectness in this > experiment. So I retract my assertion of infeasibility, with regard to > control characters. > > However, there's still an issue with code points that don't have to be > escaped. Consider the following example: > > var x = '{"a":"caf\\u00e9"}'; > var x2 = JSON.stringify(JSON.parse(x)) > > In a quick test of the environments I have on hand (several browsers + > Node.JS + PHP + Perl), I see a wide variety of behaviors, most of which > would break the signature. The browsers and Node all converted the > "\u00e9" sequence to an unescaped "é" character, while Perl (creatively) > rendered it directly as a Latin-1 0xe9 byte. Only PHP rendered it back to > the escape sequence. Test code here: > <http://www.ipv.sx/jose/json_test.html> > > Now, even if we think that re-encoding is a problem, there's the question > of whether we think the risk is acceptable to achieve better human > readability. I'm pretty split, myself. > > --Richard > > > > -- > James Manger > >
_______________________________________________ jose mailing list [email protected] https://www.ietf.org/mailman/listinfo/jose
