On Tue, Jun 11, 2013 at 1:12 PM, Jim Schaad <[email protected]> wrote:
> ** ** > > ** ** > > *From:* Richard Barnes [mailto:[email protected]] > *Sent:* Tuesday, June 11, 2013 7:11 AM > *To:* Manger, James H > *Cc:* Mike Jones; Jim Schaad; [email protected] > > *Subject:* Re: [jose] Possible change to protected field**** > > ** ** > > On Mon, Jun 10, 2013 at 9:45 PM, Manger, James H < > [email protected]> wrote:**** > > >>> I was thinking about proposing that we make a change to the content of > the protected field in the JWS JSON serialization format. If we encoded > this as a UTF8 string rather than the base64url encoded UTF8 string, then > the content would be smaller. > > **** > > This would work. Mike's canonicalization worries are misplaced. Richard's > Chrome/Firefox "bug" seems wrong. > > Example: JOSE originator creates the following header, with U+000A for > line ending.**** > > > {"alg": "RSA-OAEP", > "enc": "A256GCM", > "kid": "7" > }**** > > This header consists of 19+1+17+1+10+1+1+1=51 characters (including 3 > spaces and 4 newlines). Put these 51 chars as the value of the "protected" > field. > > {"protected":"{\"alg\": \"RSA-OAEP\",\n\"enc\": \"A256GCM\",\n\"kid\": > \"7\"\n}\n"} > > They appear as 51+16=67 chars inside the quotes of the "protected" field: > 4 newlines are replaced with 4 n chars; plus 16 slashes are added. This > step could have used \u0022 instead of \", or \u000A or \u000a instead of > \n, or escaped any of the other 35 chars. But that doesn't matter since > this escaping will be removed by the first JSON parse at the other end, > which will gives the original 51 chars to its caller (in whatever native > string type the language uses). > > If the originator used CRLF instead (giving 55 chars), the transmitted > data will be {"protected":"...\r\n...\r\n...\r\n...\r\n"}. The first JSON > parse at the recipient will recreate the correct 55 chars. No problem. > > There is plenty of potential to confuse a human doing this manually as > there are 2 layers of JSON string escaping. It is well-defined enough for a > computer (or spec) though. > > > As to whether this is worthwhile... > In this specific example the base64url overhead would be +17 chars, merely > 1 more char than the overhead of omitting the base64url step. > > > A more valuable change would be to apply the crypto operations AFTER > removing the base64url encoding. That is, apply the crypto to a > UTF-8-encoded JSON header, instead of to the US-ASCII encoding of the > base64url-encoded UTF-8-encoded JSON header. Implementations always need > the UTF-8 form at some point anyway. It means the crypto is applied to > fewer bytes so it is faster. It means an idea like Jim's that doesn't need > base64url to maintain the correct bytes in transmission doesn't need to > artificially add a base64url-encoding step at the receiver just to do the > crypto. > > > P.S.**** > > >> the JSON parsers in both Chrome and Firefox (possible others; haven't > tried) choke on parsing strings that have control characters**** > > JSON does not allow unescaped control chars when representing strings so > choking is the correct behaviour. JSON.parse(JSON.stringify(x)) works for > me in Chrome. JSON.stringify does NOT leave unescaped control chars in > quotes in the result.**** > > ** ** > > I have heard multiple independent reports of my incorrectness in this > experiment. So I retract my assertion of infeasibility, with regard to > control characters.**** > > ** ** > > However, there's still an issue with code points that don't have to be > escaped. Consider the following example:**** > > ** ** > > var x = '{"a":"caf\\u00e9"}';**** > > var x2 = JSON.stringify(JSON.parse(x))**** > > ** ** > > In a quick test of the environments I have on hand (several browsers + > Node.JS + PHP + Perl), I see a wide variety of behaviors, most of which > would break the signature. The browsers and Node all converted the > "\u00e9" sequence to an unescaped "é" character, while Perl (creatively) > rendered it directly as a Latin-1 0xe9 byte. Only PHP rendered it back to > the escape sequence. Test code here:**** > > <http://www.ipv.sx/jose/json_test.html>**** > > ** ** > > Now, even if we think that re-encoding is a problem, there's the question > of whether we think the risk is acceptable to achieve better human > readability. I'm pretty split, myself.**** > > ** ** > > Do you thing from your test that this would also be a problem if you did** > ** > > ** ** > > Base64(JSON.stringify(JSON.parse(Unbase64(x)))**** > > ** ** > > This would appear to be a general re-creation problem even if we use > base64 encoding > The base64 part in your line actually isn't relevant. It's actually the outer parse/stringify (on the overall JWE object) that we're worried about changing the string. The control case for the experiment would be: var x = '{"a":"validBase64=="}'; var x2 = JSON.stringify(JSON.parse(x)) That works just fine with my standard test suite. <http://www.ipv.sx/jose/json_test.html> --Richard > **** > > ** ** > > Jim**** > > ** ** > > ** ** > > --Richard**** > > ** ** > > **** > > > -- > James Manger**** > > ** ** >
_______________________________________________ jose mailing list [email protected] https://www.ietf.org/mailman/listinfo/jose
