This is a known issue and already addressed in the specifications. Uniformly,
each JSON structure definition includes language like "JWSs with duplicate
Header Parameter Names MUST be rejected"
(http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-04#section-4).
Also, there is a Security Considerations section specifically about JSON
parsing issues
(http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-04#section-8.2),
which says:
Strict JSON validation is a security requirement. If malformed JSON is
received, then the intent of the sender is impossible to reliably discern.
Ambiguous and potentially exploitable situations could arise if the JSON parser
used does not reject malformed JSON syntax.
Section 2.2 of the JavaScript Object Notation (JSON) specification [RFC4627]
states "The names within an object SHOULD be unique", whereas this
specification states that "Header Parameter Names within this object MUST be
unique; JWSs with duplicate Header Parameter Names MUST be rejected". Thus,
this specification requires that the Section 2.2 "SHOULD" be treated as a
"MUST". Ambiguous and potentially exploitable situations could arise if the
JSON parser used does not enforce the uniqueness of member names.
It's entirely possible that a hardened JSON parser will need to be used that
rejects this and other kinds of malformed JSON if the native parser doesn't.
We've tried to be very explicit about this in the specifications.
Cheers,
-- Mike
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Richard
L.Barnes
Sent: Wednesday, July 18, 2012 6:41 AM
To: [email protected]
Subject: [jose] JSON parsing pitfalls
I noticed the other day a widespread flaw in JSON parsers that makes JSON
parsing a lossy operation. I haven't worked out the implications, but it seems
worth thinking about, so I thought I would run it by the group.
The problem is that JSON does not require that field names MUST be unique, only
that they SHOULD. From RFC 4627:
"
An object structure is represented as a pair of curly brackets
surrounding zero or more name/value pairs (or members). A name is a
string. A single colon comes after each name, separating the name
from the value. A single comma separates a value from a following
name. The names within an object SHOULD be unique.
"
Unfortunately, most JSON parsers are parsing into object constructs where each
field name references only one value. However, they will happily parse JSON
objects containing redundant fields -- they just set each field value to the
last value they saw for that name. So for example, if we parse the following
object ...
{ "foo": 1, "bar": 2, "foo": "This above all: to thine own self be true." } ...
then most parsers will return an object with two fields, which would stringify
to the following:
{ "foo": "This above all: to thine own self be true.", "bar": 2 } A
demonstration in four languages is at the bottom of this message.
Like I said at the top, I'm not sure exactly what the implications of this
ambiguity are for this protocol. Thoughts?
--Richard
-----BEGIN json-error.sh-----
#!/bin/bash
BADHEADER='{"typ":"JWT", "alg":"HS256", "alg":"hmac-md5"}'
###### Python
python <<EOF
import json
str = '$BADHEADER'
obj = json.loads(str)
print obj
EOF
###### Perl
perl <<EOF
use JSON qw(to_json from_json);
\$str = '$BADHEADER';
\$obj = from_json(\$str);
printf "%s\n", to_json(\$obj);
EOF
###### Javascript (NodeJS / JSC)
node <<EOF
var str = '$BADHEADER';
var obj = JSON.parse(str);
console.log(JSON.stringify(obj));
EOF
###### PHP
php <<EOF
<?
\$str = '$BADHEADER';
\$obj = json_decode(\$str);
echo json_encode(\$obj) . "\n";
?>
EOF
-----END json-error.sh-----
_______________________________________________
jose mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/jose
_______________________________________________
jose mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/jose