On 2018-03-16 16:38, C. Scott Ananian wrote:
Canonical JSON is often used to imply a security property: two JSON blobs > 
with identical contents are expected to have identical canonical JSON
forms (and thus identical hashed values).

Right.

However, unicode normalization allows multiple representations
of "the same" string, which defeats this security property.

This is an aspect that I believe belongs to the "application" level.  This specification 
is only about "on the wire" format.

Rationale: if this was a part of the SPECIFICATION it would either be ignored 
(=useless) or be a showstopper (=dead) due to complexity.

If applications using the received data want to address this issue they can for 
example call
https://msdn.microsoft.com/en-us/library/windows/desktop/dd318671(v=vs.85).aspx
and reject if they want.

Or always normalize: 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize


Depending on your implementation language and use, a string with
precomposed accepts could compare equal to a string with separated accents, even though the canonical JSON or hash differed.

I don't want to go there for the reasons mentioned.


In an extreme case (with a weak hash function, say MD5), this can be  > used to 
break security by re-encoding all strings in multiple variants
until a collision is found.  This is just a slight variant on the fact
that JSON allows multiple ways to encode a character using escape sequences.
You've already taken the trouble to disambiguate this case; security-conscious
applications should take care to perform unicode normalization as well, for the 
same reason.

If you are able to break the hash function all bets are off anyway because then 
you can presumably change *any* part of the object and it would still appear 
authentic.

Escape normalization: If you don't do this normalization, signatures would typically break and 
that's not really a "security" (=attacker) problem; it is rather a "nuisance" 
of the same caliber as a server not responding.


Similarly, if you don't offer a verifier to ensure that the input is
in "canonical JSON" format, then an attacker can try to create collisions by violating the rules of canonical JSON format, whether by using different
escape sequences, adding whitespace, etc.  This can be used to make JSON which
is "the same" appear "different", violating the intent of the canonicalization.

Again, if the hash function is broken, there's nothing to do except maybe cry 
:-(

This a Unicode problem, not a cryptographic problem.


Any security application of canonical JSON will require a strict mode for JSON.parse() as well as a strict mode for JSON.stringify().

Indeed, you ALWAYS must verify that indata conforms to the agreed conventions.

Anyway, feel free pushing a different JSON canonicalization scheme!

Here is another: http://gibson042.github.io/canonicaljson-spec/
It claims that you should support "lone surrogates" (invalid Unicode) which for 
example JDK doesn't.
I don't go there either.

Anders

   --scott

On Fri, Mar 16, 2018 at 4:48 AM, Anders Rundgren <[email protected] 
<mailto:[email protected]>> wrote:

    On 2018-03-16 08:52, C. Scott Ananian wrote:

        See http://wiki.laptop.org/go/Canonical_JSON 
<http://wiki.laptop.org/go/Canonical_JSON> -- you should probably at least
        mention unicode normalization of strings.


    Yes, I could add that unicode normalization of strings is out of scope for 
this specification.


        You probably should also specify a validator: it doesn't matter if you 
emit canonical JSON if you can tweak the hash of the value by feeding 
non-canonical JSON as an input.


    Pardon me, but I don't understand what you are writing here.

    Hash functions only "raison d'être" are providing collision safe checksums.

    thanx,
    Anders


            --scott

        On Fri, Mar 16, 2018 at 3:16 AM, Anders Rundgren <[email protected] 
<mailto:[email protected]> <mailto:[email protected] 
<mailto:[email protected]>>> wrote:

             Dear List,

             Here is a proposal that I would be very happy getting feedback on 
since it builds on ES but is not (at all) limited to ES.

             The request is for a complement to the ES "JSON" object called 
canonicalize() which would have identical parameters to the existing stringify() method.

             The JSON canonicalization scheme (including ES code for emulating 
it), is described in:
        
https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html 
<https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html>
 <https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html 
<https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html>>

             Current workspace: https://github.com/cyberphone/json-canonicalization 
<https://github.com/cyberphone/json-canonicalization> 
<https://github.com/cyberphone/json-canonicalization 
<https://github.com/cyberphone/json-canonicalization>>

             Thanx,
             Anders Rundgren
             _______________________________________________
             es-discuss mailing list
        [email protected] <mailto:[email protected]> 
<mailto:[email protected] <mailto:[email protected]>>
        https://mail.mozilla.org/listinfo/es-discuss 
<https://mail.mozilla.org/listinfo/es-discuss> 
<https://mail.mozilla.org/listinfo/es-discuss 
<https://mail.mozilla.org/listinfo/es-discuss>>





_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to