Re: [jose] Canonical JSON form

Bret Jordan Fri, 12 Oct 2018 14:33:02 -0700

Anders,

How do we move forward?  What can we do to make this happen?



Thanks,
Bret
PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
"Without cryptography vihv vivc ce xhrnrw, however, the only thing that can not 
be unscrambled is an egg."

> On Oct 12, 2018, at 10:47 AM, Anders Rundgren <[email protected]> 
> wrote:
> 
> On 2018-10-12 17:20, Bret Jordan wrote:
>> Please correct me if I am wrong…. The way I understand the problem is as 
>> follows:
> 
> Hi Bret,
> Comments in line...
>> 1) If you verified the JSON string at consumption time, before it has been 
>> unmarshal-ed, then all you need to do is decide how to handle white space 
>> and carriage returns.  You could basically regex remove all white space and 
>> CR / CRLF and have a workable solution.
> 
> It depends on what your goal is.  Canonicalization builds on the assumption 
> that there is a unique representation of the data, preferably even after it 
> has passed through a processor like an intermediary.
> 
>> 2) Where this breaks down is, when a tool unmarshals the data into a map or 
>> struct, then you have no guarantee that you would recreate the keys in the 
>> same order (a struct may force it to the order of the struct definition). So 
>> you have no way of being able to verify the hash after it has been 
>> unmarshal-ed.  Further, if you recreate the JSON and send it back out, the 
>> next person that gets the data might have a hash that can not be verified in 
>> option 1 above.
> 
> Right, therefore option 1 is not very useful.  Sorting of keys is the cure 
> for this issue.
> 
> 
>> 3) Another problem once you have unmarshal-ed the data is what do you do 
>> with JSON numbers. 
> 
> Right, but even string data needs adjustments.  "\u0020" and " " both 
> represents a space character.
> 
> 
>> Some programming languages store them as a float, some as who-knows-what.  
>> So you would need a way to ensure that the number was always stored in the 
>> same way, especially for strongly typed systems (is this architecture 
>> dependent too?). So the options here are, if the ontology / semantics of the 
>> JSON data were well defined in schema (a meaning it was standardized and 
>> documented), then the code could know what it should do and interoperability 
>> tests could be made to ensure that it worked.
> 
> This is (IMO) the only part of the puzzle that is non-trivial.  In my take on 
> the matter, I have "prescribed" that the JSON Number type must be coerced 
> into an IEEE 754 double precision number and be serialized according to 
> ECMAScript V6+ rules.
> 
> If your application needs higher precision or bigger range, you are forced 
> using the quoted string notation which (AFAIK...) is used by every IETF 
> standard of any significance to date defining a JSON structure.
> 
> 
>> What am I not understanding here?  And what am I missing?
> 
> 
> As I wrote earlier, there are (at least) two entirely different and 
> documented approaches.
> 
> Using a schema based canonicalizer as you mention is also an option but that 
> is a much more ambitious project.
> 
> Regards,
> Anders
> 
> 
>> Thanks,
>> Bret
>> PGP Fingerprint: 63B4 FC53 680A 6B7D 1447  F2C0 74F8 ACAE 7415 0050
>> "Without cryptography vihv vivc ce xhrnrw, however, the only thing that can 
>> not be unscrambled is an egg."
>>> On Oct 12, 2018, at 12:38 AM, Anders Rundgren 
>>> <[email protected] <mailto:[email protected]>> 
>>> wrote:
>>> 
>>> On 2018-10-11 22:05, Bret Jordan wrote:
>>>> Anders,
>>>> I really like what you have done with this.  I am trying to figure out if 
>>>> it will work 100% for my needs, or if it will need some tweaking.  If it 
>>>> does work, then I think we should really try and figure out how we get 
>>>> your work standardized.
>>> 
>>> Thanx Bret!
>>> 
>>> The https://tools.ietf.org/html/draft-erdtman-jose-cleartext-jws-01 I-D 
>>> provides quite a lot of features including an extension option that can be 
>>> used for adding possibly missing functionality.
>>> 
>>> There is one thing that is good to know for anyone thinking about 
>>> standardizing Canonical JSON and that's the fact that canonicalization also 
>>> can be performed on the text level as described by: 
>>> https://gibson042.github.io/canonicaljson-spec/
>>> 
>>> This has the advantage that it is very simple and supports the entire JSON 
>>> RFC without restrictions.
>>> 
>>> 
>>> So why didn't I took this [superficially obvious] route? There are several 
>>> reasons for that:
>>> 
>>> A downside of source level canonicalization is that it doesn't integrate 
>>> with JSON parsers and serializers. 
>>> https://tools.ietf.org/html/draft-rundgren-json-canonicalization-scheme-01 
>>> was explicitly designed to eventually be an option of a standard JSON 
>>> serializer as it already is in my Java reference implementation.
>>> 
>>> Another issue is that it is unclear what the value is with using the JSON 
>>> "Number" format outside of the IEEE range.  In fact, it excludes parsers 
>>> like JavaScript's JSON.parse() unless JavaScaript would be updated to 
>>> always use a "BigNumber" as fundamental numeric type.
>>> 
>>> Regards,
>>> Anders
>

_______________________________________________
jose mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/jose

Re: [jose] Canonical JSON form

Reply via email to