Yesterday I added implementations of JWS, JSON-B and JSON-C to my existing JSON encoding suite (PROTOGEN).
For the sake of fair comparison, I have not attempted any further optimization or made any changes to the encoding described here. http://tools.ietf.org/html/draft-hallambaker-jsonbcd-02 I could easily shave a few more bytes off the total with additional techniques which I considered but rejected as not being worth the space/complexity tradeoff. Implementation took approximately 2 hours for the encoding scheme and 4 hours for JWS. Much of the latter being spent writing test code to make sure that the test vectors in draft-41 work (they do). While implementing additional encodings is additional work, the binary encodings are actually much easier to implement than the text. There is no need to perform Base64 encoding. Estimating space requirements is a lot easier, etc. If I was implementing one encoder for a constrained device I would much prefer to do implement JSON-C than JSON. On the decoder side, JSON-B is a strict super-set of JSON which means that a decoder must support both encodings unless a 'binary only' subset is defined. But this does mean that a JSON-C decoder can decode JSON-B or traditional JSON, one decoder fits all. For a test case, I used: http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-41 Encoding the HMAC signature example results in the following encoding sizes: In JSON: 244 bytes In JSON-B: 165 bytes In JSON-C: 129 bytes In each case the Payload data is the string given in the example, even though this is JSON, this is left as is to provide a fair basis for comparison. This slightly overstates the advantage of JSON-B as my JSON encoding has indentation. The main saving comes from avoiding the need to BASE64 armor the binary blobs. Looking at the internals: In JSON: 244 bytes Protected 35 / Payload 70 / Signature 32 In JSON-B: 165 bytes Protected 24 / Payload 70 / Signature 32 In JSON-C: 129 bytes Protected 13 / Payload 70 / Signature 32 Since the payload and signature are the same for each example, 102 bytes of the message are irreducible. The move from text to binary blobs saves 79 bytes, 30 of which are coming from eliminating Base64. Tag and string compression saves another 34 bytes, but only in the case where we can pre-exchange the tag dictionary. JSON-C also supports an on-the fly string compression technique but that only provides savings for longer messages with large areas of repeated texts. Conclusion: 1) We do not need a new working group to specify a binary encoding of JOSE. The fact that CBOR requires hand tweaking to apply it to a JSON data structure is the reason that I and others objected to the approach from the start. Note that I wrote JSON-BCD in response to the statement made by the CBOR cabal that they were a private group that was not required to be open, consider alternative approaches or respect IETF consensus. In particular the statement was made repeatedly that 'CBOR is not intended to be a binary encoding of JSON'. 2) A binary encoding of JSON should not require additional IETF time, effort or review. The implementation of JSON-B is entirely mechanical and required no additional input whatsoever. The only additional input required for use of JSON-C is the compilation of a tag dictionary. The one I used has 88 defined code points which I compiled by looking through the IANA considerations section of the draft. This could easily be produced automatically through use of a tool. Since JSON-B uses byte aligned tags and there is only a need for 88 of them, the choice of tag values has absolutely no impact on the compression efficiency. 3) A binary encoding should not require ongoing maintenance. What worries me most about the CBOR fiasco is that we risk a MIB type situation in which every new IETF JSON protocol requires a parallel 'CBOR' encoding and this becomes an ongoing maintenance requirement. JSON-B is designed as a strict superset of JSON so that upwards compatibility is guaranteed. This allows use of a new version of the specification or support a privately defined tag that is not in the dictionary without waiting for a new dictionary to be issued or a new 'binary' version of the specification to be defined. 4) The IETF needs a binary encoding of JSON that encodes precisely the JSON data model with (almost) nothing added or taken away. A Binary encoding of JSON does need to add a binary type which is an extension of the JSON model. A case could also be made for a DateTime intrinsic type which would be rendered in RFC3339 format strings in JSON but I have resisted this so far. One of the main reasons for rejecting many of the existing Binary JSON formats is that the designers have found the temptation to add code points for their favorite random data types irresistible. 5) While it is possible to improve JSON-B compression efficiency, the savings are unlikely to be very interesting. The use of the JWS example is instructive because the only way to improve significantly on JSON-C would be to compress the payload. Out of the 129 bytes used in the JSON-C version, 104 are data elements and 25 are framing for two nested structures with a total of six structure element. That is an average overhead of 4 bytes per element including the tag and length data. _______________________________________________ jose mailing list [email protected] https://www.ietf.org/mailman/listinfo/jose
