I agree with all of Shane's points below, namely better encoding of data, not
making all fields optional, handling of unknown rrtypes and having a schema for
the data. These are all points (and many more) that were thoroughly addressed
in the following draft on representing DNS in XML
http://tools.ietf.org/html/draft-daley-dnsxml-00
If it helps then I can produce XSLT that takes data in the XML format specified
in that draft and validated against that schema and turn it into whatever
flavour of JSON you require:
http://controlfreak.net/xml-to-json-in-xslt-a-toolkit/
*stands back and waits for the "but json is sooo much easier than xml"
brickbats*
Jay
On 22/08/2014, at 3:27 am, Shane Kerr <[email protected]> wrote:
> Paul,
>
> On Thu, 21 Aug 2014 07:01:01 -0700
> Paul Hoffman <[email protected]> wrote:
>
>> On Aug 21, 2014, at 1:22 AM, Shane Kerr <[email protected]>
>> wrote:
>>
>>> * I don't like the treatment of QNAME*/hostQNAME, NAME*/hostNAME,
>>> and so on. Since JSON includes encoded strings, wouldn't it make
>>> more sense just to always put the QNAME in there? (Especially since
>>> you'll end up with SRV queries always being encoded as they have
>>> underscore characters...)
>>
>> JSON requires its strings to be encoded in a particular character
>> set. Given that the labels in a QNAME/NAME can be an binary cruft,
>> you can't assume that every QNAME will be representable.
>
> I think you're making it too hard. Control characters, ", and \ are
> already required to be escaped. Just specify a similar requirement for
> octets 127 to 255 also be escaped, and we're done.
>
>>> * In general I'm not super enthusiastic about the mixing of binary
>>> and formatted data - I tend to think an application will want one
>>> or the other. Perhaps it makes more sense to define two formats,
>>> one binary and one formatted? Or...
>>
>> All fields are optional, so a profile could say "don't include these"
>> or "always include those". Further, and more importantly, most RDATA
>> are binary. I did not want to force implementations to use the
>> presentation format for RDATA.
>
> The problem with an "all fields are optional" approach is that it puts
> all the burden on the consumer of the data, right? You literally have
> no idea what to expect. (That's kind of why I proposed some sort of
> schema below.)
>
> I understand not wanting to force implementations to use the
> presentation format for RDATA... OTOH it seems likely that the reason
> people are putting data in JSON is so they can see what it is. We could
> always try the RFC 3597 approach for an unknown RTYPE?
>
>>> * Maybe it makes sense to define a meta-record so consumers can know
>>> what to expect? Something that lists which names will (or may)
>>> appear.
>>
>> That would be a JSON schema. Just using that phrase will cause
>> screaming in the Apps Area. Having said that, it's perfectly
>> reasonable for a profile to insist that each record have a profile
>> indicator such as "Profile": "Private DNS interchange v3.1".
>
> Screaming aside, applications will either have an implicit schema or an
> explicit one. Defining the problem to be out of scope may be necessary
> to get something published, but that's a symptom of IETF brokenness
> IMHO, since it reduces the usefulness of any such RFC. :(
>
>>> I'd be mildly curious to see a comparison of the compressed sizes of
>>> JSON-formatted data (without data duplicated as binary stuff) versus
>>> non-JSON-formatted data. My intuition is that compression will
>>> remove most of the horrible redundancy that is involved in JSON,
>>> but there's only one way to be sure. ;)
>>
>> Sure. It's pretty trivial to do, for example, a CBOR format that
>> follows this; there are now CBOR libraries for most popular modern
>> languages (see http://cbor.io/). If folks here want that, I can add
>> it as an appendix. To be clear, however, I haven't heard anyone
>> saying they want compression so badly they are willing to lose
>> readability of the data.
>
> Oh, I meant with gzip or the like, not some JSON crafted format.
>
> So the idea is:
>
> $ tcpdump -w somefile.pcap
> $ pcap2dnsjson somefile.pcap somefile.json
> $ gzip somefile.pcap
> $ gzip somefile.json
> $ ls -l somefile.{pcap,json}.gz
>
> Then compare the sizes of the compressed files.
>
> The idea being that when moving files around via scp or rsync or
> whatever they'd probably be compressed like this, and probably also for
> medium-term storage. My hope is that a compressed JSON is roughly the
> same size as a compress raw pcap file, since basically they have the
> same entropy.
>
> The reason I bring this up is to give a feel for the size cost of a
> bloated text format in practice. :)
>
> Cheers,
>
> --
> Shane
>
> _______________________________________________
> DNSOP mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/dnsop
--
Jay Daley
Chief Executive
.nz Registry Services (New Zealand Domain Name Registry Limited)
desk: +64 4 931 6977
mobile: +64 21 678840
linkedin: www.linkedin.com/in/jaydaley
_______________________________________________
DNSOP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dnsop