Paul,

On Thu, 21 Aug 2014 07:01:01 -0700
Paul Hoffman <[email protected]> wrote:

> On Aug 21, 2014, at 1:22 AM, Shane Kerr <[email protected]>
> wrote:
> 
> > * I don't like the treatment of QNAME*/hostQNAME, NAME*/hostNAME,
> > and so on. Since JSON includes encoded strings, wouldn't it make
> > more sense just to always put the QNAME in there? (Especially since
> > you'll end up with SRV queries always being encoded as they have
> > underscore characters...)
> 
> JSON requires its strings to be encoded in a particular character
> set. Given that the labels in a QNAME/NAME can be an binary cruft,
> you can't assume that every QNAME will be representable.

I think you're making it too hard. Control characters, ", and \ are
already required to be escaped. Just specify a similar requirement for
octets 127 to 255 also be escaped, and we're done.

> > * In general I'm not super enthusiastic about the mixing of binary
> > and formatted data - I tend to think an application will want one
> > or the other. Perhaps it makes more sense to define two formats,
> > one binary and one formatted? Or...
> 
> All fields are optional, so a profile could say "don't include these"
> or "always include those". Further, and more importantly, most RDATA
> are binary. I did not want to force implementations to use the
> presentation format for RDATA.

The problem with an "all fields are optional" approach is that it puts
all the burden on the consumer of the data, right? You literally have
no idea what to expect. (That's kind of why I proposed some sort of
schema below.)

I understand not wanting to force implementations to use the
presentation format for RDATA... OTOH it seems likely that the reason
people are putting data in JSON is so they can see what it is. We could
always try the RFC 3597 approach for an unknown RTYPE?

> > * Maybe it makes sense to define a meta-record so consumers can know
> >  what to expect? Something that lists which names will (or may)
> > appear.
> 
> That would be a JSON schema. Just using that phrase will cause
> screaming in the Apps Area. Having said that, it's perfectly
> reasonable for a profile to insist that each record have a profile
> indicator such as "Profile": "Private DNS interchange v3.1".

Screaming aside, applications will either have an implicit schema or an
explicit one. Defining the problem to be out of scope may be necessary
to get something published, but that's a symptom of IETF brokenness
IMHO, since it reduces the usefulness of any such RFC. :(
 
> > I'd be mildly curious to see a comparison of the compressed sizes of
> > JSON-formatted data (without data duplicated as binary stuff) versus
> > non-JSON-formatted data. My intuition is that compression will
> > remove most of the horrible redundancy that is involved in JSON,
> > but there's only one way to be sure. ;)
> 
> Sure. It's pretty trivial to do, for example, a CBOR format that
> follows this; there are now CBOR libraries for most popular modern
> languages (see http://cbor.io/). If folks here want that, I can add
> it as an appendix. To be clear, however, I haven't heard anyone
> saying they want compression so badly they are willing to lose
> readability of the data.

Oh, I meant with gzip or the like, not some JSON crafted format.

So the idea is:

   $ tcpdump -w somefile.pcap
   $ pcap2dnsjson somefile.pcap somefile.json
   $ gzip somefile.pcap
   $ gzip somefile.json
   $ ls -l somefile.{pcap,json}.gz
   
Then compare the sizes of the compressed files.

The idea being that when moving files around via scp or rsync or
whatever they'd probably be compressed like this, and probably also for
medium-term storage. My hope is that a compressed JSON is roughly the
same size as a compress raw pcap file, since basically they have the
same entropy.

The reason I bring this up is to give a feel for the size cost of a
bloated text format in practice. :)

Cheers,

--
Shane

_______________________________________________
DNSOP mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to