Andrew Lentvorski wrote:
Except that it doesn't look like they even *thought* about ASN.1. They just thought about how they were abusing XML.

True. There are a lot of good protocol mechanisms in there. My point was more that XML has an awful lot of overhead if what you want is to ship tagged bytes around. That overhead comes from using XML to ship tagged bytes around instead of shipping marked up text around.

Yes and no. That's true if everything is just internal to your own program. However, once you start dumping data into a generalized persistent store (eg. BigTable), that's could be the difference between terabytes of dead data and data that's useful because nobody can remember what program stuffed all that data there.

You could use UBF, where the data is the program that creates the data.

http://armstrongonsoftware.blogspot.com/2008/07/ubf-and-vm-opcocde-design.html

But yeah, ASN.1 has a premise that you're actually using it to describe standard data structures, so it needs to be documented in (say) CCITT or ISO standards. You can parse (most) ASN.1 without the description of what it is. Using XML doesn't really save you unless you're smart enough to use good tags.

Show me how to interpret XHTML without knowing the standard. Show my what <data><flag>UP</flag></data> is supposed to mean.

Knowing how the data is formatted is pretty independent of the container.

In addition, it loses the inline association between labeled delimiter and 
delimited data.  That's a large loss that many people won't think about.

Maybe I've just deal with stupid people too much, but my experience is that "we use XML" means "we don't have to document what are data means because we can just hand you an example and hope you intuit that 'ID' isn't actually the primary key of the record, etc." Much the same way that "the documentation is on the wiki" really means "we have our customers try to reverse-engineer our system because we don't design anything up front."

XML handles a lot of things *right*. Unicode is good.

That's only necessary for a system designed primarily to handle large chunks of text. Any counted format will handle unicode just fine if your language does. If your language doesn't handle unicode cleanly, neither will the XML library in that language.

Named closing delimiters have some nice advantages.

Assuming you're not using any tools beyond a text editor to look at the data, sure, perhaps.

It handles tree structures from the start.

It's worse. It handles <trees> with nodes <stuck>stuck</stuck> in the middle of other </trees>.

It handles character escaping correctly--a task which *everyone* seems to get wrong in the first 10 versions of a format.

Again, assuming you're using a delimited format rather than a counted format. And lots of people manage to get the escaping right. XML's escaping is just uglier because it's designed to have text nodes that look like tree nodes.

Oh, and I've interfaced to plenty of XML implementations that don't get the escaping right. XML only gets it right when the person implementing it uses an XML library that's done right, instead of hand-rolling a one-off because they don't know any better.

The XML API's actually do most of the hard parsing garbage for you. XML API's exist in practically every language of any usefulness. And everybody now seems to use XML by default.

Yeah. These are actually real advantages, I'll grant you. :-)

I think the real difference is that you don't have to actually have the spec there to parse XML. That is, you don't really need to configure the library to do the parsing. You can just hand a blob of XML to a library and get back a tree. There are lots of formats that work this way these days, tho. The old stuff, like Sun's RPC encoding, was awful in this respect. But JSON, ASN.1, etc all handle parsing without a spec underneath. (Of course, you have to avoid IMPLICIT in your ASN.1)

--
Darren New / San Diego, CA, USA (PST)
 Helpful housekeeping hints:
  Check your feather pillows for holes
   before putting them in the washing machine.

--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to