Re: JSON

Andrew Lentvorski Tue, 25 Oct 2005 19:36:53 -0700

Gabriel Sechan wrote:

Because XML is the Emperor's New Clothes of computer technology. Somepeople praise it to the heavans as the solution to everything. Inreality, it does next to nothing. We've had parser generators since the70s, at least. And parsing is the *easy* part of dealing with data.

And we've had s-expressions for longer than that. So why are there somany programs that *still* can't cope with any of following"[EMAIL PROTECTED]&*()_+-={}[],.<>" in data? As a fun exercise, inject a NULcharacter into *anything* going into C. Oops. Security hole. Or,better yet, the number of programs that still have hard-coded string andbuffer lengths. Oops. Stack smash.

XML doesn't help at all with the hard part- acting on the parsedtokens.

Actually, you missed something more fundamental. It forces programmersto deal with "parsed tokens". That's a *big* step up for mostprogrammers who normally just "throw a couple regexps down".

XML is overkill. It tries to swat a mosquito with a sledgehammer. Ittakes a problem that can generally solved in minutes, and gives youhours of fun debugging XML code. I have *never* seen a problem solvedby XML that couldn't have been done just as easily- if not more so-without it.

And yet, somehow, nobody ever *did* solve the problems better or moreeasily. Before XML, everybody generated their own binary formats inspite of the fact that a perfectly good standard existed (ASN.1). Inaddition, nobody ever documented these formats, either.

Instead of having to waste time reverse engineering the format, XML putsthe format directly in front of the humans. Yes, normally you use themachine; however, when you *need* to look at it by eye, you *can*.

Writing a parser for a spec like XML is hard- thats why most XMLparsers are buggy. Writing a parser for a small domain language isquite easy. Its a simple state machine. For middle sized languages,you have lex and yacc.

The fact that you recommend lex and yacc sums up the problem quitenicely. There are many superior tools for parsing and lexing, and yetnobody ever uses them.


At least with XML, those better tools get encoded behind the SAX/DOM APIs.

XML *forces* these morons to have to interface with a structured,debugged parser. SAX and DOM have their faults, but at least they
No it doesn't. THey still use regexes as often as not, which is a badthing with XML, since XML is such a top heavy, corner-case ridden spec.


That isn't just XML.  All formats eventually get corner cases.

And 90% of the time, this boundary case only exists in the parsersmind. An additional 9% of the time, the corner case is due to XMLitself and not failing to follow the DTD/schema.

Actually, I find those percentages are about reversed, nowadays. YMMVand all that.

In 99% of apps, internationalization is overkill. Unless a human ismeant to be editing the file (such as a config file), its just a wasteof CPU power and time.

The problem is that i18n can't be retrofitted easily. Either get itright up front or suffer forevermore.

As for being liberal in what you expect being a bad thing- lets try anexperiment. For the next month, you can only go to webpages that areWC3 validated, and who's servers put out perfect HTTP. Come back to uswith how many sites you visited. I'll be impressed if you could makedouble digits.
There's a reason most real world programs are liberal with inputs- theyhave to be. You can't expect the other guy to get his shit right,especially if he's not employed with you. And failing to the end useris not a good option, not when the error can be routed around.

Let's try a different experiment. Let's let TCP fill in zeros forpackets it knows the size of but can't finish receiving and see how manyfiles you get transferred.

I get to single digits but I *know* my files and web pages are correct.You wind up with randomly corrupted files.


TCP demands *perfect* packets for a reason.

Being liberal is *not* always a good thing. Being pedantic is necessarywhen doing data interchange.

The only reason why being liberal with HTML works is that the endconsumer (a human) is doing error correction in the wetware using theredundant information of language.

And you could have saved yourself a lot of work in 99% of cases by notusing XML, and not having to worry about the nasty gnarly stuff at all.Just write a language that does what you need, no more no less.

And my solution eventually evolves to be just as complex as as XML. I'drather start at the debugged endpoint, thanks.


-a

--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Re: JSON

Reply via email to