From: Andrew Lentvorski <[EMAIL PROTECTED]>
Gabriel Sechan wrote:

Because XML is the Emperor's New Clothes of computer technology. Some people praise it to the heavans as the solution to everything. In reality, it does next to nothing. We've had parser generators since the 70s, at least. And parsing is the *easy* part of dealing with data.

And we've had s-expressions for longer than that. So why are there so many programs that *still* can't cope with any of following "[EMAIL PROTECTED]&*()_+-={}[],.<>" in data?

Generally, thats a command line issue, nothing more or less. And XML doesn't have anything to do with that problem.

As a fun exercise, inject a NUL character into *anything* going into C. Oops. Security hole.

Not at all. C code can handle NULLs just fine. However, if it expects text data it may truncate your string to the first NULL. But there is no security hole here. And again, nothing to do with XML.

Or, better yet, the number of programs that still have hard-coded string and buffer lengths. Oops. Stack smash.

Bigger problem.  But still nothing to do with XML.


XML doesn't help at all with the hard part-  acting on the parsed tokens.

Actually, you missed something more fundamental. It forces programmers to deal with "parsed tokens". That's a *big* step up for most programmers who normally just "throw a couple regexps down".


Except that it doesn't. It only forces you to do so if you use an XML parser that generates tokens. Not all do.

XML is overkill. It tries to swat a mosquito with a sledgehammer. It takes a problem that can generally solved in minutes, and gives you hours of fun debugging XML code. I have *never* seen a problem solved by XML that couldn't have been done just as easily- if not more so- without it.

And yet, somehow, nobody ever *did* solve the problems better or more easily.

SUre we did.  How else did anything get written before XML?

Before XML, everybody generated their own binary formats in spite of the fact that a perfectly good standard existed (ASN.1). In addition, nobody ever documented these formats, either.

And there's damn good reasons for those binary formats- speed of parsing, storage space, etc. Not being human readable is frequently a bonus too- for every coder who goes in and hacks something cool of a human readable file, 20 morons corrupt the file.


Instead of having to waste time reverse engineering the format, XML puts the format directly in front of the humans. Yes, normally you use the machine; however, when you *need* to look at it by eye, you *can*.

I can read binary or non-XML text quite well.  Just give me the format.


Writing a parser for a spec like XML is hard- thats why most XML parsers are buggy. Writing a parser for a small domain language is quite easy. Its a simple state machine. For middle sized languages, you have lex and yacc.

The fact that you recommend lex and yacc sums up the problem quite nicely. There are many superior tools for parsing and lexing, and yet nobody ever uses them.

Do you prefer flexx and Bison?  Same thing, new model.

That isn't just XML.  All formats eventually get corner cases.

Nope. Not unless you misdesign it. Anything as big as XML will, but XML is stupidly big. Thats why the idea of a meta-format for creating formats is not a good one.



TCP demands *perfect* packets for a reason.

Being liberal is *not* always a good thing. Being pedantic is necessary when doing data interchange.


Apparently you've never seen code to talk to telnet servers, or ftp. Writing a client for one of those requires more special cases and workarounds for specific servers than you can shake a stick at. Yet somehow, it all works.


And you could have saved yourself a lot of work in 99% of cases by not using XML, and not having to worry about the nasty gnarly stuff at all. Just write a language that does what you need, no more no less.

And my solution eventually evolves to be just as complex as as XML. I'd rather start at the debugged endpoint, thanks.

Except that it wouldn't. XML is as crufty as it is because it does everything. It incorporates unicode, it has 15 cadgillion ways to validate, it slices, it dices, and yes it even makes Julianne fries. If you were looking to replace XML with another meta-standard, yes it would be about as complex. If you were looking to make just whats needed, it wouldn't be 1/10th as bad.

Gabe


--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to