From: Andrew Lentvorski <[EMAIL PROTECTED]>
Gabriel Sechan wrote:
Because XML is the Emperor's New Clothes of computer technology. Some
people praise it to the heavans as the solution to everything. In
reality, it does next to nothing. We've had parser generators since the
70s, at least. And parsing is the *easy* part of dealing with data.
And we've had s-expressions for longer than that. So why are there so many
programs that *still* can't cope with any of following
"[EMAIL PROTECTED]&*()_+-={}[],.<>" in data?
Generally, thats a command line issue, nothing more or less. And XML
doesn't have anything to do with that problem.
As a fun exercise, inject a NUL character into *anything* going into C.
Oops. Security hole.
Not at all. C code can handle NULLs just fine. However, if it expects text
data it may truncate your string to the first NULL. But there is no
security hole here. And again, nothing to do with XML.
Or, better yet, the number of programs that still have hard-coded string
and buffer lengths. Oops. Stack smash.
Bigger problem. But still nothing to do with XML.
XML doesn't help at all with the hard part- acting on the parsed tokens.
Actually, you missed something more fundamental. It forces programmers to
deal with "parsed tokens". That's a *big* step up for most programmers who
normally just "throw a couple regexps down".
Except that it doesn't. It only forces you to do so if you use an XML
parser that generates tokens. Not all do.
XML is overkill. It tries to swat a mosquito with a sledgehammer. It
takes a problem that can generally solved in minutes, and gives you hours
of fun debugging XML code. I have *never* seen a problem solved by XML
that couldn't have been done just as easily- if not more so- without it.
And yet, somehow, nobody ever *did* solve the problems better or more
easily.
SUre we did. How else did anything get written before XML?
Before XML, everybody generated their own binary formats in spite of the
fact that a perfectly good standard existed (ASN.1). In addition, nobody
ever documented these formats, either.
And there's damn good reasons for those binary formats- speed of parsing,
storage space, etc. Not being human readable is frequently a bonus too-
for every coder who goes in and hacks something cool of a human readable
file, 20 morons corrupt the file.
Instead of having to waste time reverse engineering the format, XML puts
the format directly in front of the humans. Yes, normally you use the
machine; however, when you *need* to look at it by eye, you *can*.
I can read binary or non-XML text quite well. Just give me the format.
Writing a parser for a spec like XML is hard- thats why most XML parsers
are buggy. Writing a parser for a small domain language is quite easy.
Its a simple state machine. For middle sized languages, you have lex and
yacc.
The fact that you recommend lex and yacc sums up the problem quite nicely.
There are many superior tools for parsing and lexing, and yet nobody ever
uses them.
Do you prefer flexx and Bison? Same thing, new model.
That isn't just XML. All formats eventually get corner cases.
Nope. Not unless you misdesign it. Anything as big as XML will, but XML is
stupidly big. Thats why the idea of a meta-format for creating formats is
not a good one.
TCP demands *perfect* packets for a reason.
Being liberal is *not* always a good thing. Being pedantic is necessary
when doing data interchange.
Apparently you've never seen code to talk to telnet servers, or ftp.
Writing a client for one of those requires more special cases and
workarounds for specific servers than you can shake a stick at. Yet
somehow, it all works.
And you could have saved yourself a lot of work in 99% of cases by not
using XML, and not having to worry about the nasty gnarly stuff at all.
Just write a language that does what you need, no more no less.
And my solution eventually evolves to be just as complex as as XML. I'd
rather start at the debugged endpoint, thanks.
Except that it wouldn't. XML is as crufty as it is because it does
everything. It incorporates unicode, it has 15 cadgillion ways to
validate, it slices, it dices, and yes it even makes Julianne fries. If you
were looking to replace XML with another meta-standard, yes it would be
about as complex. If you were looking to make just whats needed, it
wouldn't be 1/10th as bad.
Gabe
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg