En Thu, 22 Oct 2009 17:08:21 -0300, <ru...@yahoo.com> escribió:
On 10/22/2009 03:23 AM, Gabriel Genellina wrote:
En Wed, 21 Oct 2009 15:14:32 -0300, <ru...@yahoo.com> escribió:
On Oct 21, 4:59 am, Bruno Desthuilliers <bruno.
42.desthuilli...@websiteburo.invalid> wrote:
beSTEfar a écrit :
(snip)
> When parsing strings, use Regular Expressions.
And now you have _two_ problems <g>
For some simple parsing problems, Python's string methods are powerful
enough to make REs overkill. And for any complex enough parsing (any
recursive construct for example - think XML, HTML, any programming
language etc), REs are just NOT enough by themselves - you need a full
blown parser.
But keep in mind that many XML, HTML, etc parsing problems
are restricted to a subset where you know the nesting depth
is limited (often to 0 or 1), and for that large set of
problems, RE's *are* enough.
I don't think so. Nesting isn't the only problem. RE's cannot handle
comments, by example. And you must support unquoted attributes, single
and
double quotes, any attribute ordering, empty tags, arbitrary
whitespace...
If you don't, you are not reading XML (or HTML), only a specific file
format that resembles XML but actually isn't.
OK, then let me rephrase my point as: in the real world it is often
not necessary to parse XML in it's full generality; parsing, as you
put it, "a specific file format that resembles XML" is all that is
really needed.
Given that using a real XML parser like ElementTree is as easy as (or even
easier than) building a regular expression, and more robust, and more
likely to survive small changes in the input format, why use the worse
solution?
RE's are good in solving some problems, but parsing XML isn't one of those.
--
Gabriel Genellina
--
http://mail.python.org/mailman/listinfo/python-list