On Jun 26, 11:07 am, Grant Edwards <[EMAIL PROTECTED]> wrote:
> On 2008-06-26, Stefan Behnel <[EMAIL PROTECTED]> wrote:
> >
> > Why not use an HTML parser instead?
> >
>
> Stating it differently: in order to correctly recognize HTML
> tags, you must use an HTML parser.  Trying to write an HTML
> parser in a single RE is probably not practical.
>

s/practical/possible

It isn't *possible* to grok HTML with regular expressions. Individual
tags--yes. But not a full element where nesting is possible. At least
not properly.

Maybe we need some notes on the limits of regular expressions in the
re documentation for people who haven't taken the computer science
courses on parsing and grammars. Then we could explain the necessity
of real parsers and grammars, at least in layman's terms.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to