On Jun 26, 11:07 am, Grant Edwards <[EMAIL PROTECTED]> wrote: > On 2008-06-26, Stefan Behnel <[EMAIL PROTECTED]> wrote: > > > > Why not use an HTML parser instead? > > > > Stating it differently: in order to correctly recognize HTML > tags, you must use an HTML parser. Trying to write an HTML > parser in a single RE is probably not practical. >
s/practical/possible It isn't *possible* to grok HTML with regular expressions. Individual tags--yes. But not a full element where nesting is possible. At least not properly. Maybe we need some notes on the limits of regular expressions in the re documentation for people who haven't taken the computer science courses on parsing and grammars. Then we could explain the necessity of real parsers and grammars, at least in layman's terms. -- http://mail.python.org/mailman/listinfo/python-list