On Sun, 7 Nov 2004, [ISO-8859-15] André Malo wrote:

> * Nick Kew <[EMAIL PROTECTED]> wrote:
>
> > BTW, the "what is a comment" problem is easier than it looks, as both
> > <script> and <style> are declared in HTML as having CDATA content.
> > That makes it trivial to distinguish them from "inert" comments.
>
> but in xhtml it's PCDATA, which makes them real xml comments...

So long as we're in web-browser-compatible land, we can parse XHTML with
an HTML parser that knows about CDATA.  And when we move out of it,
we're also leaving commented <script> and <style> contents behind.

I'll grant there are other pathological edge-cases due to the ways
people abuse markup.  That's one very good reason none of the modules
I mentioned defaults to stripping comments:-)

-- 
Nick Kew

Reply via email to