On Jun 7, 2007, at 15:00, Anne van Kesteren wrote:

These should be converted to LF too. One thing that might be interesting to look into is the handling of LFCR in browsers (as opposed to CRLF). I haven't done that yet... Some browsers (just tested Opera) also normalize two newline entities following each other (CRLF pair).

This requires more code. I haven't analyzed the perf impact, but intuitively this requires either naïve and inefficient buffer retraversal in the tree builder or additional complexity to the tokenizer's buffer management (assuming the tokenizer is doing efficient buffering to begin with).

You can't protect the DOM from getting CRs if someone insists on putting them there using JS or XML. Is it worthwhile to prevent escaped CRs from ending up in the DOM as CRs in HTML? Is special handling required for compat.

I'd try doing exactly what XML does here unless compat requires otherwise.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/


Reply via email to