On 3/2/2010 6:54 PM, Ian Hickson wrote:
On Tue, 2 Mar 2010, Elliotte Rusty Harold wrote:
Briefly it seems that<? causes the parser to go into Bogus comment
state, which is fair enough. (I wouldn't really recommend that anyone
use processing instructions in HTML syntax anyway.) However the parser
comes out of that state at the first>. Because processing instructions
can contain>  and terminate only at the two character sequence ?>  this
could cause PI processing to terminate early and leave a lot more error
handling and a confused parser state in the text yet to come.
In HTML4, PIs ended at the first>, not at ?>. "<?target data>" is the
syntax of PIs when the SGML options used by HTML4 are applied.

In any case, the parser in HTML5 is based on what browsers do, which is
also to terminate at the first>. It's unlikely that we can change that,
given backwards-compatibility needs.

Are there really a lot of folks out there depending on old HTML4-style processing instructions not being broken? Given that as I understand it such HTML4 processing instructions were not even used by any standard at that time, and with XHTML 1.0+ processing instructions bringing into practice the XML form, and especially with all of the progress made in X/HTML5 on harmonizing HTML and XHTML, I'd think that it'd really be ideal if this issue would not get in the way (along with the unfortunate loss of external DTDs)...

As long as website creators have the freedom to be sloppy, why not go a little further to make XML compatibility better? It'd be a whole lot more appealing to work in both environments out of the box than deal with complex server-side conversion solutions...

Brett

Reply via email to