[whatwg] Missing comma (8.2.4.1)

2008-12-20 Thread Kartikaya Gupta
Section 8.2.4.1, for the '' input, says: When the content model flag is set to either the RCDATA state or the CDATA state and the escape flag is false: switch to the tag open state. I think the lack of commas in this sentence makes it ambiguous: it can either be interpreted as (cmf ==

[whatwg] Byte-wise tokenization algorithm

2008-12-20 Thread Edward Z. Yang
I am currently working on a PHP5 implementation of the HTML5 specification. PHP has abysmal Unicode support, and implementing Unicode streams in userspace may be unacceptablu slow. Thus, my questions: 1. Given an input stream that is known to be valid UTF-8, is it possible to implement the

Re: [whatwg] Byte-wise tokenization algorithm

2008-12-20 Thread Ian Hickson
On Sat, 20 Dec 2008, Edward Z. Yang wrote: I am currently working on a PHP5 implementation of the HTML5 specification. PHP has abysmal Unicode support, and implementing Unicode streams in userspace may be unacceptablu slow. Thus, my questions: 1. Given an input stream that is known to be