2006/12/21, Anne van Kesteren:
On Thu, 21 Dec 2006 11:08:51 +0100, Thomas Broyer wrote:

> Before DOCTYPE name state:
> http://www.whatwg.org/specs/web-apps/current-work/#before1
> """
> ↪ U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z
>     Create a new DOCTYPE token. Set the token's name name to the
> uppercase version of the current input character (subtract 0x0020 from
> the character's code point), and mark it as being in error. Switch to
> the DOCTYPE name state.
> """
>
> DOCTYPE name state
> http://www.whatwg.org/specs/web-apps/current-work/#doctype1
> """
> ↪ U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z
>     Append the uppercase version of the current input character
> (subtract 0x0020 from the character's code point) to the current
> DOCTYPE token's name. Stay in the DOCTYPE name state."""
>
> Why is the DOCTYPE marked "in error" in the former case?
>
> In other words, why would <!DOCTYPE html> be "in error" while
> <!DOCTYPE Html> wouldn't?
>
> My guess is that it's a bug in the "Before DOCTYPE name state".

It's not. The "DOCTYPE name state" also has this paragraph: "Then, if the
name of the DOCTYPE token is exactly the four letters "HTML", then mark
the token as being correct. Otherwise, mark it as being in error."

But it also has this note, which is quite confusing: "Because
lowercase letters in the name are uppercased by the algorithm above,
the "HTML" letters are actually case-insensitive relative to the
markup."

However, section 8.1.1 says:
http://www.whatwg.org/specs/web-apps/current-work/#doctype
"""
In other words, <!DOCTYPE HTML>, case-insensitively.
"""

So I guess you're right.

It remains that the tokenization stage is a bit confusing…

--
Thomas Broyer

Reply via email to