The following entity tests all look like they're incorrect: Name: Entity in attribute without semicolon ending in x Input: <h a='¬x'>
Name: Entity in attribute without semicolon ending in 1 Input: <h a='¬1'> Name: Entity in attribute without semicolon ending in i Input: <h a='¬i'> Name: Undefined named entity in attribute value ending in semicolon and whose name starts with a known entity name. Input: <h a='¬i;'> In each case, the test expects a single error (presumably for the lack of a trailing semi-colon after the last matched character which is 't'), but the wording of the spec does NOT make these cases an error: Here's the relevant text from the spec: "If the character reference is being consumed as part of an attribute, and the last character matched is not a U+003B SEMICOLON character (;), and the next character is either a U+003D EQUALS SIGN character (=) or in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), U +0041 LATIN CAPITAL LETTER A to U+005A LATIN CAPITAL LETTER Z, or U +0061 LATIN SMALL LETTER A to U+007A LATIN SMALL LETTER Z, then, for historical reasons, all the characters that were matched after the U +0026 AMPERSAND character (&) must be unconsumed, and nothing is returned." 1. Character is being consumed as part of an attribute 2. The last character matched ('t') is not a semi-colon 3. In all three cases, the next character is an alphanumeric character, which the spec says to ignore without emitting an error. The other set of tests that appear incorrect deal with the opposite case in which no named entity was matched. These tests are the ""Bad named entity: XXXXX without a semi-colon" tests along with a few other random ones: Name: Entity name followed by the equals sign in an attribute value. Input: <h a='&lang='> Name: Entity without a name Input: &; Name: Non-allowed ' after ampersand in attribute value Input: <z z=\"&'\"> Name: Non-allowed \" after ampersand in attribute value Input: <z z='&\"'> Name: Non-ASCII character reference name Input: &\u00AC; Name: Partial entity match at end of file Input: I'm &no Name: Text after bogus character reference Input: <z z='&xlink_xmlns;'>bar<z> Name: Unfinished entity Input: &f Here's the relevant text from the spec: "If no match can be made, then no characters are consumed, and nothing is returned. In this case, if the characters after the U+0026 AMPERSAND character (&) consist of a sequence of one or more characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), U+0061 LATIN SMALL LETTER A to U+007A LATIN SMALL LETTER Z, and U +0041 LATIN CAPITAL LETTER A to U+005A LATIN CAPITAL LETTER Z, followed by a U+003B SEMICOLON character (;), then this is a parse error." These tests expect an error, but no error should be emitted since for an unmatched named entity to be an error, the ampersand needs to be followed by at least 1 alphanumeric character, immediately followed by a semicolon, which is not the case in these tests. -- You received this message because you are subscribed to the Google Groups "html5lib-discuss" group. To post to this group, send an email to html5lib-disc...@googlegroups.com. To unsubscribe from this group, send email to html5lib-discuss+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/html5lib-discuss?hl=en-GB.