Re: [whatwg] Entity parsing

Allan Sandfeld Jensen Sat, 23 Jun 2007 05:57:21 -0700

On Friday 15 June 2007 03:05, Ian Hickson wrote:
> On Sun, 5 Nov 2006, �istein E. Andersen wrote:
> > From section 9.2.3.1. Tokenising entities:
> > >  For some entities, UAs require a semicolon, for others they don't.
> >
> > This applies to IE.
> >
> > FWIW, the entities not requiring a semicolon are the ones encoding
> > Latin-1 characters, the other HTML 3.2 entities (&amp, &gt and &lt), as
> > well as &quot and the uppercase variants (&AMP, &COPY, &GT, &LT, &QUOT
> > and &REG). [...]
>
> I've defined the parsing and conformance requirements in a way that
> matches IE. As a side-effect, this has made things like "na&iumlve"
> actually conforming. I don't know if we want this. On the one hand, it's
> pragmatic (after all, why require the semicolon?), and is equivalent to
> not requiring quotes around attribute values. On the other, people don't
> want us to make the quotes optional either.


What about the Gecko entity parsing extension?

- IE consitently parses unterminated entities from latin-1
- Gecko parses all unterminated entities, even those beyond latin-1, but only 
in text-content, not in attributes. (seems my recent firefox also supports 
the IE parsing in attributes now.)

See the attached test-case.

`Allan

Test of HTML entities in quirky mode:

&	&
&amp	&
&ample	&le
¬	¬
&not	¬
&notat	¬at
∉	∉
&notin	¬in
&notina	¬ina
≥	≥
&ge	&ge
&gel	&gel

Test of entities in attributes:

Re: [whatwg] Entity parsing

Reply via email to