At 23:59 05/01/25, Julian Reschke wrote:
>Martin Duerst wrote:
>> At 22:29 05/01/25, Julian Reschke wrote:
>> >The big difference here is that XMLNS uses IRIs/URIs as identifiers and only for that. However, what is an XSLT that transforms Atom content to HTML supposed to do when it encounters a IRI which isn't a legal URI? For instance, it can't put it into an HTML href attribute without producing invalid HTML.
>> First, the Atom spec just says that HTML or XHTML goes into certain
>> elements, it leaves it to other specs to say what HTML or XHTML is.
>> So Atom doesn't deal with the question of whether IRIs are allowed
>> there or not, and I hope PaceIRI is worded correctly in that respect.
> >
> > Second, HTML href attributes are defined as CDATA, so in terms of
> > validity, any Unicode character goes anyway.
>
><http://www.w3.org/TR/html401/struct/links.html#adef-href>
>
>As far as I can tell, HTML 4.01 normatively refers to RFC2396, which doesn't allow non-ASCII characters. The *DTD* may allow any kind of string here, but the spec text doesn't.
I know. You were using 'invalid', not 'illegal'. The former is a technical term that is very strongly related to DTDs.
>> Third, the HTML 4 recommendation, dating back to 1997 >> (http://www.w3.org/TR/REC-html40-971218/appendix/notes.html#h-B.2) >> contains language that in today's terms amounts to saying >> "browsers should treat URIs with non-ASCII characters as IRIs, >> even though strictly speaking, they're illegal". Many browsers >> to some extent already do, and when the IRI RFC gets out, I guess >> some more will. > >So are there plans to add an erratum to HTML 4.01?
If there was a working errata process for HTML 4.01, I guess such an erratum would already have happened.
>And what's the situation for XHTML?
For XHTML 1.0/1.1, that's based on HTML 4..., so see above. For XHTML 2.0, that will just refer to the IRI RFC, I guess.
Regards, Martin.
