17.09.2010, в 15:32, David Carlisle написал(а):

> adding a canonical decomposition doesn't imply deprecation.
> Depending on which canonical form is chosen, the canonicalisation mapping can 
> go either way, loosely speaking some forms prefer composite characters, some 
> use combining characters in preference (not that combining characters are 
> involved here)

This is not accurate. For singleton decomposition, both NFC and NFD contain the 
decomposed form. See Unicode 5.2.0 section D113 (full composition exclusion) 
for details.

> 2329  was deprecated some years after the canonical mapping was added
> because it was realised that that mapping was wrong, but mappings are never 
> changed once added. It became deprecated not when the mapping to 3008 was 
> added; it became deprecated when it was replaced by 27E8
> I described it as a two step process because it happened in two stages.

Because of the above, I don't see how it could happen in two stages. Adding a 
singleton decomposition logically implies deprecation. And it wasn't until 
Unicode 5.2 that "deprecated" had a clearly defined meaning anyway.

> It was conformant to unicode 2 yes, the fact that unicode then added a 
> canonical form to 3xxx doesn't make them non conformant, systems don't have 
> to use NFC form and they don't have to use any particular glyph, so for 
> either reason it's perfectly conformant to use a math character for 2329.

Again, both composition and decomposition of U+2329 produces U+3008.

> The point is that there have been documents using those entities as math 
> character names in continuous use since the '80s why should they all be 
> broken? Not to mention the fact that the vast majority of use of those 
> entities in html will also be expecting a mathematical bracket (even if on 
> some systems, with some fonts the character glyph used was actually designed 
> for CJK punctuation).
> 
> In fact where classical ISO usage and HTML usage differed I followed HTML 
> usage in all cases (for all the obvious reasons) even when the HTML 
> definitions make no sense at all (eg asymp) but in this case
> external factors (ie Unicode moving the goalposts) meant that the "new" 
> Uniocde 3.2 character should be used here.

Do these documents use the entities with the same "&...;" notation? MathML 
didn't exist in the 80's, so what are the documents that actually conflict with 
HTML, or with compound XHTML documents?

I see that <http://www.w3.org/TR/xml-entity-names/> defines multiple names for 
the same code points: "lang, langle, LeftAngleBracket and rang, rangle, 
RightAngleBracket". Do they all really need to have the same meaning?

>>> the only fix the UTC suggest for that is just not using 2329 at all
>>> and use 27E8 instead. Which is what the entity spec recommends.
>> 
>> 
>> Did they actually suggest to use it for the lang entity in HTML, or
>> did they suggest to use it when a math character is desired?
> 
> xhtml entities have document scope it is not possible for an xhtml+mathml 
> document to have different definitions for html and mathml use, but even for 
> pure html use it is fairly clear that 27e8 is the correct choice.

I wasn't asking about HTML vs. XHTML - both used to define &rang; in the same 
way. I can re-phrase my question as "Did they actually suggest to use it for 
the lang entity in (X)HTML, or did they suggest to use it when a math character 
is desired?"

> rang was never defined to be 3009, it was defined to be 232A  and documented 
> as being a math angle bracket. Unicode have deprecated 232A and suggest that 
> any uses of that be replaced by 27E9 because 232A is effectively unusable as 
> it is subject to an essentially accidental and incorrect normalisation to 
> 3009.
> 
> It would be bizarre in the extreme to redefine rang to be 3009 (is there any 
> evidence of anyone ever having used that entity name and wanting a CJK 
> character?)

I don't think that characterizing what we did in WebKit as bizarre in the 
extreme is fair. The Unicode spec said that 232A is deprecated in favor of 
3009, so it was the only formally correct thing to do.

- WBR, Alexey Proskuryakov

_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Reply via email to