"John H. Jenkins" wrote:

> 
> Yes, because you do not *encode* characters using IDC's.  You describe 
> them.  This is carefully explained in the standard.

I stand corrected.

> 
> Of course, using the taboo variant selector is about as vague as an 
> IDC, so it doesn't make that much difference.

My point is that if the commonly encountered taboo variants are already encoded in 
CJK-B, then
either the other taboo variants should also be added to CJK-B or they could be 
*described* using
IDCs. Adding a taboo variant selector does make a difference, because then there'll be 
more than one
way to reference the same character.

On the other hand, given the lack of font support for CJK-B, perhaps a taboo variant 
selector would
be preferable ... now I don't where I stand on this !

> 
> As to the proposed location, note that the byte-order mark got stuck 
> with a bunch of Arabic compatibility forms.

U+FEFF is only stuck with a bunch of Arabic compatibility forms because it's the 
little-endian of
U+FFFE, and as far as I'm aware it's not actually a BOM character, but a code point 
that is "used
solely with the semantic of BOM" (TR28 Section 3.9).

> Sometimes the odd 
> character gets stuck in an odd place; as you say, there wasn't any room 
> left in the more logical location, and this spot in the KangXi radicals 
> block was pretty much never going to be used otherwise.  Six of one, as 
> it were.
> 

I simply can't accept this.

For argument's sake, what are you going to do when I publish the manuscript copy of a 
draft edition
of the Kangxi dictionary that I recently purchased in a second-hand bookstore in 
London that
includes ten supplementary radicals not found in the printed editions ?

In principle, as has been argued convincingly in another thread recently, you can 
never assume that
any unused code point will always remain vacant. The Kangxi Radical block may look as 
if it will
never change, but we shouldn't rely on that being the case.

Given that there's going to be proposals for additional CJK symbols and punctuation 
marks in the
future (if no-one else does I've got a few I'll propose), surely it would be better to 
simply create
a "CJK Symbols and Punctuation B" block for the proposed IDEOGRAPHIC TABOO VARIATION 
INDICATOR. It's
irrelevant that the block will only have one charcacter to start with. It's got to be 
better than
poluting other blocks with characters that just don't belong there.

Andrew

Reply via email to