Re: Private Use Agreements and Unapproved Characters

Doug Ewell Mon, 18 Mar 2002 21:27:13 -0800

Sorry for the belated response to this.  I hope it is still relevant.

Patrick T. Rourke <[EMAIL PROTECTED]> wrote:


>> I would think you could simply use the version number of the Unicode
>> Standard.  For example, the use of Tagalog would have been conformant
to
>> this proposed PUA registry until Unicode version 3.2, at which time
it
>> would have to be removed from the registry because of its
introduction
>> into Unicode.
>
> This had not occurred to me!  The only thing that would militate
against
> this would be if additional characters were identified which had not
yet
> been proposed and were proposed at a later date; that would require a
> new version number which would not be a Unicode point number, and so
> might be distinguished using a letter, etc.  (I don't foresee this
> happening, but it's better to be safe than sorry, no?).

Patrick is right.  I neglected to consider that the proprietors of this
private registry would want to add characters between releases of
Unicode.  Of course they would; the Unicode release cycle is completely
irrelevant to their process.  In any event, the exact technique used for
version control of such a registry is a minor detail.

> The target
> users for the registry would be a small number of electronic scholarly
> publishers in the community.

Originally I thought Patrick was talking about a relatively large group.
In the situation he describes, it should not be difficult to enforce
adherence to the registry, and make sure everyone's on the most current
version.

> The point is that the registry would not be "rushing characters into
> use," but that they would be characters which were already in use with
a
> variety of non-standardized methods and which are widely used in print
> in the community.

Absolutely, it is much better to use Unicode, even if that means the
PUA, than to putter along with a private 8-bit encoding.

> Another serious issue.  The characters are such that I doubt they
would
> be approved for the BMP.  Most of the tools being used by the users in
> the community in question (mostly Windows 98 and Mac OS 9 word
> processors and web browsers - yes, Mac OS 9 will be a problem anyway)
> are not yet able to handle secondary plane characters, at least not
> without serious intervention.  The PUA code points which would be used
> would be in the BMP because use of the secondary plane PUA (I don't
> remember the code points, so forgive me for not knowing what plane(s)
> they're in)

Planes 15 and 16.  (U+Fxxxx and U+10xxxx)

> would be obstacles to adoption.  The problem will be getting
> the targeted content providers to agree beforehand to convert their
> content to the approved codepoints when they become available, as the
> BMP code points are easier to support.  Does anyone have any advice /
> prior experience for dealing with this issue?

Sorry, I can't offer much moral support here.  The supplementary planes
have existed since 1993, and the designation of private-use code points
in Planes 15 and 16 has existed since the release of Unicode 2.0 in
1996.  That's six years ago.  Vendors of "Unicode-compliant" software
that can't handle supplementary characters -- and there's a lot of it
out there, make no mistake -- really need to get in touch with the
times.

> Finally, are there any existing resources describing / testing support
> for PUA characters in existing applications, besides Alan Wood's test
> page?  Perhaps at ConScript?

This is going to be a bit tricky, almost by definition, because
characters in the PUA are intended for privately-defined exchange only.
You are not going to find many fonts on the Web that contain PUA
characters.  Personally, I'd like to see a font that covers all or most
of the ConScript characters, but that seems impossible since so many of
the ConScript glyphs have become unavailable, possibly forever.

-Doug Ewell
 Fullerton, California

Re: Private Use Agreements and Unapproved Characters

Reply via email to