On 08/21/2011 08:19 AM, Asmus Freytag wrote:
The best default would be an explicit "PU" - undefined behavior in the absence of a private agreement.
Hm -- but really this would only serve to allay concerns like Michael's stemming from a presumption that the BC is "deeper" than other characters (which I should concede is not entirely false). But you can't define explicit undefined values for *all* properties (even those that you can change despite stability) can you?
There are some properties where stability guarantees prevent adding a new value. In that case, the documentation should point out that the intended effect was to have a PU value, but for historical / stability reasons, the tables contain a different entry.
What are these properties? The standard says that the canonical decomposition will not be changed. Mark Davis said the GC can not be changed[*]. What else?
[* There is no need to *officially* change the GC of the PUA characters, but PUA-supporting implementations will certainly need to be able to handle letters, marks and numbers etc as if they were encoded characters, and Mark has expressed he is fine by that.]
Suggesting a "structure" on the private use area, by suggesting different default properties, ipso facto makes the PUA less private. That should be a non-starter.
I entirely agree (obviously). -- Shriramana Sharma

