On 7/3/2014 11:02 AM, Richard COOK wrote:
On Jul 2, 2014, at 8:02 AM, Karl Williamson <[email protected]> wrote:
Corrigendum #9 has changed this so much that people are coming to me and saying
that inputs may very well have non-characters, and that the default should be
to pass them through. Since we have no published wording for how the TUS will
absorb Corrigendum #9, I don't know how this will play out. But this abrupt a
change seems wrong to me, and it was done without public input or really
adequate time to consider its effects.
Asmus,
I think you will recall that in late 2012 and early 2013, when the subject of
the proposed changes (or clarifications) to text relating to noncharacters
first arose, we (at Wenlin) expressed our concerns. Some concerns were grave,
and some of the discussion and comments were captured in this web page:
<http://wenlininstitute.org/UnicodeNoncharacters/>
There was much back and forth on the editorial list. Discussion clarified some
of the issues for me, and mollified some of my concerns.
At that time we did implement support for noncharacters in Wenlin, controlled
by an Advanced Option to:
Replace noncharacters with [U+FFFD]
This user preference is turned on by default.
Not sure if revisiting any of our prior discussion would help clarify the
evolution of thinking on this issue.
But I did want to mention that the comment “without public input” is not quite
correct.
Richard,
"public input" is best understood as PRI or similar process, not
discussions by members or other people closely associated with the
project. Also, in particular, discussions on the editorial list are
invisible to the public.
As is so often the case, and as the web page above shows, there was input and
discussion. Whether the amount of time given to this was really adequate is
another question. Work required may expand to fill the available time, and
perhaps more time is now available.
Given the wide ranging nature of implementations this "clarification"
affected, I believe the process failed to provide the necessary safeguards.
Conformance changes are really significant, and a Corrigendum, no matter
how much it is presented as harmless clarification, does affect conformance.
The UTC would be well served to formally adopt a process that requires a
PRI as well as resolutions taken at two separate UTCs to approve any
Corrigendum.
There are changes to properties and algorithms that would also benefit
from such an extended process that has a guaranteed minimum number of
times for the change to be debated, to surface in minutes and to surface
in calls for public input, rather than sailing quietly and quickly into
the standard.
The threshold for this should really be rather low -- as the standard
has matured, the number and nature of implementations that depend on it
have multiplied, to the point where even a diverse membership is no
guarantee that issues can be correctly identified and averted.
With the minutes from the UTC only recording decisions, one change, to
require an initial and a confirming resolution at separate meetings
would allow more issues to surface. It would also help if proposal
documents were updated to reflect the initial discussion, much as it is
done with character encoding proposals that are updated to address
additional concerns identified or resolved.
That said, I could imagine a possible exception for true errata (typos),
where correcting a clear mistake should not be unnecessarily drawn out,
so the error can be removed promptly. Such cases usually are turning on
facts (was there an editing mistake, was there new data about how a
character is used that makes an original property assignment a mistake
(rather than a less than optimal choice).
Despite being called a "clarification" this corrigendum is not in the
nature of an erratum.
A./
-Richard
_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode
_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode