On Wed, 27 Jun 2001, [EMAIL PROTECTED] wrote:
[earlier correspondents]
Personally, I think that the codecs should report an error in the
appropriate fashion when presented with a python unicode string
which contains values that are not allowed, such as lone
surrogates.
Other people have
Mark,
Your are correct in that the text is not nearly as clear as it should be,
and is open to different interpretations. My view of the status in Unicode
3.1 is represented on http://www.macchiato.com/utc/utf_comparison.htm.
Corresponding computations are on
Martin v. Loewis [EMAIL PROTECTED] wrote:
It seems to be unclear to many, including myself, what exactly was
clarified with Unicode 3.1. Where exactly does it say that processing
a six-byte two-surrogates sequence as a single character is
non-conforming?
It's not non-conforming, it's
If you still find the definitions and discussion in the technical report
to be unclear, then the Unicode editorial committee would undoubtedly like
to hear about it.
There is no question that there are still things that are unclear and
things that are anachronistic in the definitions. I have
The next version of the Unicode Standard will be Version 3.1.1, due for
release in August, 2001. The beta period for this version will be until July
31, 2001. During this beta period, updated Unicode Character Database files
are available for public comment. We strongly encourage implementers to
5 matches
Mail list logo