Philippe Verdy verd...@wanadoo.fr wrote:
|glibc is not more borken and any other C library implementing toupper and
|tolower from the legacy ctype standard library. These are old APIs that
|are just widely used and still have valid contexts were they are simple and
|safe to use. But they are
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
glibc is not more borken and any other C library implementing toupper
and tolower from the legacy ctype standard library. These are old
APIs that are just widely used and still have valid contexts were they
are simple and safe to use.
Successors to convert strings instead of just isolated characters (sorry,
they are NOT what we need to handle texts, they are not even equivalent
to Unicode characters, they are just code units, most often 8-bit with
char or 16-bit only with wchar_t !) already exist in all C libraries
(including
The equivalent of strtolower() and strtoupper() is implemented in all C
libraries I know (yes, including glibc) and I have worked with on various
OSes (and since very long!), even if their names change (because of the
unfortunate lack of standardization about their interaction with C locales).
Philippe Verdy verd...@wanadoo.fr wrote:
|Successors to convert strings instead of just isolated characters (sorry,
|they are NOT what we need to handle texts, they are not even equivalent
|to Unicode characters, they are just code units, most often 8-bit with
|char or 16-bit only with wchar_t
Philippe Verdy verd...@wanadoo.fr wrote:
|The standard C++ string package could have then used this standard
|internally in the methods exposed in its API. I cannot understand this
|simple effort was never done on such basic functionality needed and used in
|almost all softwares and OSes.
Philippe Verdy verd...@wanadoo.fr さんはかきました:
note that tolower() and toupper() can only work one 1-character level, it
is not recommended for use for changing case of plain text.
For correct handling of locales, to upper and toupper should be replaced by
strtolower and strtoupper (or their
Do not try to get consisant results with only a character to character
mapping, it does not work with all letters, because sometimes you need 1-2
or 2-1 mappings (not all composable characters exist in precombined forms,
or sometimes the combination must be split into its canonical decomposed
So glibc is broken. This doesn't make it a Unicode problem.
On Sat, Nov 8, 2014 at 8:22 PM, Mike FABIAN mfab...@redhat.com wrote:
Philippe Verdy verd...@wanadoo.fr さんはかきました:
note that tolower() and toupper() can only work one 1-character level, it
is not recommended for use for changing
glibc is not more borken and any other C library implementing toupper and
tolower from the legacy ctype standard library. These are old APIs that
are just widely used and still have valid contexts were they are simple and
safe to use. But they are not meant to convert text.
The i18n data just
note that tolower() and toupper() can only work one 1-character level, it
is not recommended for use for changing case of plain text. Its purpose
should be limited to use cases where letters can be safely isolated from
their context, for example when handling letters as numbers (e.g. section
Philippe Verdy verd...@wanadoo.fr さんはかきました:
this is a feature of the Greek alphabet that the lowercase iota subscript
can be capitalized in two different ways : either as a subscript below the
uppercase main letter, or as a standard iota capitalized. The subscript
form is a combining
I have a question about “Uppercase” in DerivedCoreProperties.txt:
U+1F80 ᾀ GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI
is listed as “Lowercase” in
http://www.unicode.org/Public/7.0.0/ucd/DerivedCoreProperties.txt :
1F80..1F87; Lowercase # L [8] GREEK SMALL LETTER ALPHA
I have a question about “Uppercase” in DerivedCoreProperties.txt:
U+1F80 ᾀ GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI
is listed as “Lowercase” in
http://www.unicode.org/Public/7.0.0/ucd/DerivedCoreProperties.txt :
1F80..1F87; Lowercase # L [8] GREEK SMALL LETTER ALPHA
' sign which originates from the et ligature, or the
German umlaut which inherits some old behavior of the superscripted small
latin letter e behaving like the Greek iota script in Fraktur font styles)
2014-11-06 16:55 GMT+01:00 Mike FABIAN maiku.fab...@gmail.com:
I have a question about “Uppercase
,
L.
-Original Message-
From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Mike FABIAN
Sent: Thursday, November 6, 2014 12:32 AM
To: unicode@unicode.org
Subject: Question about Uppercase in DerivedCoreProperties.txt
I have a question about “Uppercase
16 matches
Mail list logo