On 7/26/21 2:39 PM, David Malcolm wrote:
On Mon, 2021-07-26 at 14:21 -0400, Andrew MacLeod via Gcc-patches
wrote:
Remove lower case characters from the range of to_upper and likewise,
upper case characters from to_lower.
I looked at also only adding the upper case characters for which there
is a lower_case character in the range, but it seemed of limited use
Given some odd usage patterns we emit. . Instead, I simply took the
incoming range, removed the "from" character set, and added the "to"
character set. This'll preserve any odd things that are passed into it
while providing the basic functionality.
Easy enough for someone to enhance if they feel so inclined.
Bootstrapped on x86_64-pc-linux-gnu with no regressions. Pushed.
Awkward question: does this work with character sets where there are
non-letter characters between 'a' and 'z' and between 'A' and 'Z' (e.g.
EBCDIC [1])?
For example toupper('~') should return '~', but '~' is between 'a' and
'z' in EBCDIC; likewise tolower('}') should return '}', but '}' is
between 'A' and 'Z' in EBCDIC.
Dave
[1] https://en.wikipedia.org/wiki/EBCDIC
do we suppord non-ansi/ EBCDIC? I thought I saw other places where we
use 'a'-'z' and stuff like that.
If we do support it, then how does one determine which set is being used
so we would only do it for ANSI. ?
Andrew