On 11/15/15, Ulrich Mueller <u...@gentoo.org> wrote: > Description: > In an UTF-8 locale like en_US.UTF-8, the case-modifying > parameter expansions sometimes return invalid UTF-8 encodings. > > This seems to happen when the UTF-8 byte sequences that are > encoding upper and lower case have different lengths. > > Repeat-By: > $ LC_ALL=en_US.UTF-8 > $ x=$'\xc4\xb1' # LATIN SMALL LETTER DOTLESS I > $ echo -n "${x^}" | od -t x1 > 0000000 49 b1 > 0000002 > > This should have output "49" for "I" only. The "b1" is illegal > as the first byte of an UTF-8 sequence. > > $ x=$'\xe1\xba\x9e' # LATIN CAPITAL LETTER SHARP S > $ echo -n "${x,}" | od -t x1 > 0000000 c3 9f 9e > 0000003 > > This should have output "c3 9f" (for "sharp s") only. >
Both examples should work as expected in 4.4-beta. --- xoxo iza