2016-11-04 12:29:03 +0000, Stephane Chazelas: [...] > $ LC_ALL=zh_HK.big5hkscs locale charmap > BIG5-HKSCS > > Most of the problematic characters are the ones ending in 0x5c > (which happens to be backslash in ASCII (or in BIG5-HKSCS when > standing alone). [...]
Those characters are also a problem for "read", "echo" and probably many other cases: $ echo $'\u3b1 b c' | bash -c 'read a b c; echo $b' c $ echo $'\u3b1 b c' | ksh93 -c 'read a b c; echo $b' c $ echo $'\u3b1 b c' | zsh -c 'read a b c; echo $b' b $ echo $'\u3b1 b c' | yash -c 'read a b c; echo $b' b $ locale charmap BIG5-HKSCS (ksh93 has a similar bug). \u3b1 is the Greek lower case alpha character encoded as a3 5c in that Hong Kong charset. Also: $ export alpha=$'\u3b1' $ printf 'A%sB\n' "$alpha" | bash -c 'IFS=$alpha read a b c; echo $b' <empty-line> (that one is OK in ksh93, zsh and bash). $ bash -c 'echo -e "a${alpha}b"' | LC_ALL=C sed -n l a\243\b$ (second byte of \u3b1 with "b" expanded to BS). (same bug in zsh and ksh93, only yash OK). (same with $'...' and printf) -- Stephane