Re: Pattern replacement fails if string contains multibyte characters
Bernd Eggink [EMAIL PROTECTED] writes: This happens on a utf-8 based system (CRUX 2.3), LANG=de_DE.UTF-8: t=123abc456äöüABCD echo ${t//[a-c]/} # output: 123456öüCD Which is correct. [a-c] matches every character between a and c (inclusive) in the collating sequence defined by the locale. For your locale that includes characters like ä and A. You should avoid the use of ranges when not using the C locale. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: Pattern replacement fails if string contains multibyte characters
I wrote: The difference is in the gnu libc implementation of strcoll(), which bash uses to compare characters for range matching. The glibc implementation ignores the locale; the other systems incorporate the current locale's collating sequence into their strcoll implementation. Sorry, that's backwards. On systems where strcoll() honors the current locale's collating sequence, you'll get the output you see on Linux. Systems that either don't have locale support or don't reflect the locale's collating sequence in strcoll() will produce the output you expect. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer Live Strong. No day but today. Chet Ramey, ITS, CWRU[EMAIL PROTECTED]http://cnswww.cns.cwru.edu/~chet/