On Fri, 6 Aug 2010 18:39:20 +0200 Cedric Blancher wrote: > On 6 August 2010 17:26, Glenn Fowler <[email protected]> wrote: > > > > tr is on the ast l18n todo list > > what locale is your example?
> en_US.UTF-8 > > btw, I tried this on linux and solaris > > > > LC_ALL=de_DE.UTF-8 /usr/bin/tr '[:lower:]' '[:upper:]' <<<$'a\303\274z' > This works with AST tr, too. But my test case is backwards, upper to lower: > ksh93 -c "LC_ALL=de_DE.UTF-8 ./arch/sol11.i386/bin/tr '[:upper:]' > '[:lower:]' <<<$'aÄÄz'" " > <trash> I was showing that it didn't work for linux and solaris (does for /usr/xpg6/bin/tr -- thanks) I just had the UTF-8 bytes for lower case u umlaut in hand so I exchanged lower/upper > ksh93 -c "LC_ALL=de_DE.UTF-8 ./arch/sol11.i386/bin/tr '[:upper:]' > '[:lower:]' <<<$'a\303\274z'" > <trash> > > ($'\303\274' is UTF-8 lower case u-umlaut) > > and got lower case u-umlaut on the output for the test above ast tr does not print trash, it simply copies the \303\274 UTF-8 bytes unchanged is your window/display/xterm set for UTF-8? you can factor out the display upiping to od -c ast tr (not multibyte aware yet) should produce 0000000 A 303 274 Z \n 0000005 _______________________________________________ ast-developers mailing list [email protected] https://mailman.research.att.com/mailman/listinfo/ast-developers
