On Fri, 6 Aug 2010 18:39:20 +0200 Cedric Blancher wrote:
> On 6 August 2010 17:26, Glenn Fowler <[email protected]> wrote:
> >
> > tr is on the ast l18n todo list
> > what locale is your example?

> en_US.UTF-8

> > btw, I tried this on linux and solaris
> >
> > LC_ALL=de_DE.UTF-8 /usr/bin/tr '[:lower:]' '[:upper:]' <<<$'a\303\274z'

> This works with AST tr, too. But my test case is backwards, upper to lower:
> ksh93 -c "LC_ALL=de_DE.UTF-8 ./arch/sol11.i386/bin/tr '[:upper:]'
> '[:lower:]' <<<$'aÄÄz'" "
> <trash>

I was showing that it didn't work for linux and solaris (does for 
/usr/xpg6/bin/tr -- thanks)
I just had the UTF-8 bytes for lower case u umlaut in hand
so I exchanged lower/upper

> ksh93 -c "LC_ALL=de_DE.UTF-8 ./arch/sol11.i386/bin/tr '[:upper:]'
> '[:lower:]' <<<$'a\303\274z'"
> <trash>

> > ($'\303\274' is UTF-8 lower case u-umlaut)
> > and got lower case u-umlaut on the output

for the test above ast tr does not print trash,
it simply copies the \303\274 UTF-8 bytes unchanged
is your window/display/xterm set for UTF-8?

you can factor out the display upiping to od -c
ast tr (not multibyte aware yet) should produce
0000000   A 303 274   Z  \n
0000005

_______________________________________________
ast-developers mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to