On Fri, Mar 15, 2013 at 03:57:20PM +0100, Cedric Blancher wrote:
> On 14 March 2013 23:01, Roland Mainz <[email protected]> wrote:
> > On Thu, Mar 14, 2013 at 2:19 PM, Cedric Blancher
> > <[email protected]> wrote:
> >> How do I match accented e (i.e. é) using an equivalence class in AST tr?
> >>
> >> Doing that in sed is easy:
> >> ~/bin/sed -r "s/[[=e=]]/X/g" <<<"8é8" ; printf "\n"
> >> 8X8
> >>
> >> But in tr I am not able to get it working:
> >> ksh -c 'builtin tr ; tr -Cd "[=e=]" <<<"1e2é3" ; print'
> >> e
> >>
> >> AFAIK this should print "eé".
> >>
> >> I used:
> >> version tr (AT&T Research) 2012-11-12
> >> version sed (AT&T Research) 2012-03-28
> >
> > Erm... wIthout digging around... does AST "tr" support the POSIX
> > equivalence class syntax yet (Glenn... ping!) ? My first guess would
> > be to try another platform like Solaris to see if the issue is
> > libc-related...
>
> Glenn, does AST tr support the [=e=] syntax?
> Werner, does GNU tr support the [=e=] syntax?
The manual page or tr says:
[=CHAR=]
all characters which are equivalent to CHAR
... nevertheless
werner@noether:~> echo $LANG
POSIX
werner@noether:~> tr -Cd "[=a=]" <<<"1e2b3a"; echo
a
werner@noether:~> tr -d "[=a=]" <<<"1e2b3a"
1e2b3
werner@noether:~> tr -d "[:alpha:]" <<<"1e2b3a"
123
werner@noether:~> LANG=fr_FR.UTF-8
werner@noether:~> tr -Cd "[=e=]" <<<"1e2é3a"; echo
e
werner@noether:~> tr -d "[=e=]" <<<"1e2é3a"
12é3a
werner@noether:~> tr -d "[:alpha:]" <<<"1e2é3a"
12é3
werner@fatou:~> tr -Cs "[=e=]" '[\n*]' <<<"1e2é3a"
e
werner@fatou:~> tr -s "[=e=]" '[\n*]' <<<"1e2é3a"
1
2é3a
... it seems that multibyte may cause problems as well
as equivalent classes. The tr is from GNU coreutils 8.17.
Werner
--
"Having a smoking section in a restaurant is like having
a peeing section in a swimming pool." -- Edward Burr
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers