On Fri, Mar 15, 2013 at 5:27 PM, Dr. Werner Fink <[email protected]> wrote: > On Fri, Mar 15, 2013 at 03:57:20PM +0100, Cedric Blancher wrote: >> On 14 March 2013 23:01, Roland Mainz <[email protected]> wrote: >> > On Thu, Mar 14, 2013 at 2:19 PM, Cedric Blancher >> > <[email protected]> wrote: >> >> How do I match accented e (i.e. é) using an equivalence class in AST tr? >> >> >> >> Doing that in sed is easy: >> >> ~/bin/sed -r "s/[[=e=]]/X/g" <<<"8é8" ; printf "\n" >> >> 8X8 >> >> >> >> But in tr I am not able to get it working: >> >> ksh -c 'builtin tr ; tr -Cd "[=e=]" <<<"1e2é3" ; print' >> >> e >> >> >> >> AFAIK this should print "eé". >> >> >> >> I used: >> >> version tr (AT&T Research) 2012-11-12 >> >> version sed (AT&T Research) 2012-03-28 >> > >> > Erm... wIthout digging around... does AST "tr" support the POSIX >> > equivalence class syntax yet (Glenn... ping!) ? My first guess would >> > be to try another platform like Solaris to see if the issue is >> > libc-related... >> >> Glenn, does AST tr support the [=e=] syntax? >> Werner, does GNU tr support the [=e=] syntax? > > The manual page or tr says: > > > [=CHAR=] > all characters which are equivalent to CHAR > > ... nevertheless > > werner@noether:~> echo $LANG > POSIX > werner@noether:~> tr -Cd "[=a=]" <<<"1e2b3a"; echo > a > werner@noether:~> tr -d "[=a=]" <<<"1e2b3a" > 1e2b3 > werner@noether:~> tr -d "[:alpha:]" <<<"1e2b3a" > 123 > werner@noether:~> LANG=fr_FR.UTF-8 > werner@noether:~> tr -Cd "[=e=]" <<<"1e2é3a"; echo > e > werner@noether:~> tr -d "[=e=]" <<<"1e2é3a" > 12é3a > werner@noether:~> tr -d "[:alpha:]" <<<"1e2é3a" > 12é3 > werner@fatou:~> tr -Cs "[=e=]" '[\n*]' <<<"1e2é3a" > > e > werner@fatou:~> tr -s "[=e=]" '[\n*]' <<<"1e2é3a" > 1 > 2é3a > > > ... it seems that multibyte may cause problems as well > as equivalent classes. The tr is from GNU coreutils 8.17.
Maybe AST tr uses the wrong libast regex function? I noticed this: /usr/ast/bin/sed -E "s/[=e=]/X/g" <<<"1e2é3ae4" 1X2é3aX4 /usr/ast/bin/sed -E "s/[[=e=]]/X/g" <<<"1e2é3ae4" 1X2X3aX4 In the first sed example é is not matched by [=e=] but the second matches it with [[=e=]]. Maybe AST tr must call the regex function for [[=e=]] and not [=e=]? Simon _______________________________________________ ast-users mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-users
