On Tue, 20 Nov 2012 10:04:36 +0100 Cedric Blancher wrote: > On 17 November 2012 11:25, Roland Mainz <[email protected]> wrote: > > On Fri, Nov 16, 2012 at 6:00 PM, Roland Mainz <[email protected]> > > wrote: > >> On Fri, Nov 16, 2012 at 5:57 PM, Roland Mainz <[email protected]> > >> wrote: > >>> The following testcase (which should basically test whether the > >>> SystemV "tr" range expression [a-z] works with 'a' and 'z' replaced > >>> with \u[20a0] and \u[20af] ...) ... > >>> -- snip -- > >>> $ ~/bin/ksh -x -c $'builtin tr ; tr -c > >>> $\'[:digit:][\u[20a0]-\u[20af]][:alpha:]\' "[\\n*]" <<<$\'hello > >>> chicken \u[20ac] world\' ; true' > >>> -- snip -- > >>> ... should AFAIK print something like this: > >>> -- snip -- > >>> + builtin tr > >>> + tr -c $'[:digit:][\u[20a0]-\u[20af]][:alpha:]' '[\n*]' > >>> + 0<<< hello chicken world > >>> hello > >>> chicken > >>> > >>> > >>> world > >>> + true > >>> > >>> -- snip -- > >>> ... but ast-ksh.2012-11-24 with Glenn's latest tr.c changes gives this > >>> output: > >>> -- snip -- > >>> + builtin tr > >>> + tr -c $'[:digit:][\u[20a0]-\u[20af]][:alpha:]' '[\n*]' > >>> + 0<<< hello chicken world > >>> hello > >>> chicken > >>> > >>> world > >>> + true > >>> > >>> -- snip -- > >>> > >>> Erm... does anyone spot the mistake ? Or is this a AST "tr" bug ? > >> > >> BTW: It seems to work if I remove the leading [:digit:] expression: > >> -- snip -- > >> $ ~/bin/ksh -x -c $'builtin tr ; tr -c > >> $\'[\u[20a0]-\u[20af]][:alpha:]\' "[\\n*]" <<<$\'hello chicken > >> \u[20ac] world\' ; true' > >> + builtin tr > >> + tr -c $'[\u[20a0]-\u[20af]][:alpha:]' '[\n*]' > >> + 0<<< hello chicken world > >> hello > >> chicken > >> > >> world > >> + true > >> -- snip -- > > > > ... or if I put the [:digit:] at the end: > > -- snip -- > > $ ~/bin/ksh -x -c $'builtin tr ; tr -c > > $\'[\u[20a0]-\u[20af]][:alpha:][:digit:]\' "[\\n*]" <<<$\'hello > > chicken 6a \u[20ac] world\' ; true' > > + builtin tr > > + tr -c $'[\u[20a0]-\u[20af]][:alpha:][:digit:]' '[\n*]' > > + 0<<< hello chicken 6a world > > hello > > chicken > > 6a > > > > world > > + true > > -- snip -- > > > > ... erm... question for Glenn: > > Must range patterns (e.g. [a-z] or 'a' and 'z' replaced by Unicode > > characters) be sorted before character classes like [:digit:] or > > [:alpha:] (this may be a case where a --strict option should > > warn/complain if the arguments must be sorted) ?
> The current implementation requires the argument to be sorted - > characters first, then ranges and finally character classes > ([:digit:]) - but I'm not seeing that the standard requires this. > Glenn, can you elaborate on this? the current implementation of ast tr?
_______________________________________________ ast-developers mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-developers
