On 20 November 2012 16:27, Glenn Fowler <[email protected]> wrote: > > On Tue, 20 Nov 2012 10:04:36 +0100 Cedric Blancher wrote: >> On 17 November 2012 11:25, Roland Mainz <[email protected]> wrote: >> > On Fri, Nov 16, 2012 at 6:00 PM, Roland Mainz <[email protected]> >> > wrote: >> >> On Fri, Nov 16, 2012 at 5:57 PM, Roland Mainz <[email protected]> >> >> wrote: >> >>> The following testcase (which should basically test whether the >> >>> SystemV "tr" range expression [a-z] works with 'a' and 'z' replaced >> >>> with \u[20a0] and \u[20af] ...) ... >> >>> -- snip -- >> >>> $ ~/bin/ksh -x -c $'builtin tr ; tr -c >> >>> $\'[:digit:][\u[20a0]-\u[20af]][:alpha:]\' "[\\n*]" <<<$\'hello >> >>> chicken \u[20ac] world\' ; true' >> >>> -- snip -- >> >>> ... should AFAIK print something like this: >> >>> -- snip -- >> >>> + builtin tr >> >>> + tr -c $'[:digit:][\u[20a0]-\u[20af]][:alpha:]' '[\n*]' >> >>> + 0<<< hello chicken € world >> >>> hello >> >>> chicken >> >>> >> >>> >> >>> world >> >>> + true >> >>> >> >>> -- snip -- >> >>> ... but ast-ksh.2012-11-24 with Glenn's latest tr.c changes gives this >> >>> output: >> >>> -- snip -- >> >>> + builtin tr >> >>> + tr -c $'[:digit:][\u[20a0]-\u[20af]][:alpha:]' '[\n*]' >> >>> + 0<<< hello chicken € world >> >>> hello >> >>> chicken >> >>> € >> >>> world >> >>> + true >> >>> >> >>> -- snip -- >> >>> >> >>> Erm... does anyone spot the mistake ? Or is this a AST "tr" bug ? >> >> >> >> BTW: It seems to work if I remove the leading [:digit:] expression: >> >> -- snip -- >> >> $ ~/bin/ksh -x -c $'builtin tr ; tr -c >> >> $\'[\u[20a0]-\u[20af]][:alpha:]\' "[\\n*]" <<<$\'hello chicken >> >> \u[20ac] world\' ; true' >> >> + builtin tr >> >> + tr -c $'[\u[20a0]-\u[20af]][:alpha:]' '[\n*]' >> >> + 0<<< hello chicken € world >> >> hello >> >> chicken >> >> € >> >> world >> >> + true >> >> -- snip -- >> > >> > ... or if I put the [:digit:] at the end: >> > -- snip -- >> > $ ~/bin/ksh -x -c $'builtin tr ; tr -c >> > $\'[\u[20a0]-\u[20af]][:alpha:][:digit:]\' "[\\n*]" <<<$\'hello >> > chicken 6a \u[20ac] world\' ; true' >> > + builtin tr >> > + tr -c $'[\u[20a0]-\u[20af]][:alpha:][:digit:]' '[\n*]' >> > + 0<<< hello chicken 6a € world >> > hello >> > chicken >> > 6a >> > € >> > world >> > + true >> > -- snip -- >> > >> > ... erm... question for Glenn: >> > Must range patterns (e.g. [a-z] or 'a' and 'z' replaced by Unicode >> > characters) be sorted before character classes like [:digit:] or >> > [:alpha:] (this may be a case where a --strict option should >> > warn/complain if the arguments must be sorted) ? > >> The current implementation requires the argument to be sorted - >> characters first, then ranges and finally character classes >> ([:digit:]) - but I'm not seeing that the standard requires this. >> Glenn, can you elaborate on this? > > the current implementation of ast tr?
./arch/linux.i386-64/bin/ksh -c 'builtin tr ; tr --version' version tr (AT&T Research) 2012-11-12 Rephrasing my question: 1. Does the standard, whatever it's name or version, require the tr arguments to be sorted like regex arguments need to be sorted? 2. Does the current AST tr implementation (tr (AT&T Research) 2012-11-12) require the arguments to be sorted? Ced -- Cedric Blancher <[email protected]> Institute Pasteur _______________________________________________ ast-developers mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-developers
