On 17 November 2012 11:25, Roland Mainz <[email protected]> wrote:
> On Fri, Nov 16, 2012 at 6:00 PM, Roland Mainz <[email protected]> 
> wrote:
>> On Fri, Nov 16, 2012 at 5:57 PM, Roland Mainz <[email protected]> 
>> wrote:
>>> The following testcase (which should basically test whether the
>>> SystemV "tr" range expression [a-z] works with 'a' and 'z' replaced
>>> with \u[20a0] and \u[20af] ...) ...
>>> -- snip --
>>> $ ~/bin/ksh -x -c $'builtin tr ; tr -c
>>> $\'[:digit:][\u[20a0]-\u[20af]][:alpha:]\' "[\\n*]" <<<$\'hello
>>> chicken \u[20ac] world\' ; true'
>>> -- snip --
>>> ... should AFAIK print something like this:
>>> -- snip --
>>> + builtin tr
>>> + tr -c $'[:digit:][\u[20a0]-\u[20af]][:alpha:]' '[\n*]'
>>> + 0<<< hello chicken € world
>>> hello
>>> chicken
>>>
>>>
>>> world
>>> + true
>>>
>>> -- snip --
>>> ... but ast-ksh.2012-11-24 with Glenn's latest tr.c changes gives this 
>>> output:
>>> -- snip --
>>> + builtin tr
>>> + tr -c $'[:digit:][\u[20a0]-\u[20af]][:alpha:]' '[\n*]'
>>> + 0<<< hello chicken € world
>>> hello
>>> chicken
>>> €
>>> world
>>> + true
>>>
>>> -- snip --
>>>
>>> Erm... does anyone spot the mistake ? Or is this a AST "tr" bug ?
>>
>> BTW: It seems to work if I remove the leading [:digit:] expression:
>> -- snip --
>> $ ~/bin/ksh -x -c $'builtin tr ; tr -c
>> $\'[\u[20a0]-\u[20af]][:alpha:]\' "[\\n*]" <<<$\'hello chicken
>> \u[20ac] world\' ; true'
>> + builtin tr
>> + tr -c $'[\u[20a0]-\u[20af]][:alpha:]' '[\n*]'
>> + 0<<< hello chicken € world
>> hello
>> chicken
>> €
>> world
>> + true
>> -- snip --
>
> ... or if I put the [:digit:] at the end:
> -- snip --
> $ ~/bin/ksh -x -c $'builtin tr ; tr -c
> $\'[\u[20a0]-\u[20af]][:alpha:][:digit:]\' "[\\n*]" <<<$\'hello
> chicken 6a \u[20ac] world\' ; true'
> + builtin tr
> + tr -c $'[\u[20a0]-\u[20af]][:alpha:][:digit:]' '[\n*]'
> + 0<<< hello chicken 6a € world
> hello
> chicken
> 6a
> €
> world
> + true
> -- snip --
>
> ... erm... question for Glenn:
> Must range patterns (e.g. [a-z] or 'a' and 'z' replaced by Unicode
> characters) be sorted before character classes like [:digit:] or
> [:alpha:] (this may be a case where a --strict option should
> warn/complain if the arguments must be sorted) ?

The current implementation requires the argument to be sorted -
characters first, then ranges and finally character classes
([:digit:]) - but I'm not seeing that the standard requires this.
Glenn, can you elaborate on this?

Ced
-- 
Cedric Blancher <[email protected]>
Institute Pasteur
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Reply via email to