Hello,
More progress on tr with multibyte support, available here:
https://files.housegordon.org/src/coreutils-multibyte-2017-12-23.patch.xz
translation (mostly) working:
$ echo abcdefg | ./src/tr 'abcd' 'αβγδ'
αβγδefg
$ echo '1234 ABCD ΨΔΩΣ *$%()' \
| ./src/tr -c '[:alpha:][:cntrl:]' 'Ψ'
ΨΨΨΨΨABCDΨΨΔΩΣΨΨΨΨΨΨ
$ echo 'αααββββ' | ./src/tr -s 'β' 'χ'
αααχ
$ echo 'aAbBcC ✀ χΧλΛσΣ' | ./src/tr '[:lower:]' '[:upper:]'
AABBCC ✀ ΧΧΛΛΣΣ
The current implementation could be a starting point for
testing and discussing specific edge-cases (some tests are already
included).
It is not tuned for efficiency (neither implementation nor run time
performance).
There's a lot of code duplication due to keeping the entire current
unibyte code-path intact.
comments welcomed.
- assaf