Hello Tilman, On 10/08/17 10:10 AM, Tilman Schmidt wrote: > it seems the expand(1) command does not properly support multi-byte > characters.
That is correct. > tschmidt@sl-vm-redmine01:~$ cat test.txt > Text ohne Umlaute > Täxt müt Umläuten > tschmidt@sl-vm-redmine01:~$ expand test.txt > Text ohne Umlaute > Täxt müt Umläuten > > Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu. Multibyte support is not available yet (neither in version 8.21 which is 4 years old, nor in the current version 8.27). However, there is an on-going effort to add multibyte support to all coreutils programs, including 'expand'. You can read more technical details about it here: http://crashcourse.housegordon.org/coreutils-multibyte-support.html In the current (work-in-progress) internationalization patch, the 'expand' program does support multibyte locales, and expands your input correctly: multibyte locale: $ ./src/expand bug28038.txt Text ohne Umlaute Täxt müt Umläuten versus forcing single-byte locale: $ LC_ALL=C ./src/expand bug28038.txt Text ohne Umlaute Täxt müt Umläuten The latest version of the patch is available for download and experimentation here: http://lists.gnu.org/archive/html/coreutils/2017-04/msg00009.html However it should not be considered stable. regards, - assaf
