Hello Tilman,

On 10/08/17 10:10 AM, Tilman Schmidt wrote:
> it seems the expand(1) command does not properly support multi-byte
> characters.

That is correct.

> tschmidt@sl-vm-redmine01:~$ cat test.txt
> Text  ohne    Umlaute
> Täxt  müt     Umläuten
> tschmidt@sl-vm-redmine01:~$ expand test.txt
> Text    ohne    Umlaute
> Täxt   müt    Umläuten
> 
> Using Ubuntu 14.04.5 LTS with coreutils 8.21-1ubuntu.

Multibyte support is not available yet (neither in version 8.21 which is
4 years old, nor in the current version 8.27).

However, there is an on-going effort to add multibyte support
to all coreutils programs, including 'expand'.

You can read more technical details about it here:
  http://crashcourse.housegordon.org/coreutils-multibyte-support.html

In the current (work-in-progress) internationalization patch,
the 'expand' program does support multibyte locales, and expands
your input correctly:

multibyte locale:

   $ ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt    müt     Umläuten

versus forcing single-byte locale:

   $ LC_ALL=C ./src/expand bug28038.txt
   Text    ohne    Umlaute
   Täxt   müt    Umläuten


The latest version of the patch is available for download and
experimentation here:
  http://lists.gnu.org/archive/html/coreutils/2017-04/msg00009.html
However it should not be considered stable.

regards,
 - assaf





Reply via email to