Hi,

some further debugging of a hexdump output of printf, i.e.:

#!/bin/bash
for l in de_DE en_US nb_NO nn_NO ; do
   echo "LC_NUMERIC=$l.UTF-8"
   for n in 1 100 1000 10000 100000 1000000 10000000 ; do
      LC_NUMERIC=$l.UTF-8 /usr/bin/printf "<%'10d>" $n | hexdump -C
   done
done

The output is:

...
LC_NUMERIC=nb_NO.UTF-8
00000000  3c 20 20 20 20 20 20 20  20 20 31 3e              |<         1>|
0000000c
00000000  3c 20 20 20 20 20 20 20  31 30 30 3e              |<       100>|
0000000c
00000000  3c 20 20 20 31 e2 80 af  30 30 30 3e              |<   1...000>|
0000000c
00000000  3c 20 20 31 30 e2 80 af  30 30 30 3e              |<  10...000>|
0000000c
00000000  3c 20 31 30 30 e2 80 af  30 30 30 3e              |< 100...000>|
0000000c
00000000  3c 31 e2 80 af 30 30 30  e2 80 af 30 30 30 3e     |<1...000...000>|
0000000f
00000000  3c 31 30 e2 80 af 30 30  30 e2 80 af 30 30 30 3e  |<10...000...000>|
00000010
LC_NUMERIC=nn_NO.UTF-8
00000000  3c 20 20 20 20 20 20 20  20 20 31 3e              |<         1>|
0000000c
00000000  3c 20 20 20 20 20 20 20  31 30 30 3e              |<       100>|
0000000c
00000000  3c 20 20 20 31 e2 80 af  30 30 30 3e              |<   1...000>|
0000000c
00000000  3c 20 20 31 30 e2 80 af  30 30 30 3e              |<  10...000>|
0000000c
00000000  3c 20 31 30 30 e2 80 af  30 30 30 3e              |< 100...000>|
0000000c
00000000  3c 31 e2 80 af 30 30 30  e2 80 af 30 30 30 3e     |<1...000...000>|
0000000f
00000000  3c 31 30 e2 80 af 30 30  30 e2 80 af 30 30 30 3e  |<10...000...000>|
00000010

printf seems to insert a 3-byte UTF-8 character 0xe2 0x80 0xaf as thousands separator. "0xe2 0x80 0xaf" is UTF-8 NARROW NO-BREAK SPACE -> https://www.fileformat.info/info/unicode/char/202f/index.htm <https://www.fileformat.info/info/unicode/char/202f/index.htm> . But terminal output (tested with Konsole and XTerm) has fixed spacing, so "narrow space" should probably be a regular space or regular non-breakable space (0xc2 0xa0, HTML "&nbsp;")? Note that also LibreOffice cannot produce a correct screen output with UTF-8 NARROW NO-BREAK SPACE, even with proportional fonts, when loading the output of the test script as a text file.

Screenshots for illustration:

 * Terminal output:
   
https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/2058775/+attachment/5758462/+files/Screenshot_20240322_213947.png
 * LibreOffice output:
   
https://bugs.launchpad.net/ubuntu/+source/coreutils/+bug/2058775/+attachment/5758464/+files/Screenshot_20240322_222052.png

--
Best regards / Mit freundlichen Grüßen / Med vennlig hilsen

=======================================================================
 Thomas Dreibholz

 Simula Metropolitan Centre for Digital Engineering
 Centre for Resilient Networks and Applications
 Pilestredet 52
 0167 Oslo, Norway
-----------------------------------------------------------------------
 E-Mail:dre...@simula.no
 Homepage:http://simula.no/people/dreibh
=======================================================================

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to