On 03/04/2025 07:10, Avid Seeker wrote:
```
$ echo -n this is a 🐕 | od -cx
0000000   t   h   i   s       i   s       a     360 237 220 225
            6874    7369    6920    2073    2061    9ff0    9590
0000016
```

Can od print UTF-8 characters verbatim instead of encoding them in octal?

I guess as its name suggests, that's not possible. But if it can do it
for ASCII characters what prevents it from also applying it to UTF-8
characters?

If it's not possible, any suggestions or alternative tools would be
apprecited.

Avid


Well there is a bit of a layout issue with multi-byte chars.
With which byte do you align the literal character with?
Also if aligning with spaces there is ambiguity as to whether
there was a space there in the input or not.

  $ echo -n this is á 🐕 | od -tc -tx1
  0000000   t   h   i   s       i   s     303 241     360 237 220 225
           74  68  69  73  20  69  73  20  c3  a1  20  f0  9f  90  95

  $ echo -n this is á 🐕 | od  -tx1z
  0000000 74 68 69 73 20 69 73 20 c3 a1 20 f0 9f 90 95     >this is .. ....<


Now in the first form above at least I guess there isn't much ambiguity with 
spaces,
and we could continue to align multi-byte chars to the last nibble.

thanks,
Pádraig

Reply via email to