Hi,
Seen with diffutils 'master' branch today:
$ cat foo1.txt
你好
$ cat foo2.txt
早晨, 你好
$ src/diff -y foo1.txt foo2.txt
你好 | 早晨, 你好
$ src/diff -l -y foo1.txt foo2.txt
2023-07-04 18:41 diff -l -y foo1.txt foo2.txt Page 1
| ,
你好早晨你好
(and the last line is not terminated by a newline).
This is strange, isn't it? 'diff' has separate the ASCII from the
non-ASCII parts of the input. With the attached patch, the output becomes:
$ src/diff -l -y foo1.txt foo2.txt
2023-07-04 18:42 diff -l -y foo1.txt foo2.txt Page 1
你好 | 早晨, 你好
(which is more what I had expected).
>From cb52fa88d5f2d9bc4894a7eccd90fdc2e03f5af4 Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Tue, 4 Jul 2023 18:45:33 +0200
Subject: [PATCH] diff: Fix output of "diff -l -y" for non-ASCII input files
* src/side.c (print_half_line): Output the multibyte character to out,
not stdout.
---
src/side.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/side.c b/src/side.c
index 2f566f8..46ef095 100644
--- a/src/side.c
+++ b/src/side.c
@@ -146,7 +146,7 @@ print_half_line (char const *const *line, intmax_t indent, intmax_t out_bound)
if (in_position <= out_bound)
{
out_position = in_position;
- fwrite (tp0, 1, bytes, stdout);
+ fwrite (tp0, 1, bytes, out);
}
text_pointer = tp0 + bytes;
break;
--
2.34.1