Hello, > On Jun 27, 2016, at 06:56, Pádraig Brady <[email protected]> wrote: > > On 27/06/16 06:17, Assaf Gordon wrote: >> Hello Pádraig and all, >> >>> On Jun 25, 2016, at 07:20, Pádraig Brady <[email protected]> wrote: >>> >>> As part of this, or at least before looking at multibyte changes, >>> it would be worth considering this proposal for changing the >>> unexpand algorithm: http://bugs.gnu.org/23335 >> >> The above bug-report addresses this TODO item: >> === >> unexpand: [http://www.opengroup.org/onlinepubs/007908799/xcu/unexpand.html] >> printf 'x\t \t y\n'|unexpand -t 8,9 should print its input, unmodified. >> printf 'x\t \t y\n'|unexpand -t 5,8 should print "x\ty\n" >> === > > I think the second command is wrong there actually? > Surely it should print "x\t\t y\n"
Digging a bit deeper about various 'unexpand' implementation, it seems there are more differences. Attached is a summary of most of coreutil's unexpand tests on various systems. The trivial cases give the same results, but more tricky cases (e.g. the 'blanks' and 'posix' tests) do differ. The test script is here: http://files.housegordon.org/tmp/test-unexpand-2.sh (the last 'ff' octet for AIX can be ignored, I suspect a bug in AIX's unexpand when lines are not '\n' terminated). Example (the inputs are 'blank-1' and 'blank-11' from <coreutils>/tests/misc/unexpand.pl): blanks-1 AIX-1 09 62 09 09 63 09 09 09 64 blanks-1 Darwin-14.4.0 20 62 09 20 63 09 09 20 64 blanks-1 FreeBSD-10.1-RELEASE 20 62 09 20 63 09 09 20 64 blanks-1 Linux-3.16.0-4-amd64 09 62 09 09 63 09 09 09 64 blanks-1 SunOS-5.11 20 62 20 20 63 20 20 20 64 blanks-11 AIX-1 09 09 34 blanks-11 Darwin-14.4.0 09 34 blanks-11 FreeBSD-10.1-RELEASE 09 34 blanks-11 Linux-3.16.0-4-amd64 09 09 34 blanks-11 SunOS-5.11 09 20 34 And so I wonder if it's best to leave unexpand's algorithm as-is, for the sake of backwards-compatability (if someone is expecting coreutils' expected behavior), and then focus back on multibyte character processing in 'expand' (with or without using the refactoring patches).
unexpand-comparison.txt.xz
Description: Binary data
regards, - assaf
