(Adding Ian Jackson for dpkg/debian-version details) Hello,
On Tue, May 28, 2019 at 02:53:39AM +0200, Vincent Lefevre wrote: > With GNU coreutils 8.30 under Debian/unstable, I get: > > $ LC_ALL=C ls > ab-cd abb abe > $ LC_ALL=C ls -v > abb abe ab-cd > > The hyphen-minus character should still be regarded as being less > than the letters (there are no digits, so both are expected to be > equivalent). The GNU coreutils manual says: > [...] Thanks for the report and the clear details. To summarize, "ls -v" and "sort -V" (coreutils' version sort) behaves differently than other implementations in regards to minus character: $ printf "%s\n" abb ab-cd | sort -V abb ab-cd $ v1="abb" $ v2="ab-cd" $ dpkg --compare-versions "$v1" lt "$v2" && printf "$v1\n$v2\n" || printf "$v2\n$v1\n" ab-cd abb If I understand correctly, The reason is that in Debian's version comparison algorithm [1], the minus character has a special meaning: it separates the "upstream version" part from the "debian revision" part. In Debian's implementation [2], a version string is first split into three parts (epoch, upstream version, debian revision) using ":" for epoch delimiter and "-" for revision delimiter. Only then the three parts are compared, separately [3]. [1] https://www.debian.org/doc/debian-policy/ch-controlfields.html#version [2] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/parsehelp.c#n191 [3] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/version.c#n140 On ther other hand, coreutils' implementation (from gnulib [4]) does not break version string into three parts - it treats the entire string as a single "upstream version" part. The rules for sorting the "upstream version" string say: "... The lexical comparison is a comparison of ASCII values modified so that all the letters sort earlier than all the non-letters and so that a tilde sorts before anything" (from [1]) [4] https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/filevercmp.c Therefore, dpkg first seprates "ab" from "cd", then compares "ab" to "abb" - and 'ab' comes first; Coreutils compare "ab-cd" to "abb" (or technically, just "ab-" to "abb"), and because "letters sort earlier than all non-letters", "abb" comes first. I hope this helps explain the differences (I also hope this explanation is correct, and I invite others to chime in). regards, - assaf