Vincent Lefevre writes ("Re: bug#35939: version sort is incorrect with hyphen-minus"): > On 2019-06-26 18:40:50 -0700, Paul Eggert wrote: > > Perhaps the coreutils manual could be improved to make this all clearer, and > > perhaps it should refer to the Debian manual if it doesn't already. > > In this case, there should be a new ordering option to provide > true numeric sort with strings mixing non-negative integers and > characters.
I think the Debian algorithm is such an algorithm, but it has a wrinkle which you are not expecting. Here is the specification: https://www.debian.org/doc/debian-policy/ch-controlfields.html#version Note in particular | The lexical comparison is a comparison of ASCII values modified so | that all the letters sort earlier than all the non-letters and so | that a tilde sorts before anything, even the end of a part So in the Debian algorithm, `-' sorts after `a'. I specified this rule. I did it mainly because of versions like `1.0beta3', which is is probably a prerelease of `1.0' and therefore earlier than `1.0.3'. So `b' has to sort before `.' and my rule seemed the simplest one to achieve that. (The version comparison algorithm is a tradeoff between complexity, and breadth of support for people's then-existing practices.) Nowadays Debian invariably writes `1.0~beta3' but when I invented this scheme I did not include the (invaluable) `~' feature. When this is extended to UTF-8, presumably the ordering should be an ordering of unicode scalar values, with the rule about letters interpreted as referring to anything which Unicode considers a letter. If you want to test the Debian algorithm and have access to a copy of dpkg, you can append -1 to both strings to be the "Debian revision", and prepend "1:" to be the "epoch", and then the middle part should be compared the same way as sort -V etc. Vincent, what is your use case for a comparison algorithm which is like the Debian one but which sorts letters after punctuation ? Ian. -- Ian Jackson <ijack...@chiark.greenend.org.uk> These opinions are my own. If I emailed you from an address @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.