Hello Carsten, On 05/14/2016 10:17 AM, Carsten Hey wrote:
the man page sort(1) contains a misleading description of the option -n:
[...]
$ man sort | grep -A1 -- --numeric-sort | sed -n -e 's/^ *//' -e '1!p' compare according to string numerical value
[...]
This description reads as if this command: $ printf '%s\n' 'x 9' 'x 10' | sort -n x 10 x 9
[...]
but instead, -n stops doing its magic after finding the first non-numeric, non-whitespace character. There is a short and simple way to summarize this behaviour.
IIUC, you are disputing the accuracy (or clarity) of the term "string numerical value" on the manual page, and not the actual behavior of "sort -n" (which is mandated by posix and has been this way for many many years, as opposed to "sort -V" which was only introduced as GNU extension in coreutils version 7.0 in 2008). The description says "string numeric value" - which (to me) does not mean anything other than numeric value (implying letters will not be sorted properly), but opinions clearly differ. Using the "--debug" option would immediately reveal the error: $ printf '%s\n' 'x 9' 'x 10' | sort --debug -n sort: using ‘en_US.UTF-8’ sorting rules x 10 ^ no match for key ____ x 9 ^ no match for key ___ If you have a suggestion for improved wording, I'm sure they can be considered for inclusion. A patch against function usage() in sort.c would go even a longer way. note that unlike FreeBSD/OpenBSD, the description in the man page is derived from "sort --help", and thus kept brief. For completeness, here are similar descriptions of "sort -n" from other sources: POSIX says (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html): -n Restrict the sort key to an initial numeric string, consisting of optional <blank> characters, optional minus-sign, and zero or more digits with an optional radix character and thousands separators (as defined in the current locale), which shall be sorted by arithmetic value. An empty digit string shall be treated as zero. Leading zeros and signs on zeros shall not affect ordering. The GNU Coreutils manual (which is the official documentation, not the man page) says: (http://www.gnu.org/software/coreutils/manual/coreutils.html#sort-invocation) -n --numeric-sort --sort=numeric Sort numerically. The number begins each line and consists of optional blanks, an optional ‘-’ sign, and zero or more digits possibly separated by thousands separators, optionally followed by a decimal-point character and zero or more digits. An empty number is treated as ‘0’. The LC_NUMERIC locale specifies the decimal-point character and thousands separator. By default a blank is a space or a tab, but the LC_CTYPE locale can change this. OpenBSD's man page has: -n, --numeric-sort, --sort=numeric An initial numeric string, consisting of optional blank space, optional minus sign, and zero or more digits (including decimal point) is sorted by arithmetic value. Leading blank characters are ignored. FreeBSD's man page has: -n, --numeric-sort, --sort=numeric Sort fields numerically by arithmetic value. Fields are supposed to have optional blanks in the beginning, an optional minus sign, zero or more digits (including decimal point and possible thou- sand separators). I'm leaving the bug open, other comments and feedback welcomed. regards, - assaf