On 2025/03/12 20:33, Ingo Schwarze wrote: > > See below for a patch to improve some of the aspects related to the > present report. I do not claim this patch fixes all problems in the > vicinity, but i fear rabbit holes and prefer incremental progress. > > > but the -n behaviour seems valid and, importantly, matches the common > > other implementation and does not seem to violate posix. > > > > -h is of course an extension, but matching -n seems right. > > I agree with all of that. > > One aspect i still don't understand is the interaction of -n with "-t.", > for example why "sort -n -t. -k1 -k2 -k3 -k4 < test.in" doesn't > work on the input provided by the OP (maybe parsing "." as a decimal > point takes precedence over the "-t." making it a field separator? > I'm not sure). I'm not sure how the standard expects field splitting > and number parsing to be related to each other. But one thing at a time, > so here comes my diff: > > Rationale: > The main point is that for all the numeric sort options, we need to say > explicitely what the key is, because the key is what the description > of the -u option refers to. > > In the order of the patch, the detailed rationale is: > 1. "implies a stable sort (see below)" is just wrong. > If anything, -s is above -u, not below - but saying that would > be useless, it's better to just point to -s directly. > 2. Fix -g in a similar way as -n (see below). > 3. "handles general floating points" sounds logically wrong. > The text isn't talking about multiple points, but multiple numbers. > 4. Fix -h in a similar way as -n (see below). > 5. Fix the cross reference to df(1). > 6. Say what the key is. > 7. Add the missing indefinite article "an optional minus sign". > 8. Avoid needlessly turning the postpositive participle "including" > into a parenthetic remark. > 0. Add the missing indefinite article to "decimal point". > 10. Clarify that the decimal point is optional. > > OK? > Ingo
This seems a clear improvement to me. Of course, further changes can be made on top if wanted, but this is a big improvement already. OK sthen > Index: sort.1 > =================================================================== > RCS file: /cvs/src/usr.bin/sort/sort.1,v > diff -u -r1.65 sort.1 > --- sort.1 31 Mar 2022 17:27:27 -0000 1.65 > +++ sort.1 12 Mar 2025 19:26:15 -0000 > @@ -121,7 +121,8 @@ > is not defined. > .It Fl u , Fl Fl unique > Unique: suppress all but one in each set of lines having equal keys. > -This option implies a stable sort (see below). > +This option implies > +.Fl s . > If used with > .Fl C > or > @@ -148,24 +149,25 @@ > Consider all lowercase characters that have uppercase > equivalents to be the same for purposes of comparison. > .It Fl g , Fl Fl general-numeric-sort , Fl Fl sort Ns = Ns Cm general-numeric > -Sort by general numerical value. > +Use an initial numeric string as the key and sort numerically. > As opposed to > .Fl n , > -this option handles general floating points. > +this option handles general floating point numbers. > It has a more > permissive format than that allowed by > .Fl n > but it has a significant performance drawback. > .It Fl h , Fl Fl human-numeric-sort , Fl Fl sort Ns = Ns Cm human-numeric > -Sort by numerical value, but take into account the SI suffix, > -if present. > +Use an initial numeric string with an optional SI suffix as the key. > Sorts first by numeric sign (negative, zero, or > positive); then by SI suffix (either empty, or `k' or `K', or one > of `MGTPEZY', in that order); and finally by numeric value. > The SI suffix must immediately follow the number. > For example, '12345K' sorts before '1M', because M is "larger" than K. > This sort option is useful for sorting the output of a single invocation > -of 'df' command with > +of a > +.Xr df 1 > +command with > .Fl h > or > .Fl H > @@ -176,9 +178,9 @@ > Sort by month abbreviations. > Unknown strings are considered smaller than valid month names. > .It Fl n , Fl Fl numeric-sort , Fl Fl sort Ns = Ns Cm numeric > -An initial numeric string, consisting of optional blank space, optional > -minus sign, and zero or more digits (including decimal point) > -is sorted by arithmetic value. > +Use an initial numeric string as the key, consisting of optional > +blank space, an optional minus sign, and zero or more digits including > +an optional decimal point, and sort numerically. > Leading blank characters are ignored. > .It Fl R , Fl Fl random-sort , Fl Fl sort Ns = Ns Cm random > Sort lines in random order. >