2021-07-04 15:47:55 +0700, Robert Elz via austin-group-l at The Open Group:
>     Date:        Fri, 2 Jul 2021 14:41:50 +0100
>     From:        "Geoff Clare via austin-group-l at The Open Group" 
> <austin-group-l@opengroup.org>
>     Message-ID:  <20210702134150.GB16587@localhost>
> 
>   | I've always assumed that the intention for -c is to answer the
>   | question "if I ran this command without -c would the output be 
>   | the same as the input?"  So the NetBSD behaviour seems wrong
>   | to me.
> 
> But:
>       jinx$ printf '%s\n' a,b a,a 
>       a,b
>       a,a
>       jinx$ printf '%s\n' a,b a,a | sort -t, -k1,1
>       a,b
>       a,a

That would make is non-compliant then.

SUS> When there are multiple key fields, later keys shall be
SUS> compared only after all earlier keys compare equal. Except
SUS> when the -u option is specified, lines that otherwise
                                      ^^^^^^^^^^^^^^^^^^^^
SUS> compare equal shall be ordered as if none of the options
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SUS> -d, -f, -i, -n, or -k were present (but with -r still in
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SUS> effect, if it was specified) and with all bytes in the
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SUS> lines significant to the comparison. The order in which
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SUS> lines that still compare equal are written is unspecified.

[...]
> ie: When -k args are given, there is no fallback to "whole record" matching,
> if one wants that, one can easily add a final -k1 option to make that happen:
[...]
> which is the way it should be - if one has taken the trouble to specify
> what parts of the record are the keys for sorting (and -u comparisons)
> then sort should not be gratuitously adding more - that it used to do so
> was widely regarded as a bug (especially given that there was no way to
> defeat it, but enabling it is so simple when it is not the default).
[...]

I don't know what the original rationale was, but /one/
rationale could be to garantee a deterministic and total order,
to make sure that two files with the same lines (though in
different orders) result in the same output when sorted whatever
the sorting specification.

That guarantee is broken in locales that don't have total order
which was the subject of recent changes.

POSIX sort does sort as specified, and in cases where the user
doesn't say (sort key same but line different), makes one of
several possible decisions, in that case last resort comparison
of the full line (and resort to memcmp() comparison when
strcoll() find them equal if need be), whilst NetBSD sort uses
the original order. Note that POSIX doesn't require the order be
stable, leaves it unspecified what the selected one is for sort
-uk1,1 for instance.

-- 
Stephane

  • sort -c/C and last-... Stephane Chazelas via austin-group-l at The Open Group
    • Re: sort -c/C ... Joerg Schilling via austin-group-l at The Open Group
      • Re: sort -... Stephane Chazelas via austin-group-l at The Open Group
        • Re: so... Geoff Clare via austin-group-l at The Open Group
        • Re: so... Joerg Schilling via austin-group-l at The Open Group
          • Re... Geoff Clare via austin-group-l at The Open Group
            • ... Joerg Schilling via austin-group-l at The Open Group
            • ... Stephane Chazelas via austin-group-l at The Open Group
    • Re: sort -c/C ... Geoff Clare via austin-group-l at The Open Group
    • Re: sort -c/C ... Robert Elz via austin-group-l at The Open Group
      • Re: sort -... Stephane Chazelas via austin-group-l at The Open Group
      • Re: sort -... Stephane Chazelas via austin-group-l at The Open Group
        • Re: so... Geoff Clare via austin-group-l at The Open Group
        • Re: so... Robert Elz via austin-group-l at The Open Group
        • Re: so... Joerg Schilling via austin-group-l at The Open Group
        • Re: so... Robert Elz via austin-group-l at The Open Group
          • Re... Joerg Schilling via austin-group-l at The Open Group
            • ... Geoff Clare via austin-group-l at The Open Group
              • ... Joerg Schilling via austin-group-l at The Open Group
                • ... Geoff Clare via austin-group-l at The Open Group
          • Re... Robert Elz via austin-group-l at The Open Group

Reply via email to