On Thu 11-Jan-07 4:21am -0600, Matthew Winn wrote:

> On Wed, 10 Jan 2007 12:19:08 -0600, Bill McCarthy wrote:

>> On Tue 9-Jan-07 4:12pm -0600, Arun Easi wrote:

>>> Lines with same 3rd field are sorted lexicographically.
>>> So, if you have MM/DD/YYYY format, that should be good. If
>>> you have DD/MM/YYYY, it cannot be (Try adding 31/11/1996
>>> and 01/12/1996).

>> Is this documented behavior?  It certainly isn't mentioned
>> in the brief docs that come with the GnuWin32 version.  Do
>> you have a URL that documents this?

> As far as I can remember sort has always performed that way, using
> the entire line as a "last resort" sort to resolve keys that would
> otherwise sort identically. I have vague memories of seeing this
> documented but I don't know where. I think the idea was that if you
> cared about the order of the lines you'd specify it, and if you didn't
> care then sort was free to use whatever algorithm it chose.

Thanks for sharing that.  This time, I googled with "last
resort" included - I found some GnuWin32 documentation.

I went back to the GnuWin32 site and discovered that I may
not have downloaded all the CoreUtils material (I had the
zip file and there is a slightly bigger exe file).  I
downloaded and installed.  It contains two pdf files I
didn't have before - one with much better documentation for
sort.

There is a very good explanation of the sort:

    A pair of lines is compared as follows: sort
    compares each pair of fields, in the order specified
    on the command line, according to the associated
    ordering options, until a difference is found or no
    fields are left.  If no key fields are specified,
    sort uses a default key of the entire line.
    Finally, as a last resort when all keys compare
    equal, sort compares entire lines as if no ordering
    options other than ‘--reverse’ (‘-r’) were
    specified.  The ‘--stable’ (‘-s’) option disables
    this last-resort comparison so that lines in which
    all fields compare equal are left in their original
    relative order.  The ‘--unique’ (‘-u’) option also
    disables the last-resort comparison.

It is interesting to note that the penultimate sentence
above appears to imply that the utility uses a stable sort.
A non-stable sort, like quicksort, would not return same key
records in "original relative order" merely because the
last-resort comparison was not performed.  Of course, if the
sort switches imply that a last-resort comparison will be
made, quicksort could be used.

-- 
Best regards,
Bill

Reply via email to