On Sunday, 17 November 2013 at 00:56:46 UTC, Andrei Alexandrescu
wrote:
Very simple. Any sequence with a large number of equivalent
elements will cause timsort to work more than an unstable
algorithm, because it would need to shift around those
elements. (Consider e.g. 1, 0, 0, 0, ..., 0.) The discussion
then shifts to how many equivalent elements are in the typical
range etc.
Ah, very nicely described. So a stable sort can take more time
when there is more than one equal element than a theoretical
unstable sort.
There are well known tweaks to quicksort that make it operate
well on sorted or almost sorted sequences.
I'd have to see them to comment on them. I'd doubt that they only
use quicksort if they support taking advantage of large parts of
the array being sorted. I'd expect some sort of merging to occur
in that case.
We could do that, but that would be masking a bug instead of
fixing it. We should instead focus on fixing unstable sort. One
additional issue with making stable sort the default for a
while is that people will grow to assume sort() is stable, and
will have broken code when we switch back.
Let's fix unstable sort. If it's slower on any other data than
Timsort's happy case, we have a bug.
Sure, but could we at least agree on a timescale? If we can't
come up with something significantly better than Timsort in the
next year, it seems pointless to hold out for something that may
never come. Hidden bug or not, deliberately exposing a bug that
has no effective plan to be fixed isn't a good tactic either.
Especially if D is expected to be competitive to other native
languages in speed. Seeing D falter in the "sort" column of a
shootout could be disconcerting to some and we can't count on
everyone knowing to specify stable sort in those head-to-heads
until we work out improving unstable sort.
That said, your point that people might grow comfortable with
sort being stable is a good one. Unfortunately, I don't have an
answer for that. Documentation detailing that sort is not
guaranteed a particular stability is about the best that could be
done. Perhaps a variant of Timsort could be prepared that
deliberately changes the order of equal elements. Designed
instability could suffice as a stopgap against that (at least
until the unstable sort is worked out).