On Sunday, 17 November 2013 at 00:56:46 UTC, Andrei Alexandrescu wrote:
Very simple. Any sequence with a large number of equivalent elements will cause timsort to work more than an unstable algorithm, because it would need to shift around those elements. (Consider e.g. 1, 0, 0, 0, ..., 0.) The discussion then shifts to how many equivalent elements are in the typical range etc.

Ah, very nicely described. So a stable sort can take more time when there is more than one equal element than a theoretical unstable sort.

There are well known tweaks to quicksort that make it operate well on sorted or almost sorted sequences.

I'd have to see them to comment on them. I'd doubt that they only use quicksort if they support taking advantage of large parts of the array being sorted. I'd expect some sort of merging to occur in that case.

We could do that, but that would be masking a bug instead of fixing it. We should instead focus on fixing unstable sort. One additional issue with making stable sort the default for a while is that people will grow to assume sort() is stable, and will have broken code when we switch back.

Let's fix unstable sort. If it's slower on any other data than Timsort's happy case, we have a bug.

Sure, but could we at least agree on a timescale? If we can't come up with something significantly better than Timsort in the next year, it seems pointless to hold out for something that may never come. Hidden bug or not, deliberately exposing a bug that has no effective plan to be fixed isn't a good tactic either. Especially if D is expected to be competitive to other native languages in speed. Seeing D falter in the "sort" column of a shootout could be disconcerting to some and we can't count on everyone knowing to specify stable sort in those head-to-heads until we work out improving unstable sort.

That said, your point that people might grow comfortable with sort being stable is a good one. Unfortunately, I don't have an answer for that. Documentation detailing that sort is not guaranteed a particular stability is about the best that could be done. Perhaps a variant of Timsort could be prepared that deliberately changes the order of equal elements. Designed instability could suffice as a stopgap against that (at least until the unstable sort is worked out).

Reply via email to