[issue33989] ms.key_compare is not initialized in all pathes of list_sort_impl
Elliot Gorokhovsky added the comment: You can always fall back on safe_object_compare. So unless there's an obvious reason why your edge case can't be triggered, it would be worth putting that in as a failsafe. The additional branch should be 100% predictable, so there shouldn't be any performance penalty. -- ___ Python tracker <https://bugs.python.org/issue33989> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: OK, I added the safety check to unsafe_object_compare. I verified that it matches the current implementation on all the tests proposed in this thread. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: I am embarrassed! That's why I said IIRC... I remembered that either RichCompare calls RichCompareBool, or the other way around, and I was too lazy to check :) I did remember about do_richcompare, though! On Sat, Mar 11, 2017 at 10:07 PM Tim Peters <rep...@bugs.python.org> wrote: > > Tim Peters added the comment: > > Elliot, PyObject_RichCompareBool calls PyObject_RichCompare. That in turn > does some checks, hides a small mountain of tests in the expansions of the > recursion-checking macros, and calls do_richcompare. That in turn does > some useless (in the cases you're aiming at) tests and finally gets around > to invoking tp_richcompare. Your patch gets to that final step at once. > > I'm surprised you didn't know that ;-) > > -- > > ___ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue28685> > ___ > -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Eryk Sun: Thanks for your detailed response. I'm curious, though: can you figure out why those other two examples *didn't* fail? It's driving me crazy! For reference: https://bugs.python.org/msg289435 -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Ya, that makes sense... I just don't get why it's faster at all, then! Because if we add the (v==w) check, and the tp_richcompare check, how is unsafe_object_compare any different from PyObject_RichComareBool? Is it that we're saving function calls? (PyObject_RichCompareBool calls do_richcompare, so it's one extra call IIRC). On Sat, Mar 11, 2017 at 9:32 PM Tim Peters <rep...@bugs.python.org> wrote: > > Tim Peters added the comment: > > The impact would be small: it would add one (or so) pointer-equality > compare that, in practice, will always say "yup, they're equal". Dirt > cheap, and the branch is 100% predictable. > > -- > > ___ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue28685> > ___ > -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: It was a release build -- it would blow up in a debug build. Now, regarding the fix you propose: I'll benchmark it tomorrow. If the impact is small, I agree that it would be an elegant fix. If not, however, perhaps we just have to get rid of unsafe_object_compare entirely? The common use cases would still work just fine, and maybe we could add a case for bytes or something to compensate for the loss. On Sat, Mar 11, 2017 at 9:12 PM Tim Peters <rep...@bugs.python.org> wrote: > > Tim Peters added the comment: > > Elliot, did you run the example in a release build or a debug build? I'm > wondering why this: > >assert(v->ob_type == w->ob_type && > v->ob_type->tp_richcompare != NULL && > v->ob_type->tp_richcompare == compare_funcs.key_richcompare); > > didn't blow up (in `unsafe_object_compare`). > > If that does blow up in a debug build, it suggests "a fix": > unconditionally check whether the tp_richcompare slot is the expected > value. If not, use `PyObject_RichCompareBool(v, w, Py_LT)` instead. > > -- > > ___ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue28685> > ___ > -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: On Sat, Mar 11, 2017 at 9:01 PM Tim Peters <rep...@bugs.python.org> wrote: > > Elliot, I don't care if the example behaves differently. Although someone > else may ;-) > > The only things `.sort()` has ever tried to guarantee in the presence of > mutations (of either the list or the elements) during sorting are that (a) > the implementation won't segfault; and, (b) the list at the end is _some_ > permutation of the input list (no elements are lost or duplicated). > > If crazy mutation examples can provoke a segfault, that's possibly "a > problem" - but different results really aren't (at least not to me). > > That's great to hear. (Of course, one could always remove unsafe_object_compare from the patch and keep the rest, but that would be a real shame). I don't think segfaults are possible if the code is pure-Python, because all the builtin/stdlib functions type-check anyway, so you would just get an exception. Right? Of course, using the C API you could probably provoke segfaults, but there are much easier ways to segfault using the C API :). -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: I just ran it. With the patched interpreter, I get no error. With the unpatched interpreter, I get ValueError. :(. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Actually, I just ran this in the patched interpreter, and it worked! It printed: zebra gizmo zebra gizmo zebra gizmo zebra gizmo zebra gizmo zebra gizmo zebra gizmo Inspired by the above result, I ran your counterexample (below) to see if it would work as well: from random import * class ClassAssignmentCanBreakChecks(): def __init__(self, i): self._i = i def __lt__ (self, other): print('gizmo') last.__class__ = OrdinaryOldInteger return self._i < (other._i if hasattr(other, '_i') else other) class OrdinaryOldInteger: def __init__(self, i): self._i = i def __lt__(self, other): print('rocket') return self._i < (other._i if hasattr(other, '_i') else other) lst = [ClassAssignmentCanBreakChecks(i) for i in range(10)] shuffle(lst) last = lst[-1] lst.sort() And it did! It printed: gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo gizmo rocket rocket rocket Note the "rocket" prints at the end; those could not have printed if the compare method didn't change! Do I have any idea *why* these tests work? No. But I swear, I *just* re-made the patched interpeter in the directory where I ran the tests. You will definitely be able to reproduce my results on your system. Wacky! (seriously though I have no idea *why* this works, it just... does... I'm scared...) -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Yup, I was completely wrong. If your classes were defined in pure-Python, this would raise an exception (since the pure-Python operators/functions check for bad types, correct me if I'm wrong). However, if you defined your compares at the C API level, you could get a segfault. There are much easier ways to get segfaults using the C API, however :) Overall, I feel like if you're mutating the objects while they're being sorted, you should be prepared for undefined behavior. Specifically, in the current implementation, the sort result can be incorrect if you mutate as you sort; no sort algorithm will recheck comparisons periodically to see if you've tried to fool it. Anyway, here's what Tim Peters said on the Github PR comments, where I asked about the issue you raised: > About the __class__-changing example, I don't care provided it doesn't > blow up. But someone else should weigh in on that. I hope we can get, > e.g., Raymond Hettinger to stare at this issue too. Either way, great catch! Thanks for the feedback. On Fri, Mar 10, 2017 at 6:15 PM ppperry <rep...@bugs.python.org> wrote: > > ppperry added the comment: > > > > > Elliot Gorokhovsky added the comment: > > > > Your code changes __class__, not type, which would remain equal to > > "instance". (my understanding, could be wrong). The docs say the > > following: > > > Nope: > class A:pass > class B:pass > a = A() > a.__class__ = B > type(a) > returns "" > > -- > > ___ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue28685> > ___ > -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Your code changes __class__, not type, which would remain equal to "instance". (my understanding, could be wrong). The docs say the following: https://docs.python.org/3.7/reference/datamodel.html > Like its identity, an object’s type is also unchangeable. [1] > > [1] It is possible in some cases to change an object’s type, under certain controlled conditions. > It generally isn’t a good idea though, since it can lead to some very strange behavior if it is > handled incorrectly. So I think it's safe to assume that type doesn't change; if you change type, you get undefined ("very strange") behavior. Based on the comment in the footnote, other code clearly assumes that type doesn't change, so I don't see why list.sort() should be any different. On Fri, Mar 10, 2017 at 5:08 PM ppperry <rep...@bugs.python.org> wrote: > > ppperry added the comment: > > Does this work with wacky code like this? > @functools.total_ordering > class ClassAssignmentCanBreakChecks(): > def __init__(self, i): > self._i = i > def __lt__ (self, other): > last.__class__ = OrdinaryOldInteger > return self._i < (other._i if hasattr(other, '_i') > else other) > @functools.total_ordering > class OrdinaryOldInteger: > def __init__(self, i): > self._i = i > def __lt__(self, other): > return self._i < (other._i if hasattr(other, '_i') > else other) > lst = [ClassAssignmentCanBreakChecks(i) for i in range(10)] > random.shuffle(lst) > last = lst[-1] > lst.sort() > It looks like it will initially say that all are the same type, and > attempt that optimization, which will probably lead to unexpected results > as that condition is no longer true after the first compare. > > -- > nosy: +ppperry > > ___ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue28685> > ___ > -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: On Fri, Mar 10, 2017 at 12:26 AM Serhiy Storchaka <rep...@bugs.python.org> wrote: > > Serhiy Storchaka added the comment: > > The issue shouldn't be closed until it resolved or rejected. > Ya, sorry about that. This is my first time contributing. > > I like the idea, and benchmarking results for randomized lists look nice. > But could you please run benchmarks for already sorted lists? > David Mertz asked for the same thing on python-ideas. Here's what I replied (you can also find these numbers in my pull request description): *** You are entirely correct, as the benchmarks below demonstrate. I used the benchmark lists from Objects/listsort.txt, which are: \sort: descending data /sort: ascending data 3sort: ascending, then 3 random exchanges +sort: ascending, then 10 random at the end %sort: ascending, then randomly replace 1% of elements w/ random values ~sort: many duplicates =sort: all equal My results are below (the script can be found at https://github.com/embg/python-fastsort-benchmark/blob/master/bench-structured.py ): Homogeneous ([int]): \sort: 54.6% /sort: 56.5% 3sort: 53.5% +sort: 55.3% %sort: 52.4% ~sort: 48.0% =sort: 45.2% Heterogeneous ([int]*n + [0.0]): \sort: -17.2% /sort: -19.8% 3sort: -18.0% +sort: -18.8% %sort: -10.0% ~sort: -2.1% =sort: -21.3% As you can see, because there's a lot less non-comparison overhead in the structured lists, the impact of the optimization is much greater, both in performance gain and in worst-case cost. However, I would argue that these data do not invalidate the utility of my patch: the probability of encountering a type-heterogeneous list is certainly less than 5% in practice. So the expected savings, even for structured lists, is something like (5%)(-20%) + (95%)(50%) = 46.5%. And, of course, not *all* the lists one encounters in practice are structured; certainly not *this* structured. So, overall, I would say the numbers above are extremely encouraging. Thanks for pointing out the need for this benchmark, though! *** Thanks for the feedback! -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Changes by Elliot Gorokhovsky <elliot.gorokhov...@gmail.com>: -- pull_requests: +479 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Will post the final version of this patch as a pull request on Github. -- stage: -> resolved status: open -> closed ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Oh wait... uh... never mind... we want "faster" to refer to total time taken, so 1-def/ref is indeed the correct formula. I just got confused because perf outputs ref/dev, but that doesn't make sense for percents. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: So thanks for pointing out that perf has a --compare_to option: it turns out I had calculated the times wrong! Specifically, I had used (ref-dev)/ref while I should have used ref/dev which is what perf --compare_to uses. Anyway, the actual times are even more incredible than I could have imagined! First, here's my benchmark script: #!/bin/bash rm ref.json dev.json 2> /dev/null python-dev/python -m perf timeit -s "$1" "sorted(l)" --rigorous -o dev.json > /dev/null python-ref/python -m perf timeit -s "$1" "sorted(l)" --rigorous -o ref.json > /dev/null python-ref/python -m perf compare_to ref.json dev.json And here are the results: ./bench.sh "import random; l=[random.uniform(-1,1) for _ in range(0,100)]" Median +- std dev: [ref] 8.34 us +- 0.18 us -> [dev] 3.33 us +- 0.13 us: 2.50x faster So it's 150% faster! (i.e. 150% + 100% = 250%). 150% faster sorting for floats!!! If we make them tuples, it's even more incredible: Median +- std dev: [ref] 20.9 us +- 1.0 us -> [dev] 4.99 us +- 0.27 us: 4.19x faster 319% faster!!! And earlier, I had thought 75% was impressive... I mean, 319%!!! And again, this is an application that is directly useful: DEAP spends a great deal of time sorting tuples of floats, this will make their EAs run a lot faster. "import random; l=[str(random.uniform(-1,1)) for _ in range(0,100)]" Median +- std dev: [ref] 15.7 us +- 0.9 us -> [dev] 9.24 us +- 0.52 us: 1.70x faster "import random; l=[int(random.uniform(-1,1)*2**15) for _ in range(0,100)]" Median +- std dev: [ref] 8.59 us +- 0.19 us -> [dev] 4.35 us +- 0.13 us: 1.98x faster -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Changes by Elliot Gorokhovsky <elliot.gorokhov...@gmail.com>: Added file: http://bugs.python.org/file45508/fastsort.patch ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: On Mon, Nov 14, 2016 at 10:32 AM STINNER Victor <rep...@bugs.python.org> wrote: > You can use perf timeit --compare-to to check if the result is significant > or not, and it displays you the "N.NNx faster" or "N.NNx slower" if it's > significant. > Will do -- I'm writing this up as a paper since this is my science fair project, so I'll redo the measurements that way and upload a pdf here. > About benchmarks, I also would like to see a benchmark on the bad case, > when specialization is not used. And not only on an empty list :-) For > example, sort 1000 objects which implement compare operators and/or a sort > function. > The worst case is the third benchmark from the top -- a list of floats with a single, sad, solitary long at the end. That disables optimization because keys_are_all_same_type gets set to 0 when assign_compare_function finds the long (since key_type == _Type). That benchmark is the *absolute* worst case for two reasons: 1. Float compares are really cheap, so the ratio doesn't get messed up by the common term "time spent actually comparing floats" (since the total time is "time spent on overhead + time spend actually comparing floats"). 2. The list is of size 10. I guess I could've used size 3 or 4, but it shouldn't be too far off... smaller lists give worse ratios because the overhead is O(n) while sorting is O(nlogn). So, again, the absolute worst possible case is the third benchmark, which suffers a 10% slowdown. Certainly a reasonable price to pay considering how rare that case is in practice, and considering the 40-75% gains we get on the common cases. -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Elliot Gorokhovsky added the comment: Sure, if it compiles without that def, I'll remove it from the patch. I added it because I did all the development in an extension module, which also included Python.h, but for some reason it gave me a "function implicitly defined" error for using Py_ABS. Even though I had included Python.h. Weird, right? So I assumed I had to leave it in the patch as well. Thanks for pointing this out! On Mon, Nov 14, 2016 at 6:09 AM Julien Palard <rep...@bugs.python.org> wrote: > > Julien Palard added the comment: > > Hi Elliot, nice spot! > > Why are you redefining Py_ABS, which looks already defined in `pymacro.h` > included itself by `Python.h`? I'm not fan of undefining it later, it may > surprise someone later expecting it to be there. > > I tried to compile without your definition of Py_ABS, just in case I > missed something in the includes, and it works. > > -- > nosy: +mdk > > ___ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue28685> > ___ > -- ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
Changes by Elliot Gorokhovsky <elliot.gorokhov...@gmail.com>: -- nosy: +tim.peters ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28685> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28685] Optimizing list.sort() by performing safety checks in advance
New submission from Elliot Gorokhovsky: When python compares two objects, there are many safety checks that have to be performed before the actual comparison can take place. These include checking the types of the two objects, as well as checking various type-specific properties, like character width for strings or the number of machine-words that represent a long. Obviously, the vast majority of the work done during a list.sort() is spent in comparisons, so even small optimizations can be important. What I noticed is that since we have n objects, but we're doing O(nlogn) comparisons, it pays off to do all the safety checks in a single pass and then select an optimized comparison function that leverages the information gained to avoid as many sort-time checks as possible. I made the following assumptions: 1. In practice, lists to be sorted are almost always type-homogeneous. 2. Most lists to be sorted consist of homogeneous elements of the following types: a. Latin strings (i.e. strings whose characters are one machine-word wide) b. Longs that fit in a single machine-word c. Floats d. Tuples of the above 3. The above assumptions also hold for lists constructed by taking a list of tuples to be sorted and extracting the first elements. 4. The vast majority of tuple comparisons never get past the first element (i.e. the first elements of tuples are rarely equal). My patch adds the following routine to list.sort(): 1. Go through the list and see which of the assumptions, if any, hold (except (4), which can't be checked in advance). 2. Replace PyObject_RichCompareBool() with an optimized compare function that is able to ignore some safety checks by applying the assumption we've already verified to be true. There are then two questions: when none of the assumptions hold, how expensive is (1)? And when we do get a "hit", how much time do we save by applying (2)? Those questions, of course, can only be answered by benchmarks. So here they are, computed on an isolated CPU core, on two python interpreters, both compiled with ./configure --with-optimizations: # This is a runnable script. Just build the reference interpreter in python-ref and the patched interpreter in python-dev. # Pathological cases (where we pay for (1) but don't get any savings from (2)): what kind of losses do we suffer? # How expensive is (1) for empty lists? python-dev/python -m perf timeit -s "l=[]" "sorted(l)" --rigorous python-ref/python -m perf timeit -s "l=[]" "sorted(l)" --rigorous # Median +- std dev: 212 ns +- 9 ns # Median +- std dev: 216 ns +- 10 ns # (difference is within std dev) # How expensive is (1) for singleton lists? python-dev/python -m perf timeit -s "l=[None]" "sorted(l)" --rigorous python-ref/python -m perf timeit -s "l=[None]" "sorted(l)" --rigorous # Median +- std dev: 235 ns +- 16 ns # Median +- std dev: 238 ns +- 15 ns # (difference is within std dev) # How expensive is (1) for non-type-homogeneous lists? # (we use small lists because as n gets large, the fraction time spent in (1) gets smaller) python-dev/python -m perf timeit -s "import random; l=[random.uniform(-1,1) for _ in range(0,10)]+[1]" "sorted(l)" --rigorous python-ref/python -m perf timeit -s "import random; l=[random.uniform(-1,1) for _ in range(0,10)]+[1]" "sorted(l)" --rigorous # Median +- std dev: 784 ns +- 35 ns # Median +- std dev: 707 ns +- 51 ns # 10% slower. While this is unfortunate, this case almost never occurs in practice, and the losses are certainly offset by the gains we'll see below. # Also, note that for large lists, the difference would be *much* smaller, since the time spent in (1) gets smaller like 1/log(n). # So basically, we only pay a penalty for non-type-homogeneous small lists. # OK, now for the cases that actually occur in practice: # What kind of gains do we get for floats? python-dev/python -m perf timeit -s "import random; l=[random.uniform(-1,1) for _ in range(0,100)]" "sorted(l)" --rigorous python-ref/python -m perf timeit -s "import random; l=[random.uniform(-1,1) for _ in range(0,100)]" "sorted(l)" --rigorous # Median +- std dev: 3.63 us +- 0.20 us # Median +- std dev: 8.81 us +- 0.37 us # Wow! 59% faster! And if we used a large list, we would've gotten even better numbers. # What kind of gains do we get for latin strings? python-dev/python -m perf timeit -s "import random; l=[str(random.uniform(-1,1)) for _ in range(0,100)]" "sorted(l)" --rigorous python-ref/python -m perf timeit -s "import random; l=[str(random.uniform(-1,1)) for _ in range(0,100)]" "sorted(l)" --rigorous # Median +- std dev: 9.51 us +- 0.28 us # Median +- std dev: 15.9 us +- 0.7 us # 40% faster! # What kind of gains do we get for non-latin strings (which I imagine aren't that