On Sun, Mar 5, 2017 at 10:51 AM David Mertz <me...@gnosis.cx> wrote: > Thanks for the hard work, this looks very promising. >
Thank you! > Real world data tends to be mostly-sorted. So it would be useful for your > benchmarks to include: > > A) Performance on completely sorted data > i) Of homogenous type > ii) Of mixed type > B) Performance on mostly-sorted data > i) Of homogenous type > ii) Of mixed type > You are entirely correct, as the benchmarks below demonstrate. I used the benchmark lists from Objects/listsort.txt, which are: \sort: descending data /sort: ascending data 3sort: ascending, then 3 random exchanges +sort: ascending, then 10 random at the end %sort: ascending, then randomly replace 1% of elements w/ random values ~sort: many duplicates =sort: all equal My results are below (the script can be found at https://github.com/embg/python-fastsort-benchmark/blob/master/bench-structured.py ): Homogeneous ([int]): \sort: 54.6% /sort: 56.5% 3sort: 53.5% +sort: 55.3% %sort: 52.4% ~sort: 48.0% =sort: 45.2% Heterogeneous ([int]*n + [0.0]): \sort: -17.2% /sort: -19.8% 3sort: -18.0% +sort: -18.8% %sort: -10.0% ~sort: -2.1% =sort: -21.3% As you can see, because there's a lot less non-comparison overhead in the structured lists, the impact of the optimization is much greater, both in performance gain and in worst-case cost. However, I would argue that these data do not invalidate the utility of my patch: the probability of encountering a type-heterogeneous list is certainly less than 5% in practice. So the expected savings, even for structured lists, is something like (5%)(-20%) + (95%)(50%) = 46.5%. And, of course, not *all* the lists one encounters in practice are structured; certainly not *this* structured. So, overall, I would say the numbers above are extremely encouraging. Thanks for pointing out the need for this benchmark, though! Elliot
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/