On Thu, Mar 9, 2017 at 2:04 AM, Barry Scott <ba...@barrys-emacs.org> wrote:
> It was not clear to me that you need to scan the list at the start to make > sure its homogeneous. Given that the type check is so cheap will it > slow the sort if you do the pointer check in the compare code? I am not > suggesting you run rich compare full fat on each compare. > I think you have a point here. IIUC, the current code makes no assumptions about type homogeneity. So it must do something generic at each compare. Perhaps that generic thing (of course, Elliot knows that is) does do a pointer compare, and then something smart and fast for some built-ins. but there is still a few steps: these are both the same type what type are they how do I compare them? do the compare These may well all be fast, but it's still a few steps, and have to be done O(n*log(n)) times IIUC, Elliot's patch does something like: First go through the whole list and do: What is the type of the first item (only once) How do I compare that type (only once) Is everything else in the list the same type ( O(n) ) Then the actual sort: - do the compare ( O(n*log(n)) ) So there is one operation done n times, and one done n*log(n) times, rather than 4 done n*log(n) times, if the constants are about the same, that saves 3n*log(n) - n operations, or -- if n is non-trivial, then n*3*log(n) operations so we go from 4*n*log(n) to n*log(n) about a 4 times speed up -- in theory. with all the other complications of computer performance, only profiling will tell you! (also, it's not 4 times on the whole sort -- just the compare part -- you still need to shuffle the values around, which presumably takes at last as long as each of the above operations) (not sure about all four of those steps, it may be only three -- but still a 3 time speed up) Now Barry's idea: Assume that the list is homogenous, and crap out if it turns out it's not: Start off: What is the type of the first item (only once) How do I compare that type (only once) Do the sort: Is the next item the correct type? O(n*log(n)) Do the compare ( O(n*log(n)) ) So we now have 2 operations to be run O(n*log(n)) -- so not as good at only one, but still better than the 3 or 4 of the fully general sort. And this would have less impact on the non-homogenous case. Though Elliot points out that this would be much harder to branch-predict, etc... so maybe not. Maybe worth trying out and profiling?? NOTE: I really have no idea what really goes in in that code -- so I may have this all wrong... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/