Re: How to make Python run as fast (or faster) than Julia
On Thu, Mar 08, 2018 at 08:44:16AM +1100, Chris Angelico wrote: > On Thu, Mar 8, 2018 at 8:36 AM, Pythonwrote: > > On Mon, Mar 05, 2018 at 04:09:48PM -0800, Dan Stromberg wrote: > >> On Mon, Mar 5, 2018 at 3:53 PM, Python wrote: > >> > On Sat, Mar 03, 2018 at 08:18:03AM +1100, Chris Angelico wrote: > >> >> > Python is often a preferred solution because it is often fantastic for > >> >> > rapid implementation and maintainability. The GIL's interference > >> >> > with threaded code performance has, for me at least, on several > >> >> > occasions been... disappointing (perf costs of removing it aside) > >> >> > because it gets in the way of choosing Python for such solutions. > >> >> > Jython and IronPython are simply not feasible options for me, for > >> >> > multiple reasons that have zero to do with their technical > >> >> > suitability. > >> >> > >> >> Have you actually tried it and run into problems, > >> > > >> > Yes. It was years ago and I forget the details, but I even posted > >> > some sample code here and was told (quite possibly by you) that it was > >> > the GIL that was eating my lunch. Someone suggested writing the bits > >> > I wanted to thread as a C extension, which largely defeated the > >> > purpose of using Python. In at least one case I just used C++, and in > >> > another I just ignored the problem until it went away. > >> > >> So how about a little Cython? > > > > Nope. As a practical matter, the only Python option I have is > > Python proper (i.e. CPython). > > So you can use C++, but only if you write the entire program in it, > instead of using Cython and only transforming the part that's actually > slow? That describes the situation closely enough to correctly that I don't feel the need to elaborate, yes. I will say that /in theory/ I can do whatever I want; in practice there are significant barriers to integrate new tools into the build system I'm required to use that make it prohibitive to do so. There are complexities there which I don't care to describe, but the point remains, effectively, the only Python option I have available is CPython. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Thu, Mar 8, 2018 at 8:36 AM, Pythonwrote: > On Mon, Mar 05, 2018 at 04:09:48PM -0800, Dan Stromberg wrote: >> On Mon, Mar 5, 2018 at 3:53 PM, Python wrote: >> > On Sat, Mar 03, 2018 at 08:18:03AM +1100, Chris Angelico wrote: >> >> > Python is often a preferred solution because it is often fantastic for >> >> > rapid implementation and maintainability. The GIL's interference >> >> > with threaded code performance has, for me at least, on several >> >> > occasions been... disappointing (perf costs of removing it aside) >> >> > because it gets in the way of choosing Python for such solutions. >> >> > Jython and IronPython are simply not feasible options for me, for >> >> > multiple reasons that have zero to do with their technical >> >> > suitability. >> >> >> >> Have you actually tried it and run into problems, >> > >> > Yes. It was years ago and I forget the details, but I even posted >> > some sample code here and was told (quite possibly by you) that it was >> > the GIL that was eating my lunch. Someone suggested writing the bits >> > I wanted to thread as a C extension, which largely defeated the >> > purpose of using Python. In at least one case I just used C++, and in >> > another I just ignored the problem until it went away. >> >> So how about a little Cython? > > Nope. As a practical matter, the only Python option I have is > Python proper (i.e. CPython). So you can use C++, but only if you write the entire program in it, instead of using Cython and only transforming the part that's actually slow? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Mon, Mar 05, 2018 at 04:09:48PM -0800, Dan Stromberg wrote: > On Mon, Mar 5, 2018 at 3:53 PM, Pythonwrote: > > On Sat, Mar 03, 2018 at 08:18:03AM +1100, Chris Angelico wrote: > >> > Python is often a preferred solution because it is often fantastic for > >> > rapid implementation and maintainability. The GIL's interference > >> > with threaded code performance has, for me at least, on several > >> > occasions been... disappointing (perf costs of removing it aside) > >> > because it gets in the way of choosing Python for such solutions. > >> > Jython and IronPython are simply not feasible options for me, for > >> > multiple reasons that have zero to do with their technical > >> > suitability. > >> > >> Have you actually tried it and run into problems, > > > > Yes. It was years ago and I forget the details, but I even posted > > some sample code here and was told (quite possibly by you) that it was > > the GIL that was eating my lunch. Someone suggested writing the bits > > I wanted to thread as a C extension, which largely defeated the > > purpose of using Python. In at least one case I just used C++, and in > > another I just ignored the problem until it went away. > > So how about a little Cython? Nope. As a practical matter, the only Python option I have is Python proper (i.e. CPython). -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Mon, Mar 5, 2018 at 3:53 PM, Pythonwrote: > On Sat, Mar 03, 2018 at 08:18:03AM +1100, Chris Angelico wrote: >> > Python is often a preferred solution because it is often fantastic for >> > rapid implementation and maintainability. The GIL's interference >> > with threaded code performance has, for me at least, on several >> > occasions been... disappointing (perf costs of removing it aside) >> > because it gets in the way of choosing Python for such solutions. >> > Jython and IronPython are simply not feasible options for me, for >> > multiple reasons that have zero to do with their technical >> > suitability. >> >> Have you actually tried it and run into problems, > > Yes. It was years ago and I forget the details, but I even posted > some sample code here and was told (quite possibly by you) that it was > the GIL that was eating my lunch. Someone suggested writing the bits > I wanted to thread as a C extension, which largely defeated the > purpose of using Python. In at least one case I just used C++, and in > another I just ignored the problem until it went away. So how about a little Cython? It has decent GIL control, isn't much different from Python syntactically, and can be used to create C extension modules callable from CPython. It allows you to pretty freely intermix Python data types and C data types - just be careful about implicit conversions from one to the other - they can slow things down. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Mar 03, 2018 at 08:18:03AM +1100, Chris Angelico wrote: > > Python is often a preferred solution because it is often fantastic for > > rapid implementation and maintainability. The GIL's interference > > with threaded code performance has, for me at least, on several > > occasions been... disappointing (perf costs of removing it aside) > > because it gets in the way of choosing Python for such solutions. > > Jython and IronPython are simply not feasible options for me, for > > multiple reasons that have zero to do with their technical > > suitability. > > Have you actually tried it and run into problems, Yes. It was years ago and I forget the details, but I even posted some sample code here and was told (quite possibly by you) that it was the GIL that was eating my lunch. Someone suggested writing the bits I wanted to thread as a C extension, which largely defeated the purpose of using Python. In at least one case I just used C++, and in another I just ignored the problem until it went away. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 02 Mar 2018 14:45:55 -0600, Python wrote: > On Mon, Feb 26, 2018 at 09:57:06AM +, Steven D'Aprano wrote: >> Besides, if you want Python with no GIL so you can run threaded code, >> why aren't you using IronPython or Jython? > > But this is just another oversimplified argument. In the real world > there rather often exist constraints which have nothing to do with what > is technically possible, but which make impractical or infeasible what > may be technically possible. Of course it is simplified. As is the argument that the GIL is the source of all evil, er, why Python threading is slow. In reality, we may have good reasons for not using IronPython or Jython, such as the desire for the latest 3.x version, or because we rely on C external libraries that don't work under the CLR or JVM. [...] > But back to the point of the thread, I think it's ironic that you're > basically making the benchmarks' argument: Choose the tool which suits > the solution you need. You argue we should use IronPython or Jython > when threads are needed, and the Julia benchmarks *imply* (but never > actually argue, AFAICT) that if you need a solution which relies on > heavy use of function calls, since Python will be slower than Julia, you > should therefore choose Julia instead. Indeed. While I haven't said so explicitly in this thread, I agree entirely with everything you say here. One of the things I'm very interested in is the possibility of writing number-crunching code in Julia and calling it from Python. I haven't tried it yet, but it is on my bucket-list :-) > But that's only a sensible > conclusion if all other things be equal, which of course they are not, > neither for Julia vs. Python, nor for Python vs. Jython or IronPython > (though likely the latter case is slightly closer to being true). There are always constraints on what language you might use. Foo-lang might be the best language for the job, but since I don't know the language (or even heard of it), I can't use it. You may be constrained by familiarity, shop politics, customer requirements, the availability of staff, personal preferences, legal requirements, etc. But taking all of those things into consideration, the conclusion still fits: use the language that suits you and the job best. If that language is Python, then this is how you can get the best performance if and when needed. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Mar 3, 2018 at 7:45 AM, Pythonwrote: > On Mon, Feb 26, 2018 at 09:57:06AM +, Steven D'Aprano wrote: >> Besides, if you want Python with no GIL so you can run threaded code, why >> aren't you using IronPython or Jython? > > But this is just another oversimplified argument. In the real world > there rather often exist constraints which have nothing to do with > what is technically possible, but which make impractical or infeasible > what may be technically possible. > > Python is often a preferred solution because it is often fantastic for > rapid implementation and maintainability. The GIL's interference > with threaded code performance has, for me at least, on several > occasions been... disappointing (perf costs of removing it aside) > because it gets in the way of choosing Python for such solutions. > Jython and IronPython are simply not feasible options for me, for > multiple reasons that have zero to do with their technical > suitability. Have you actually tried it and run into problems, or do you succumb to the FUD and give up before implementing? I can think of only one time when I tried to speed up a program by multithreading it and was unable to due to locking. (And actually, it wasn't even the GIL that was the problem - it was a non-thread-safe library I was calling on, and *that* meant everything serialized.) "One time, people!" -- Wasabi ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Mon, Feb 26, 2018 at 09:57:06AM +, Steven D'Aprano wrote: > Besides, if you want Python with no GIL so you can run threaded code, why > aren't you using IronPython or Jython? But this is just another oversimplified argument. In the real world there rather often exist constraints which have nothing to do with what is technically possible, but which make impractical or infeasible what may be technically possible. Python is often a preferred solution because it is often fantastic for rapid implementation and maintainability. The GIL's interference with threaded code performance has, for me at least, on several occasions been... disappointing (perf costs of removing it aside) because it gets in the way of choosing Python for such solutions. Jython and IronPython are simply not feasible options for me, for multiple reasons that have zero to do with their technical suitability. But back to the point of the thread, I think it's ironic that you're basically making the benchmarks' argument: Choose the tool which suits the solution you need. You argue we should use IronPython or Jython when threads are needed, and the Julia benchmarks *imply* (but never actually argue, AFAICT) that if you need a solution which relies on heavy use of function calls, since Python will be slower than Julia, you should therefore choose Julia instead. But that's only a sensible conclusion if all other things be equal, which of course they are not, neither for Julia vs. Python, nor for Python vs. Jython or IronPython (though likely the latter case is slightly closer to being true). -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Tue, 27 Feb 2018 11:27:21 +, bartc wrote: [...] >> The built-in generator is using a completely different algorithm >> though, so rate of generation isn't the entire story. How long is the >> period of the one you're using? (How long before it loops?) > > I believe it's 5*2**1320480*(2**64-1) according to the author's comment. > > I haven't tested that. What, you haven't run a full period of the generator? I thought your code was fast! *wink* Sorry, I couldn't resist... :-) But for the record, if we could generate a million million numbers per second, that is one every picosecond, the total time would be of the order of 10**397504 years. That's a one followed by three hundred ninety- seven thousand zeroes. > (By looping I understand that to mean, before the same sequence starts > again. Because as the numbers are 64 bits, individual numbers will > inevitably be repeated from time to time.) Yes, that's what the period means: how long it takes for the entire sequence to repeat, not just a single number. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 27/02/2018 02:27, Chris Angelico wrote: On Tue, Feb 27, 2018 at 12:57 PM, bartcwrote: On 27/02/2018 00:35, Chris Angelico wrote: Anyway, even this pure Python version can deliver pseudo random numbers at some 200,000 per second, while the built-in generator does 450,000 per second, so it's not bad going. The built-in generator is using a completely different algorithm though, so rate of generation isn't the entire story. How long is the period of the one you're using? (How long before it loops?) I believe it's 5*2**1320480*(2**64-1) according to the author's comment. I haven't tested that. (By looping I understand that to mean, before the same sequence starts again. Because as the numbers are 64 bits, individual numbers will inevitably be repeated from time to time.) -- Bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
Christian Gollwitzerwrites: > George Marsaglia is a researcher who invented a couple of PRNGs which > are both simple (one of the first was called KISS) yet surprisingly > good. s/is/was/ Sadly, he died a few years ago (2011). -- Ben. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
Am 27.02.18 um 03:27 schrieb Chris Angelico: On Tue, Feb 27, 2018 at 12:57 PM, bartcwrote: Anyway, even this pure Python version can deliver pseudo random numbers at some 200,000 per second, while the built-in generator does 450,000 per second, so it's not bad going. The built-in generator is using a completely different algorithm though, so rate of generation isn't the entire story. How long is the period of the one you're using? (How long before it loops?) If you churn the generator to an arbitrary number and then take the next 100 values it generates, are they randomly distributed? George Marsaglia is a researcher who invented a couple of PRNGs which are both simple (one of the first was called KISS) yet surprisingly good. See here: https://en.wikipedia.org/wiki/KISS_(algorithm) They possess all these desirable properties you mention above and pass the DIEHARD random number generator tests (also invented by Marsaglia). Can you reconstruct the RNG's internal state by watching it generate numbers for a short while? https://eprint.iacr.org/2011/007.pdf Obviously no PRNG is going to be perfect at this, but there are certainly degrees of quality to be compared. It is a field of research and the generator shown by Bart is one of the better generators constructed by an expert in the field. Christian -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Tue, Feb 27, 2018 at 12:57 PM, bartcwrote: > On 27/02/2018 00:35, Chris Angelico wrote: >> >> On Tue, Feb 27, 2018 at 11:17 AM, Steven D'Aprano >> wrote: >>> >>> On Tue, 27 Feb 2018 02:09:53 +1100, Chris Angelico wrote: >>> You're still reimplementing the C code in Python, which is inefficient. Have you considered going back to the *actual algorithm* and implementing that idiomatically in Python? I think you'll find that (a) the code is more readable, and (b) the run time is much lower. >>> >>> >>> Chris, I think this is incredibly optimistic, if not naive. We're talking >>> about a PRNG by Marsaglia, so my guess is that the "original algorithm" >>> *is* the C code. Or possibly Fortran. >>> >>> Even if not, even if there is actually a language-neutral algorithm, its >>> a PRNG which means its going to be filled with bit-twiddling and number- >>> crunching operations. Pure Python without a JIT is never going to be >>> competitive with C or Fortran for code like that. >>> >> >> I may have been a little unclear. It's highly unlikely that the run >> time of the properly-implemented Python code will be lower than the >> original C or Fortran. But it most certainly CAN be more efficient >> than the Python reimplementation of the C implementation, which would >> narrow the gap considerably. Implementing Python idiotically instead >> of idiomatically gives you suboptimal performance. > > > Nonsense. You shouldn't need to care about such things. An algorithm is an > algorithm. And the better ones should be easily written in any language. > > This particular one, which was of interest because the calculations tended > to overflow into undesirable bignums, is just a bunch of calculations. > 'Bit-twiddling'. > > Anyway, even this pure Python version can deliver pseudo random numbers at > some 200,000 per second, while the built-in generator does 450,000 per > second, so it's not bad going. The built-in generator is using a completely different algorithm though, so rate of generation isn't the entire story. How long is the period of the one you're using? (How long before it loops?) If you churn the generator to an arbitrary number and then take the next 100 values it generates, are they randomly distributed? Can you reconstruct the RNG's internal state by watching it generate numbers for a short while? Obviously no PRNG is going to be perfect at this, but there are certainly degrees of quality to be compared. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 27/02/2018 00:35, Chris Angelico wrote: On Tue, Feb 27, 2018 at 11:17 AM, Steven D'Apranowrote: On Tue, 27 Feb 2018 02:09:53 +1100, Chris Angelico wrote: You're still reimplementing the C code in Python, which is inefficient. Have you considered going back to the *actual algorithm* and implementing that idiomatically in Python? I think you'll find that (a) the code is more readable, and (b) the run time is much lower. Chris, I think this is incredibly optimistic, if not naive. We're talking about a PRNG by Marsaglia, so my guess is that the "original algorithm" *is* the C code. Or possibly Fortran. Even if not, even if there is actually a language-neutral algorithm, its a PRNG which means its going to be filled with bit-twiddling and number- crunching operations. Pure Python without a JIT is never going to be competitive with C or Fortran for code like that. I may have been a little unclear. It's highly unlikely that the run time of the properly-implemented Python code will be lower than the original C or Fortran. But it most certainly CAN be more efficient than the Python reimplementation of the C implementation, which would narrow the gap considerably. Implementing Python idiotically instead of idiomatically gives you suboptimal performance. Nonsense. You shouldn't need to care about such things. An algorithm is an algorithm. And the better ones should be easily written in any language. This particular one, which was of interest because the calculations tended to overflow into undesirable bignums, is just a bunch of calculations. 'Bit-twiddling'. Anyway, even this pure Python version can deliver pseudo random numbers at some 200,000 per second, while the built-in generator does 450,000 per second, so it's not bad going. Of course, the C version will generate them at over 100 million per second. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Tue, Feb 27, 2018 at 11:17 AM, Steven D'Apranowrote: > On Tue, 27 Feb 2018 02:09:53 +1100, Chris Angelico wrote: > >> You're still reimplementing the C code in Python, which is inefficient. >> Have you considered going back to the *actual algorithm* and >> implementing that idiomatically in Python? I think you'll find that (a) >> the code is more readable, and (b) the run time is much lower. > > Chris, I think this is incredibly optimistic, if not naive. We're talking > about a PRNG by Marsaglia, so my guess is that the "original algorithm" > *is* the C code. Or possibly Fortran. > > Even if not, even if there is actually a language-neutral algorithm, its > a PRNG which means its going to be filled with bit-twiddling and number- > crunching operations. Pure Python without a JIT is never going to be > competitive with C or Fortran for code like that. > I may have been a little unclear. It's highly unlikely that the run time of the properly-implemented Python code will be lower than the original C or Fortran. But it most certainly CAN be more efficient than the Python reimplementation of the C implementation, which would narrow the gap considerably. Implementing Python idiotically instead of idiomatically gives you suboptimal performance. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Tue, 27 Feb 2018 02:09:53 +1100, Chris Angelico wrote: > You're still reimplementing the C code in Python, which is inefficient. > Have you considered going back to the *actual algorithm* and > implementing that idiomatically in Python? I think you'll find that (a) > the code is more readable, and (b) the run time is much lower. Chris, I think this is incredibly optimistic, if not naive. We're talking about a PRNG by Marsaglia, so my guess is that the "original algorithm" *is* the C code. Or possibly Fortran. Even if not, even if there is actually a language-neutral algorithm, its a PRNG which means its going to be filled with bit-twiddling and number- crunching operations. Pure Python without a JIT is never going to be competitive with C or Fortran for code like that. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 20:27, bartc wrote: On 26/02/2018 19:50, Chris Angelico wrote: On Tue, Feb 27, 2018 at 6:37 AM, Rick Johnson So what? Latency is latency. And whether it occurs over the course of one heavily recursive algorithm that constitutes the depth and breadth of an entire program (a la fib()), or it is the incremental cumulative consequence of the entire program execution, the fact remains that function call overhead contributes to a significant portion of the latency inherent in some trivial, and *ALL* non-trivial, modern software. [Sorry, the bit of your (Chris') post I replied to got chopped by mistake. Here is my post again with the right context:] CA: > By saying "the fact remains", you're handwaving away the work of > actually measuring that function call overhead is "significant". Can > you show me those numbers? Steve's point is that it is NOT > significant, because non-toy functions have non-trivial bodies. If you > wish to disagree, you have to demonstrate that the function call is > *itself* costly, even when there's a real body to it. > Take the last bit of Python I posted, which was that RNG test. It uses this function: def i64(x): return x & 0x This is a small function, but it can't be a toy one as it was suggested by Ned Batchelder. And the test program wasn't at all recursive. Running the program with N=10_000_000 took 46 seconds. Replacing the i64() call with '' (where m is a global set to 0x), it took 38 seconds. So putting that fix into a function, which is convenient for coding, maintenance and readability, cost 20% in runtime. Going the other way, having i64() call another function i64a() which does the work, common when using wrapper functions, it took 60 seconds. A 30% slowdown, even though most of the work is doing numeric shifts and adds. So function calls do seem to be expensive operations in CPython, and any benchmarks which highlight that fact should be welcomed. (Note than none of this seems to affect PyPy, which ran at the same speed with i64(), without it, or with both i64() and i64a().) -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 19:50, Chris Angelico wrote: On Tue, Feb 27, 2018 at 6:37 AM, Rick Johnson So what? Latency is latency. And whether it occurs over the course of one heavily recursive algorithm that constitutes the depth and breadth of an entire program (a la fib()), or it is the incremental cumulative consequence of the entire program execution, the fact remains that function call overhead contributes to a significant portion of the latency inherent in some trivial, and *ALL* non-trivial, modern software. Take the last bit of Python I posted, which was that RNG test. It uses this function: def i64(x): return x & 0x This is a small function, but it can't be a toy one as it was suggested by Ned Batchelder. And the test program wasn't at all recursive. Running the program with N=10_000_000 took 46 seconds. Replacing the i64() call with '' (where m is a global set to 0x), it took 38 seconds. So putting that fix into a function, which is convenient for coding, maintenance and readability, cost 20% in runtime. Going the other way, having i64() call another function i64a() which does the work, common when using wrapper functions, it took 60 seconds. A 30% slowdown, even though most of the work is doing numeric shifts and adds. So function calls do seem to be expensive operations in CPython, and any benchmarks which highlight that fact should be welcomed. (Note than none of this seems to affect PyPy, which ran at the same speed with i64(), without it, or with both i64() and i64a().) -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Tue, Feb 27, 2018 at 6:37 AM, Rick Johnsonwrote: > On Monday, February 26, 2018 at 4:39:22 AM UTC-6, Steven D'Aprano wrote: >> On Sun, 25 Feb 2018 19:26:12 -0800, Rick Johnson wrote: >> >> > On Friday, February 23, 2018 at 8:48:55 PM UTC-6, Steven D'Aprano wrote: >> > [...] >> > > Take the Fibonacci double-recursion benchmark. Okay, it >> > > tests how well your language does at making millions of >> > > function calls. Why? >> > >> > Because making "millons of function calls" is what >> > software *DOES*! >> >> You're right -- I worded that badly, and ended up saying >> something I didn't really mean. Mea culpa. Of course over >> the course of the entire program run, most non-trivial >> programmers will end up having millions (if not billions) >> of function calls *in total*. But they're not all happening >> *together*. > > So what? Latency is latency. And whether it occurs over the > course of one heavily recursive algorithm that constitutes > the depth and breadth of an entire program (a la fib()), or > it is the incremental cumulative consequence of the entire > program execution, the fact remains that function call > overhead contributes to a significant portion of the latency > inherent in some trivial, and *ALL* non-trivial, modern > software. By saying "the fact remains", you're handwaving away the work of actually measuring that function call overhead is "significant". Can you show me those numbers? Steve's point is that it is NOT significant, because non-toy functions have non-trivial bodies. If you wish to disagree, you have to demonstrate that the function call is *itself* costly, even when there's a real body to it. > I'm very glad you brought this up, because it proves my main > point that, Python's maximum recursion error is one feature > they got _right_. If you don't like the default recursion > limit of 1000, or maybe 1000 recursions is breaking a valid > recursive algorithm, you can set the recursion limit to > whatever you like... > > ## Python 2 ## > >>> import sys > >>> sys.setrecursionlimit > > >>> sys.setrecursionlimit(10) > >>> count = 1 > >>> def bad(): > ... global count > ... print count > ... count += 1 > ... bad() > ... > >>> bad() > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > Traceback (most recent call last): > File "", line 1, in > File "", line 5, in bad > File "", line 5, in bad > File "", line 5, in bad > File "", line 5, in bad > File "", line 5, in bad > File "", line 5, in bad > File "", line 5, in bad > File "", line 5, in bad > File "", line 5, in bad > RuntimeError: maximum recursion depth exceeded > > It is however unfortunate that we get slamed with N > recursive exception messages :-(, but hey, at least we can > modify the maximum recursion limit! It's a step in the right > direction. :-) Ah, been a while since you tried Python 3, is it? :) [Previous line repeated X more times] ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Monday, February 26, 2018 at 4:39:22 AM UTC-6, Steven D'Aprano wrote: > On Sun, 25 Feb 2018 19:26:12 -0800, Rick Johnson wrote: > > > On Friday, February 23, 2018 at 8:48:55 PM UTC-6, Steven D'Aprano wrote: > > [...] > > > Take the Fibonacci double-recursion benchmark. Okay, it > > > tests how well your language does at making millions of > > > function calls. Why? > > > > Because making "millons of function calls" is what > > software *DOES*! > > You're right -- I worded that badly, and ended up saying > something I didn't really mean. Mea culpa. Of course over > the course of the entire program run, most non-trivial > programmers will end up having millions (if not billions) > of function calls *in total*. But they're not all happening > *together*. So what? Latency is latency. And whether it occurs over the course of one heavily recursive algorithm that constitutes the depth and breadth of an entire program (a la fib()), or it is the incremental cumulative consequence of the entire program execution, the fact remains that function call overhead contributes to a significant portion of the latency inherent in some trivial, and *ALL* non-trivial, modern software. > > What I tried to get across is, how often do we use millions > of function calls to get one value? All the time! For, in the abstract, the fundamental function of software is to extract a "value" from one or more inputs -- is it not? So, whether that "value" be: (1) a document in an office productivity suite, or (2) a flight path for a mars lander or a jumbo jet or child's quad-copter, or (3) a modified image in a image processing program, or (4) a design drawing or technical schematic for a mars rover bot, or (5) an HD quality streaming porno flick, or yes, (6) even a recursive function which unwittingly underscores some of the less attractive aspects of the Python language (aka: function call overhead!) -- all of these examples, and countless others, extract a value from inputs. Of course, much like physical traits i suppose, some "values" are more desirable than others. > > By default, Python will only allow one thousand function > calls in a stack before raising RecursionError. By _default_, yes. > > py> def bad(): > ... return bad() > ... > py> bad() > Traceback (most recent call last): > File "", line 1, in > File "", line 2, in bad > [ previous line repeats 998 times ] > RecursionError: maximum recursion depth exceeded > > > There's nothing unique about recursion that will trigger > this error. Any call stack of more than 1000 function calls > will do it. So forget about code getting a million > functions deep. The average is probably more like a dozen, > perhaps a hundred tops. I'm very glad you brought this up, because it proves my main point that, Python's maximum recursion error is one feature they got _right_. If you don't like the default recursion limit of 1000, or maybe 1000 recursions is breaking a valid recursive algorithm, you can set the recursion limit to whatever you like... ## Python 2 ## >>> import sys >>> sys.setrecursionlimit >>> sys.setrecursionlimit(10) >>> count = 1 >>> def bad(): ... global count ... print count ... count += 1 ... bad() ... >>> bad() 1 2 3 4 5 6 7 8 9 Traceback (most recent call last): File "", line 1, in File "", line 5, in bad File "", line 5, in bad File "", line 5, in bad File "", line 5, in bad File "", line 5, in bad File "", line 5, in bad File "", line 5, in bad File "", line 5, in bad File "", line 5, in bad RuntimeError: maximum recursion depth exceeded It is however unfortunate that we get slamed with N recursive exception messages :-(, but hey, at least we can modify the maximum recursion limit! It's a step in the right direction. :-) > [...] > > > For most application code, executing the function is far > > > more costly than the overhead of calling it, > > > > What? Your functions are only called _once_ during the > > lifetime of your code execution? Enough with the leg > > pulling. > > Let me explain what I intended to say. Here is a really > poor estimate of the overhead of calling a function on my > computer, measured in iPython. > > [...snip valid examples...] > > Now I know that this is not always the case. But I maintain > that for *most* (but not all) practical, real-world code > that actually does something useful, the overhead of > calling the functions is *usually* (but not always) low > compared to the cost of actually doing whatever it is that > the functions do. True. But again you failed to address my main point that: "Lazy features (when feasible) must be offered _optionally_". Yes, function call overhead may be tiny in relation to the code encapsulated within (at least for non-trivial code), but the fact remains that function calls are inherent in modern software, and as
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 17:05, Ben Bacarisse wrote: bartcwrites: A C version is given below. (One I may have messed around with, which I'm not sure works properly. For an original, google for Marsaglia and KISS64 or SUPRKISS64.) The version I know uses unsigned integers. Did you change then to signed? Yeah, I don't know what I was doing with that version. For a Python version, go back to the original C and work from there. The original C makes confusing use of macros and comma operators. A version without macros or comma expressions is here (tweaked from generated C): https://pastebin.com/raw/k4jFK5TN This runs the 1-billion loop in 6 seconds. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Monday, February 26, 2018 at 3:59:40 AM UTC-6, Steven D'Aprano wrote: > On Sun, 25 Feb 2018 20:22:17 -0800, Rick Johnson wrote: > (We tried painting Go Faster stripes on the server, and it > didn't work.) Well of course the server won't work after you drip water- based paint all over the circuit boards. Way to go! In the future i would suggest draping some old linens over the electrical bits before throwing your paint around (just be sure to consult with the wife before rummaging through her linen closet, as men have been known to meet an untimely demise for doing much less) . And might i suggest some skulls and faux flames? Hey, it may not make the thing go any faster, but it will surely make the other kids jealous. There's no reason why apple and microsoft should be allowed to suck up every last merchandising dollar. > There's no point suggesting such major changes to Python > that requires going back to the drawing board, to Python > 0.1 or earlier, and changing the entire execution and > memory model of the language. Absolutely not. Probably too late for that now. I'm only suggesting that we, when at all possible, provide a switch to disable these lazy-features in those rare cases when they do create bottlenecks. Lazy iterators and dict-views are great ideas, but i'm sure there are instances which prove them to be PITA and even a bottleneck. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
bartcwrites: > A C version is given below. (One I may have messed around with, which > I'm not sure works properly. For an original, google for Marsaglia and > KISS64 or SUPRKISS64.) The version I know uses unsigned integers. Did you change then to signed? For a Python version, go back to the original C and work from there. -- Ben. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 15:09, Chris Angelico wrote: On Tue, Feb 27, 2018 at 2:02 AM, bartcwrote: On 26/02/2018 14:04, bartc wrote: On 26/02/2018 13:42, Ned Batchelder wrote: Well, once you notice that the Python code had N=1e5, and the C code had N=1e9 :) If you want to experiment, with N=1e5, the final number should be 5255210926702073855. OK, I'll try that. I have that Python version working now. It's necessary to apply that masking function to wherever numbers can get bigger. I don't know how long a 1-billion loop will take, but a 10-million loop took 46 seconds on Python 3.6, and 21 seconds on PyPy 2.7 from a couple of years ago. (And on Windows, which has a somewhat slower CPython than Linux.) You're still reimplementing the C code in Python, which is inefficient. Have you considered going back to the *actual algorithm* and implementing that idiomatically in Python? I think you'll find that (a) the code is more readable, and (b) the run time is much lower. Do you really think that? The algorithm seems to require this sequence of calculations to be gone through. Otherwise anything more efficient can also be done in any language. So how would idiomatic Python help, by seeing if such a routine already exists in some library? That wouldn't be playing the game. If it helps, I remember playing with a version in Algol 68 Genie (interpreted). Still buggy as the output is different, it does about the same amount of work. A 10-million loop would take an estimated 1000 seconds, on the same machine that CPython took 46 seconds. So four magnitudes slower than C. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 2/26/18 10:09 AM, Chris Angelico wrote: On Tue, Feb 27, 2018 at 2:02 AM, bartcwrote: On 26/02/2018 14:04, bartc wrote: On 26/02/2018 13:42, Ned Batchelder wrote: Well, once you notice that the Python code had N=1e5, and the C code had N=1e9 :) If you want to experiment, with N=1e5, the final number should be 5255210926702073855. OK, I'll try that. I have that Python version working now. It's necessary to apply that masking function to wherever numbers can get bigger. I don't know how long a 1-billion loop will take, but a 10-million loop took 46 seconds on Python 3.6, and 21 seconds on PyPy 2.7 from a couple of years ago. (And on Windows, which has a somewhat slower CPython than Linux.) You're still reimplementing the C code in Python, which is inefficient. Have you considered going back to the *actual algorithm* and implementing that idiomatically in Python? I think you'll find that (a) the code is more readable, and (b) the run time is much lower. Don't bother: C will win this kind of race every time. --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Tue, Feb 27, 2018 at 2:02 AM, bartcwrote: > On 26/02/2018 14:04, bartc wrote: >> >> On 26/02/2018 13:42, Ned Batchelder wrote: > > >> Well, once you notice that the >>> >>> Python code had N=1e5, and the C code had N=1e9 :) If you want to >>> experiment, with N=1e5, the final number should be 5255210926702073855. >> >> >> OK, I'll try that. > > > I have that Python version working now. It's necessary to apply that masking > function to wherever numbers can get bigger. > > I don't know how long a 1-billion loop will take, but a 10-million loop took > 46 seconds on Python 3.6, and 21 seconds on PyPy 2.7 from a couple of years > ago. (And on Windows, which has a somewhat slower CPython than Linux.) You're still reimplementing the C code in Python, which is inefficient. Have you considered going back to the *actual algorithm* and implementing that idiomatically in Python? I think you'll find that (a) the code is more readable, and (b) the run time is much lower. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 14:04, bartc wrote: On 26/02/2018 13:42, Ned Batchelder wrote: Well, once you notice that the Python code had N=1e5, and the C code had N=1e9 :) If you want to experiment, with N=1e5, the final number should be 5255210926702073855. OK, I'll try that. I have that Python version working now. It's necessary to apply that masking function to wherever numbers can get bigger. I don't know how long a 1-billion loop will take, but a 10-million loop took 46 seconds on Python 3.6, and 21 seconds on PyPy 2.7 from a couple of years ago. (And on Windows, which has a somewhat slower CPython than Linux.) Result should be x=11240129907685265998. By comparison, the C version compiled with -O3 took 0.11 seconds. (The C version I posted will work, if adjusted to a 1000 loop, but you have to change 'signed' to 'unsigned'. Apparently they weren't interchangeable after all. I've no idea why I used 'signed' there. That version is rather cryptic, but it can be better written and without the macros, and it will run just as fast. (Marsaglia may have been hot with random number routines, but his C could have done with some work...) My interpreter, using 64-bit numbers, managed 4.8 seconds. But unsigned arithmetic, which is uncommon, is not accelerated.) --- Q=0 carry=36243678541 xcng=12367890123456 xs=521288629546311 indx=20632 def i64(x): return x & 0x def refill(): global Q, carry, indx for i in range(20632): h = carry & 1 z = i64(( i64((Q[i]<<41))>>1)+(i64((Q[i]<<39))>>1)+(carry>>1)) carry = i64((Q[i]>>23)+(Q[i]>>25)+(z>>63)) Q[i] = i64(~(i64(i64(z<<1)+h))) indx=1 return Q[0] def start(): global Q, carry, xcng, xs, indx Q=[0,]*20632 for i in range(20632): xcng=i64(6906969069 * xcng + 123) xs = i64(xs ^ (xs<<13)) xs = i64(xs ^ (xs>>17)) xs = i64(xs ^ (xs<<43)) Q[i] = i64(xcng + xs) N = 1000 for i in range(N): if indx<20632: s = Q[indx] indx+=1 else: s = refill() xcng=i64(6906969069 * xcng + 123) xs = i64(xs ^ (xs<<13)) xs = i64(xs ^ (xs>>17)) xs = i64(xs ^ (xs<<43)) x = i64(s+xcng+xs) print ("Does x= 4013566000157423768") print (" x=",x) start() -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 14:34, Chris Angelico wrote: I'm glad _someone_ funded PyPy, anyhow. It's a great demonstration of what can be done. And it's good that /someone/ at least understands how PyPy works, as I don't think many people do. Apparently it doesn't optimise 'hot paths' within a Python program, but optimises hot paths in the special Python interpreter. One written in [R]Python. Or something... -- Bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Tue, Feb 27, 2018 at 12:03 AM, Steven D'Apranowrote: > On Mon, 26 Feb 2018 22:34:00 +1100, Chris Angelico wrote: >> Removing the GIL from CPython is not about "speeding up" the language or >> the interpreter, but about improving parallelism. > > It is about speeding up threaded code which is CPU-bound. I call that > speeding up the language :-) > > Hypothetically, it could also speed up even unthreaded code. In its purest sense, no, it cannot speed up unthreaded code. Since many Python programs are single-threaded, this restricts the benefits to those programs which can actually parallelize, but the penalties (mainly overhead) usually apply to all programs. > Some language features could internally launch threads and operate using > implicit parallelism. For instance, map(func, seq) could run in parallel > in separate threads whenever Python knew that func had no side-effects, > say for built-ins. There are also parallel algorithms for bignum > arithmetic. If threads were quick enough, I'm sure we could come up with > many more examples. Hypothetically speaking. These are ways of converting single-threaded code into multi-threaded code, which could then benefit from parallelization, which can reduce wall-time for the process as a whole. > (In practice, the old hope that parallel computing would speed everything > up turned out to be flawed. Relatively few tasks are embarrassingly > parallel, most have sequential bottlenecks, and many are inherently > sequential.) Exactly; and it goes further: many modern programmers do not think in terms of parallelism. Even event-driven code takes a bit of getting your head around. Why are async functions preferred over threads? It's not JUST because today's OSes have thread count limits far below socket limits - if that were the sole justification, we wouldn't have language-level support for async/await - it'd be a niche technique used only in the highest-throughput applications. No, async functions are preferred because they *reduce the points where context switching can occur*, thus making it easier *for the programmer*. If there were a way for the language to automatically run things in parallel, the biggest risk is that the programmer who wrote it can no longer understand it. >>> (If it weren't for the EU government funding it, PyPy would probably be >>> languishing in oblivion. Everyone wants a faster Python so long as they >>> don't have to pay for it, or do the work.) >> >> Hm, I didn't know the EU was funding PyPy. Okay. Cool. > > I don't know if they still receive funding, but I know that PyPy really > only got going in a big way when they got a grant from the EU. I think it > paid for at least one developer to work on it full time for a year. > DuckDuckGo is probably your friend if you care about the details. Meh, I don't care that much about what the EU does or doesn't fund. Was just a mild case of curiosity. I had about as much interest as my savings account accrued last year. I'm glad _someone_ funded PyPy, anyhow. It's a great demonstration of what can be done. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 13:42, Ned Batchelder wrote: On 2/26/18 7:13 AM, bartc wrote: A C version is given below. (One I may have messed around with, which I'm not sure works properly. For an original, google for Marsaglia and KISS64 or SUPRKISS64.) Most integers are unsigned, which have well-defined overflow in C With proper 64-bit masking (def only64(x): return x & 0x), the Python version produces the correct answer using a reasonable amount of memory. I did try sometime like that, but I must have missed something because I didn't get quite the same results as a working version. And with interpreted code, you tend not to test using loops of a billion iterations. Well, once you notice that the Python code had N=1e5, and the C code had N=1e9 :) If you want to experiment, with N=1e5, the final number should be 5255210926702073855. OK, I'll try that. Also, I note that you said, "Most integers are unsigned", but the C code has them all declared as signed? It doesn't seem to have mattered to your result, but I'm not an expert on C portability guarantees. The C code I first pasted used 'unsigned', but the main program logic wasn't right, and I found another version that looked better. That one used 'signed' for some reason, which I completely missed. Even if with C it works with either, the signed version might have 'undefined behaviour'. As said, google for the original; the ones I can see have 'unsigned'. But I can also see a Fortran version that just uses 'integer*8', which I believe is signed 64-bit. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/18 13:42, Ned Batchelder wrote: Also, I note that you said, "Most integers are unsigned", but the C code has them all declared as signed? It doesn't seem to have mattered to your result, but I'm not an expert on C portability guarantees. C explicitly leaves the behaviour of signed arithmetic overflow undefined, so you have no portability guarantees there. -- Rhodri James *-* Kynesim Ltd -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 2/26/18 7:13 AM, bartc wrote: On 26/02/2018 11:40, Chris Angelico wrote: On Mon, Feb 26, 2018 at 10:13 PM, bartcwrote: Below is the first draft of a Python port of a program to do with random numbers. (Ported from my language, which in turned ported it from a C program by George Marsaglia, the random number guy.) However, running it quickly exhausts the memory in my machine. The reason is that Python unhelpfully widens all results to bignums as needed. The code relies on calculations being modulo 2^64. Note that restricting integer ops to 64 bits probably still won't work, as I believe the numbers need to be unsigned. No, it's because the original implementation assumed integer wrap-around (at least, I think that's what's happening; I haven't analyzed the code in great detail). That means all your integer operations are doing two jobs: the one they claim to, and then a masking to 64 bits signed. That's two abstract operations that happen, due to the nature of the CPU, to work efficiently together. If you don't implement both halves of that in your Python port, you have failed at porting. What if you were porting a program from a 72-bit chip that assumed Binary Coded Decimal? Would you complain that C's integers are badly behaved? And that's without even asking whether a C program that assumes integer wrap-around counts as portable. At least with Python, you have a guarantee that integer operations are going to behave the same way on all compliant implementations of the language. A C version is given below. (One I may have messed around with, which I'm not sure works properly. For an original, google for Marsaglia and KISS64 or SUPRKISS64.) Most integers are unsigned, which have well-defined overflow in C (they just wrap as expected). In C, a mixed signed/unsigned op is performed as unsigned. - /* SUPRKISS64.c, period 5*2^1320480*(2^64-1) */ #include #include #include "timer.c" static signed long long Q[20632], carry=36243678541LL, xcng=12367890123456LL, xs=521288629546311LL, indx=20632; #define CNG ( xcng=6906969069LL*xcng+123 ) #define XS ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 ) #define SUPR ( indx<20632 ? Q[indx++] : refill() ) #define KISS SUPR+CNG+XS signed long long refill(void) { int i; signed long long z,h; for(i=0;i<20632;i++) { h = (carry&1); z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1); carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63); Q[i] = ~((z<<1)+h); } indx=1; return (Q[0]); } int main() { int i; signed long long x; for(i=0;i<20632;i++) Q[i]=CNG+XS; for(i=0;i<10;i++) x=KISS; printf("Does x=4013566000157423768\n x=%llu.\n",x); } With proper 64-bit masking (def only64(x): return x & 0x), the Python version produces the correct answer using a reasonable amount of memory. Well, once you notice that the Python code had N=1e5, and the C code had N=1e9 :) If you want to experiment, with N=1e5, the final number should be 5255210926702073855. Also, I note that you said, "Most integers are unsigned", but the C code has them all declared as signed? It doesn't seem to have mattered to your result, but I'm not an expert on C portability guarantees. --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Mon, 26 Feb 2018 22:34:00 +1100, Chris Angelico wrote: [...] >> (We tried painting Go Faster stripes on the server, and it didn't >> work.) > > Steven Middlename D'Aprano! You should know better than that. "It didn't > work" is not good enough. What actually happened? A truck crashed into the power pole outside our building and the power went off. Then the UPS ran down and the server crashed. Clearly this is a bug in the go-faster stripes. We tried reporting it: Expected behaviour: server goes faster. Actual behaviour: a truck crashes into the power pole across the street and the lights go out. but the maintainers closed the issue "Works for me". Very disappointed! [...] > Removing the GIL from CPython is not about "speeding up" the language or > the interpreter, but about improving parallelism. It is about speeding up threaded code which is CPU-bound. I call that speeding up the language :-) Hypothetically, it could also speed up even unthreaded code. Some language features could internally launch threads and operate using implicit parallelism. For instance, map(func, seq) could run in parallel in separate threads whenever Python knew that func had no side-effects, say for built-ins. There are also parallel algorithms for bignum arithmetic. If threads were quick enough, I'm sure we could come up with many more examples. Hypothetically speaking. (In practice, the old hope that parallel computing would speed everything up turned out to be flawed. Relatively few tasks are embarrassingly parallel, most have sequential bottlenecks, and many are inherently sequential.) >> (If it weren't for the EU government funding it, PyPy would probably be >> languishing in oblivion. Everyone wants a faster Python so long as they >> don't have to pay for it, or do the work.) > > Hm, I didn't know the EU was funding PyPy. Okay. Cool. I don't know if they still receive funding, but I know that PyPy really only got going in a big way when they got a grant from the EU. I think it paid for at least one developer to work on it full time for a year. DuckDuckGo is probably your friend if you care about the details. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 12:06, Antoon Pardon wrote: On 23-02-18 02:27, Steven D'Aprano wrote: Why do you care about the 50 million calls? That's crazy -- the important thing is *calculating the Fibonacci numbers as efficiently as possible*. No necessarily. David Beazley in his talks sometimes uses an ineffecient algorithm for calculating fibonacci numbers because he needs something that uses the cpu intensively. calculating the fibonacci numbers in that context as efficiently as possible would defeat that purpose. So in a context of a benchmark it is not unreasonable to assume those 50 million calls are the purpose and not calculating the Fibonacci numbers as efficiently as possible. I don't think Steven is ever going to concede this point. Because Python performs badly compared to Julia or C, and it's not possible to conveniently offload the task to some fast library because it only uses a handful of primitive byte-codes. (I have the same trouble with my own interpreted language. Although somewhat brisker than CPython, it will always be much slower than a C-like language on such micro-benchmarks. But I accept that; I don't have an army of people working on acceleration projects and tracing JIT compilers. To those people however, such a benchmark can be a useful yardstick of progress.) -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 24-02-18 06:18, Terry Reedy wrote: > On 2/23/2018 11:32 AM, Python wrote: >> >> Doing that would completely fail to accomplish the task of comparing >> the performance of recursive function calls in the two languages, >> which is what the benchmark was designed to do. > > That is an non goal because *languages* do not have clock-time speeds, > only programs written in concrete implementations run on real hardware. > > Why do you think it fair to compare function call times using the > slowest rather than the fastest implementation of Python function > calls? Real Python programmers who are seriously concerned about time > try to use the fastest implementation appropriate to the situation. If the slowest happens to be the standard implementation I see nothing wrong with it. > > > So, no actually, it shouldn't. > > To me, it is 'not using the fasted Python' that fails to make apples > to apples comparisons. > > It has been said here that comparisons should use the same algorithm > doing the much the same work in both implementation. However, a > Python function call does *much more work* than in most languages, > because the meaning of 'function call' is much more flexible in Python > than most languages. So? I don't see what is wrong when is pointed out that all that extra work, that a lot of people don't care about is costing performance. > The usual comparison is like lemons (other language calls) to oranges > (Python language calls, much more flexible). To be fair, the > comparison should be to a Python implementation that either notices or > accepts a declaration that, for instance, fib(n) only needs to pass an > int by position. > > Comparing int addition to python object addition also compares unlike > operations. To compare using the same addition algorithm, one must > use an implementation that can restrict '+' to just int addition. I don't agree, if the performance loss comes from the standard implementation of the language unable of doing that then people testing languages shouldn't be burdened with going to search for some implementation that can. -- Antoon Pardon -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 11:40, Chris Angelico wrote: On Mon, Feb 26, 2018 at 10:13 PM, bartcwrote: Below is the first draft of a Python port of a program to do with random numbers. (Ported from my language, which in turned ported it from a C program by George Marsaglia, the random number guy.) However, running it quickly exhausts the memory in my machine. The reason is that Python unhelpfully widens all results to bignums as needed. The code relies on calculations being modulo 2^64. Note that restricting integer ops to 64 bits probably still won't work, as I believe the numbers need to be unsigned. No, it's because the original implementation assumed integer wrap-around (at least, I think that's what's happening; I haven't analyzed the code in great detail). That means all your integer operations are doing two jobs: the one they claim to, and then a masking to 64 bits signed. That's two abstract operations that happen, due to the nature of the CPU, to work efficiently together. If you don't implement both halves of that in your Python port, you have failed at porting. What if you were porting a program from a 72-bit chip that assumed Binary Coded Decimal? Would you complain that C's integers are badly behaved? And that's without even asking whether a C program that assumes integer wrap-around counts as portable. At least with Python, you have a guarantee that integer operations are going to behave the same way on all compliant implementations of the language. A C version is given below. (One I may have messed around with, which I'm not sure works properly. For an original, google for Marsaglia and KISS64 or SUPRKISS64.) Most integers are unsigned, which have well-defined overflow in C (they just wrap as expected). In C, a mixed signed/unsigned op is performed as unsigned. - /* SUPRKISS64.c, period 5*2^1320480*(2^64-1) */ #include #include #include "timer.c" static signed long long Q[20632], carry=36243678541LL, xcng=12367890123456LL, xs=521288629546311LL, indx=20632; #define CNG ( xcng=6906969069LL*xcng+123 ) #define XS ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 ) #define SUPR ( indx<20632 ? Q[indx++] : refill() ) #define KISS SUPR+CNG+XS signed long long refill(void) { int i; signed long long z,h; for(i=0;i<20632;i++) { h = (carry&1); z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1); carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63); Q[i] = ~((z<<1)+h); } indx=1; return (Q[0]); } int main() { int i; signed long long x; for(i=0;i<20632;i++) Q[i]=CNG+XS; for(i=0;i<10;i++) x=KISS; printf("Does x=4013566000157423768\n x=%llu.\n",x); } -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23-02-18 02:27, Steven D'Aprano wrote: > Why do you care about the 50 million calls? That's crazy -- the important > thing is *calculating the Fibonacci numbers as efficiently as possible*. No necessarily. David Beazley in his talks sometimes uses an ineffecient algorithm for calculating fibonacci numbers because he needs something that uses the cpu intensively. calculating the fibonacci numbers in that context as efficiently as possible would defeat that purpose. So in a context of a benchmark it is not unreasonable to assume those 50 million calls are the purpose and not calculating the Fibonacci numbers as efficiently as possible. -- Antoon. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Mon, Feb 26, 2018 at 10:13 PM, bartcwrote: > Below is the first draft of a Python port of a program to do with random > numbers. (Ported from my language, which in turned ported it from a C > program by George Marsaglia, the random number guy.) > > However, running it quickly exhausts the memory in my machine. The reason is > that Python unhelpfully widens all results to bignums as needed. The code > relies on calculations being modulo 2^64. > > Note that restricting integer ops to 64 bits probably still won't work, as I > believe the numbers need to be unsigned. No, it's because the original implementation assumed integer wrap-around (at least, I think that's what's happening; I haven't analyzed the code in great detail). That means all your integer operations are doing two jobs: the one they claim to, and then a masking to 64 bits signed. That's two abstract operations that happen, due to the nature of the CPU, to work efficiently together. If you don't implement both halves of that in your Python port, you have failed at porting. What if you were porting a program from a 72-bit chip that assumed Binary Coded Decimal? Would you complain that C's integers are badly behaved? And that's without even asking whether a C program that assumes integer wrap-around counts as portable. At least with Python, you have a guarantee that integer operations are going to behave the same way on all compliant implementations of the language. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Mon, Feb 26, 2018 at 8:57 PM, Steven D'Apranowrote: > On Sun, 25 Feb 2018 20:22:17 -0800, Rick Johnson wrote: > >> So of course, speed is not and should not be the >> primary concern, but to say that execution speed is of _no_ concern is >> quite absurd indeed. > > I'm pretty sure that nobody here has said that speed is of no concern. > > Rather, I would argue that the position we're taking is that *in general* > Python is fast enough for the sorts of tasks we use it for (except, of > course, when it isn't, in which case you have our sympathy, and if we can > suggest some solutions we will). > > Of course we'd take speed optimizations if they were free, but they're > never free: > > - they often require more memory to run; > > - they usually require more code, which adds to the maintenance burden > and increases the interpreter bloat; > > - and increases the risk of bugs; > > - somebody has to write, debug, document and maintain the code, > and developer time and effort is in short supply; > > - or the optimization requires changes to the way Python operates; > > - even if we wanted to make that change, it will break backwards > compatibility; > > - and often what people imagine is a simple optimization (because > they haven't tried it) isn't simple at all; > > - or simply doesn't work; > > - and most importantly, just saying "Make it go faster!" doesn't work, > we actually need to care about the details of *how* to make it faster. Or it reduces memory usage, improves performance, and makes things easier on the programmer... but might place unexpected (or unwanted) demands on other implementations of Python. CPython, by virtue of being the "default" Python interpreter, has to be careful of appearing to mandate something. That's why buy-in from other Python implementation teams (I believe Jython was the most notable here) was important in the discussion about the recent compact dict update. Even what looks like a no-brainer still can have unwanted consequences. > (We tried painting Go Faster stripes on the server, and it didn't work.) Steven Middlename D'Aprano! You should know better than that. "It didn't work" is not good enough. What actually happened? Did the stripes smoulder and smoke until you powered down the server? Did the server raise NotImplementedError when you touched it with the paintbrush? Did your boss walk up behind you and hand you a pink slip? Be specific! > There have been at least two (maybe three) attempts to remove the GIL > from CPython. They've all turned out to increase complexity by a huge > amount, and not actually provide the hoped-for speed increase. Under many > common scenarios, the GIL-less CPython actually ran *slower*. > > (I say "hoped for", but it was more wishful thinking than a rational > expectation that removing the GIL would speed things up. I don't believe > any of the core developers were surprised that removing the GIL didn't > increase speeds much, if at all, and sometimes slowed things down. The > believe that the GIL slows Python down is mostly due to a really > simplistic understanding of how threading works.) Removing the GIL from CPython is not about "speeding up" the language or the interpreter, but about improving parallelism. And I think all the core devs understand this (hence the lack of surprise there), but people who call for it often don't. > (If it weren't for the EU government funding it, PyPy would probably be > languishing in oblivion. Everyone wants a faster Python so long as they > don't have to pay for it, or do the work.) Hm, I didn't know the EU was funding PyPy. Okay. Cool. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 26/02/2018 09:22, Steven D'Aprano wrote: On Sun, 25 Feb 2018 21:19:19 -0800, Rick Johnson wrote: I agree with your sarcasm. And that's why these types of auto conversions should be optional. I agree that most times it's more practical to let python handle the dirty details. But in some cases, where you need to squeeze out a few more drops of speed juice, you won't mind the extra trouble. And that's where you call out to a library like numpy, or write a C extension, or use a tool like Numba or Cython to optimise your Python code to use native ints. (Numba is still a work in progress, but Cython is a mature working product.) Or to put it another way... if you want machine ints in Python, the way you get them is something like: from numba import jit @jit def myfunction(): ... The core language doesn't have to support these things when there is a healthy language ecosystem that can do it. Below is the first draft of a Python port of a program to do with random numbers. (Ported from my language, which in turned ported it from a C program by George Marsaglia, the random number guy.) However, running it quickly exhausts the memory in my machine. The reason is that Python unhelpfully widens all results to bignums as needed. The code relies on calculations being modulo 2^64. Note that restricting integer ops to 64 bits probably still won't work, as I believe the numbers need to be unsigned. -- Q=0 carry=36243678541 xcng=12367890123456 xs=521288629546311 indx=20632 def refill(): global Q, carry, indx for i in range(20632): h = carry & 1 z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1) carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63) Q[i] = ~((z<<1)+h) indx=1 return Q[0] def start(): global Q, carry, xcng, xs, indx Q=[0,]*20632 for i in range(20632): xcng=6906969069 * xcng + 123 xs ^= (xs<<13) xs ^= (xs>>17) xs ^= (xs<<43) Q[i] = xcng + xs N = 10 for i in range(N): if indx<20632: s = Q[indx] indx+=1 else: s = refill() xcng=6906969069 * xcng + 123 xs ^= (xs<<13) xs ^= (xs>>17) xs ^= (xs<<43) x = s+xcng+xs print ("Does x= 4013566000157423768") print (" x=",x) start() -- (The code performs N iterations of a random number generator. You get the result expected, ie. x=401...768, when N is a billion.) -- Bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sun, 25 Feb 2018 19:26:12 -0800, Rick Johnson wrote: > On Friday, February 23, 2018 at 8:48:55 PM UTC-6, Steven D'Aprano wrote: > [...] >> Take the Fibonacci double-recursion benchmark. Okay, it tests how well >> your language does at making millions of function calls. Why? > > Because making "millons of function calls" is what software *DOES*! You're right -- I worded that badly, and ended up saying something I didn't really mean. Mea culpa. Of course over the course of the entire program run, most non-trivial programmers will end up having millions (if not billions) of function calls *in total*. But they're not all happening *together*. What I tried to get across is, how often do we use millions of function calls to get one value? By default, Python will only allow one thousand function calls in a stack before raising RecursionError. py> def bad(): ... return bad() ... py> bad() Traceback (most recent call last): File "", line 1, in File "", line 2, in bad [ previous line repeats 998 times ] RecursionError: maximum recursion depth exceeded There's nothing unique about recursion that will trigger this error. Any call stack of more than 1000 function calls will do it. So forget about code getting a million functions deep. The average is probably more like a dozen, perhaps a hundred tops. [...] >> For most application code, executing the function is far more costly >> than the overhead of calling it, > > What? Your functions are only called _once_ during the lifetime of your > code execution? Enough with the leg pulling. Let me explain what I intended to say. Here is a really poor estimate of the overhead of calling a function on my computer, measured in iPython. In [50]: def call(): : pass : In [51]: %timeit call() 100 loops, best of 3: 717 ns per loop So, on my computer, in iPython, it takes 717ns to call a function that does nothing. How long does it take to do the tiniest smidgen of useful work? In [52]: def do_something(): : a = 1 : b = a+1 : return b : In [53]: %timeit do_something() 100 loops, best of 3: 923 ns per loop Only an extra 200 ns! Wow, function calls are expensive!!! (Well, yes, I knew that.) However, let's do some *real* work, not a Mickey Mouse function like do_something(). In [54]: import pyprimes In [55]: %timeit pyprimes.nth_prime(10) 1 loops, best of 3: 2.6 s per loop I happen to know that calling nth_prime ends up making a couple of hundred function calls. (Because I wrote it.) Let's call it 10,000 function calls, to be on the safe side. So 10,000 x 800 nanoseconds is 0.008 second, let's call it 0.01 second, which is just 0.4% of the total cost of calculating the number I was after. (For the record, that's 1299709.) So just a trivial fraction of the total cost of the function. Now I know that this is not always the case. But I maintain that for *most* (but not all) practical, real-world code that actually does something useful, the overhead of calling the functions is *usually* (but not always) low compared to the cost of actually doing whatever it is that the functions do. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sun, 25 Feb 2018 20:22:17 -0800, Rick Johnson wrote: > So of course, speed is not and should not be the > primary concern, but to say that execution speed is of _no_ concern is > quite absurd indeed. I'm pretty sure that nobody here has said that speed is of no concern. Rather, I would argue that the position we're taking is that *in general* Python is fast enough for the sorts of tasks we use it for (except, of course, when it isn't, in which case you have our sympathy, and if we can suggest some solutions we will). Of course we'd take speed optimizations if they were free, but they're never free: - they often require more memory to run; - they usually require more code, which adds to the maintenance burden and increases the interpreter bloat; - and increases the risk of bugs; - somebody has to write, debug, document and maintain the code, and developer time and effort is in short supply; - or the optimization requires changes to the way Python operates; - even if we wanted to make that change, it will break backwards compatibility; - and often what people imagine is a simple optimization (because they haven't tried it) isn't simple at all; - or simply doesn't work; - and most importantly, just saying "Make it go faster!" doesn't work, we actually need to care about the details of *how* to make it faster. (We tried painting Go Faster stripes on the server, and it didn't work.) There's no point suggesting such major changes to Python that requires going back to the drawing board, to Python 0.1 or earlier, and changing the entire execution and memory model of the language. That would just mean we've swapped from a successful, solid, reliable version 3.6 of the language to an untried, unstable, unproven, bug-ridden version 0.1 of New Python. And like New Coke, it won't attract new users (they're already using Javascript or Go or whatever...) and it will alienate existing users (if they wanted Javascript or Go they'd already be using it). There have been at least two (maybe three) attempts to remove the GIL from CPython. They've all turned out to increase complexity by a huge amount, and not actually provide the hoped-for speed increase. Under many common scenarios, the GIL-less CPython actually ran *slower*. (I say "hoped for", but it was more wishful thinking than a rational expectation that removing the GIL would speed things up. I don't believe any of the core developers were surprised that removing the GIL didn't increase speeds much, if at all, and sometimes slowed things down. The believe that the GIL slows Python down is mostly due to a really simplistic understanding of how threading works.) Besides, if you want Python with no GIL so you can run threaded code, why aren't you using IronPython or Jython? Another example: UnladenSwallow. That (mostly) failed to meet expectations because the compiler tool chain being used wasn't mature enough. If somebody were to re-do the project again now, the results might be different. But it turns out that not that many people care enough to make Python faster to actually invest money in it. Everyone wants to just stamp their foot and get it for free. (If it weren't for the EU government funding it, PyPy would probably be languishing in oblivion. Everyone wants a faster Python so long as they don't have to pay for it, or do the work.) A third example: Victor Stinner's FatPython, which seemed really promising in theory, but turned out to not make enough progress fast enough and he lost interest. I have mentioned FatPython here a number of times. All you people who complain that Python is "too slow" and that the core devs ought to do something about it... did any of you volunteer any time to the FatPython project? -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sun, 25 Feb 2018 21:19:19 -0800, Rick Johnson wrote: > I agree with your sarcasm. And that's why these types of auto > conversions should be optional. I agree that most times it's more > practical to let python handle the dirty details. But in some cases, > where you need to squeeze out a few more drops of speed juice, you won't > mind the extra trouble. And that's where you call out to a library like numpy, or write a C extension, or use a tool like Numba or Cython to optimise your Python code to use native ints. (Numba is still a work in progress, but Cython is a mature working product.) Or to put it another way... if you want machine ints in Python, the way you get them is something like: from numba import jit @jit def myfunction(): ... The core language doesn't have to support these things when there is a healthy language ecosystem that can do it. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sunday, February 25, 2018 at 10:35:29 PM UTC-6, Steven D'Aprano wrote: [...] > Ah, you mean just like the way things were in Python 1.0 > through 2.1? Hands up anyone who has seen an integer > OverflowError in the last 10 years? Anyone? I think Python2.1 is much older than 10 years, so yeah, of course almost no one (save a few old timers) have seen that error. :-) > [...] > I really miss having to either add, or delete, an "L" > suffix from my long ints, and having to catch OverflowError > to get any work done, and generally spending half my time > worrying how my algorithms will behave when integer > operations overflow. I agree with your sarcasm. And that's why these types of auto conversions should be optional. I agree that most times it's more practical to let python handle the dirty details. But in some cases, where you need to squeeze out a few more drops of speed juice, you won't mind the extra trouble. And perhaps you (meaning specifically you) will never need such a flexible feature. But hey. The Python community is diverse. So please do keep that in mind. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sun, 25 Feb 2018 18:33:47 -0800, Rick Johnson wrote: > On Friday, February 23, 2018 at 10:41:45 AM UTC-6, Steven D'Aprano > wrote: [...] >> There are dozens of languages that have made the design choice to limit >> their default integers to 16- 32- or 64-bit fixed size, and let the >> user worry about overflow. Bart, why does it upset you so that Python >> made a different choice? > > A default "integer-diversity-and-inclusivity-doctrine" is all fine and > dandy by me, (Hey, even integers need safe spaces), but i do wish we > pythonistas had a method to turn off this (and other) cycle burning > "features" -- you know -- in the 99.9 percent of time that we don't > need them. Ah, you mean just like the way things were in Python 1.0 through 2.1? Hands up anyone who has seen an integer OverflowError in the last 10 years? Anyone? [steve@ando ~]$ python1.5 -c "print 2**64" Traceback (innermost last): File "", line 1, in ? OverflowError: integer pow() I really miss having to either add, or delete, an "L" suffix from my long ints, and having to catch OverflowError to get any work done, and generally spending half my time worrying how my algorithms will behave when integer operations overflow. Good times. I miss those days. I also miss the days when everyone had scabies. As someone who wrote Python code when bignums where *not* the default, I can tell you that: - it was a real PITA for those who cared about their code working correctly and being bug-free; - and the speed up actually wasn't that meaningful. As is so often the case with these things, using fixed size ints looks good in benchmark games, but what's fast in a toy benchmark and what's fast in real code are not always the same. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sunday, February 25, 2018 at 8:45:56 PM UTC-6, Chris Angelico wrote: > On Mon, Feb 26, 2018 at 1:33 PM, Rick Johnson >wrote: [...] > > A default "integer-diversity-and-inclusivity-doctrine" is > > all fine and dandy by me, (Hey, even integers need safe spaces), > > In Python 3.6+, integers have safe underscores instead. You spacist!!! -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sunday, February 25, 2018 at 8:45:56 PM UTC-6, Chris Angelico wrote: > On Mon, Feb 26, 2018 at 1:33 PM, Rick Johnson [...] > > but i do wish we pythonistas had a method to turn off this > > (and other) cycle burning "features" -- you know -- in the > > 99.9 percent of time that we don't need them. > > Definitely. We should also provide a way for people to > manually allocate memory, for the times when that would be > more efficient. And to switch out the syntax to include > braces. Now you're throwing the baby out with the bath water. We needn't re-implement the C language simply to achieve reasonable and practical code-execution efficiency. Python was never intended to be the fastest language. Heck! Python was never intented to be the fastest _scripting_ language! So of course, speed is not and should not be the primary concern, but to say that execution speed is of _no_ concern is quite absurd indeed. As Steven mentioned, few people, even of they _could_ afford the absurd price, will ever purchase that high-end sports car. However, by using this example Steve is committing the argumentation sin of comparing apples to oranges because transportation is more a matter of practicality. Sure, you may arrive at your destination more quickly in the high-end sports car, however, the trade-off will be that you could be thrown in jail for speeding or be injured or killed in an accident. But such existential risks are not the case for software... Code that execute more quickly simply (wait for it...) yes, executes more quickly! And I can assure you: no one will die, be injured, or even become "Bubba's" cell mate should their code commit the unforgivable "sin" of executing more quickly[1]. -- [1] Quick! Someone please call a priest! We need to confess!!! -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Friday, February 23, 2018 at 8:48:55 PM UTC-6, Steven D'Aprano wrote: [...] > Take the Fibonacci double-recursion benchmark. Okay, it > tests how well your language does at making millions of > function calls. Why? Because making "millons of function calls" is what software *DOES*! Granted, and as others have testified in this thread, i myself have almost never implemented an algorithm like that monstrosity of recursion that is implemented in fib(), but whether we use recursive functions are not, a high number of functions calls is inevitable in our software. And if you don't believe me, then use sys.setrace(myFunc) to trace all function calls to stdout, and you shall be enlightened[1]. > How often do you make millions of function calls? All the time! Of course, if the extent of your code base is a single script containing the source `print 'Hello World'`, perhaps your experience may vary from others here. > For most application code, executing the function is far > more costly than the overhead of calling it, What? Your functions are only called _once_ during the lifetime of your code execution? Enough with the leg pulling. > and the call overhead is dwarfed by the rest of the > application. Of course, because it's calling _other_ functions. > For many, many applications, the *entire* program run could > take orders of magnitude fewer function calls than a single > call to fib(38). Hmm, let's see: def main(): print 'hello world' main() By golly, he's right! O_O -- [1] But, in the meantime, you may want to make yourself a taco; go play a round of golf; and possibly pay a visit to national monument (or three!), while all that crap is printing to stdout. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Mon, Feb 26, 2018 at 1:33 PM, Rick Johnsonwrote: > On Friday, February 23, 2018 at 10:41:45 AM UTC-6, Steven D'Aprano wrote: > [...] >> There are dozens of languages that have made the design >> choice to limit their default integers to 16- 32- or 64-bit >> fixed size, and let the user worry about overflow. Bart, >> why does it upset you so that Python made a different >> choice? > > A default "integer-diversity-and-inclusivity-doctrine" is > all fine and dandy by me, (Hey, even integers need safe spaces), In Python 3.6+, integers have safe underscores instead. > but i do wish we pythonistas had a method to turn off this > (and other) cycle burning "features" -- you know -- in the > 99.9 percent of time that we don't need them. Definitely. We should also provide a way for people to manually allocate memory, for the times when that would be more efficient. And to switch out the syntax to include braces. And all those other things people like from other languages. Hey, here's an idea: maybe Python shouldn't stop people from using other languages, if those other languages are better suited to the job? > And BTW... Am i the *ONLY* person here who feels that Python > optimizations should be more than merely the tossing of dead > weight overboard? I dunno. We optimize this mailing list that way.. *ducking for cover* ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Friday, February 23, 2018 at 10:41:45 AM UTC-6, Steven D'Aprano wrote: [...] > There are dozens of languages that have made the design > choice to limit their default integers to 16- 32- or 64-bit > fixed size, and let the user worry about overflow. Bart, > why does it upset you so that Python made a different > choice? A default "integer-diversity-and-inclusivity-doctrine" is all fine and dandy by me, (Hey, even integers need safe spaces), but i do wish we pythonistas had a method to turn off this (and other) cycle burning "features" -- you know -- in the 99.9 percent of time that we don't need them. And BTW... Am i the *ONLY* person here who feels that Python optimizations should be more than merely the tossing of dead weight overboard? -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 24/02/2018 02:05, Steven D'Aprano wrote: On Fri, 23 Feb 2018 19:25:35 +, bartc wrote: Python is 10 times slower than a competitor = doesn't matter My language is 1.5 times slower than the big boys' = matters a great deal As for Python's order-of-magnitude speed difference, thank you for being generous. Actually that comparison was with a competitor, ie. another dynamic language, because I understand such languages work in different fields from the Cs and C++s. I'm sure there must be some that are faster (years since I've looked at the field), but I vaguely had in mind mine. Although since then, CPython has gotten faster. Note that there are JIT-based implementations now which can give very good results (other than PyPy) with dynamic languages. My own efforts are still byte-code based so are unlikely to get any faster. But they are also very simple. So it is quite possible to get practical work done and be a competitive, useful language despite being (allegedly) a thousand or more times slower than C. Of course. I've been using a dynamic scripting language as an adjunct to my compiled applications since the mid 80s. Then they were crude and hopelessly slow (and machines were a lot slower too), but they could still be tremendously useful with the right balance. But the faster they are, the more work they can take over. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 24/02/2018 02:46, Steven D'Aprano wrote: Take the Fibonacci double-recursion benchmark. Okay, it tests how well your language does at making millions of function calls. Why? How often do you make millions of function calls? Very often. Unless you are writing code 1970s style with everything in one big main function. I've done some tests with my interpreter [sorry, Ned], on one real task: Total number of byte-code calls: 3.2 million Total number of real x64 calls: 520 million On this specific benchmark: 48 million and 580 million. Changing the way those x64 calls were made (using a different call convention), made some byte-code programs take up to 50% longer to execute. [In this interpreter, each byte-code, no matter what it is, is dispatched via a function call.] For most application code, executing the function is far more costly than the overhead of calling it, and the call overhead is dwarfed by the rest of the application. Any actual figures? In the case of interpreters, you want to optimise each byte-code, and one way of doing that is to write a small program which features that byte-code heavily. And then you start tweaking. It is true that when working with heavy-duty data, or offloading work to external, non-byte-code functions, then the byte-code execution overheads are small. But Python's weakness is when it /is/ executing actual algorithms using actual byte-code. And actually, performance of CPython does seem to have improved tremendously over the years. So some people have their eye on the ball. Clearly not you. If you have a language with tail recursion elimination, you can bet that's its benchmarks will include examples of tail recursion and tail recursion will be a favoured idiom in that language. If it doesn't, it won't. Benchmarks need to be honest. But Fibonacci I think can't use that optimisation (although gcc seems to have found another way of not that much work). -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 1:16 PM, Chris Angelicowrote: > On Sat, Feb 24, 2018 at 6:09 AM, Python wrote: >> On Sat, Feb 24, 2018 at 05:56:25AM +1100, Chris Angelico wrote: >>> No, not satisfied. Everything you've said would still be satisfied if >>> all versions of the benchmark used the same non-recursive algorithm. >>> There's nothing here that says it's testing recursion, just that (for >>> consistency) it's testing the same algorithm. There is no reason to >>> specifically test *recursion*, unless that actually aligns with what >>> you're doing. >> >> It seems abundantly clear to me that testing recursion is the point of >> writing a benchmark implementing recursion (and very little of >> anything else). Again, you can decide for yourself the suitability of >> the benchmark, but I don't think you can really claim it doesn't >> effectively test what it means to. > > Where do you get that it's specifically a recursion benchmark though? > I can't find it anywhere in the explanatory text. I hope I am not missing something blatantly obvious, but in the page Python linked to (https://julialang.org/benchmarks/), in the opening paragraph it states: These micro-benchmarks, while not comprehensive, do test compiler performance on a range of common code patterns, such as function calls, string parsing, sorting, numerical loops, random number generation, recursion, and array operations. Recursion is listed above as one code pattern to specifically target with one of their micro-benchmarks. I did not exhaustively go through each language's code examples, but it does appear that the authors of these benchmarks are implementing as exactly as possible the same algorithm chosen for each of these code patterns in each language. Now to me the real question is whether the Julia people were trying to be intellectually honest in their choices of coding patterns and algorithms or were they deliberately cherry picking these to make Julia look better than the competition? Myself, I would rather be charitable than accusatory about the benchmarkers' intentions. For instance, the authors were aware of numpy and used it for some of the python coding -- the array operations they were targeting IIRC. Instead, if they were being deliberately dishonest, they could have come up with some really contrived python code. -- boB -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 2/23/2018 11:32 AM, Python wrote: On Fri, Feb 23, 2018 at 03:11:36AM -0500, Terry Reedy wrote: Why do you care about the 50 million calls? That's crazy -- the important thing is *calculating the Fibonacci numbers as efficiently as possible*. If you are writing practical programs, that's true. But the Julia benchmarks are not practical programs; they are designed to compare the performance of various language features across a range of languages. If that were so, then the comparison should use the fastest *Python* implementation. Doing that would completely fail to accomplish the task of comparing the performance of recursive function calls in the two languages, which is what the benchmark was designed to do. That is an non goal because *languages* do not have clock-time speeds, only programs written in concrete implementations run on real hardware. Why do you think it fair to compare function call times using the slowest rather than the fastest implementation of Python function calls? Real Python programmers who are seriously concerned about time try to use the fastest implementation appropriate to the situation. > So, no actually, it shouldn't. To me, it is 'not using the fasted Python' that fails to make apples to apples comparisons. It has been said here that comparisons should use the same algorithm doing the much the same work in both implementation. However, a Python function call does *much more work* than in most languages, because the meaning of 'function call' is much more flexible in Python than most languages. The usual comparison is like lemons (other language calls) to oranges (Python language calls, much more flexible). To be fair, the comparison should be to a Python implementation that either notices or accepts a declaration that, for instance, fib(n) only needs to pass an int by position. Comparing int addition to python object addition also compares unlike operations. To compare using the same addition algorithm, one must use an implementation that can restrict '+' to just int addition. The Juila fib benchmark uses the type of function call Julia is good at. Suppose we define fib in a way that uses Python features. def fib(n, dummy): if n >= 2: return fib(n=n-1, dummy=dummy) + fib(dummy=dummy, n=n-2) elif n >= 0: return 1 else: return None, n # or some error indicator including the bad value If the language does not support 'parameter=value' syntax (which existed long before Python), use ('n', n) and ('dummy', dummy) instead. Now let's compare call times. Or lets try 'fib(n-1, dummy) + fib(dummy=dummy, n=n-2)' to compare functions that can identify argments either by position or by name. Or f(n-1, dummy) + f(dummy=dummy, n=n-2) + f(*(n-3, dummy)), and change '2' to '3', to utilize another Python call feature. If the language does not support *args, omit '*'. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 12:43:06 -0600, Python wrote: > Even if testing optimized > code is the point, as the article claims, it utterly fails to do that. > Bad science. You've used that statement two or three times now. *This isn't science*. There's nothing scientific about writing benchmarks, or even objective. It is through and through subjective choices given a paper-thin patina of objectivity because the results include numbers. But those numbers depend on the precise implementation of the benchmark. They depend on the machine you run them on, sometimes strongly enough that the order of which language is faster can swap. I remember a bug in Python's urllib module, I think it was, that made code using it literally hundreds of times slower on Windows than Linux or OS X. The choice of algorithms used is not objective, or fair. Most of it is tradition: the famous "whetstone" benchmark apparently measures something which has little or no connection to anything software developers should care about. It, like the Dhrystone variant, were invented to benchmark CPU performance. The relevance to comparing languages is virtually zero. "As this data reveals, Dhrystone is not a particularly representative sample of the kinds of instruction sequences that are typical of today's applications. The majority of embedded applications make little use of the C libraries for example, and even desktop applications are unlikely to have such a high weighting of a very small number of specific library calls." http://dell.docjava.com/courses/cr346/data/papers/DhrystoneMIPS- CriticismbyARM.pdf Take the Fibonacci double-recursion benchmark. Okay, it tests how well your language does at making millions of function calls. Why? How often do you make millions of function calls? For most application code, executing the function is far more costly than the overhead of calling it, and the call overhead is dwarfed by the rest of the application. For many, many applications, the *entire* program run could take orders of magnitude fewer function calls than a single call to fib(38). If you have a language with tail recursion elimination, you can bet that's its benchmarks will include examples of tail recursion and tail recursion will be a favoured idiom in that language. If it doesn't, it won't. I'm going to end with a quote: "And of course, the very success of a benchmark program is a danger in that people may tune their compilers and/or hardware to it, and with this action make it less useful." Reinhold P. Weicker, Siemens AG, April 1989 Author of the Dhrystone Benchmark -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 6:06 PM, bartcwrote: > On 24/02/2018 00:45, Dan Stromberg wrote: >> But again, those techniques are only infrequently relevant, and pretty >> much never relevant to an entire application. > That your page lists 23 different approaches to making Python faster > suggests that its speed can be an issue. Certainly there seem to be a lot of > projects going on trying to fix that aspect. Of course CPU performance can be an issue. For any language. When I was a kid, programs were written in BASIC. If BASIC was too slow, you'd rewrite the hotspots in assembler. Does that mean everything should have been written in assembler? Not at all. Assembler was slow to code in and tended to produce crashy solutions. Obsessing about CPU time is outdated thinking for almost all problems. Developer time is much more likely to matter to the bottom line > Perhaps it ought to have been clear that this was CPython, if it in fact > was. CPython != Python. g++ != C++. gcc != C. Python is a language. Not an implementation of a language. > There are so many possibilities with Python, that using anything else > would confuse matters. Gosh no. A language's success can, in significant part, be judged by the number of compatible implementations and dialects. You might find this interesting: http://stromberg.dnsalias.org/~strombrg/why-is-python-slow/ It's just one microbenchmark, but it's me addressing someone asking why Python is so much slower than C++. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 24/02/2018 00:45, Dan Stromberg wrote: On Fri, Feb 23, 2018 at 1:32 PM, bartcwrote: But the difference in runtime speed between Python and other dynamic languages, if you look at benchmarks doing actual work in the language, can be much greater than two times, yet that doesn't appear to matter. The best way to use Python is to: 1) Write your whole application in it. 2) IF things are too slow, profile the app. Usually performance will be fast enough from the start, but not always. 3) After (if) identifying a hotspot, optimize it in any of a variety of ways, while still reaping the benefits of rapid development (which often matters much more than CPU speed today) I have a page about speeding up python: http://stromberg.dnsalias.org/~strombrg/speeding-python/ But again, those techniques are only infrequently relevant, and pretty much never relevant to an entire application. That your page lists 23 different approaches to making Python faster suggests that its speed can be an issue. Certainly there seem to be a lot of projects going on trying to fix that aspect. So I'll rephrase, and say that some individuals in this group are saying that speed doesn't matter. For many others it apparently does. (I've just looked at the benchmark results referred to in the OP's link [https://julialang.org/benchmarks/] and the comparison for the recursion benchmark (let's just call it that) seems fair enough to me. Perhaps it ought to have been clear that this was CPython, if it in fact was. There are so many possibilities with Python, that using anything else would confuse matters. People who use Python will be aware there are acceleration techniques. (A note regarding the C timing in that table: I've looked at what gcc does with this benchmark with its unreasonably fast timing, and it appears to cheat. That is, it does not call that function (ie. enter via the normal prologue code) the number of times it says. I don't know if that was the case when they tested C for that table. But it can happen that a sufficiently simple benchmark can lend itself to super-optimisation that is much harder with real code. For that reason I no longer take account of gcc-O3 timings for micro-benchmarks.)) -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 19:25:35 +, bartc wrote: > On 23/02/2018 18:05, Steven D'Aprano wrote: >> On Fri, 23 Feb 2018 13:51:33 +, Ben Bacarisse wrote: > >> Stop writing crap code and then complaining that the language is "too >> slow". Write better code, and then we'll take your complaints >> seriously. > > I write compilers for static languages, which are generally considered > toys, partly because the code generated is not great. I have no idea whether your compilers are toys, or what "not great" means for your generated code. - too slow for the intended purpose? - full of bugs? - generated code is enormous? - requires too much memory? - all of the above? If the actual answer is, "none of the above", then apart from your failure to attract users and build a full ecosystem around your compiler, I see no reason to call them toys. If your compilers can get within 10% of best-of-breed C compilers, even if only on trivial problems, that's nothing to sneer at. If you can do it on non-trivial applications (say, the Linux kernel) without introducing a vast number of compiler-generated bugs, then you're doing very well indeed. In another post (responding to Chris) you wrote: > Python is 10 times slower than a competitor = doesn't matter > My language is 1.5 times slower than the big boys' = matters > a great deal Maybe it matters to *you*. You are the one who stated it was a toy compiler. To me, I have *no idea* whether the compiler is any good. Nobody but you uses it. I once tried to compile it, and couldn't get it to compile. As for Python's order-of-magnitude speed difference, thank you for being generous. If you cherry-pick the right benchmarks, you could easily have said it was two orders-of-magnitude slower. On the other hand, if you use PyPy, and similarly cherry-pick the right benchmarks, you can justifiably say that *Python is faster than C*. (On carefully chosen examples which probably aren't representative of full application performance.) But okay, let's take "Python is an order of magnitude (between 10 and 90 times) slower than C" for the sake of the argument. Does it matter? Maybe. Maybe not. Sometimes it matters. Sometimes it doesn't. The world's fastest street-legal production car is, so I'm told, the Bugatti Veyron Super Sport. It has a top speed of 268 mph and the engine has 1200 horsepower. And yet literally *billions* of car drivers choose to drive something slower. They wouldn't buy a Veyron even if they could afford it. Why shouldn't people make choices based on factors other than speed, speed, speed, speed, speed? Ruby has fallen out of favour now, but for quite a few years it was looking like a contender despite being an order of magnitude slower than Python. Which is an order of magnitude slower than Java, which is an order of magnitude slower than C, according to some benchmarks at least. People write business applications using Excel spreadsheets and macros. And why not? So it is quite possible to get practical work done and be a competitive, useful language despite being (allegedly) a thousand or more times slower than C. [...] > Have you ever written a benchmark in your life? If so, did it do > anything useful? Other than what benchmarks are normally used for. Yes I have, not for what benchmarks are normally used for (dick-measuring contests between programmers competing to see whose language is more macho) but for the very practical purposes of: (1) Deciding which algorithm to use in practice; (2) Ensuring that updates to the code don't suffer serious performance regressions; (3) And checking that the code is *fast enough* (not as fast as conceivably possible) for its purpose. They are the sorts of benchmarks I care about. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 1:32 PM, bartcwrote: > On 23/02/2018 20:12, Chris Angelico wrote: > So I'll keep it generic. Let's say the Tiny C compiler is not taken > seriously because it might be a couple of times slower than gcc-O3, even > thought it's 1% of the size and compiles 1000% as fast. What's generic about a specific example? (that's rhetorical) I've used tcc in preference to gcc before; I had an application where performance didn't matter as much as security; tcc has buffer overrun checking. > But the difference in runtime speed between Python and other dynamic > languages, if you look at benchmarks doing actual work in the language, can > be much greater than two times, yet that doesn't appear to matter. The best way to use Python is to: 1) Write your whole application in it. 2) IF things are too slow, profile the app. Usually performance will be fast enough from the start, but not always. 3) After (if) identifying a hotspot, optimize it in any of a variety of ways, while still reaping the benefits of rapid development (which often matters much more than CPU speed today) I have a page about speeding up python: http://stromberg.dnsalias.org/~strombrg/speeding-python/ But again, those techniques are only infrequently relevant, and pretty much never relevant to an entire application. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 8:32 AM, bartcwrote: > So I'll keep it generic. Let's say the Tiny C compiler is not taken > seriously because it might be a couple of times slower than gcc-O3, even > thought it's 1% of the size and compiles 1000% as fast. Except that nobody has said that. You're doing an excellent job of fighting against a strawman. I'm done. Have fun! ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23/02/2018 20:12, Chris Angelico wrote: On Sat, Feb 24, 2018 at 7:02 AM, bartcwrote: I don't know what point you're making here. Unless it's that no software is worth writing unless it's done on a big scale with a huge team of people, and is very well known. My own point should be clear: Python is 10 times slower than a competitor = doesn't matter My language is 1.5 times slower than the big boys' = matters a great deal Most performance is a tradeoff. Python has other advantages; what are yours? So far, all you've said is that you're "not too much worse". That's great for a toy language! Ned Batchelder has banned me from mentioning any of my private work here. So I'll keep it generic. Let's say the Tiny C compiler is not taken seriously because it might be a couple of times slower than gcc-O3, even thought it's 1% of the size and compiles 1000% as fast. But the difference in runtime speed between Python and other dynamic languages, if you look at benchmarks doing actual work in the language, can be much greater than two times, yet that doesn't appear to matter. Even defining how fast or slow Python actually is is fraught with complexities making it hard to establish just what its comparative speed is: No one seems to be able to agree with what benchmarks ought to be used. Some don't seem to get benchmarks. Then it's question of whether you run CPython. Or PyPy. Or Ironpython or Jpython. Or Numba with @jit. Or Numpy. Or @cache or whatever it is. Or you run Cython. Or maybe you can combine @jit, @cache, Cython, and PyPy in the same program; I don't know. (Have I ever mentioned Python is complicated?) So, you were asking about the benefits of having a small, simple self-contained implementation of a language. I really couldn't tell you... -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 2/23/18 3:02 PM, bartc wrote: On 23/02/2018 19:47, Chris Angelico wrote: On Sat, Feb 24, 2018 at 6:25 AM, bartcwrote: The difference between Python and another dynamic language might be a magnitude, yet you say it doesn't matter. Thanks, that makes me feel much better about my own work! If a magnitude difference in performance can be so casually dismissed. Yep! Your work is awesome, because your code is only a little bit slower than the same algorithm with the same bugs in a better-known language that's supported on more platforms and has a large team of people supporting it. Remind me what's actually advantageous about your language? I don't know what point you're making here. Unless it's that no software is worth writing unless it's done on a big scale with a huge team of people, and is very well known. My own point should be clear: Python is 10 times slower than a competitor = doesn't matter My language is 1.5 times slower than the big boys' = matters a great deal I've lost track of the points anyone here is making. Can we PLEASE avoid getting into this tarpit of comparing Bart's language to Python yet again? --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 7:02 AM, bartcwrote: > On 23/02/2018 19:47, Chris Angelico wrote: >> >> On Sat, Feb 24, 2018 at 6:25 AM, bartc wrote: > > >>> The difference between Python and another dynamic language might be a >>> magnitude, yet you say it doesn't matter. >>> >>> Thanks, that makes me feel much better about my own work! If a magnitude >>> difference in performance can be so casually dismissed. >> >> >> Yep! Your work is awesome, because your code is only a little bit >> slower than the same algorithm with the same bugs in a better-known >> language that's supported on more platforms and has a large team of >> people supporting it. >> >> Remind me what's actually advantageous about your language? > > > I don't know what point you're making here. Unless it's that no software is > worth writing unless it's done on a big scale with a huge team of people, > and is very well known. > > My own point should be clear: > > Python is 10 times slower than a competitor = doesn't matter > My language is 1.5 times slower than the big boys' = matters a great deal Most performance is a tradeoff. Python has other advantages; what are yours? So far, all you've said is that you're "not too much worse". That's great for a toy language! ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23/02/2018 19:47, Chris Angelico wrote: On Sat, Feb 24, 2018 at 6:25 AM, bartcwrote: The difference between Python and another dynamic language might be a magnitude, yet you say it doesn't matter. Thanks, that makes me feel much better about my own work! If a magnitude difference in performance can be so casually dismissed. Yep! Your work is awesome, because your code is only a little bit slower than the same algorithm with the same bugs in a better-known language that's supported on more platforms and has a large team of people supporting it. Remind me what's actually advantageous about your language? I don't know what point you're making here. Unless it's that no software is worth writing unless it's done on a big scale with a huge team of people, and is very well known. My own point should be clear: Python is 10 times slower than a competitor = doesn't matter My language is 1.5 times slower than the big boys' = matters a great deal -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 01:41:17PM -0600, Python wrote: > On Sat, Feb 24, 2018 at 06:16:38AM +1100, Chris Angelico wrote: > > > It seems abundantly clear to me that testing recursion is the point of > > > writing a benchmark implementing recursion (and very little of > > > anything else). Again, you can decide for yourself the suitability of > > > the benchmark, but I don't think you can really claim it doesn't > > > effectively test what it means to. > > > > Where do you get that it's specifically a recursion benchmark though? > > I can't find it anywhere in the explanatory text. > > I don't know how you could conclude it could be meant to test anything > else. Actually I will make one small correction, which is to say that the benchmark tests the performance of algorithms which are predominated by many function calls, of which recursion is an obvious subset. But I think that's splitting hairs. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Thu, 22 Feb 2018 17:13:02 -0800, Rick Johnson wrote: > On Thursday, February 22, 2018 at 1:55:35 PM UTC-6, Jack Fearnley wrote: > [...] >> I realize that this thread is about benchmarking and not really about >> generating fibonacci numbers, but I hope nobody is using this code to >> generate them on a 'production' basis, > > I've been raising the warning flags about that awful Fibonacci example > for years! > > First it was in the tutorial, and then, they plastered it on the front > page of Python.org like it were some sort of trophy... > > At this point i have concluded that it must be: (1) a sadistic noob > hazing device, or (2) a straw-stuffed, > anthropomorphic, crow-poop-riddled monster that sends the seasoned > programmers off in the opposite direction. Excellent thread! I see I'm preaching to the converted. Jack -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 6:25 AM, bartcwrote: > On 23/02/2018 18:05, Steven D'Aprano wrote: >> >> On Fri, 23 Feb 2018 13:51:33 +, Ben Bacarisse wrote: > > >> Stop writing crap code and then complaining that the language is "too >> slow". Write better code, and then we'll take your complaints seriously. > > > I write compilers for static languages, which are generally considered toys, > partly because the code generated is not great. Doesn't matter what language you're writing in, if your algorithm is flawed. Bubble sort is bubble sort no matter what you're using. This function will calculate triangle numbers just as inefficiently in Python as it would in C: def triangle(n): if n <= 1: return n return triangle(n-1) + triangle(n-1) - triangle(n-2) + 1 > But in a real application, the difference between my compiler, and gcc-O3 or > MSVC/O2 might only be 10-50%. 1.1 to 1.5 times as fast (or as slow), and > that's an application that does few external library calls. > > The difference between Python and another dynamic language might be a > magnitude, yet you say it doesn't matter. > > Thanks, that makes me feel much better about my own work! If a magnitude > difference in performance can be so casually dismissed. Yep! Your work is awesome, because your code is only a little bit slower than the same algorithm with the same bugs in a better-known language that's supported on more platforms and has a large team of people supporting it. Remind me what's actually advantageous about your language? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 06:16:38AM +1100, Chris Angelico wrote: > > It seems abundantly clear to me that testing recursion is the point of > > writing a benchmark implementing recursion (and very little of > > anything else). Again, you can decide for yourself the suitability of > > the benchmark, but I don't think you can really claim it doesn't > > effectively test what it means to. > > Where do you get that it's specifically a recursion benchmark though? > I can't find it anywhere in the explanatory text. I don't know how you could conclude it could be meant to test anything else. The code is: ## recursive fib ## fib(n) = n < 2 ? n : fib(n-1) + fib(n-2) What else is there of substance to test? Trigraphs? Addition? The comment even calls out that it is recursive. I further draw inference from the two statements: 1. "Instead, the benchmarks are written to test the performance of identical algorithms and code patterns implemented in each language." Calling out explicitly that it is algorithms that they are testing, and here the algorithm is a recursion. 2. "For example, the Fibonacci benchmarks all use the same (inefficient) doubly-recursive algorithm..." They expressly called out this algorithm as being inefficient, and expressly called it out as being double-recursion. I'll grant you they never explicitly said so unequivocally (that I saw--my reading of their site was not exhaustive), but I can't see any way an informed reader could conclude any different purpose other than benchmarking recursion. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23/02/2018 18:56, Chris Angelico wrote: On Sat, Feb 24, 2018 at 5:43 AM, Pythonwrote: Satisfied? No, not satisfied. Everything you've said would still be satisfied if all versions of the benchmark used the same non-recursive algorithm. Why? Does it matter? There's nothing here that says it's testing recursion, just that (for consistency) it's testing the same algorithm. There is no reason to specifically test *recursion*, unless that actually aligns with what you're doing. For instance, when Google Chrome rolled out its new V8 JavaScript engine, it was specifically optimized for recursion, because many Google apps used recursion heavily. If you're implementing a new LISP interpreter, you should probably optimize for recursion, because most LISP code is going to be written recursively. But otherwise, there's no particular reason to stick to recursion. You list plenty of reasons, then say there's no reason to use recursion! The recursion is a red herring; but to get a program with lots of function calling, then using a recursive algorithm - any algorithm - will do the job. Would you have a different opinion if Python excelled at that benchmark, and Julia sucked? (I can't remember what the results were.) (I was testing one of my static compilers today; it has three call convention modes all operating within the Win64 ABI. I used the Fibonacci benchmark with fib(42) to test it, with these results: mode 1 5.8 seconds (non-ABI) mode 2 6.5 seconds (part ABI conformance) mode 3 7.4 seconds (full ABI conformance) Other compilers (C compilers that have to conform) ranged from 4.3 to 6.2 seconds, including MSVC/O2. Except for gcc-O3 which managed 1.7 seconds. How do it do that? I'll have to investigate to see how much it cheated. The point of all this: I was using the recursive Fibonacci benchmark for these tests, as there is little else going on besides function calls and returns. It's perfect. Why some people hate it here, I really don't know.) I won't dispute that part. The correct way to do this would be to optimize both sides fairly - either not at all, or equivalently (for some definition of equivalence). But switching both sides to an unoptimized iterative algorithm would be perfectly fair. Recursion is NOT essential to the benchmark. So, what benchmark would you use instead if you wanted an idea of how well each language dealt with it? -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23/02/2018 18:05, Steven D'Aprano wrote: On Fri, 23 Feb 2018 13:51:33 +, Ben Bacarisse wrote: Stop writing crap code and then complaining that the language is "too slow". Write better code, and then we'll take your complaints seriously. I write compilers for static languages, which are generally considered toys, partly because the code generated is not great. But in a real application, the difference between my compiler, and gcc-O3 or MSVC/O2 might only be 10-50%. 1.1 to 1.5 times as fast (or as slow), and that's an application that does few external library calls. The difference between Python and another dynamic language might be a magnitude, yet you say it doesn't matter. Thanks, that makes me feel much better about my own work! If a magnitude difference in performance can be so casually dismissed. "Python takes hours to sort a million integers using BubbleSort! Its too damn slow!!!" "Why not use the built-in sort? That's fast." "NOO I MUST USE BUBBLESORT, BECAUSE REASONS!!!1!" *wink* Have you ever written a benchmark in your life? If so, did it do anything useful? Other than what benchmarks are normally used for. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 6:09 AM, Pythonwrote: > On Sat, Feb 24, 2018 at 05:56:25AM +1100, Chris Angelico wrote: >> No, not satisfied. Everything you've said would still be satisfied if >> all versions of the benchmark used the same non-recursive algorithm. >> There's nothing here that says it's testing recursion, just that (for >> consistency) it's testing the same algorithm. There is no reason to >> specifically test *recursion*, unless that actually aligns with what >> you're doing. > > It seems abundantly clear to me that testing recursion is the point of > writing a benchmark implementing recursion (and very little of > anything else). Again, you can decide for yourself the suitability of > the benchmark, but I don't think you can really claim it doesn't > effectively test what it means to. Where do you get that it's specifically a recursion benchmark though? I can't find it anywhere in the explanatory text. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 05:56:25AM +1100, Chris Angelico wrote: > No, not satisfied. Everything you've said would still be satisfied if > all versions of the benchmark used the same non-recursive algorithm. > There's nothing here that says it's testing recursion, just that (for > consistency) it's testing the same algorithm. There is no reason to > specifically test *recursion*, unless that actually aligns with what > you're doing. It seems abundantly clear to me that testing recursion is the point of writing a benchmark implementing recursion (and very little of anything else). Again, you can decide for yourself the suitability of the benchmark, but I don't think you can really claim it doesn't effectively test what it means to. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 06:14:22PM +, Steven D'Aprano wrote: > > I never said the benchmarks chosen were awesome... :-) What I'm saying > > is this: > > > > 1. Insistence that the most efficient python implementation of Fib > >completely misses the point (and defeats the purpose) of the > >benchmarks, and as such the entire article is lost in the weeds. > > The point of the benchmarks is to highlight how great Julia is For a particular set of common code patterns. You left that part out, and it's the crux of the issue. > and why people should choose it over Python. I'd argue not why, but WHEN. Or really I'd argue nothing of the sort... and simply point out that the benchmarks do exactly what the Julia team say they do. It's up to you to decide how useful that is. > The point of the article is to show how to take badly written, > unidiomatic, slow code written in Python, and modify it to be better, > faster code competitive (if not faster than) Julia, while remaining > within the Python ecosystem. Even if that was the point, the article fails, because it compares non-optimized Julia code to optimized Pyhton code. > We aren't limited to the two options the benchmarks suggest It actually suggested quite a lot more than that... benchmarks compared the performance of 12 languages, using exactly the same programming patterns (as near as I can tell anyway, I didn't scrutinize all of the code). -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 5:43 AM, Pythonwrote: > On Sat, Feb 24, 2018 at 03:42:43AM +1100, Chris Angelico wrote: >> >> If that were so, then the comparison should use the fastest *Python* >> >> implementation. >> > >> > Doing that would completely fail to accomplish the task of comparing >> > the performance of recursive function calls in the two languages, >> > which is what the benchmark was designed to do. So, no actually, it >> > shouldn't. >> > >> >> Where does the author say that the benchmark is designed to compare >> recursion? > > Chris, you're a smart guy... Are you suggesting that the reason the > fibonacci sequence was selected as a benchmark is because it's such an > amazingly useful problem that it, in and of itself, warrants having > such a behchmark? Or, do you think the reason it makes sense to have > such a benchmark is because, like the reason it's presented in pretty > much every CS program ever, it presents an opportunity to consider a > particular class of problems and different techniques for solving > those problems, and the performance characteristics of those > solutions? > > > But, to answer your question more directly, here: > > https://julialang.org/benchmarks/ > > "It is important to note that the benchmark codes are not written > for absolute maximal performance (the fastest code to compute > recursion_fibonacci(20) is the constant literal 6765). Instead, > the benchmarks are written to test the performance of identical > algorithms and code patterns implemented in each language. For > example, the Fibonacci benchmarks all use the same (inefficient) > doubly-recursive algorithm..." > > Satisfied? No, not satisfied. Everything you've said would still be satisfied if all versions of the benchmark used the same non-recursive algorithm. There's nothing here that says it's testing recursion, just that (for consistency) it's testing the same algorithm. There is no reason to specifically test *recursion*, unless that actually aligns with what you're doing. For instance, when Google Chrome rolled out its new V8 JavaScript engine, it was specifically optimized for recursion, because many Google apps used recursion heavily. If you're implementing a new LISP interpreter, you should probably optimize for recursion, because most LISP code is going to be written recursively. But otherwise, there's no particular reason to stick to recursion. > So, by changing the algorithm, the article defeats the purpose of the > benchmark. It makes some fine points about code optimization, but it > completely fails at its stated purpose (to make the benchmarks more > fair). The comparisons it makes are substantially less valid than the > ones made by the Julia benchmarks, on account of optimizing only the > algorithm used by Python, and not testing with a similarly optimized > algorithm in Julia, but rather using its results for the intentionally > unoptimized algorithm those benchmarks used. Even if testing > optimized code is the point, as the article claims, it utterly fails > to do that. Bad science. > I won't dispute that part. The correct way to do this would be to optimize both sides fairly - either not at all, or equivalently (for some definition of equivalence). But switching both sides to an unoptimized iterative algorithm would be perfectly fair. Recursion is NOT essential to the benchmark. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 03:42:43AM +1100, Chris Angelico wrote: > >> If that were so, then the comparison should use the fastest *Python* > >> implementation. > > > > Doing that would completely fail to accomplish the task of comparing > > the performance of recursive function calls in the two languages, > > which is what the benchmark was designed to do. So, no actually, it > > shouldn't. > > > > Where does the author say that the benchmark is designed to compare > recursion? Chris, you're a smart guy... Are you suggesting that the reason the fibonacci sequence was selected as a benchmark is because it's such an amazingly useful problem that it, in and of itself, warrants having such a behchmark? Or, do you think the reason it makes sense to have such a benchmark is because, like the reason it's presented in pretty much every CS program ever, it presents an opportunity to consider a particular class of problems and different techniques for solving those problems, and the performance characteristics of those solutions? But, to answer your question more directly, here: https://julialang.org/benchmarks/ "It is important to note that the benchmark codes are not written for absolute maximal performance (the fastest code to compute recursion_fibonacci(20) is the constant literal 6765). Instead, the benchmarks are written to test the performance of identical algorithms and code patterns implemented in each language. For example, the Fibonacci benchmarks all use the same (inefficient) doubly-recursive algorithm..." Satisfied? > Recursion is sometimes a good way to describe an algorithm, but > rarely a good way to implement it at a low level. I'm well aware, and said the equivalent elsewhere. As I also said elsewhere, I never claimed it was a particularly useful benchmark. It is, nevertheless, designed to accomplish the stated goal, and does exactly that. You can decide for yourself how useful that goal is, but you can't really argue that it doesn't serve that purpose. So, by changing the algorithm, the article defeats the purpose of the benchmark. It makes some fine points about code optimization, but it completely fails at its stated purpose (to make the benchmarks more fair). The comparisons it makes are substantially less valid than the ones made by the Julia benchmarks, on account of optimizing only the algorithm used by Python, and not testing with a similarly optimized algorithm in Julia, but rather using its results for the intentionally unoptimized algorithm those benchmarks used. Even if testing optimized code is the point, as the article claims, it utterly fails to do that. Bad science. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 5:05 AM, Steven D'Apranowrote: > But guess what? The benchmarks are flawed. The performance of real-world > Julia code doesn't match the performance of the benchmarks. > > "What’s disappointing is the striking difference between > the claimed performance and the observed one. For example, > a trivial hello world program in Julia runs ~27x slower > than Python’s version and ~187x slower than the one in C." > > http://zverovich.net/2016/05/13/giving-up-on-julia.html > > Admittedly "Hello World" is not what *I* would call a real world program, > but that's just an illustration of the discrepancy between the > performance in artificial benchmarks and the performance in useful code. No, but Hello World is a good test of application startup time. (A null program doesn't always test everything, and doesn't look very good either.) If Hello World takes half a second to run, you have a language that is terrible for short scripts or command-line tools. That doesn't mean it's a "slow language" necessarily - nor even a slow interpreter, which is what you're actually testing - but it does mean that, well... rosuav@sikorsky:~$ time pypy -c 'print("Hello, world!")' Hello, world! real 0m0.429s user 0m0.036s sys 0m0.032s rosuav@sikorsky:~$ time python -c 'print("Hello, world!")' Hello, world! real 0m0.024s user 0m0.016s sys 0m0.008s ... CPython does have its place still. Or maybe... rosuav@sikorsky:~$ time pypy -c 'print("Hello, world!")' Hello, world! real 0m0.050s user 0m0.016s sys 0m0.032s rosuav@sikorsky:~$ time python -c 'print("Hello, world!")' Hello, world! real 0m0.022s user 0m0.016s sys 0m0.004s ... that PyPy doesn't live in my disk cache the way CPython does. :) PyPy is still slower than CPython for startup time, but not THAT much slower. However, if you DO have something that has a ton of startup overhead, you need to know about it. (Interestingly, I was actually able to compile and run a C hello-world in comparable time to PyPy, once the cache was warmed up. Curious. I expected that to be slower. Once again, it proves that intuition can be extremely wrong.) In terms of "which language is faster?", real-world code is both the ONLY way to measure, and a completely unfair comparison. Life is tough. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 4:49 AM, Steven D'Apranowrote: > On Sat, 24 Feb 2018 00:03:06 +1100, Chris Angelico wrote: > >>> Is numpy a general purpose C library that can also be called from any >>> language that can use a C API? Or is it specific to Python? >>> >>> >> No, it's a general purpose FORTRAN library that can also be called from >> any language that can use a generic C, FORTRAN, COBOL, etc API. > > Numpy itself is a collection of Python interfaces to a number of C and > Fortran libraries. You can't generally call numpy from other languages -- > you can't even call numpy from other Python implementations, unless they > support calling C/Fortran. Jython, for example, can't do it without help: > > https://jyni.org/ > > and while PyPy can call numpy, doing so is slow, and there's a custom > fork of numpy specially for PyPy: > > https://bitbucket.org/pypy/numpy Right. I was kinda handwaving a bit; numpy itself isn't the thing you'd call from elsewhere, but BLAS libraries in general are intended to be available from many languages. Point is, it's not Python-specific. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 10:50:50 -0600, Python wrote: > On Fri, Feb 23, 2018 at 01:38:01AM -0500, Terry Reedy wrote: >> It is no secret that Python's interpreted function calls are slower >> than function calls compiled to machine code and that Python's infinite >> precision integer arithmetic is slower that machine int arithmetic. >> So there is almost no point in combining both in a naive drastically >> inefficient algorithm and declaring that Python is slower. > > I never said the benchmarks chosen were awesome... :-) What I'm saying > is this: > > 1. Insistence that the most efficient python implementation of Fib >completely misses the point (and defeats the purpose) of the >benchmarks, and as such the entire article is lost in the weeds. The point of the benchmarks is to highlight how great Julia is, and why people should choose it over Python. The point of the article is to show how to take badly written, unidiomatic, slow code written in Python, and modify it to be better, faster code competitive (if not faster than) Julia, while remaining within the Python ecosystem. We aren't limited to the two options the benchmarks suggest: "Fast Julia! Slow Python!". (Three, if you include C.) We can take a third option: "we reject your crappy algorithms and write code like a professional". https://allthetropes.org/wiki/Take_a_Third_Option -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 13:51:33 +, Ben Bacarisse wrote: [...] >> I don't know why the Julia programmers chose to use such a poor >> algorithm: > > It's odd indeed, but given that they did, what you take to be the point > of the article -- to write a good Python algorithm as fast as the > terrible Julia one -- seems a bit pointless. Someone looking at the Julia benchmarks will look at the results and go, "Wow! Julia is so much faster than Python, I should choose to use Julia instead!". But guess what? The benchmarks are flawed. The performance of real-world Julia code doesn't match the performance of the benchmarks. "What’s disappointing is the striking difference between the claimed performance and the observed one. For example, a trivial hello world program in Julia runs ~27x slower than Python’s version and ~187x slower than the one in C." http://zverovich.net/2016/05/13/giving-up-on-julia.html Admittedly "Hello World" is not what *I* would call a real world program, but that's just an illustration of the discrepancy between the performance in artificial benchmarks and the performance in useful code. The benchmarks emphasise what Julia is good at, and use poor, unidiomatic Python code. People who are either too naive to know better, or who *do* know better but for some bizarre reason still take benchmarks at face value, believe them, and conclude that Julia is faster than Python. Not just a bit faster, but enough to conclude that Python is uncompetitive. Okay, so Julia is fast, at least for toy benchmarks, maybe or maybe not for actual code you care about. Great. Python is fast too, if you stop writing shitty code. The *FIRST* lesson in optimization is to re-write your crap implementation with a better algorithm. The point of the post is to teach that lesson. Stop writing crap code and then complaining that the language is "too slow". Write better code, and then we'll take your complaints seriously. Or, you never know, maybe you'll find it's fast enough and you don't actually have to abandon your existing code base and migrate to a whole new language. "Python takes hours to sort a million integers using BubbleSort! Its too damn slow!!!" "Why not use the built-in sort? That's fast." "NOO I MUST USE BUBBLESORT, BECAUSE REASONS!!!1!" *wink* -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 12:17:50 +, bartc wrote: > On 23/02/2018 01:27, Steven D'Aprano wrote: >> On Thu, 22 Feb 2018 17:53:30 +, bartc wrote: > >>> The actual result is irrelevant, so long as its correct. The important >>> thing is those 50 million calls. >> >> Why do you care about the 50 million calls? > > Because we are interested in comparing call-efficiency across languages? Are we? What on earth makes you think that? [...] > You're responsible for implementing the call-mechanism of a new > language, and you want to make it efficient. What program do you test it > with, one that contains zero calls (clearly, that will be most > efficient!), or one that contains lots of calls? This has nothing to do with the topic under consideration. It is irrelevant. This discussion has nothing, absolutely nothing with "implementing the call-mechanism of a new language". Perhaps you have missed that Python is a mature language well over 20 years old now, and even Julia is a few years old. [...] > OK, then, if you are at all interested in Julia vs. Python (and it was > you who posted that article!), then compare a version of a benchmark > that uses as few calls as possible, if you like. But what exactly is > that benchmark then highlighting? Loops? What on earth makes you think that the point of this is to highlight any one particular feature of a language? Loops, calls, addition, who cares? The point is to show how to take badly written slow code written in Python, and modify it to be better, faster code while remaining within the Python ecosystem. [...] > OK, then do as I did, and keep a global tally of how many calls it > makes, and print that as the result. Then the correct output for > 'fibcalls(36)' should be: > >48315633 > > If yours shows only 37, then /your/ program is wrong. I'm not interested in such a function. The words have not yet been invented to explain how little I care about how fast Python or Julia can make millions and millions of pointless function calls for no good reason. But if it makes you feel good about yourself, if it makes you feel that you've won some sort of little victory, then I concede: If a programmer cares about writing useless, crappy, pointless code that just makes function call after function call for no good reason, then Python will be terrible for the job. I would never use Python for such code, it is awful at it. If you think it is important to run an empty loop a billion times in a nanosecond, Python will be crap at that too. Python is only useful for writing useful, meaningful code that gets useful work done, not for your "fibcalls" function. If you have a need for "fibcalls", then go ahead and use Julia, or C, or whatever language you prefer, and be glad that you have a wide choice of languages to choose from. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, 24 Feb 2018 00:03:06 +1100, Chris Angelico wrote: >> Is numpy a general purpose C library that can also be called from any >> language that can use a C API? Or is it specific to Python? >> >> > No, it's a general purpose FORTRAN library that can also be called from > any language that can use a generic C, FORTRAN, COBOL, etc API. Numpy itself is a collection of Python interfaces to a number of C and Fortran libraries. You can't generally call numpy from other languages -- you can't even call numpy from other Python implementations, unless they support calling C/Fortran. Jython, for example, can't do it without help: https://jyni.org/ and while PyPy can call numpy, doing so is slow, and there's a custom fork of numpy specially for PyPy: https://bitbucket.org/pypy/numpy Remember that numpy show cases the exact reason Python was invented: to act as an easy to use "glue language" to join together efficient and useful, but not so easy to use, libraries written in C and Fortran. Jython was invented to support Java libraries, and IronPython to make it easy to integrate with the Dot-Net ecosystem. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 01:38:01AM -0500, Terry Reedy wrote: > It is no secret that Python's interpreted function calls are slower > than function calls compiled to machine code and that Python's > infinite precision integer arithmetic is slower that machine int > arithmetic. So there is almost no point in combining both in a > naive drastically inefficient algorithm and declaring that Python is > slower. I never said the benchmarks chosen were awesome... :-) What I'm saying is this: 1. Insistence that the most efficient python implementation of Fib completely misses the point (and defeats the purpose) of the benchmarks, and as such the entire article is lost in the weeds. 2. Inasmuch as someone might want to measure the performance of recursive function calls in a particular language, the choice of recursively implementing fib is a fine one, and useful for that purpose. 3. If you want to say that choosing the most efficient implementation in Python is the only way to do a fair comparison, then you must also choose the most efficient implementation in Julia, which this is not, by all appearances. But again, this misses the point of the benchmark and defeats its purpose. Now if you want to complain that the purpose is silly, I won't even argue that... In 30 years of writing code I don't think I've ever imiplemented a recursive solution to any problem that wasn't purely academic. Maybe once or twice... maybe. But the purpose is what it is, and arguing that it must be done a different way is bogus. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 3:39 AM, Steven D'Apranowrote: > On Fri, 23 Feb 2018 23:41:44 +1100, Chris Angelico wrote: > > [...] >>> Integer pixel values >> >> Maybe in 64 bits for the time being, but 32 certainly won't be enough. >> As soon as you do any sort of high DPI image manipulation, you will >> exceed 2**32 total pixels in an image (that's just 65536x65536, or >> 46341x46341 if you're using signed integers); > > I don't think there's any sensible reason to use signed ints for pixel > offsets. How about "my language's default integer type is signed"? That's what gave us all manner of problems, like the way a whole lot of old (and some not even all THAT old) programs have trouble if you have more than 2GB of free space - because they query disk free space as a 32-bit number, then interpret that number as signed. (If, as on many modern systems, you have hundreds or thousands of gig free, it depends whether that number modulo 4GB is more than 2GB, which is a very fun thing to manage.) >> and it wouldn't surprise >> me if some image manipulation needs that many on a single side - if not >> today, then tomorrow. So 64 bits might not be enough once you start >> counting total pixels. > > 64-bit int will limit your image size to a maximum of 4294967296 x > 4294967296. > > If you stitched the images from, let's say, the Hubble Space Telescope > together into one panoramic image of the entire sky, you'd surely exceed > that by a couple of orders of magnitude. There's a lot of sky and the > resolution of the Hubble is very fine. > > Or made a collage of all the duck-face photos on Facebook *wink* > > But you wouldn't be able to open such a file on your computer. Not until > we get a 128-bit OS :-) Hmm, yeah, I didn't actually do the math on that, heh. > While all of this is very interesting, let's remember that listing all > the many things which can be counted in 64 bits (or 128 bits, or 1024 > bits...) doesn't disprove the need for bignums. The argument Bart makes > is to list all the things that don't need refrigerating: > > - tinned beans > - nuts and bolts > - DVDs > - books > - spoons > > as evidence that we don't need refrigerators. Yeah, I get it, there are > approximately a hundred thousand million billion trillion things that > don't need bignums. There are so many things that don't need bignums, you > need a bignum to count them all! Touche! > And I don't give a rat's arse that this means that adding 1 to 10 to get > 11 in Python takes thirty nanoseconds instead of ten nanoseconds as a > consequence. If I cared about that, I wouldn't be using Python. Which is why you don't do crypto in Python - because you DO care about whether it takes thirty sometimes and ten sometimes, and various other things. So there *are* situations where you specifically do not want bignums; but they're a lot rarer than situations where you don't care what the underlying implementation is, and it's incredibly freeing to not have to take integer magnitude limitations into account. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Sat, Feb 24, 2018 at 3:32 AM, Pythonwrote: > On Fri, Feb 23, 2018 at 03:11:36AM -0500, Terry Reedy wrote: >> >>Why do you care about the 50 million calls? That's crazy -- the important >> >>thing is *calculating the Fibonacci numbers as efficiently as possible*. >> >> >If you are writing practical programs, that's true. But the Julia >> >benchmarks are not practical programs; they are designed to compare >> >the performance of various language features across a range of >> >languages. >> >> If that were so, then the comparison should use the fastest *Python* >> implementation. > > Doing that would completely fail to accomplish the task of comparing > the performance of recursive function calls in the two languages, > which is what the benchmark was designed to do. So, no actually, it > shouldn't. > Where does the author say that the benchmark is designed to compare recursion? If you specifically test recursion, some language interpreters will perform really well (because they convert your recursion into iteration) and others will not (because they faithfully execute the code you actually ask them to). Recursion is sometimes a good way to describe an algorithm, but rarely a good way to implement it at a low level. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 23:41:44 +1100, Chris Angelico wrote: [...] >> Integer pixel values > > Maybe in 64 bits for the time being, but 32 certainly won't be enough. > As soon as you do any sort of high DPI image manipulation, you will > exceed 2**32 total pixels in an image (that's just 65536x65536, or > 46341x46341 if you're using signed integers); I don't think there's any sensible reason to use signed ints for pixel offsets. > and it wouldn't surprise > me if some image manipulation needs that many on a single side - if not > today, then tomorrow. So 64 bits might not be enough once you start > counting total pixels. 64-bit int will limit your image size to a maximum of 4294967296 x 4294967296. If you stitched the images from, let's say, the Hubble Space Telescope together into one panoramic image of the entire sky, you'd surely exceed that by a couple of orders of magnitude. There's a lot of sky and the resolution of the Hubble is very fine. Or made a collage of all the duck-face photos on Facebook *wink* But you wouldn't be able to open such a file on your computer. Not until we get a 128-bit OS :-) While all of this is very interesting, let's remember that listing all the many things which can be counted in 64 bits (or 128 bits, or 1024 bits...) doesn't disprove the need for bignums. The argument Bart makes is to list all the things that don't need refrigerating: - tinned beans - nuts and bolts - DVDs - books - spoons as evidence that we don't need refrigerators. Yeah, I get it, there are approximately a hundred thousand million billion trillion things that don't need bignums. There are so many things that don't need bignums, you need a bignum to count them all! I'm truly happy for Bart that he never, or almost never, uses numbers larger than 2**64. I just wish he would be less of a prig about this and stop trying to deny me my bignums. One of the things I like about Python is that I stopped needing to worry about integer overflow back around Python 2.2 when the default integer type became a bignum. I used to write code like: try: calculation(n) except OverflowError: calculation(long(n)) all the time back then. Now, if I have an integer calculation, I just calculate it without having to care whether it fits in 64 bits or not. And I don't give a rat's arse that this means that adding 1 to 10 to get 11 in Python takes thirty nanoseconds instead of ten nanoseconds as a consequence. If I cared about that, I wouldn't be using Python. Just the other day I need to calculate 23! (factorial) for a probability calculation, and Python made it trivially easy. I didn't need to import a special library, or use special syntax to say "use a bignum". I just multiplied 1 through 23. Why, just now, on a whim, simply because I can, I calculated 100!! (double-factorial): py> reduce(operator.mul, range(100, 0, -2), 1) 34243224702511976248246432895208185975118675053719198827915654463488 (Yes, I'm a maths geek, and I love this sort of thing. So sue me.) There are dozens of languages that have made the design choice to limit their default integers to 16- 32- or 64-bit fixed size, and let the user worry about overflow. Bart, why does it upset you so that Python made a different choice? -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 03:11:36AM -0500, Terry Reedy wrote: > >>Why do you care about the 50 million calls? That's crazy -- the important > >>thing is *calculating the Fibonacci numbers as efficiently as possible*. > > >If you are writing practical programs, that's true. But the Julia > >benchmarks are not practical programs; they are designed to compare > >the performance of various language features across a range of > >languages. > > If that were so, then the comparison should use the fastest *Python* > implementation. Doing that would completely fail to accomplish the task of comparing the performance of recursive function calls in the two languages, which is what the benchmark was designed to do. So, no actually, it shouldn't. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
Steven D'Apranowrites: > On Fri, 23 Feb 2018 00:26:33 +, bartc wrote: > >> The point of the article was Julia vs. Python. You can't make Python >> faster by switching to a faster algorithm; you have to use the same one >> on both. > > No, the point of the article was to write Python code that is as fast as > the Julia code. I did not get that impression. In fact I was not really sure what the point was. At the start, the authors says "My take on this kind of cross language comparison is that the benchmarks should be defined by tasks to perform, then have language experts write the best code they can to perform these tasks." but if that is the point (rather the one on you picked up on) then the article could be, at best, only half the story. > I don't know why the Julia programmers chose to use such a poor > algorithm: It's odd indeed, but given that they did, what you take to be the point of the article -- to write a good Python algorithm as fast as the terrible Julia one -- seems a bit pointless. -- Ben. -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23/02/2018 12:41, Chris Angelico wrote: On Fri, Feb 23, 2018 at 11:17 PM, bartcwrote: Integer pixel values Maybe in 64 bits for the time being, but 32 certainly won't be enough. Why, people's eyes will evolve to see quintillions of colours? As soon as you do any sort of high DPI image manipulation, you will exceed 2**32 total pixels in an image (that's just 65536x65536, or 46341x46341 if you're using signed integers) (Huh?) ; and it wouldn't surprise me if some image manipulation needs that many on a single side - if not today, then tomorrow. So 64 bits might not be enough once you start counting total pixels. A four-billion x four-billion image would be quite something. I wonder how long it would take Cpython to loop through all 18e18 pixels? (BTW this still fits into unsigned 64 bits.) These speculations are pointless. Graphics is very hardware-oriented now, and this is mainly the concern of graphics processors which will already have wide datapaths, much wider than 64 bits. Because you can of course use more than one 64-bit value. Most calculations on these can also be done in 64 bits (except multiply and power where the ceiling might 32 bits or lower). Exactly. Most. OK, good, we've agreed on something: Most. Actually most values in my programs fit into 32 bits. Most of /those/ are likely to fit into 16 bits. In a program, most integers are going to represent small values. And then you'll run into some limitation somewhere. Let's say you want to know how many IP addresses, on average, are allocated to one person. You'll probably get a single digit number as your answer. Can you calculate that using 32-bit integers? What about 64-bit? Should you use bignums just in case? Take your pick, let me know, and then I'll tell you my answer - and why. I'm not saying bignums are needed 0% of the time. But they don't need to be used for 100% of all integer values, just in case. What sort of handicap is it? Performance? I don't know. Wasn't that brought up in the discussion? That Python's fib() routine could seamlessly accommodate bigger values as needed, and Julia's couldn't, and maybe that was one reason that Julia had the edge. Or maybe, in Bartville, this flexibility is worth paying for but large integers are useless? In 40 years of coding I can say that I have very rarely needed arbitrary precision integer values. (They come up in some recreational programs, or for implementing arbitrary precision integers in the languages used to run those programs!) But I have needed flexible strings and lists all the time. Is my programming experience really that different from anybody else's? -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 11:57 PM, bartcwrote: > On 23/02/2018 08:11, Terry Reedy wrote: > >> * Python has an import statement. But 'comparisons' disallow 'import >> numpy', a quite legal Python statement, and similar others. > > > If I'm duplicating a benchmark [in another language] then the last thing I > want to see is something like 'import numpy', or other special language > feature, as that cannot be replicated. > > I want to see a pure algorithm. Not the equivalent of: > > os.system("C_version.exe") > > The ability >> >> to import Python wrappers of Fortran and C libraries is part of the >> fundamental design of cpython. It is a distributed project. > > > Is numpy a general purpose C library that can also be called from any > language that can use a C API? Or is it specific to Python? > No, it's a general purpose FORTRAN library that can also be called from any language that can use a generic C, FORTRAN, COBOL, etc API. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23/02/2018 08:11, Terry Reedy wrote: * Python has an import statement. But 'comparisons' disallow 'import numpy', a quite legal Python statement, and similar others. If I'm duplicating a benchmark [in another language] then the last thing I want to see is something like 'import numpy', or other special language feature, as that cannot be replicated. I want to see a pure algorithm. Not the equivalent of: os.system("C_version.exe") The ability to import Python wrappers of Fortran and C libraries is part of the fundamental design of cpython. It is a distributed project. Is numpy a general purpose C library that can also be called from any language that can use a C API? Or is it specific to Python? -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 11:17 PM, bartcwrote: >>> The fact is that the vast majority of integer calculations don't need to >>> use big integers (pretty much 100% of mine). Probably most don't even >>> need 64 bits, but 32 bits. >> >> >> And here we have the World According To Bart again: "since *I* don't need >> more than 32-bits, then obviously it is a FACT than nobody else could >> need them". > > > Do you have better statistics than me? If so please share. I guess the > number of calculations (that /can/ be done in 64 bits) is less than 100%, > but not 0%, so somewhere in between. > > I'm saying it's nearer to 100% than to 0%; you're saying, what . ? I don't know what he's saying, but I'm saying that unless it is 100%, your language should just natively support bigints. How do you know when you'll need that extra room? Are you going to pick bigints for everything that's affected by user input? If so, you may as well use them for everything and not bother with the distinction; if not, some day, your program will need larger numbers than you guessed for it, and it'll break. > As examples of quantities that can be represented in 64 bits: > > Character codes or code points Sure. That's why we have a special optimization for sequences of 21-bit numbers. We call it "str". > Integer pixel values Maybe in 64 bits for the time being, but 32 certainly won't be enough. As soon as you do any sort of high DPI image manipulation, you will exceed 2**32 total pixels in an image (that's just 65536x65536, or 46341x46341 if you're using signed integers); and it wouldn't surprise me if some image manipulation needs that many on a single side - if not today, then tomorrow. So 64 bits might not be enough once you start counting total pixels. > Most enumerations > For-loop counting (if it needs more than 64 bits, then we might be >waiting a while for it to finish) Most loops don't need any counting. Should we implement a 0-bit integer for those? > Array indexing > Numbers of people, cars, guns, countries, houses, books, iphones etc >on Earth > ... endless examples of numbers less than 900 used >in programming As soon as you say "the number of X is always going to be less than Y", you open yourself up to risk. https://xkcd.com/865/ > Most calculations on these can also be done in 64 bits (except multiply and > power where the ceiling might 32 bits or lower). Exactly. Most. And then you'll run into some limitation somewhere. Let's say you want to know how many IP addresses, on average, are allocated to one person. You'll probably get a single digit number as your answer. Can you calculate that using 32-bit integers? What about 64-bit? Should you use bignums just in case? Take your pick, let me know, and then I'll tell you my answer - and why. >> Yes Bart, the reason that so many languages have support for Bignums is >> that everyone except you is an idiot who loves writing slow code for >> absolutely no reason at all. > > If bignum support is a handicap, even when they are not needed, then it is > something to consider. What sort of handicap is it? Performance? That's what optimization is for. Some languages (Pike, for instance) have a single data type that is sometimes represented internally by a native integer and sometimes by a bignum; it's completely transparent apart from with timings. Others (eg Python 2) have two separate data types, but will smoothly transition to bignum any time it's needed. IMO CPython 3.x could probably benefit from the transparent optimization... but Python (the language) benefits hugely from the core int type simply being a bignum. There are no semantically significant effects at the boundary between "small numbers" and "large numbers". Auto-growing lists are a cost, too. It's expensive to have to allocate more memory to allow another element to be added. Should we lock in the sizes of all lists/arrays at creation? Why should we pay the price for that flexibility? Or maybe, in Bartville, this flexibility is worth paying for but large integers are useless? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 23/02/2018 01:27, Steven D'Aprano wrote: On Thu, 22 Feb 2018 17:53:30 +, bartc wrote: The actual result is irrelevant, so long as its correct. The important thing is those 50 million calls. Why do you care about the 50 million calls? Because we are interested in comparing call-efficiency across languages? NOT in Fibonacci numbers. That's crazy -- the important thing is *calculating the Fibonacci numbers as efficiently as possible*. I don't believe for a second that you would write code like this: # need a Fibonacci number, but the implementation we have We don't. If Fibonacci causes problems for you because you can't see past the inefficiency, then we might use the Ackermann function instead. (But I've found Ackermann can cause stack overflow in some languages.) so why on earth do you care about the 50 million calls? The only reason I care about them is to reduce the number as much as possible. Why do you care about them? You're responsible for implementing the call-mechanism of a new language, and you want to make it efficient. What program do you test it with, one that contains zero calls (clearly, that will be most efficient!), or one that contains lots of calls? arr := some array for i := 1 to len(arr): value = arr[i] process(value) If I'm writing Ruby or Python, that means *NOT* writing loops like that, but writing them like this: for value in arr: process(value) Suppose it turned out that in Python, /this/ was actually faster: for i in range(len(arr)): value = arr[i] process(value) ? I don't give a flying fox about how fast the compiler can do those 48 million calls, I want to know how to get Fibonacci numbers as fast as I can, and if that means avoiding making the calls, then you better believe that I will avoid making those calls. OK, then, if you are at all interested in Julia vs. Python (and it was you who posted that article!), then compare a version of a benchmark that uses as few calls as possible, if you like. But what exactly is that benchmark then highlighting? Loops? By running a different program? Fine, then here's my submission for the N=20 case used in the tests: def fib(n): return 6765 The program has to be *correct*, and your program is not. OK, then do as I did, and keep a global tally of how many calls it makes, and print that as the result. Then the correct output for 'fibcalls(36)' should be: 48315633 If yours shows only 37, then /your/ program is wrong. (BTW, this value will always fit in 64 bits, for the foreseeable future.) When comparing two compilers, you are ALWAYS comparing two different programs. That's part of the art of doing it properly. Either the compilers generate precisely the same machine/byte code, in which case there is no point in comparing them[1], or they generate different machine/byte code, in which case you're comparing two different running programs. So the people who worked for years on PyPy never once bothered to compare results with running the same source program with CPython, because there would be point? OK, I never knew that... The fact is that the vast majority of integer calculations don't need to use big integers (pretty much 100% of mine). Probably most don't even need 64 bits, but 32 bits. And here we have the World According To Bart again: "since *I* don't need more than 32-bits, then obviously it is a FACT than nobody else could need them". Do you have better statistics than me? If so please share. I guess the number of calculations (that /can/ be done in 64 bits) is less than 100%, but not 0%, so somewhere in between. I'm saying it's nearer to 100% than to 0%; you're saying, what . ? As examples of quantities that can be represented in 64 bits: Character codes or code points Integer pixel values Most enumerations For-loop counting (if it needs more than 64 bits, then we might be waiting a while for it to finish) Array indexing Numbers of people, cars, guns, countries, houses, books, iphones etc on Earth ... endless examples of numbers less than 900 used in programming Most calculations on these can also be done in 64 bits (except multiply and power where the ceiling might 32 bits or lower). Yes Bart, the reason that so many languages have support for Bignums is that everyone except you is an idiot who loves writing slow code for absolutely no reason at all. If bignum support is a handicap, even when they are not needed, then it is something to consider. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 11:00:28 +0100, Stefan Behnel wrote: > I've even seen proper Cython benchmark code that a C compiler can fully > analyse as static and replaces by a constant, and then get wonder > speedups from it that would never translate to any real-world gains. This is one of the reasons why typical benchmarks are junk. If you're not benchmarking actual, real-world code you care about, you have no idea how the language will perform with actual, real-world code. There is a popular meme that "Python is slow" based on the results of poorly-written code. And yet, in *real world code* that people actually use, Python's performance is not that bad -- especially when you play to its strengths, rather than its weaknesses, and treat it as a glue language for stitching together high-performance libraries written in high-performance languages like C, Fortran, Rust and Java. Speaking of bad benchmarks, I recall an anecdote about a programmer running some benchmarks on Fortran code on various mainframes, and found that one specific machine was millions of times faster than the others. It turns out that the code being tested was something like this (translated into Python): for i in range(1, 100): pass and the Fortan compiler on that one machine was clever enough to treat it as dead code and remove it, turning the entire benchmark into a single no-op. Your point that we must consider benchmarkers carefully is a good one. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 11:00:28 +0100, Stefan Behnel wrote: > I've even seen proper Cython benchmark code that a C compiler can fully > analyse as static and replaces by a constant, and then get wonder > speedups from it that would never translate to any real-world gains. This is one of the reasons why typical benchmarks are junk. If you're not benchmarking actual, real-world code you care about, you have no idea how the language will perform with actual, real-world code. There is a popular meme that "Python is slow" based on the results of poorly-written code. And yet, in *real world code* that people actually use, Python's performance is not that bad -- especially when you play to its strengths, rather than its weaknesses, and treat it as a glue language for stitching together high-performance libraries written in high-performance languages like C, Fortran, Rust and Java. Speaking of bad benchmarks, I recall an anecdote about a programmer running some benchmarks on Fortran code on various mainframes, and found that one specific machine was millions of times faster than the others. It turns out that the code being tested was something like this (translated into Python): for i in range(1, 100): pass and the Fortan compiler on that one machine was clever enough to treat it as dead code and remove it, turning the entire benchmark into a single no-op. Your point that we must consider benchmarkers carefully is a good one. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
Steven D'Aprano schrieb am 22.02.2018 um 11:59: > https://www.ibm.com/developerworks/community/blogs/jfp/entry/Python_Meets_Julia_Micro_Performance?lang=en Thanks for sharing, Steven. While it was already suggested between the lines in some of the replies, I'd like to emphasise that the combination of timeit and result caching (memoizing) in this post is misleading and not meaningful. It actually shows a common mistake that easily happens when benchmarking code. Since timeit repeats the benchmark runs and takes the minimum, it will *always* return the time it takes to look up the final result in the cache, and never report the actual performance of the code that is meant to be benchmarked here. From what I read, this was probably not intended by the author of the post. Myself, I'm guilty as charged to run into this more than once and have seen it in many occasions. I've even seen proper Cython benchmark code that a C compiler can fully analyse as static and replaces by a constant, and then get wonder speedups from it that would never translate to any real-world gains. These things happen, but they are mistakes. All they tell us is that we must always be careful when evaluating benchmarks and their results, and to take good care that the benchmarks match the intended need, which is to help us understand and solve a real-world performance problem [1]. Stefan [1] That's also the problem in the post above and in the original benchmarks it refers to: there is no real-world problem to be solved here. Someone wrote slow code and then showed that one language can evaluate that slow code faster than another. Nothing to see here, keep walking... -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 03:11:36 -0500, Terry Reedy wrote: > On 2/22/2018 10:31 PM, Python wrote: > >>> Why do you care about the 50 million calls? That's crazy -- the >>> important thing is *calculating the Fibonacci numbers as efficiently >>> as possible*. > >> If you are writing practical programs, that's true. But the Julia >> benchmarks are not practical programs; they are designed to compare the >> performance of various language features across a range of languages. > > If that were so, then the comparison should use the fastest *Python* > implementation. Which is what the article discussed here did. But the > comparison is a lie when the comparison is compiled machine code versus > bytecode interpreted by crippled cpython*. And people use this sort of > benchmark to choose a language for *practical programs*, and don't know > that they are not seeing *Python* times, but *crippled cpython* times. Benchmarks are completely pointless anyway. the question when choosing a language should not be 'How fast it is', it should always be 'Is it fast enough'. once you have a choice of languages that are fast enough other criteria for suitability become important (such as readability & ease of debugging) -- If you're going to do something tonight that you'll be sorry for tomorrow morning, sleep late. -- Henny Youngman -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, Feb 23, 2018 at 5:38 PM, Terry Reedywrote: > As to the vague 'class of problems implemented in a similar manner': Any > function f of count N that depends of values of f for counts < N can be > memoized the same way in Python as fibonacci. Everything said about P vs J > for fib applies to such similar problems. The same is almost true for > functions that depend on a pair of counts, such as the number of > combinations of N things taken K at a time. The time benefit per space used > is just a bit less. The naive recursive Fibonacci function can be memoized with maxsize=2 and it gets virtually all the benefit - in fact, it becomes effectively iterative. (I think even maxsize=1 will do that.) With most memoization, that kind of benefit is less blatant. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 2/22/2018 10:31 PM, Python wrote: Why do you care about the 50 million calls? That's crazy -- the important thing is *calculating the Fibonacci numbers as efficiently as possible*. If you are writing practical programs, that's true. But the Julia benchmarks are not practical programs; they are designed to compare the performance of various language features across a range of languages. If that were so, then the comparison should use the fastest *Python* implementation. Which is what the article discussed here did. But the comparison is a lie when the comparison is compiled machine code versus bytecode interpreted by crippled cpython*. And people use this sort of benchmark to choose a language for *practical programs*, and don't know that they are not seeing *Python* times, but *crippled cpython* times. * Python has an import statement. But 'comparisons' disallow 'import numpy', a quite legal Python statement, and similar others. The ability to import Python wrappers of Fortran and C libraries is part of the fundamental design of cpython. It is a distributed project. The fact that numerical python, numarray, and now numpy are distributed separately, in spite of proposals to include them, syntax additions for their benefit, and their overwhelming usefullness, is a matter of social and administrative convenience, not necessity. There is also a proposal, which I consider likely to happen, to enhance the cpython Windows and Mac installers by making installation of numpy and other great 'external' modules selectable options. It takes just 2 or 3 legal Python lines that do run with cpython as installed to make 'import numpy' work, without resorting to subprocess (though that is *also* legal python: from ensurepip import bootstrap; bootstrap() # only if needed from pip import main main(['install', 'numpy']) >>> import numpy >>> dir(numpy) ['ALLOW_THREADS', 'AxisError', 'BUFSIZE', 'CLIP', 'ComplexWarning', ... 'vstack', 'warnings', 'where', 'who', 'zeros', 'zeros_like'] So, given a realistic number-crunching benchmark that take at least several minutes, one could add those two lines at the top and be off -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On 2/22/2018 8:36 PM, Python wrote: On Thu, Feb 22, 2018 at 11:29:53PM +1100, Chris Angelico wrote: The idea of the Fibonacci benchmark is to test how effectively an implementation manages large numbers of recursive function calls. Then, fib(36) would normally involve 48,315,633 calls. This version does only 37, giving a misleading impression. Not overly misleading; the point of it is to show how trivially easy it is to memoize a function in Python. This does not appear to me at all to be the point of the article. The point of the article seems to be that the Julia benchmarks compare unfairly the performance of Python to Julia, because they do not use algorithms that suit "the best way to use Python." But I have to agree with bartc, that is not at all the point of the benchmarks. The point of the benchmarks is to compare the performance of a particular algorithm or set of algorithms across a variety of languages. It's fine to say that this method of computing Fibonacci sequences is inefficient; but anyone who's spent any time on the problem knows that recursive function calls is not the most efficient way to calculate them in most languages. So it must follow logically that the point of the benchmark is not to prove that Julia is faster than Python at solving Fibonacci sequences, but rather that Julia is faster than Python at solving the class of problems that would be implemented in a similar manner as this particular implementation of Fibonacci sequences. It is no secret that Python's interpreted function calls are slower than function calls compiled to machine code and that Python's infinite precision integer arithmetic is slower that machine int arithmetic. So there is almost no point in combining both in a naive drastically inefficient algorithm and declaring that Python is slower. As to the vague 'class of problems implemented in a similar manner': Any function f of count N that depends of values of f for counts < N can be memoized the same way in Python as fibonacci. Everything said about P vs J for fib applies to such similar problems. The same is almost true for functions that depend on a pair of counts, such as the number of combinations of N things taken K at a time. The time benefit per space used is just a bit less. Now consider a general recursive problem: a graph with N nodes and E edges and recursive node functions that depend on the value of the function at args(node) other nodes. Example: the minimum length of a path from node a to node b. In the field of graph algorithms, it is completely routine to save f(node) for intermediate nodes when calculated. The idea of naively implementing the recursive definition, in any non-trivial practical situation, without saving intermediate values, is pretty ludicrous. Fib can be viewed as a function on a directed graph where all but the 2 base nodes have two incoming and two outgoing edges. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: How to make Python run as fast (or faster) than Julia
On Fri, 23 Feb 2018 00:26:33 +, bartc wrote: > The point of the article was Julia vs. Python. You can't make Python > faster by switching to a faster algorithm; you have to use the same one > on both. No, the point of the article was to write Python code that is as fast as the Julia code. I don't know why the Julia programmers chose to use such a poor algorithm: maybe they are show-casing just how amazingly awesome the Julia compiler is that it can even make a rubbish algorithm run fast. I don't care -- they can choose whatever algorithm they like for their Julia code, but I don't really care how good the language is at running bad code that I would never use in real life. -- Steve -- https://mail.python.org/mailman/listinfo/python-list