Re: Generators vs. Functions?
On Sun, 05 Feb 2006 19:14:29 +1100, Steven D'Aprano [EMAIL PROTECTED] wrote: On Sun, 05 Feb 2006 03:31:24 +, Neil Schemenauer wrote: Peter Hansen [EMAIL PROTECTED] wrote: More precisely, the state of the function is *saved* when a yield occurs, so you certainly don't *recreate* it from scratch, but merely restore the state, and this should definitely be faster than creating it from scratch in the first place. Right. Resuming a generator is faster than calling a function. Have you actually measured this, or are you just making a wild guess? According to a short test performed by Magnus Lycka, resuming a generator takes more time than calling a function. My own test agrees. Here is my test, using Python 2.3. I've tried to make the test as fair as possible, with the same number of name lookups in both pieces of test code. # straight function, two name lookups import timeit t1 = timeit.Timer(stmt=func.next(), setup= ... class K: ... pass ... ... def next(): ... return 1 ... ... func = K() ... func.next = next ... ) t1.timeit() 0.63980388641357422 # generator, two name lookups t2 = timeit.Timer(stmt=gen.next(), setup= ... def g(): ... while 1: yield 1 ... ... gen = g() ... ) t2.timeit() 0.82081794738769531 # straight function, one name lookup t3 = timeit.Timer(stmt=f(), setup= ... def f(): ... return 1 ... ) t3.timeit() 0.47273492813110352 # generator, one name lookup t4 = timeit.Timer(stmt=gnext(), setup= ... def g(): ... while 1: yield 1 ... ... gnext = g().next ... ) t4.timeit() 0.55085492134094238 So on the basis of my tests, there is a small, but significant speed advantage to _calling_ a function versus _resuming_ a generator. Of course the other advantages of generators often far outweigh the tiny setup cost each time you call one. In addition, for any complex function with significant execution time, the call/resume time may be an insignificant fraction of the total execution time. There is little or no point in avoiding generators due to a misplaced and foolish attempt to optimise your code. I show an advantage favoring generator resumption vs function call: from time import clock def f(): return clock() ... def g(): yield clock(); yield clock() ... max(f()-f() for x in xrange(1)) -9.2190462142316409e-006 max(f()-f() for x in xrange(1)) -9.2190462139818408e-006 max(float.__sub__(*g()) for x in xrange(1)) -7.5428559682677587e-006 max(float.__sub__(*g()) for x in xrange(1)) -7.5428559682677587e-006 max(float.__sub__(*g()) for x in xrange(1)) -7.5428559682677587e-006 (It'll probably go ten times faster on a recent box ;-) Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
On Sun, 05 Feb 2006 03:31:24 +, Neil Schemenauer wrote: Peter Hansen [EMAIL PROTECTED] wrote: More precisely, the state of the function is *saved* when a yield occurs, so you certainly don't *recreate* it from scratch, but merely restore the state, and this should definitely be faster than creating it from scratch in the first place. Right. Resuming a generator is faster than calling a function. Have you actually measured this, or are you just making a wild guess? According to a short test performed by Magnus Lycka, resuming a generator takes more time than calling a function. My own test agrees. Here is my test, using Python 2.3. I've tried to make the test as fair as possible, with the same number of name lookups in both pieces of test code. # straight function, two name lookups import timeit t1 = timeit.Timer(stmt=func.next(), setup= ... class K: ... pass ... ... def next(): ... return 1 ... ... func = K() ... func.next = next ... ) t1.timeit() 0.63980388641357422 # generator, two name lookups t2 = timeit.Timer(stmt=gen.next(), setup= ... def g(): ... while 1: yield 1 ... ... gen = g() ... ) t2.timeit() 0.82081794738769531 # straight function, one name lookup t3 = timeit.Timer(stmt=f(), setup= ... def f(): ... return 1 ... ) t3.timeit() 0.47273492813110352 # generator, one name lookup t4 = timeit.Timer(stmt=gnext(), setup= ... def g(): ... while 1: yield 1 ... ... gnext = g().next ... ) t4.timeit() 0.55085492134094238 So on the basis of my tests, there is a small, but significant speed advantage to _calling_ a function versus _resuming_ a generator. Of course the other advantages of generators often far outweigh the tiny setup cost each time you call one. In addition, for any complex function with significant execution time, the call/resume time may be an insignificant fraction of the total execution time. There is little or no point in avoiding generators due to a misplaced and foolish attempt to optimise your code. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Steven D'Aprano wrote: So on the basis of my tests, there is a small, but significant speed advantage to _calling_ a function versus _resuming_ a generator. now add state handling to your micro-benchmark, and see if the function example still runs faster. (hint: functions and generators do different things, and are designed for different use cases. they're not two different ways to do the same thing, and benchmarks that ignore that simple fact are pretty much use- less, except, perhaps, for a very small group of VM developers.) /F -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Steven D'Aprano wrote: t1.timeit() 0.63980388641357422 ... t2.timeit() 0.82081794738769531 So on the basis of my tests, there is a small, but significant speed advantage to _calling_ a function versus _resuming_ a generator. I get the same, but the difference is much less on my system: min(t1.timeit() for i in range(3)) 0.43929577641858941 min(t2.timeit() for i in range(3)) 0.46473169075954956 You missed though what is perhaps a more representative case. Generators have state, so for a fair comparison you need to restore some state. I think a fairer comparison is: t5 = timeit.Timer(stmt=func.next(), setup= class K: pass def next(self): return 1 func = K() ) min(t5.timeit() for i in range(3)) 0.58508302032805659 The method call is slower than the generator resumption and that is without even accessing any of the saved state. If you do access the saved state the generator runs at about the same speed as the original (local variable access is about as fast as accessing a constant), but the method slows down even more: t6 = timeit.Timer(stmt=gen.next(), setup= def g(n): while 1: yield n gen = g(42) ) min(t6.timeit() for i in range(3)) 0.46405506845144373 t7 = timeit.Timer(stmt=func.next(), setup= class K: def __init__(self, n): self.n = n def next(self): return self.n func = K(42) ) min(t7.timeit() for i in range(3)) 0.67426781895460408 -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
On Sun, 05 Feb 2006 09:49:21 +0100, Fredrik Lundh wrote: Steven D'Aprano wrote: So on the basis of my tests, there is a small, but significant speed advantage to _calling_ a function versus _resuming_ a generator. now add state handling to your micro-benchmark, and see if the function example still runs faster. ¿Que Mr Fawlty? Sorry, I'm not sure I follow what you mean. Do you mean, Make the function and generator do something significant? I expected that people would understand that I was talking only about the overhead of calling the function or generator, not the overall time needed to perform some useful task. Sorry for the less than clear explanation. (hint: functions and generators do different things, and are designed for different use cases. they're not two different ways to do the same thing, and benchmarks that ignore that simple fact are pretty much use- less, except, perhaps, for a very small group of VM developers.) I never meant to imply that generators were somehow worse than functions. As you say, they have different purposes, and for the sort of use case that generators are good for, the tiny extra overhead in restoring a generator is a cost well worth paying. But it is a cost. I personally don't believe it is a significant cost, except maybe for the odd special case or two. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Steven D'Aprano [EMAIL PROTECTED] wrote: Have you actually measured this, or are you just making a wild guess? I haven't timed it until now but my guess it not so wild. I'm pretty familiar with the generator implementation (having written the initial version of it). In Python 2.3, resuming a generator does a small amount of setup and then calls eval_frame(). Calling a function does more setup work and then also calls eval_frame(). Here is my test, using Python 2.3. I've tried to make the test as fair as possible, with the same number of name lookups in both pieces of test code. On my machine t4 is faster than t3. Your test is not so fair because the generator is doing a while loop (executing more bytecode instructions) while the function is just returning a value (one instruction). On your machine the function call may be faster due to CPU cache effects or branch prediction. In any case, the difference you are trying to measure is extremely small. Try adding some arguments to the functions (especially keyword arguments). What your test does show is that the speed difference should not come into the decision of which construct to use. Neil -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
On Sun, 05 Feb 2006 16:14:54 +, Neil Schemenauer wrote: Steven D'Aprano [EMAIL PROTECTED] wrote: Have you actually measured this, or are you just making a wild guess? I haven't timed it until now but my guess it not so wild. I'm pretty familiar with the generator implementation (having written the initial version of it). Well I guess you're forgiven then *sheepish grin* In Python 2.3, resuming a generator does a small amount of setup and then calls eval_frame(). Calling a function does more setup work and then also calls eval_frame(). It takes MORE setup to call a function than it takes to resume a generator? Here is my test, using Python 2.3. I've tried to make the test as fair as possible, with the same number of name lookups in both pieces of test code. On my machine t4 is faster than t3. Your test is not so fair because the generator is doing a while loop (executing more bytecode instructions) while the function is just returning a value (one instruction). A fair criticism, but then a generator with just one instruction is, well, pointless. On your machine the function call may be faster due to CPU cache effects or branch prediction. In any case, the difference you are trying to measure is extremely small. Try adding some arguments to the functions (especially keyword arguments). Small in absolute terms, but significant in relative terms: as an order of magnitude, a factor of about 1/10th. Of course, I never expected that calling/resuming cost to be significant for most real world uses. If I gave anyone that impression, it wasn't intended. What your test does show is that the speed difference should not come into the decision of which construct to use. I never said it should. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Duncan Booth wrote: Steven D'Aprano wrote: So on the basis of my tests, there is a small, but significant speed advantage to _calling_ a function versus _resuming_ a generator. I get the same, but the difference is much less on my system: With Python 2.4? Doesn't surprise me a bit. I tested with 2.3 (vanilla Red Hat EL4 install) and it seems Steven used 2.3 as well. My little test was just an attempt to test a claim that the setup time would be shorter for generator calls than for function calls. It's so easy to test timing with Python, so it's surprising that people speculate so much about theories with no measurements. Who knows what the call time ratios will be in Python 2.6? I think the important point is the one Fredrik is making: You won't have a function implementation or a generator implementation with the same code body. The other differences in the code will typically mean much more than the call overhead. If we're in some nested loop where we call a function so trivial that the call overhead makes performance suffer, by all means, inline these few lines of code in the loop! -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Wolfgang Keller wrote: If this is actually also true in the general case, and not due to eventual non-representativeness of the test mentioned above, is it simply due to a less-than-optimum implementation of generators in the current Pyython interpreter and thus likely to change in the future or is this a matter of principle and will consequently remain like this forever? I am not a CPython or PyPy hacker, but I would guess that it will always be slower as a matter of principal. When resuming a generator you have to resetup the state the function was in when it was last called, which I think should always be more costly than calling the function with a clean state. Someone want to correct me? Whether or not the difference is that significant though I am unsure. It may be small enough that for most applications no one cares. -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Joseph Garvin wrote: I am not a CPython or PyPy hacker, but I would guess that it will always be slower as a matter of principal. When resuming a generator you have to resetup the state the function was in when it was last called, which I think should always be more costly than calling the function with a clean state. Someone want to correct me? In cases where there are thousands of (large) values to return, the list (as returned by the function) may be large enough to require memory paging, whereas the generator only returns one value at a time. Whether or not the difference is that significant though I am unsure. It may be small enough that for most applications no one cares. I just wrote an application which retrieves values from a 300mb database, and got a significant speedup using iterators. --Max -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Joseph Garvin wrote: Wolfgang Keller wrote: If this is actually also true in the general case, and not due to eventual non-representativeness of the test mentioned above, is it simply due to a less-than-optimum implementation of generators in the current Pyython interpreter and thus likely to change in the future or is this a matter of principle and will consequently remain like this forever? I am not a CPython or PyPy hacker, but I would guess that it will always be slower as a matter of principal. When resuming a generator you have to resetup the state the function was in when it was last called, which I think should always be more costly than calling the function with a clean state. Someone want to correct me? Sure. You have to resetup the state of the function... depending on what resetup means (not a usual English word, so we might all imagine different meanings for it), either the first or the second part of the last sentence is false. More precisely, the state of the function is *saved* when a yield occurs, so you certainly don't *recreate* it from scratch, but merely restore the state, and this should definitely be faster than creating it from scratch in the first place. I haven't looked at the source, but this wouldn't have to involve much beyond a little memory copying, or even a few pointer changes, whereas the original could involve a lot of work, depending on how many arguments were passed, how many locals exist, and so on. -Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Generators vs. Functions?
Peter Hansen [EMAIL PROTECTED] wrote: More precisely, the state of the function is *saved* when a yield occurs, so you certainly don't *recreate* it from scratch, but merely restore the state, and this should definitely be faster than creating it from scratch in the first place. Right. Resuming a generator is faster than calling a function. Neil -- http://mail.python.org/mailman/listinfo/python-list