Re: Generators vs. Functions?

2006-02-06 Thread Bengt Richter
On Sun, 05 Feb 2006 19:14:29 +1100, Steven D'Aprano [EMAIL PROTECTED] wrote:

On Sun, 05 Feb 2006 03:31:24 +, Neil Schemenauer wrote:

 Peter Hansen [EMAIL PROTECTED] wrote:
 More precisely, the state of the function is *saved* when a yield 
 occurs, so you certainly don't *recreate* it from scratch, but merely 
 restore the state, and this should definitely be faster than creating it 
 from scratch in the first place.
 
 Right.  Resuming a generator is faster than calling a function.

Have you actually measured this, or are you just making a wild guess?

According to a short test performed by Magnus Lycka, resuming a generator
takes more time than calling a function. My own test agrees.

Here is my test, using Python 2.3. I've tried to make the test as fair as
possible, with the same number of name lookups in both pieces of test code.

# straight function, two name lookups

 import timeit

 t1 = timeit.Timer(stmt=func.next(), setup=
... class K:
...   pass
...
... def next():
...   return 1
...
... func = K()
... func.next = next
... )

 t1.timeit()
0.63980388641357422


# generator, two name lookups

 t2 = timeit.Timer(stmt=gen.next(), setup=
... def g():
...   while 1: yield 1
...
... gen = g()
... )

 t2.timeit()
0.82081794738769531


# straight function, one name lookup

 t3 = timeit.Timer(stmt=f(), setup=
... def f():
...   return 1
... )
 
 t3.timeit()
0.47273492813110352


# generator, one name lookup 

 t4 = timeit.Timer(stmt=gnext(), setup=
... def g():
...   while 1: yield 1
...
... gnext = g().next
... )

 t4.timeit()
0.55085492134094238


So on the basis of my tests, there is a small, but significant speed
advantage to _calling_ a function versus _resuming_ a generator.

Of course the other advantages of generators often far outweigh the tiny
setup cost each time you call one. In addition, for any complex function
with significant execution time, the call/resume time may be an
insignificant fraction of the total execution time. There is little or no
point in avoiding generators due to a misplaced and foolish attempt to
optimise your code.

I show an advantage favoring generator resumption vs function call:

  from time import clock
  def f(): return clock()
 ...
  def g(): yield clock(); yield clock()
 ...
  max(f()-f() for x in xrange(1))
 -9.2190462142316409e-006
  max(f()-f() for x in xrange(1))
 -9.2190462139818408e-006
  max(float.__sub__(*g()) for x in xrange(1))
 -7.5428559682677587e-006
  max(float.__sub__(*g()) for x in xrange(1))
 -7.5428559682677587e-006
  max(float.__sub__(*g()) for x in xrange(1))
 -7.5428559682677587e-006

(It'll probably go ten times faster on a recent box ;-)

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-05 Thread Steven D'Aprano
On Sun, 05 Feb 2006 03:31:24 +, Neil Schemenauer wrote:

 Peter Hansen [EMAIL PROTECTED] wrote:
 More precisely, the state of the function is *saved* when a yield 
 occurs, so you certainly don't *recreate* it from scratch, but merely 
 restore the state, and this should definitely be faster than creating it 
 from scratch in the first place.
 
 Right.  Resuming a generator is faster than calling a function.

Have you actually measured this, or are you just making a wild guess?

According to a short test performed by Magnus Lycka, resuming a generator
takes more time than calling a function. My own test agrees.

Here is my test, using Python 2.3. I've tried to make the test as fair as
possible, with the same number of name lookups in both pieces of test code.

# straight function, two name lookups

 import timeit

 t1 = timeit.Timer(stmt=func.next(), setup=
... class K:
...   pass
...
... def next():
...   return 1
...
... func = K()
... func.next = next
... )

 t1.timeit()
0.63980388641357422


# generator, two name lookups

 t2 = timeit.Timer(stmt=gen.next(), setup=
... def g():
...   while 1: yield 1
...
... gen = g()
... )

 t2.timeit()
0.82081794738769531


# straight function, one name lookup

 t3 = timeit.Timer(stmt=f(), setup=
... def f():
...   return 1
... )
 
 t3.timeit()
0.47273492813110352


# generator, one name lookup 

 t4 = timeit.Timer(stmt=gnext(), setup=
... def g():
...   while 1: yield 1
...
... gnext = g().next
... )

 t4.timeit()
0.55085492134094238


So on the basis of my tests, there is a small, but significant speed
advantage to _calling_ a function versus _resuming_ a generator.

Of course the other advantages of generators often far outweigh the tiny
setup cost each time you call one. In addition, for any complex function
with significant execution time, the call/resume time may be an
insignificant fraction of the total execution time. There is little or no
point in avoiding generators due to a misplaced and foolish attempt to
optimise your code.


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-05 Thread Fredrik Lundh
Steven D'Aprano wrote:

 So on the basis of my tests, there is a small, but significant speed
 advantage to _calling_ a function versus _resuming_ a generator.

now add state handling to your micro-benchmark, and see if the function
example still runs faster.

(hint: functions and generators do different things, and are designed
for different use cases.  they're not two different ways to do the same
thing, and benchmarks that ignore that simple fact are pretty much use-
less, except, perhaps, for a very small group of VM developers.)

/F



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-05 Thread Duncan Booth
Steven D'Aprano wrote:

 t1.timeit()
 0.63980388641357422
...
 t2.timeit()
 0.82081794738769531
 
 So on the basis of my tests, there is a small, but significant speed
 advantage to _calling_ a function versus _resuming_ a generator.

I get the same, but the difference is much less on my system:

 min(t1.timeit() for i in range(3))
0.43929577641858941
 min(t2.timeit() for i in range(3))
0.46473169075954956

You missed though what is perhaps a more representative case. Generators 
have state, so for a fair comparison you need to restore some state. I 
think a fairer comparison is:

 t5 = timeit.Timer(stmt=func.next(), setup=
class K:
   pass

   def next(self):
  return 1

func = K()
)
 
 min(t5.timeit() for i in range(3))
0.58508302032805659


The method call is slower than the generator resumption and that is without 
even accessing any of the saved state. If you do access the saved state the 
generator runs at about the same speed as the original (local variable 
access is about as fast as accessing a constant), but the method slows down 
even more:

 t6 = timeit.Timer(stmt=gen.next(), setup=
def g(n):
  while 1: yield n
gen = g(42)
)
 min(t6.timeit() for i in range(3))
0.46405506845144373
 t7 = timeit.Timer(stmt=func.next(), setup=
class K:
   def __init__(self, n):
   self.n = n

   def next(self):
  return self.n

func = K(42)
)
 min(t7.timeit() for i in range(3))
0.67426781895460408


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-05 Thread Steven D'Aprano
On Sun, 05 Feb 2006 09:49:21 +0100, Fredrik Lundh wrote:

 Steven D'Aprano wrote:
 
 So on the basis of my tests, there is a small, but significant speed
 advantage to _calling_ a function versus _resuming_ a generator.
 
 now add state handling to your micro-benchmark, and see if the function
 example still runs faster.

¿Que Mr Fawlty?

Sorry, I'm not sure I follow what you mean. Do you mean, Make the
function and generator do something significant?

I expected that people would understand that I was talking only about the
overhead of calling the function or generator, not the overall time needed
to perform some useful task. Sorry for the less than clear explanation.


 (hint: functions and generators do different things, and are designed
 for different use cases.  they're not two different ways to do the same
 thing, and benchmarks that ignore that simple fact are pretty much use-
 less, except, perhaps, for a very small group of VM developers.)

I never meant to imply that generators were somehow worse than
functions. As you say, they have different purposes, and for the sort of
use case that generators are good for, the tiny extra overhead in
restoring a generator is a cost well worth paying.

But it is a cost. I personally don't believe it is a significant cost,
except maybe for the odd special case or two. 


-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Generators vs. Functions?

2006-02-05 Thread Neil Schemenauer
Steven D'Aprano [EMAIL PROTECTED] wrote:
 Have you actually measured this, or are you just making a wild
 guess?

I haven't timed it until now but my guess it not so wild.  I'm
pretty familiar with the generator implementation (having written
the initial version of it).  In Python 2.3, resuming a generator
does a small amount of setup and then calls eval_frame().  Calling a
function does more setup work and then also calls eval_frame().

 Here is my test, using Python 2.3. I've tried to make the test as
 fair as possible, with the same number of name lookups in both
 pieces of test code.

On my machine t4 is faster than t3.  Your test is not so fair
because the generator is doing a while loop (executing more
bytecode instructions) while the function is just returning a value
(one instruction).

On your machine the function call may be faster due to CPU cache
effects or branch prediction.  In any case, the difference you are
trying to measure is extremely small.  Try adding some arguments to
the functions (especially keyword arguments).

What your test does show is that the speed difference should not
come into the decision of which construct to use.

  Neil
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-05 Thread Steven D'Aprano
On Sun, 05 Feb 2006 16:14:54 +, Neil Schemenauer wrote:

 Steven D'Aprano [EMAIL PROTECTED] wrote:
 Have you actually measured this, or are you just making a wild
 guess?
 
 I haven't timed it until now but my guess it not so wild.  I'm
 pretty familiar with the generator implementation (having written
 the initial version of it).

Well I guess you're forgiven then *sheepish grin*

 In Python 2.3, resuming a generator
 does a small amount of setup and then calls eval_frame().  Calling a
 function does more setup work and then also calls eval_frame().

It takes MORE setup to call a function than it takes to resume a
generator?


 Here is my test, using Python 2.3. I've tried to make the test as
 fair as possible, with the same number of name lookups in both
 pieces of test code.
 
 On my machine t4 is faster than t3.  Your test is not so fair
 because the generator is doing a while loop (executing more
 bytecode instructions) while the function is just returning a value
 (one instruction).

A fair criticism, but then a generator with just one instruction is, well,
pointless.

 
 On your machine the function call may be faster due to CPU cache
 effects or branch prediction.  In any case, the difference you are
 trying to measure is extremely small.  Try adding some arguments to
 the functions (especially keyword arguments).


Small in absolute terms, but significant in relative terms: as an order of
magnitude, a factor of about 1/10th.

Of course, I never expected that calling/resuming cost to be significant
for most real world uses. If I gave anyone that impression, it wasn't
intended.

 What your test does show is that the speed difference should not
 come into the decision of which construct to use.

I never said it should.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-05 Thread Magnus Lycka
Duncan Booth wrote:
 Steven D'Aprano wrote:
So on the basis of my tests, there is a small, but significant speed
advantage to _calling_ a function versus _resuming_ a generator.
 
 I get the same, but the difference is much less on my system:

With Python 2.4? Doesn't surprise me a bit.

I tested with 2.3 (vanilla Red Hat EL4 install) and it seems Steven
used 2.3 as well. My little test was just an attempt to test a claim
that the setup time would be shorter for generator calls than for
function calls. It's so easy to test timing with Python, so it's
surprising that people speculate so much about theories with no
measurements.

Who knows what the call time ratios will be in Python 2.6?

I think the important point is the one Fredrik is making: You
won't have a function implementation or a generator implementation
with the same code body.

The other differences in the code will typically mean much more than
the call overhead. If we're in some nested loop where we call a
function so trivial that the call overhead makes performance suffer,
by all means, inline these few lines of code in the loop!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-04 Thread Joseph Garvin
Wolfgang Keller wrote:

If this is actually also true in the general case, and not due to eventual 
non-representativeness of the test mentioned above, is it simply due to a 
less-than-optimum implementation of generators in the current Pyython 
interpreter and thus likely to change in the future or is this a matter of 
principle and will consequently remain like this forever?
  


I am not a CPython or PyPy hacker, but I would guess that it will always 
be slower as a matter of principal. When resuming a generator you have 
to resetup the state the function was in when it was last called, which 
I think should always be more costly than calling the function with a 
clean state.

Someone want to correct me?

Whether or not the difference is that significant though I am unsure. It 
may be small enough that for most applications no one cares.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-04 Thread Max
Joseph Garvin wrote:
 
 I am not a CPython or PyPy hacker, but I would guess that it will always 
 be slower as a matter of principal. When resuming a generator you have 
 to resetup the state the function was in when it was last called, which 
 I think should always be more costly than calling the function with a 
 clean state.
 
 Someone want to correct me?

In cases where there are thousands of (large) values to return, the list 
(as returned by the function) may be large enough to require memory 
paging, whereas the generator only returns one value at a time.

 
 Whether or not the difference is that significant though I am unsure. It 
 may be small enough that for most applications no one cares.

I just wrote an application which retrieves values from a 300mb 
database, and got a significant speedup using iterators.

--Max
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-04 Thread Peter Hansen
Joseph Garvin wrote:
 Wolfgang Keller wrote:
If this is actually also true in the general case, and not due to eventual 
non-representativeness of the test mentioned above, is it simply due to a 
less-than-optimum implementation of generators in the current Pyython 
interpreter and thus likely to change in the future or is this a matter of 
principle and will consequently remain like this forever?
 
 I am not a CPython or PyPy hacker, but I would guess that it will always 
 be slower as a matter of principal. When resuming a generator you have 
 to resetup the state the function was in when it was last called, which 
 I think should always be more costly than calling the function with a 
 clean state.
 
 Someone want to correct me?

Sure.  You have to resetup the state of the function... depending on 
what resetup means (not a usual English word, so we might all imagine 
different meanings for it), either the first or the second part of the 
last sentence is false.

More precisely, the state of the function is *saved* when a yield 
occurs, so you certainly don't *recreate* it from scratch, but merely 
restore the state, and this should definitely be faster than creating it 
from scratch in the first place.  I haven't looked at the source, but 
this wouldn't have to involve much beyond a little memory copying, or 
even a few pointer changes, whereas the original could involve a lot of 
work, depending on how many arguments were passed, how many locals 
exist, and so on.

-Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Generators vs. Functions?

2006-02-04 Thread Neil Schemenauer
Peter Hansen [EMAIL PROTECTED] wrote:
 More precisely, the state of the function is *saved* when a yield 
 occurs, so you certainly don't *recreate* it from scratch, but merely 
 restore the state, and this should definitely be faster than creating it 
 from scratch in the first place.

Right.  Resuming a generator is faster than calling a function.

  Neil
-- 
http://mail.python.org/mailman/listinfo/python-list