Ok, I think that the bottom line is this :

For all the algorithms that run after their tail in an FP way, like the 
Hamming problem, or the Fibonacci sequence, (but unlike Sieve of Eratosthene 
-- there's a subtle difference), i.e. all those algorithms that typically 
rely upon recursion to get the beginning of the generated elements in order 
to generate the next one ...

... so for this family of algorithms, we have, somehow, to abandon recursion, 
and to explicitely maintain one or many lists of what had been generated.

One solution for this is suggested in "test_generators.py". But I think that 
this solution is evil as it looks like recursion is used but it is not and it 
depends heavily on how the m235 function (for example) is invoked. 
Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! It 
internally maintains a lists of generated results that grows forever while it 
is useless to maintain results that had been "consumed". Just for a gross 
estimate, on my system, percentage of memory usage for this program grows 
rapidly, reaching 21.6 % after 5 minutes. CPU usage varies between 31%-36%.
Ugly and inefficient.

The solution of Jeff Epler is far more elegant. The "itertools.tee" function 
does just what we want. It internally maintain a list of what had been 
generated and deletes the "consumed" elements as the algo unrolls. To follow 
with my gross estimate, memory usage grows from 1.2% to 1.9% after 5 minutes 
(probably only affected by the growing size of Huge Integer). CPU usage 
varies between 27%-29%.
Beautiful and effecient.

You might think that we shouldn't be that fussy about memory usage on 
generating 100 digits numbers but we're talking about a whole family of 
useful FP algorithms ; and the best way to implement them should be 
established.

For this family of algorithms, itertools.tee is the way to go.

I think that the semi-tutorial in "test_generators.py" should be updated to 
use "tee". Or, at least, a severe warning comment should be written.

Thank you,

Francis Girard
FRANCE

Le dimanche 23 Janvier 2005 23:27, Jeff Epler a ÃcritÂ:
> Your formulation in Python is recursive (hamming calls hamming()) and I
> think that's why your program gives up fairly early.
>
> Instead, use itertools.tee() [new in Python 2.4, or search the internet
> for an implementation that works in 2.3] to give a copy of the output
> sequence to each "multiply by N" function as well as one to be the
> return value.
>
> Here's my implementation, which matched your list early on but
> effortlessly reached larger values.  One large value it printed was
> 6412351813189632 (a Python long) which indeed has only the distinct
> prime factors 2 and 3. (2**43 * 3**6)
>
> Jeff
>
> from itertools import tee
> import sys
>
> def imerge(xs, ys):
>     x = xs.next()
>     y = ys.next()
>     while True:
>         if x == y:
>             yield x
>             x = xs.next()
>             y = ys.next()
>         elif x < y:
>             yield x
>             x = xs.next()
>         else:
>             yield y
>             y = ys.next()
>
> def hamming():
>     def _hamming(j, k):
>         yield 1
>         hamming = generators[j]
>         for i in hamming:
>             yield i * k
>     generators = []
>     generator = imerge(imerge(_hamming(0, 2), _hamming(1, 3)), _hamming(2,
> 5)) generators[:] = tee(generator, 4)
>     return generators[3]
>
> for i in hamming():
>     print i,
>     sys.stdout.flush()

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to