Ok I'll submit the patch with the prose pretty soon. Thank you Francis Girard FRANCE
Le mardi 25 Janvier 2005 04:29, Tim Peters a écrit : > [Francis Girard] > > > For all the algorithms that run after their tail in an FP way, like the > > Hamming problem, or the Fibonacci sequence, (but unlike Sieve of > > Eratosthene -- there's a subtle difference), i.e. all those algorithms > > that typically rely upon recursion to get the beginning of the generated > > elements in order to generate the next one ... > > > > ... so for this family of algorithms, we have, somehow, to abandon > > recursion, and to explicitely maintain one or many lists of what had been > > generated. > > > > One solution for this is suggested in "test_generators.py". But I think > > that this solution is evil as it looks like recursion is used but it is > > not and it depends heavily on how the m235 function (for example) is > > invoked. > > Well, yes -- "Heh. Here's one way to get a shared list, complete with > an excruciating namespace renaming trick" was intended to warn you in > advance that it wasn't pretty <wink>. > > > Furthermore, it is _NOT_ memory efficient as pretended : it leaks ! > > Yes. But there are two solutions to the problem in that file, and the > second one is in fact extremely memory-efficient compared to the first > one. "Efficiency" here was intended in a relative sense. > > > It internally maintains a lists of generated results that grows forever > > while it is useless to maintain results that had been "consumed". Just > > for a gross estimate, on my system, percentage of memory usage for this > > program grows rapidly, reaching 21.6 % after 5 minutes. CPU usage varies > > between 31%-36%. Ugly and inefficient. > > Try the first solution in the file for a better understanding of what > "inefficient" means <wink>. > > > The solution of Jeff Epler is far more elegant. The "itertools.tee" > > function does just what we want. It internally maintain a list of what > > had been generated and deletes the "consumed" elements as the algo > > unrolls. To follow with my gross estimate, memory usage grows from 1.2% > > to 1.9% after 5 minutes (probably only affected by the growing size of > > Huge Integer). CPU usage varies between 27%-29%. > > Beautiful and effecient. > > Yes, it is better. tee() didn't exist when generators (and > test_generators.py) were written, so of course nothing in the test > file uses them. > > > You might think that we shouldn't be that fussy about memory usage on > > generating 100 digits numbers but we're talking about a whole family of > > useful FP algorithms ; and the best way to implement them should be > > established. > > Possibly -- there really aren't many Pythonistas who care about this. > > > For this family of algorithms, itertools.tee is the way to go. > > > > I think that the semi-tutorial in "test_generators.py" should be updated > > to use "tee". Or, at least, a severe warning comment should be written. > > Please submit a patch. The purpose of that file is to test > generators, so you should add a third way of doing it, not replace the > two ways already there. It should also contain prose explaining why > the third way is better (just as there's prose now explaining why the > second way is better than the first). -- http://mail.python.org/mailman/listinfo/python-list