On Fri, May 26, 2017 at 2:45 PM, Stephan Houben <stephan...@gmail.com> wrote: > Hi all, > > The itertools.tee function can hold on to objects "unnecessarily". > In particular, if you do > > iter2 = itertools.tee(iter1, 2)[0] > > i.e. you "leak" one of the returned iterators, then all returned > objects are not collected until also iter2 is collected. > > I propose a different implementation, namely the one in: > > https://github.com/stephanh42/streamtee > > streamtee.tee is a drop-in alternative for itertools.tee but > as you can see from the test in the repo, it will not hold > on to the generated objects as long. > > I propose this as an improved implementation of itertools.tee. > > Thanks, > > Stephan
For convenience, the implementation itself is here (33 lines including comments): https://github.com/stephanh42/streamtee/blob/master/streamtee.py I like this. It uses an infinite generator as a thunk. Though I think it belongs on bugs.python.org rather than -ideas, because the interface should be the same and the memory/time use is asymptotically the same. The current tee implementation in C (lines 378 to 852): https://github.com/python/cpython/blob/bf623ae8843dc30b28c574bec8d29fc14be59d86/Modules/itertoolsmodule.c#L378 What bothers me is, the current implementation looks very similar to what I imagine the C implementation of this looks like. Instead of thunks with a single element, it has thunks with up to LINKCELLS elements. The comment from the commit (https://github.com/python/cpython/commit/ad983e79d6f215235d205245c2599620e33cf719): > Formerly, underlying queue was implemented in terms of two lists. The > new queue is a series of singly-linked fixed length lists. > > The new implementation runs much faster, supports multi-way tees, and > allows tees of tees without additional memory costs. The delay in deletion, then, seems to be a feature for efficiency, and can be implemented with `#define LINKCELLS 1`. P.S.: I couldn't find a pure Python implementation of tee in the CPython repo. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/