http://bugs.python.org/issue14288
Raymond suggested that this patch should be discussed here, so here goes:
How this came about:
There are frameworks, such as the Nagare web framework,
(http://www.nagare.org/) that rely on suspending execution at some point and
resuming it again. Nagare does this using Stackless python, pickles the
execution state of a tasklet, and resumes it later (possibly elsewhere). Other
frameworks are doing similar things in cloud computing. I have seen such
presentation at previous PyCons, and they have had to write their own picklers
to get around these problems.
The problem is this: While pickling execution state (frame objects, functions)
might be considered exotic, and indeed Stackless has modifications unique to it
to do it, they quickly run into trouble that have nothing to do really with
the fact that they are doing such exotic things.
For example, the fact that the very common dictiter is implemented in C and not
python, necessitates that special pickle support is done added for that,
otherwise only some context can be pickled, (those that are not currently
iterating through a dict) and not others.
Now stackless has tried to provide this functionality for many years and indeed
has special pickling support for dictiters, listiters, etc. (stuff that has
nothing to do with the stacklessness of Stackless, really). However,
(somewhat) recently a lot of the itertools were moved into C. Suddenly
iterators, previously picklable (by merit of being in .py) stopped being that,
just because they became C objects. In addition, a bunch of other iterators
started showing up (stringiter, bytesiter). This started to cause problems.
Suddenly you have to arbitrarily restrict what you can and can't do in code
that is using these approaches. For Stackless, (and Nagare), it was necessary
to ban the usage of the _itertools module in web programs.
Instead of adding this to Stackless, and thus widening the gap between
stackless and cpython, I think it is a good idea simply to fix this in cpython
itself. Note that I also consider this to be of general utility to regular,
non-exotic applications: Why should an application, that is working with a
bunch of data, but wants to stop that for a bit, and maybe save it out to disk,
have to worry about transforming the data into valid primitive datastructures
before doing so?
In my opinion, any objects that have simple and obvious pickle semantics should
be picklable. Iterators are just regular objects with some state. They are
not file pointers or sockets or database cursors. And again, I argue that if
these objects were implemented in .py, they would already be automatically
picklable (indeed, itertools.py was). The detail that some iterators in
standard python are implemented in C should not automatically restrict their
usage for no particular reason.
The patch is straightforward. Most of it is tests, in fact. But it does use a
few tricks in some places to get around the fact that some of those iterator
types are hidden. We did try to be complete and find all the c iterators,
but it was a year ago that the bulk of this work was done and something might
have been added in the meantime.
Anyway, that's my pitch.
Kristján
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com