Hi all,

warning: long e-mail ahead, don't read in a hurry!

I gave generator functions a couple of thoughts. Implementing them actually
sounds simpler than it is, not because of the state keeping, but because of
the refactoring (point 1 below) that I would like to see done before going
there.

Here's what I think should be done:

1) refactor def functions into a Python wrapper and a static C function
   * Python wrapper does all argument unpacking, return value packing and
     the final exception propagation
   * C function contains the complete body of the original function and
     returns the return value directly
   a) non-closure functions:
      - C function has signature as written in the code
      - Python wrapper calls C function to execute the body
   b) closure functions:
      - C function has METH_NOARGS signature
      - Python wrapper creates closure and fills in arguments
      - Python wrapper calls C function with closure as 'self'

2) support writing utility code in Cython (does this work already?)
   * likely just compile TreeFragments inside of the utility_scope?
     (does the utility_scope actually have a unique mangling prefix
      or will it interfere with a user provided "utility" module?)

3) implement a generic 'generator' type in Cython code (see code below)
   * methods: __iter__, __next__, send, throw, close (as in PEP 342, see
       http://www.python.org/dev/peps/pep-0342/ )
   * fields: closure, exception, __weakref__, C function pointer

4) implement generators as extension to 1b)
   * Python wrapper works mostly as in 1b), but
     - does not call the C function
     - creates and returns a generator instance instead and fills in the
       created closure and the pointer to the C function part of the
       generator function
   * generator functions become modified closure functions:
     - METH_O signature instead of METH_NOARGS to receive the send(x) value
       directly (note that gen.__next__() is defined as gen.send(None) and
       gen.throw(exc) could be implemented as gen.send(NULL))
     - closures additionally contain function temps (I'm thinking of a
       union of structs, i.e. one struct for each set of temps that existed
       during the code generation for a yield node, but I guess storing
       all temps is just fine to start with - won't impact performance,
       just memory)
     - closures have an additional C field to store the execution state
       (void* to a function label, initially NULL)
     - "sendval = (yield [expr])" emits the following code:
       - store away all current temp values in the closure
       - set "closure._resume_label" to the resume label (see below, uses
         the C operator "&&")
       - return the expression result (or None) - return immediately
         without cleanup (the temp that holds the expression result must be
         unmanaged to prevent DECREF()-ing on resume; INCREF()-ing the
         return value will keep it alive for too long)
       - here goes the resume label ("__Lxyz_resume_from_yield:")
       - reset all saved temp values from the closure
       - if an exception is to be raised (gen.throw() was called, which has
         already set the exception externally), use normal exception path
       - set the result temp of the yield node to the send value argument
         that was passed (INCREF or not, as for parameters)
   * generator C function basically implements gen.send(x)
     - receives both the closure and the current send value as parameters
     - if "closure._resume_label" is not NULL, jump to the label;
       otherwise, check that 'x' is None (raise an exception if not) and
       execute the function body normally

So the main work that's left to be done in 4) will be the closure extension
to include the temps and the yield/resume implementation.

Here's the (trivial) generic generator type:

    cdef class generator:
        cdef object _closure
        cdef meth_o_func* _run
        cdef object __weakref__

        def __iter__(self):
            return self

        def __next__(self):
            return self._run(self._closure, None)

        def send(self, value):
            return self._run(self._closure, value)

        def throw(self, type, value=None, traceback=None):
            EXC_RESET(type, value, traceback)
            return self._run(self._closure, NULL)

        def close(self):
            try:
                EXC_RESET(GeneratorExit, NULL, NULL)
                self._run(self._closure, NULL)
            except (GeneratorExit, StopIteration):
                pass
            else:
                raise RuntimeError('generator ignored GeneratorExit')

I wonder if there is a way to make it inherit from CPython's GeneratorType.
That would enhance the interoperability, but it would also mean that we add
some unnecessary instance size overhead and that we have to prevent that
base-type from doing anything, including initialisation and final cleanup.

The separation in 1a) has also been requested by Lisandro (and likely
others) a while ago to make the function setup code more readable.
Currently, the argument unpacking code takes so much space that it's easy
to get lost when trying to read the generated function code, especially in
short functions.

The refactoring for 1) actually conflicts a bit with cpdef functions, which
do the exact opposite: they create a DefNode for an existing C function. I
wonder if it makes sense to swap that while we're at it. That would reduce
some redundancy.

Ok, this is a rather lengthy e-mail that's a bit akin to a spec already.
Does this make sense to everybody? Any objections or ideas? Anyone happy to
give a hand? :)

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to