Peter Alexander:

How does this handle recursive generators, e.g. like you would see with a depth-first tree traversal? The "state" in this case is a call stack. How is the memory allocated for the stack?

A Python 2.x test program:


class Node:
    def __init__(self, data, left, right):
        self.data = data
        self.left = left
        self.right = right

def in_order(node):
    if node is not None:
        for n in in_order(node.left):
            yield n
        yield node.data
        for n in in_order(node.right):
            yield n

def main():
    tree = Node(1,
                Node(2,
                     Node(4,
                          Node(7, None, None),
                          None),
                     Node(5, None, None)),
                Node(3,
                     Node(6,
                          Node(8, None, None),
                          Node(9, None, None)),
                     None))

    for n in in_order(tree):
        print n,
    print

main()


It prints:

7 4 2 5 1 8 6 9 3


ShedSkin translates it to some C++ code, here I show only the in_order function:



class __gen_in_order : public __iter<__ss_int> {
public:
    Node *node;
    __iter<__ss_int> *__0, *__1, *__4, *__5;
    __ss_int __2, __6, n;
    __iter<__ss_int>::for_in_loop __3, __7;

    int __last_yield;

    __gen_in_order(Node *node) {
        this->node = node;
        __last_yield = -1;
    }

    __ss_int __get_next() {
        switch(__last_yield) {
            case 0: goto __after_yield_0;
            case 1: goto __after_yield_1;
            case 2: goto __after_yield_2;
            default: break;
        }
        if ((node!=NULL)) {

            FOR_IN(n,in_order(node->left),0,2,3)
                __last_yield = 0;
                __result = n;
                return __result;
                __after_yield_0:;
            END_FOR

            __last_yield = 1;
            __result = node->data;
            return __result;
            __after_yield_1:;

            FOR_IN(n,in_order(node->right),4,6,7)
                __last_yield = 2;
                __result = n;
                return __result;
                __after_yield_2:;
            END_FOR

        }
        __stop_iteration = true;
    }

};

__iter<__ss_int> *in_order(Node *node) {
    return new __gen_in_order(node);
}



Where FOR_IN is a macro to some foreach-like code:

#define FOR_IN(e, iter, temp, i, t) \
    __ ## temp = iter; \
    __ ## i = -1; \
    __ ## t = __ ## temp->for_in_init(); \
    while(__ ## temp->for_in_has_next(__ ## t)) \
    { \
        __ ## i ++; \
        e = __ ## temp->for_in_next(__ ## t);


Note this is not efficient, because it could generate O(n^2) code, but this is not fault of ShedSkin, as the original Python code has the same problem.

They have recently (in Python 3.3) added to Python the syntax "yield from" that will allow to remove that performance problem (currently I think the performance is not improved):

http://legacy.python.org/dev/peps/pep-0380/

Look especially at the section regarding optimization:

http://legacy.python.org/dev/peps/pep-0380/#optimisations

With this improvement the in_order function becomes:

def in_order(node):
    if node is not None:
        yield from in_order(node.left)
        yield node.data
        yield from in_order(node.right)


The F# language has the same optimization, the two differnet kinds of yield are "yield" and "yield!":

http://theburningmonk.com/2011/09/fsharp-yield-vs-yield/

http://stackoverflow.com/questions/3500488/f-yield-operator-implementation-and-possible-c-sharp-equivalents

Bye,
bearophile

Reply via email to