Hi, I would like to get support for borrowed references in Cython.
My use case is a package I recently wrote, which uses an automaton (DFA) for multi-keyword text search in a unicode string. The automaton was originally modeled as a set of state objects that basically contain a (non-dict) map from characters to subsequent state objects. The transformation is currently written in Python. The search engine itself, written in Cython, is rather straight forward. It starts with a reference to one state and jumps from one state to the next while reading the sequence of input characters. The engine doesn't create any new objects until a match was found. All it does is update a reference to the current state. If that reference was an unmanaged reference, the whole engine could run without requiring the GIL. However, since it's not, Cython requires the GIL to update the reference for each new input character. (Note that a PyObject* won't work here as this would prevent the code from accessing the state's attributes). I ended up rewriting the engine to use a struct instead, which added quite a bit to the LOC count and also a bit to the memory footprint (due to duplicated pointers). The code was nice and simple before that, now it's still somewhat short, but it clearly became less beautiful. I'm not sure yet what would be needed to support borrowed references, but I don't think it's trivial. There was an older discussion about borrowed references on cython-dev: http://comments.gmane.org/gmane.comp.python.cython.devel/6864 However, that only dealt with stolen function arguments and borrowed return values. My use case above makes me believe that it would be just as useful for local variables and (potentially) object attributes. So you could write cdef borrowed object borrowed_ref and Cython would disable ref-counting for "borrowed_ref", i.e. borrowed_ref = some_value # no incref some_normal_var = borrowed_ref # normal incref However, this becomes problematic when a new reference is (accidentally?) assigned to the variable, e.g. borrowed_ref = [] The above could raise an error at compile time, and I actually think that we could use the same mechanism as for e.g. bytes->char* conversions of temporary values to detect incorrect code. Also, functions could be required to declare their return values as "borrowed" to allow such an assignment. That would provide a reasonable level of safety IMO. Comments? Stefan _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
