On Wed, Feb 27, 2013 at 5:09 AM, Stefan Behnel <stefan...@behnel.de> wrote: > Robert Bradshaw, 27.02.2013 09:54: >> On Tue, Feb 26, 2013 at 11:24 PM, Stefan Behnel wrote: >>> I imagine that the freelist could leave the initial vtable untouched in >>> some cases, but that would mean that we need a freelist per actual type, >>> instead of object struct size. >>> >>> Now, if we move the freelist handling into each subtype (as you and Mark >>> proposed already), we'd get some of this for free, because the objects that >>> get freed are already properly set up for the specific type, including >>> vtable etc. All that remains to be done is to zero out the (known) C typed >>> attributes, set the (known) object attributes to None, and call any >>> __cinit__() methods in the super types to do the rest for us. We might have >>> to do it in the right order, i.e. initialise some attributes, call the >>> corresponding __cinit__() method, initialise some more attributes, ... >>> >>> So, basically, we'd manually inline the bottom-up aggregation of all tp_new >>> functions into the current one, skipping those operations that we don't >>> consider necessary in the freelist case, such as the vtable setup. >>> >>> Now, the only remaining issue is how to get at the __cinit__() functions if >>> the base type isn't in the same module, but as Mark proposed, that could >>> still be done if we require it to be exported in a C-API (and assume that >>> it doesn't exist if not?). Would be better to know it at compile time, >>> though... >> >> Yes, and that's still going to (potentially) be expensive. I'd rather >> have a way of controlling what, if anything, gets zero'd out/set to >> None, as most of that (in Sage's case at least) will still be valid >> for the newly-reused type or instantly over-written (though perhaps >> the default could be to call __dealloc__/__cinit__). With this we >> could skip going up and down the type hierarchy at all. > > I don't think the zeroing is a problem. Just bursting out static data to > memory should be plenty fast these days and not incur any wait cycles or > pipeline stalls, as long as the compiler/processor can figure out that > there are no interdependencies between the assignments. The None > assignments may be a problem due to the INCREFs, but even in that case, the > C compiler and processor should be able to detect that they are all just > incrementing the same address in memory and may end up reducing a series of > updates into one. The only real problem are the calls to __cinit__(), which > run user code and can thus do anything. If they can't be inlined, the C > compiler needs to lessen a lot of its assumptions. > > Would it make sense to require users to implement __cinit__() as an inline > method in a .pxd file if they want to use a freelist on a subtype? Or would > that be overly restrictive? It would prevent them from using module > globals, for example. That's quite a restriction normally, but I'm not sure > how much it hurts the "average" code in the specific case of __cinit__().
It would hurt in the couple of examples I've thought about (e.g. fast Sage elements, where one wants to set the Parent field correctly). - Robert _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel