On Wed, Feb 27, 2013 at 5:09 AM, Stefan Behnel <stefan...@behnel.de> wrote:
> Robert Bradshaw, 27.02.2013 09:54:
>> On Tue, Feb 26, 2013 at 11:24 PM, Stefan Behnel wrote:
>>> I imagine that the freelist could leave the initial vtable untouched in
>>> some cases, but that would mean that we need a freelist per actual type,
>>> instead of object struct size.
>>>
>>> Now, if we move the freelist handling into each subtype (as you and Mark
>>> proposed already), we'd get some of this for free, because the objects that
>>> get freed are already properly set up for the specific type, including
>>> vtable etc. All that remains to be done is to zero out the (known) C typed
>>> attributes, set the (known) object attributes to None, and call any
>>> __cinit__() methods in the super types to do the rest for us. We might have
>>> to do it in the right order, i.e. initialise some attributes, call the
>>> corresponding __cinit__() method, initialise some more attributes, ...
>>>
>>> So, basically, we'd manually inline the bottom-up aggregation of all tp_new
>>> functions into the current one, skipping those operations that we don't
>>> consider necessary in the freelist case, such as the vtable setup.
>>>
>>> Now, the only remaining issue is how to get at the __cinit__() functions if
>>> the base type isn't in the same module, but as Mark proposed, that could
>>> still be done if we require it to be exported in a C-API (and assume that
>>> it doesn't exist if not?). Would be better to know it at compile time,
>>> though...
>>
>> Yes, and that's still going to (potentially) be expensive. I'd rather
>> have a way of controlling what, if anything, gets zero'd out/set to
>> None, as most of that (in Sage's case at least) will still be valid
>> for the newly-reused type or instantly over-written (though perhaps
>> the default could be to call __dealloc__/__cinit__). With this we
>> could skip going up and down the type hierarchy at all.
>
> I don't think the zeroing is a problem. Just bursting out static data to
> memory should be plenty fast these days and not incur any wait cycles or
> pipeline stalls, as long as the compiler/processor can figure out that
> there are no interdependencies between the assignments. The None
> assignments may be a problem due to the INCREFs, but even in that case, the
> C compiler and processor should be able to detect that they are all just
> incrementing the same address in memory and may end up reducing a series of
> updates into one. The only real problem are the calls to __cinit__(), which
> run user code and can thus do anything. If they can't be inlined, the C
> compiler needs to lessen a lot of its assumptions.
>
> Would it make sense to require users to implement __cinit__() as an inline
> method in a .pxd file if they want to use a freelist on a subtype? Or would
> that be overly restrictive? It would prevent them from using module
> globals, for example. That's quite a restriction normally, but I'm not sure
> how much it hurts the "average" code in the specific case of __cinit__().

It would hurt in the couple of examples I've thought about (e.g. fast
Sage elements, where one wants to set the Parent field correctly).

- Robert
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to