Hello,

I'm starting this thread to brainstorm for using vectorcall to speed up creating instances of Python classes.

Currently the following happens when creating an instance of a Python class X using X(.....) and assuming that __new__ and __init__ are Python functions and that the metaclass of X is simply "type":

1. type_call (the tp_call wrapper for type) is invoked with arguments (X, args, kwargs).

2. type_call calls slot_tp_new with arguments (X, args, kwargs).

3. slot_tp_new calls X.__new__, prepending X to the args tuple. A new object obj is returned.

4. type_call calls slot_tp_init with arguments (obj, args, kwargs).

5. slot_tp_init calls type(obj).__init__ method, prepending obj to the args tuple. A new object obj is returned.

In the worst case, no less than 6 temporary objects are needed just to pass arguments around:

1. An args tuple and kwargs dict for tp_call

3. An args array with X prepended and a kwnames tuple for __new__

5. An args array with obj prepended and a kwnames tuple for __init__

This is clearly not as efficient as it could be.

An obvious solution would be to introduce variants of tp_new and tp_init using the vectorcall protocol. Assuming PY_VECTORCALL_ARGUMENTS_OFFSET is used, all 6 temporary allocations could be dropped. The implementation could be in the form of two new slots tp_vector_new and tp_vector_init. Since we're just dealing with type slots here (as opposed to offsets in an object structure), this should be easier to implement than PEP 590 itself.


Jeroen.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CF5JGQ5DQBEWP4XLF4FAH66MNY2VRREG/

Reply via email to