I wanted to follow up a bit on the issue of speeding up function calls with some more specifics.
Currently, the function call mechanism has to do a lot of conversion operations on the function arguments. Specifically, it takes the various positional arguments, keyword arguments, varargs argument, and kwargs, and combines them into a single tuple and a single dict. It then calls the target function via PyObject_Call. On the recieving side, a different conversion takes place, depending on the type of the callable. For user-defined functions, it maps the various arguments in the tuple and dict into the function's parameter slots, along with considerations of default values and such. One way to speed this up would be to determine cases where the conversion steps can be skipped. There is already a method for this in the case where there are only positional arguments, no varargs or keywords. Ideally, we would like to be able to pass the caller's argument stack pretty much as-is to the reciever. At the moment, that consists of an array of positional arguments, followed by an array of key/value pairs for the keyword arguments, followed by an optional vararg or kwarg. Assume for a moment, then, that we had a variation of PyObject_Call which had a function signature that was exactly that: PyObject_CallFast( func, PyObject **args, int numArgs, PyObject **kargs, int numKwArgs, PyObject *starargs, PyObject *kwarg ); The reciever would have to do extra work, because now it has to check two places for each argument. For a keyword argument, for example, it has to check both the key/value array and the kwargs dict argument. However, this is far less work than building a new dictionary. In addition, we know that the keywords are interned strings, not arbitrary objects, so we can speed up the comparison operation by avoiding type tests. Also, because the keyword args are in a linear list, they will exhibit better cache behavior, and the lookup will be very fast - a simple loop and test. (If you want to write optimal code for today's processors, you pretty much have to throw out everything you learned about unrolling loops and such, and learn to think in terms of cache lines, not bytes and words.) Now, it may be that in some cases, the reciever really does require a tuple and a dict. However, in such a case, the onus should be on the reciever to convert the argument list into the preferred form. So in other words, it would be the reciever's job to convert the arguments into whatever data structure the reciever required. For recievers with relatively simple calling signatures, it may mean no conversion is done at all, and the caller's argument list can be used directly. The most complex part would be validating the arguments - making sure that a keyword argument didn't get assigned more than once, and so on. What about older code? Well, I suppose you could add an additional field to the recieving object, indicating whether it accepted the traditional PyObject_Call-style args, or the newer format. Essentially you would have two function pointers in the object, one for newstyle calling, and one for oldstyle. If the newstyle was present, then the interpreter would use it, otherwise it would fall back to the older means of transforming the arguments into a tuple and a dict. In any case, the reason why this is "Py3K" material is that it involves busting the existing C API to some extent. -- Talin _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com