STINNER Victor added the comment: Oops, I forgot a DECREF: fixed in the patch version 2.
-- Oh wait, I misunderstood how dict.update() is called. In fact, they are two bytecodes to call a function with keyword arguments. (1) Using **kwargs: >>> def f(): ... d.update(**d2) ... >>> dis.dis(f) 2 0 LOAD_GLOBAL 0 (d) 2 LOAD_ATTR 1 (update) 4 BUILD_TUPLE 0 6 LOAD_GLOBAL 2 (d2) 8 CALL_FUNCTION_EX 1 10 POP_TOP 12 LOAD_CONST 0 (None) 14 RETURN_VALUE (2) Using a list of key=value: >>> def g(): ... d.update(x=1, y=2) ... >>> dis.dis(g) 2 0 LOAD_GLOBAL 0 (d) 2 LOAD_ATTR 1 (update) 4 LOAD_CONST 1 (1) 6 LOAD_CONST 2 (2) 8 LOAD_CONST 3 (('x', 'y')) 10 CALL_FUNCTION_KW 2 12 POP_TOP 14 LOAD_CONST 0 (None) 16 RETURN_VALUE The problem is that the dict.update() method has a single implementation, the C dict_update() function. For (2), there is a speedup, but it's minor: $ ./python -m perf timeit -s 'd={"x": 1, "y": 2}' 'd.update(x=1, y=2)' -p10 --compare-to=../default-ref/python Median +- std dev: [ref] 185 ns +- 62 ns -> [patched] 177 ns +- 2 ns: 1.05x faster (-5%) For (1), I expected that **kwargs would be unpacked *before* calling dict.update(), but kwargs is passed unchanged to dict.update() directly! With my patch, CALL_FUNCTION_EX calls PyCFunction_Call() which uses _PyStack_UnpackDict() to create kwnames and then dict_update() rebuilds a new temporary dictionary. It's completely inefficient! As Raymond expected, it's much slower: haypo@smithers$ ./python -m perf timeit -s 'd={"x": 1, "y": 2}; d2=dict(d)' 'd.update(**d2)' -p10 --compare-to=../default-ref/python Median +- std dev: [ref] 114 ns +- 1 ns -> [patched] 232 ns +- 21 ns: 2.04x slower (+104%) I expect that (1) dict.update(**kwargs) is more common than (2) dict.update(x=1, y=2). Moreover, the speedup for (2) is low (5%), so I prefer to reject this issue. -- Naoki: "So, when considering METH_FASTCALL, supporting **kwargs is lowest priority. (Off course, supporting it by AC with METH_KEYWORDS is nice to have)" Hum, dict.update() is the first function that I found that really wants a Python dict at the end. For dict1.update(**dict2), METH_VARARGS|METH_KEYWORDS is already optimal. So I don't think that it's worth it to micro-optimize the way to pass positional arguments. The common case is to call dict1.update(dict2) which requires to build a temporary tuple of 1 item. PyTuple_New() uses a free list for such small tuple, so it should be fast enough. I found a few functions which pass through keyword arguments, but they are "proxy". I'm converting all METH_VARARGS|METH_KEYWORDS to METH_FASTCALL, so most functions will expects a kwnames tuple at the end of the call for keyword arguments. In this case, using METH_FASTCALL for the proxy is optimum for func(x=1, y=2) (CALL_FUNCTION_KW), but slower for func(**kwargs) (CALL_FUNCTION_EX). With METH_FASTCALL, CALL_FUNCTION_EX requires to unpack the dictionary if I understood correctly. ---------- status: open -> closed Added file: http://bugs.python.org/file46334/dict_update_fastcall-2.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29312> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com