STINNER Victor added the comment:
Oops, I forgot a DECREF: fixed in the patch version 2.
--
Oh wait, I misunderstood how dict.update() is called. In fact, they are two
bytecodes to call a function with keyword arguments.
(1) Using **kwargs:
>>> def f():
... d.update(**d2)
...
>>> dis.dis(f)
2 0 LOAD_GLOBAL 0 (d)
2 LOAD_ATTR 1 (update)
4 BUILD_TUPLE 0
6 LOAD_GLOBAL 2 (d2)
8 CALL_FUNCTION_EX 1
10 POP_TOP
12 LOAD_CONST 0 (None)
14 RETURN_VALUE
(2) Using a list of key=value:
>>> def g():
... d.update(x=1, y=2)
...
>>> dis.dis(g)
2 0 LOAD_GLOBAL 0 (d)
2 LOAD_ATTR 1 (update)
4 LOAD_CONST 1 (1)
6 LOAD_CONST 2 (2)
8 LOAD_CONST 3 (('x', 'y'))
10 CALL_FUNCTION_KW 2
12 POP_TOP
14 LOAD_CONST 0 (None)
16 RETURN_VALUE
The problem is that the dict.update() method has a single implementation, the C
dict_update() function.
For (2), there is a speedup, but it's minor:
$ ./python -m perf timeit -s 'd={"x": 1, "y": 2}' 'd.update(x=1, y=2)' -p10
--compare-to=../default-ref/python
Median +- std dev: [ref] 185 ns +- 62 ns -> [patched] 177 ns +- 2 ns: 1.05x
faster (-5%)
For (1), I expected that **kwargs would be unpacked *before* calling
dict.update(), but kwargs is passed unchanged to dict.update() directly! With
my patch, CALL_FUNCTION_EX calls PyCFunction_Call() which uses
_PyStack_UnpackDict() to create kwnames and then dict_update() rebuilds a new
temporary dictionary. It's completely inefficient! As Raymond expected, it's
much slower:
haypo@smithers$ ./python -m perf timeit -s 'd={"x": 1, "y": 2}; d2=dict(d)'
'd.update(**d2)' -p10 --compare-to=../default-ref/python
Median +- std dev: [ref] 114 ns +- 1 ns -> [patched] 232 ns +- 21 ns: 2.04x
slower (+104%)
I expect that (1) dict.update(**kwargs) is more common than (2)
dict.update(x=1, y=2). Moreover, the speedup for (2) is low (5%), so I prefer
to reject this issue.
--
Naoki: "So, when considering METH_FASTCALL, supporting **kwargs is lowest
priority. (Off course, supporting it by AC with METH_KEYWORDS is nice to have)"
Hum, dict.update() is the first function that I found that really wants a
Python dict at the end.
For dict1.update(**dict2), METH_VARARGS|METH_KEYWORDS is already optimal.
So I don't think that it's worth it to micro-optimize the way to pass
positional arguments. The common case is to call dict1.update(dict2) which
requires to build a temporary tuple of 1 item. PyTuple_New() uses a free list
for such small tuple, so it should be fast enough.
I found a few functions which pass through keyword arguments, but they are
"proxy". I'm converting all METH_VARARGS|METH_KEYWORDS to METH_FASTCALL, so
most functions will expects a kwnames tuple at the end of the call for keyword
arguments. In this case, using METH_FASTCALL for the proxy is optimum for
func(x=1, y=2) (CALL_FUNCTION_KW), but slower for func(**kwargs)
(CALL_FUNCTION_EX).
With METH_FASTCALL, CALL_FUNCTION_EX requires to unpack the dictionary if I
understood correctly.
----------
status: open -> closed
Added file: http://bugs.python.org/file46334/dict_update_fastcall-2.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue29312>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com