[issue23507] Tuple creation is too slow

2016-12-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is rebased patch (just for history). Caching argtuple is not faster (or 
even slower) than using fast call.

Regression in stack consumption is fixed in issue28858.

Thanks Victor!

--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed
Added file: http://bugs.python.org/file45953/reuse_argtuples_4.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread STINNER Victor

STINNER Victor added the comment:

haypo: "status: open -> closed"

Oops, it's a mistake, sorry. I only wanted to ask the question, not to close 
the issue.

Serhiy Storchaka: "No, I'm referring to the crashing. The code that worked 
before your changes now are crashing."

Sorry, I didn't notice that you attached a script! I opened the issue #28858.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

No, I'm referring to the crashing. The code that worked before your changes now 
are crashing. Please revert your changes and find a way to optimize a function 
calling without increasing stack consumption.

--
resolution: fixed -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread STINNER Victor

STINNER Victor added the comment:

>  I opened a new issue #28852 to track this performance regression.

So can I close again this issue?

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread STINNER Victor

STINNER Victor added the comment:

Serhiy Storchaka: "Victor, your changes introduced a regression."

I suppose that you are refering to sorted(list) which seems to be slower. Do 
you have a specific change in mind? As I wrote, it doesn't use a key function, 
and so should not be impacted by my work on fast calls. I opened a new issue 
#28852 to track this performance regression.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Victor, your changes introduced a regression.

--
resolution: fixed -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread STINNER Victor

STINNER Victor added the comment:

Serhiy: "I don't propose to commit this complicated patch, but these results 
can be used as a guide to the optimization of tuple creating. It is surprising 
to me that this patch has any effect at all."

A new "fast call" calling convention was added to Python 3.6 to avoid the 
creation of temporary tuples to pass function arguments. Benchmarks confirm 
that Python 3.7 is faster than Python 3.5 of functions that Serhiy suggested to 
optimize. I confirm that I was also surprised that fast call has a significant 
effect on performance!

Fast calls don't require complex code like reuse_argtuples_3.patch which 
requires to check carefully the reference counter. In the past, a similar 
optimization on property_descr_get() introduced complex bugs: see issue #24276 
and issue #26811. Fast calls don't use such issues.

The implementation of fast calls is currently incomplete: tp_new, tp_init and 
tp_call slots still require a tuple and a dict for positional and keyword 
arguments. I'm working on an enhancement for Python 3.7 to support also fast 
calls for these slots. In the meanwhile, callbacks using tp_call don't benefit 
yet on fast calls.

That's why I had to keep the tuple/refcount optimization in 
property_descr_get(), to not introduce a performance regression.

I consider that this issue can now be closed as "fixed".

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread STINNER Victor

STINNER Victor added the comment:

I rewrote bench_builtins.py to use my new perf module.

Python 3.7 is between 1.27x and 1.42x faster than Python 3.6, but sorted(list) 
is slower. I don't know why sorted(list) is slower. It doesn't use a key 
function, and so should not be impacted by FASTCALL changes made since Python 
3.6.

Benchmark results of Python 3.7 (include latest FASTCALL patches) compared to 
Python 3.5 (no FASTCALL):

filter(lambda x: x, range(1000)): Median +- std dev: [3.5] 126 us +- 7 us -> 
[3.7] 94.6 us +- 4.8 us: 1.34x faster (-25%)
map(lambda x, y: x+y, range(1000), range(1000)): Median +- std dev: [3.5] 163 
us +- 10 us -> [3.7] 115 us +- 7 us: 1.42x faster (-30%)
map(lambda x: x, range(1000)): Median +- std dev: [3.5] 115 us +- 6 us -> [3.7] 
90.9 us +- 5.4 us: 1.27x faster (-21%)
sorted(list): Median +- std dev: [3.5] 17.5 us +- 1.0 us -> [3.7] 19.7 us +- 
1.1 us: 1.12x slower (+12%)
sorted(list, key=lambda x: x): Median +- std dev: [3.5] 122 us +- 10 us -> 
[3.7] 91.8 us +- 5.7 us: 1.33x faster (-25%)

Note: you need the development version of perf to run the script (future 
version 0.9.2).

--
Added file: http://bugs.python.org/file45726/bench_builtins.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is an example that shows increased stack consumption.

With old code it counts to 74700, with the patch it counts only to 65400.

Please revert your changes.

--
Added file: http://bugs.python.org/file45725/stack_overflow.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I think it would be better to optimize PyObject_CallFunctionObjArgs().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-12-01 Thread Roundup Robot

Roundup Robot added the comment:

New changeset b9c9691c72c5 by Victor Stinner in branch 'default':
Replace PyObject_CallFunctionObjArgs() with fastcall
https://hg.python.org/cpython/rev/b9c9691c72c5

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-05-09 Thread Jakub Stasiak

Changes by Jakub Stasiak :


--
nosy: +jstasiak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-04-23 Thread Demur Rumed

Demur Rumed added the comment:

This code could be made to look a lot less hackish if the potential tuple reuse 
was abstracted by a PyTuple_UniqueOrNew(PyTuple) which returns its argument if 
REFCNT==1 otherwise allocates a tuple of the same size & returns that. It 
decrefs PyTuple if REFCNT != 1

--
nosy: +Demur Rumed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-04-22 Thread Yury Selivanov

Changes by Yury Selivanov :


--
nosy: +yselivanov

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-04-22 Thread STINNER Victor

STINNER Victor added the comment:

reuse_argtuples_3.patch: rebased reuse_argtuples_2.patch.

--
Added file: http://bugs.python.org/file42571/reuse_argtuples_3.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-04-22 Thread STINNER Victor

STINNER Victor added the comment:

msg264009: Serhiy Storchaka "Could you compare filter(), map() and sorted() 
performance with your patch and with issue23507 patch?"

Here you have the result of bench_builtins.py. The performance look to be the 
same, even if I expect better performance with less trivial callbacks (where 
fastcall would avoid the creation of other temporary tuples).

+-+--
Tests   |   argtuples |  fastcall
+-+--
filter(lambda x: x, range(1000))| 80.2 us (*) | 72.7 us (-9%)
map(lambda x: x, range(1000))   | 79.5 us (*) |   79.9 us
map(lambda x, y: x+y, range(1000), range(1000)) |  111 us (*) |109 us
sorted(list, key=lambda x: x)   | 82.6 us (*) | 75.7 us (-8%)
sorted(list)| 15.7 us (*) |   15.6 us
+-+--
Total   |  369 us (*) |353 us
+-+--

Comparison to the original Python 3.6:

+-++---
Tests   |original |  argtuples 
|   fastcall
+-++---
filter(lambda x: x, range(1000))|  109 us (*) | 80.2 us (-27%) 
| 72.7 us (-33%)
map(lambda x: x, range(1000))   | 96.9 us (*) | 79.5 us (-18%) 
| 79.9 us (-18%)
map(lambda x, y: x+y, range(1000), range(1000)) |  126 us (*) |  111 us (-12%) 
|  109 us (-13%)
sorted(list, key=lambda x: x)   |  114 us (*) | 82.6 us (-28%) 
| 75.7 us (-34%)
sorted(list)| 16.1 us (*) |15.7 us 
|15.6 us
+-++---
Total   |  463 us (*) |  369 us (-20%) 
|  353 us (-24%)
+-++---


The difference is more on the changes in bltinmodule.c. Example with 
filter_next():

Using fastcall:

-good = PyObject_CallFunctionObjArgs(lz->func, item, NULL);
+good = PyObject_CallArg1(lz->func, item);

reuse_argtuples_2.patch:

-good = PyObject_CallFunctionObjArgs(lz->func,
-item, NULL);
+PyObject *argtuple = lz->argtuple;
+lz->argtuple = NULL;
+if (argtuple == NULL) {
+argtuple = PyTuple_New(1);
+if (argtuple == NULL) {
+Py_DECREF(item);
+return NULL;
+}
+Py_INCREF(argtuple);
+}
+assert(Py_REFCNT(argtuple) == 2);
+assert(Py_SIZE(argtuple) == 1);
+assert(PyTuple_GET_ITEM(argtuple, 0) == NULL);
+PyTuple_SET_ITEM(argtuple, 0, item);
+good = PyObject_Call(lz->func, argtuple, NULL);
+if (Py_REFCNT(argtuple) == 2) {
+PyTuple_SET_ITEM(argtuple, 0, NULL);
+if (lz->argtuple == NULL)
+lz->argtuple = argtuple;
+else {
+Py_DECREF(argtuple);
+Py_DECREF(argtuple);
+}
+}
+else {
+Py_INCREF(item);
+Py_DECREF(argtuple);
+Py_DECREF(argtuple);
+}

(reuse_argtuples_2.patch requires a few other changes related to filter_next().)

--
Added file: http://bugs.python.org/file42569/bench_builtins.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2016-04-22 Thread STINNER Victor

STINNER Victor added the comment:

Hi, I started to work on a new top-down approach avoiding completly temporary 
tuple/dict: "FASTCALL", issue #26814. As you may expect, the patch is much 
larger that Serhiy's patches attached to his issue, but IMHO the final code is 
much simpler: it doesn't have to use complex tricks on the reference count or 
setting tuple items to NULL (which lead to segfaults like the issue #26811).


Antoine Pitrou: "Maybe we want a facility to create on-stack static-size 
tuples?"

My current implementation of FASTCALL uses a buffer allocated on the stack to 
handle calls with up to 10 parameters (10 positional parameters or 5 keyword 
parameters, since a keyword uses 2 PyObject*).


Antoine Pitrou: "How many functions can benefit from this approach, though?"

In my experimental FASTCALL branch, slot wrappers, PyObject_Call*() functions 
(except of PyObject_Call()!), ceval.c, etc. benefit from FASTCALL. The question 
is not really how much benefit from FASTCALL, but more how much changes are 
worth to avoid temporary tuple/dict. For example, right now I decided to not 
add a FASTCALL flavor of tp_new and tp_init (instanciate a class or create a 
new type) because it becomes much more complex when you have to handle 
inheritance.

Maybe my FASTCALL requires too many changes and is overkill. Maybe we need to 
find a compromise between FASTCALL (issue #26814) and Serhiy's changes which 
are limited to a few functions.

Today, it's too early to decide, but I have fun with my FASTCALL experiment ;-) 
Slot wrappers are 40% faster, getting a namedtuple attribute is 25% faster 
(whereas Serhiy already optimized this specific case!), etc.

--
nosy: +haypo

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-04-14 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

map doesn't use PyObject_CallFunctionObjArgs(). If compare results for map with 
results for filter and list.sort, seems that PyTuple_New() is responsible for 
24% and PyObject_CallFunctionObjArgs() adds other 14%.

--
priority: normal -> low

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-04-13 Thread Raymond Hettinger

Raymond Hettinger added the comment:

Unassigning because I don't have an interest in going further down this path.  
I used this technique with zip() because tuple reuse nicely benefited the 
common case of "for a,b,c in zip(alist, blist, clist): ...".  

On another thread, I did see mention of removing the memset() from tuple 
creation and that seems like a promising direction.

--
assignee: rhettinger -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-04-13 Thread Antoine Pitrou

Antoine Pitrou added the comment:

By the way, is PyTuple_New() really the culprit here? Or is it the valist 
handling in PyObject_CallFunctionObjArgs()?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

> Maybe we want a facility to create on-stack static-size tuples?

There is no guarantee that called function doesn't save the reference to args.

> How many functions can benefit from this approach, though?

>From hand-writing caching? Any function, that repeatedly call other functions 
>many times. In additional to sorted()/list.sort() with the key argument, 
>filter() and map(), these are min() and max() with the key argument, some 
>iterators from itertools (groupby(), dropwhile(), takewhile(), accumulate(), 
>filterfalse()). May be some functions that are thin wrappers around special 
>method (e.g. round()).

But it is more interesting to investigate why PyTuple_New() is so slow in 
comparison with hand-written caching. It it can be speeded up, then mauch more 
code can benefit from this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-26 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Maybe we want a facility to create on-stack static-size tuples?

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-26 Thread Antoine Pitrou

Antoine Pitrou added the comment:

How many functions can benefit from this approach, though?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Fixed bugs found by Victor. Thank you Victor.

Yes, zip() already uses similar approach.

--
Added file: http://bugs.python.org/file38244/reuse_argtuples_2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Note the check for Py_REFCNT() == 1. Whithout it the code would be incorrect.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-26 Thread Raymond Hettinger

Raymond Hettinger added the comment:

I had done something like this for zip() long ago and had gotten a nice speed 
boost.  I remember looking at whether it should be applied elsewhere but don't 
remember why I decided against it.

--
assignee:  -> rhettinger
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-24 Thread Stefan Behnel

Changes by Stefan Behnel :


--
nosy: +scoder

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23507] Tuple creation is too slow

2015-02-24 Thread Ezio Melotti

Changes by Ezio Melotti :


--
nosy: +ezio.melotti

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com