[issue32346] Speed up slot lookup for class creation

2018-01-15 Thread Antoine Pitrou

Change by Antoine Pitrou :


--
status: pending -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-13 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I think that Inada is right: there's too much impact on memory consumption and 
garbage collection to call this an optimization.

Serhiy suggested removing the cache.  But, once you remove the cache, you're 
going to iterate on all values of all parent classes' dicts to find which 
values are __dunder__ methods.  This risks being much more expensive for 
classes with tens or hundreds of namespace entries, making the optimization 
even less interesting.

So in the end I think I'm going to reject this issue.

--
resolution:  -> rejected
stage: patch review -> resolved
status: open -> pending

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-12 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> Since benchgcclasses.py doesn't creates dunder methods,
cache doesn't have GC-tracked tuples, and ref cycles.

Hmm, you're right, thank you.  I can also reproduce your numbers here (using 
benchgcclasses2.py).  There's definitely an impact.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-12 Thread INADA Naoki

INADA Naoki  added the comment:

> Note this is really a worst-case benchmark: lots of classes, no methods, no 
> user data beside the classes.

Since benchgcclasses.py doesn't creates dunder methods,
cache doesn't have GC-tracked tuples, and ref cycles.

benchgcclasses2.py adds three dunder methods:

$ ./python benchgcclasses.py
GC time: 13.0 ms
gc: collecting generation 2...
gc: objects in each generation: 1 0 94942
gc: objects in permanent generation: 0
gc: done, 0.0131s elapsed
RSS:
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
inada-n  29604  0.0  0.1  47052 30692 pts/2S+   20:42   0:00 ./python 
benchgcclasses.py

$ ./python-patched benchgcclasses.py
GC time: 17.2 ms
gc: collecting generation 2...
gc: objects in each generation: 1 0 135220
gc: objects in permanent generation: 0
gc: done, 0.0173s elapsed
RSS:
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
inada-n  29626  0.0  0.2  49880 33464 pts/2S+   20:43   0:00 
./python-patched benchgcclasses.py

--
Added file: https://bugs.python.org/file47379/benchgcclasses2.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-12 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Serhiy:

> The relative speed up looks nice. But it is just few microseconds per class. 
> You have to create many thousands of classes to gain a significant fraction 
> of second.

This work started with your message in 
https://mail.python.org/pipermail/python-dev/2017-December/151277.html pointing 
out that we shouldn't add tp_ slots because it makes type creation and Python 
initialization slower (*), so I'm a bit surprised that you're now claiming 
speeding up type creation (by up to 3x!) is not important...

(*) To quote:

'''
2. Increased class initialization time. For every class for every slot 
we need to look up corresponding methods in dictionaries of the class 
itself and all its parents (caching doesn't work fine at this stage). 
Significant part of class initialization time is spent on initializing 
slots. This will increase the startup time and the time of creating 
local classes. The relative overhead is more significant in Cython.
'''

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-12 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Here is a simple script creating 1 classes (a large number, but perhaps not 
out of sight for a large application importing a lot of libraries (*)).

(*) see the experiment I did in 
https://mail.python.org/pipermail/python-dev/2017-December/151260.html

Before:
$ ./python-orig -I benchgcclasses.py 
GC time: 6.8 ms
RSS:
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
antoine  11248  0.0  0.3  41296 24576 pts/1S+   12:18   0:00 ./python-orig 
-I benchgcclasses.py

After:
$ ./python -I benchgcclasses.py 
GC time: 6.9 ms
RSS:
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
antoine  11097  0.0  0.3  41300 24740 pts/1S+   12:18   0:00 ./python -I 
benchgcclasses.py


RSS is a bit unstable from run to run, but roughly the patch seems to add 100 
to 200KB in this case.

As for full GC time, it is quite stable and there's a 0.1ms increase with the 
patch.

Note this is really a worst-case benchmark: lots of classes, no methods, no 
user data beside the classes.

--
Added file: https://bugs.python.org/file47378/benchgcclasses.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-12 Thread INADA Naoki

INADA Naoki  added the comment:

As my understand, this patch creates cache for all classe,
not only for parent classes.
Caches may has much tuples, and they are GC tracked because
they contains function descriptors.  And they actually creates
reference cycles.

Am I correct?

If so, I want estimate of GC overhead and memory overhead of cache
for large project.  Is it really negligible?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-11 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Le 11/01/2018 à 20:25, Serhiy Storchaka a écrit :
> 
> I'm not sure how this caching works when change the parent class after 
> creating the child class.

The caching is invalidated at the same place the method cache is
invalidated.

> Without caching the benefit is 20-50% smaller. Perhaps this result can be 
> improved. Actually we don't need to search in all dictionaries of all classes 
> in the MRO. We can check the correspondent slot in the parent class (the 
> offsets are constants) and look up in the class dict of the first class with 
> non-zero slot value.

I'm not sure that's always true.  The relationship between class dict
attributes and tp_ slots is not 100% bijective, there's more complicated
stuff going on for some of the descriptors.  My patch keeps the current
algorithm (ensuring no descriptor gets broken) but optimizes the lookups.

> There are more priority changes that should be made before the feature freeze 
> in 3.7.

I am not working on any of those changes, so deferring this PR will not
have any effect on those changes.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-11 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

The relative speed up looks nice. But it is just few microseconds per class. 
You have to create many thousands of classes to gain a significant fraction of 
second. And the complexity added by this patch is not tiny.

I'm not sure how this caching works when change the parent class after creating 
the child class.

Without caching the benefit is 20-50% smaller. Perhaps this result can be 
improved. Actually we don't need to search in all dictionaries of all classes 
in the MRO. We can check the correspondent slot in the parent class (the 
offsets are constants) and look up in the class dict of the first class with 
non-zero slot value.

I tried to simplify this change (even at the cost of smaller speedup). But the 
result still looks too complex.

Since the benefit in any case will be small, and seems there are no other 
issues that depend on this change, I suggest to defer it to 3.8. There are more 
priority changes that should be made before the feature freeze in 3.7.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-04 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> While I like faster startup time, I don't know 4% speed up is worth enough 
> for this complexity.

There's no magic bullet that I know of to reduce startup time by a large 
amount.  So we're left with optimizing progressively.  For comparison, 
https://github.com/python/cpython/pull/3279 had an even smaller effect on 
startup time.

This change has the nice property that it's a genuine speed-up on a generic 
operation, and it reduces the cost of adding slots as well as the cost of 
having long inheritance chains.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-04 Thread INADA Naoki

INADA Naoki  added the comment:

In my environment, python -X importtime -c 'import asyncio' speed up from 
53.2ms to 51ms.
While I like faster startup time, I don't know 4% speed up is worth enough for
this complexity.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-04 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I see, so I should be able to reuse tp_cache for this PR.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-03 Thread Guido van Rossum

Guido van Rossum  added the comment:

I don't recall what I meant to use tp_cache for, but everything I can find 
about it says it's unused, so let's kill it if we can. I guess the ABI requires 
that we keep the slot around until we find a new use for it?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-03 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

What exactly is tp_cache? Apparently it was added by Guido in 
https://github.com/python/cpython/commit/687ae00460da9cac04eb1ba8f6f5ab4db25fbfc2
 but never used (at least officially).

commit 687ae00460da9cac04eb1ba8f6f5ab4db25fbfc2
Author: Guido van Rossum 
Date:   Mon Oct 15 22:03:32 2001 +

Get rid of __defined__ and tp_defined -- there's no need to
distinguish __dict__ and __defined__ any more.  In the C structure,
tp_cache takes its place -- but this hasn't been implemented yet.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-03 Thread INADA Naoki

INADA Naoki  added the comment:

How about reusing tp_cache instead of adding new slot?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2018-01-02 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

./python -m timeit -- "class A: pass"
- before: 6.63 usec per loop
- after:  5.41 usec per loop

./python -m timeit -s "class A: pass" -- "class B(A): pass"
- before: 7.04 usec per loop
- after:  4.91 usec per loop

./python -m timeit -s "class A: pass" -s "class B(A): pass" -- "class C(B): 
pass"
- before: 8.24 usec per loop
- after:  5.09 usec per loop

./python -m timeit -s "class A: pass" -s "class B(A): pass" -s "class C(B): 
pass" -- "class D(C): pass"
- before: 9.59 usec per loop
- after:  5.29 usec per loop

./python -m timeit -s "class A: pass" -s "class B(A): pass" -s "class C(B): 
pass" -s "class D(C): pass" -- "class E(D): pass"
- before: 10.9 usec per loop
- after:  5.45 usec per loop

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-21 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I don't really understand your proposal, so it's hard to answer :-) Perhaps you 
can try to implement it so that we can compare?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-21 Thread INADA Naoki

INADA Naoki  added the comment:

New slot is really required?

My idea is:

* Copy slots from MRO
* Iterate namespace dict and override slots if dunder is defined

Is it impossible or harder than my thought?

--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-20 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Yes... The patch itself is not very complex, but you have to dive into the 
intricacies of typeobject.c to understand it :-/

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-20 Thread Serhiy Storchaka

Serhiy Storchaka  added the comment:

Due to its complexity it will take a time to make a review of this patch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-20 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Updated benchmarks against git master (and using pyperf):

- before:

$ ./env-orig/bin/pyperf timeit -q "class Test: pass"
Mean +- std dev: 8.89 us +- 0.09 us
$ ./env-orig/bin/pyperf timeit -q -s "from logging import Logger" "class 
Test(Logger): pass"
Mean +- std dev: 12.0 us +- 0.2 us
$ ./env-orig/bin/pyperf timeit -q -s "from unittest.mock import MagicMock" 
"class Test(MagicMock): pass"
Mean +- std dev: 18.6 us +- 0.2 us

- after:

$ ./env/bin/pyperf timeit -q "class Test: pass"
Mean +- std dev: 6.90 us +- 0.11 us
$ ./env/bin/pyperf timeit -q -s "from logging import Logger" "class 
Test(Logger): pass"
Mean +- std dev: 5.86 us +- 0.08 us
$ ./env/bin/pyperf timeit -q -s "from unittest.mock import MagicMock" "class 
Test(MagicMock): pass"
Mean +- std dev: 6.13 us +- 0.08 us

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-16 Thread Serhiy Storchaka

Change by Serhiy Storchaka :


--
priority: normal -> low

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-16 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Some micro-benchmarks:

$ ./python -m timeit "class Test: pass"
- before: 8.84 usec per loop
- after: 7.03 usec per loop

$ ./python -m timeit "class Test(tuple): pass"
- before: 10.1 usec per loop
- after: 8.4 usec per loop

$ ./python -m timeit -s "from logging import Logger" "class Test(Logger): pass"
- before: 12 usec per loop
- after: 6.25 usec per loop

$ ./python -m timeit -s "from logging.handlers import DatagramHandler" "class 
Test(DatagramHandler): pass"
- before: 15 usec per loop
- after: 6.68 usec per loop

$ ./python -m timeit -s "from unittest.mock import MagicMock" "class 
Test(MagicMock): pass"
- before: 18.2 usec per loop
- after: 6.56 usec per loop

$ ./python -m timeit -s "from shelve import Shelf" "class Test(Shelf): pass"
- before: 28.6 usec per loop
- after: 18.4 usec per loop

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-16 Thread Antoine Pitrou

Change by Antoine Pitrou :


--
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-16 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I posted https://github.com/python/cpython/pull/4902 for this.

This approach has two drawbacks:
- uses an additional tp_ slot (and a corresponding TPFLAGS)
- adds a bit of code complexity (but quite localized)

--
stage: patch review -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-16 Thread Antoine Pitrou

Change by Antoine Pitrou :


--
keywords: +patch
pull_requests: +4797
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32346] Speed up slot lookup for class creation

2017-12-16 Thread Antoine Pitrou

New submission from Antoine Pitrou :

As mentioned in 
https://mail.python.org/pipermail/python-dev/2017-December/151298.html, slot 
lookup can take 75 to 80% of the runtime when creating a new class.  This was 
slightly improved in https://bugs.python.org/issue31336 but we're still doing 
one lookup per possible slot name *and* per class along the MRO.

I propose to speed up this step by caching the known descriptors for each 
class.  This way, when instantiating a new class, you can simply iterate over 
the known descriptors of the MRO without wasting time on hundreds of dict 
lookups.

(and it is reasonably easy to find all slot descriptors in a dict: first select 
only __dunder__ names, then look them up in a dedicated mapping)

--
components: Interpreter Core
messages: 308474
nosy: pitrou, scoder, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Speed up slot lookup for class creation
type: performance
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com