Re: [Python-Dev] Possible performance regression
Raymond Hettinger writes: > We're trying to keep performant the ones that people actually use. > For the Mac, I think there are only four that matter: > > 1) The one we distribute on the python.org > website at > https://www.python.org/ftp/python/3.8.0/python-3.8.0a2-macosx10.9.pkg > > 2) The one installed by homebrew > > 3) The way folks typically roll their own: > $ ./configure && make (or some variant of make install) > > 4) The one shipped by Apple and put in /usr/bin I don't see the relevance of (4) since we're talking about the bleeding edge AFAICT. Not clear about Homebrew -- since I've been experimenting with it recently I use the bottled versions, which aren't bleeding edge. If prebuilt packages matter, I would add MacPorts (or substitute it for (4) since nothing seems to get Apple's attention) and Anaconda (which is what I recommend to my students). But I haven't looked at MacPorts' recent download stats, and maybe I'm just the odd one out. Steve -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnb...@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On Feb 26, 2019, at 2:28 PM, Neil Schemenauer wrote: > > Are you compiling with --enable-optimizations (i.e. PGO)? In my > experience, that is needed to get meaningful results. I'm not and I would worry that PGO would give less stable comparisons because it is highly sensitive to changes its training set as well as the actual CPython implementation (two moving targets instead of one). That said, it doesn't really matter to the world how I build *my* Python. We're trying to keep performant the ones that people actually use. For the Mac, I think there are only four that matter: 1) The one we distribute on the python.org website at https://www.python.org/ftp/python/3.8.0/python-3.8.0a2-macosx10.9.pkg 2) The one installed by homebrew 3) The way folks typically roll their own: $ ./configure && make (or some variant of make install) 4) The one shipped by Apple and put in /usr/bin Of the four, the ones I've been timing are #1 and #3. I'm happy to drop this. I was looking for independent confirmation and didn't get it. We can't move forward unless some else also observes a consistently measurable regression for a benchmark they care about on a build that they care about. If I'm the only who notices then it really doesn't matter. Also, it was reassuring to not see the same effect on a GCC-8 build. Since the effect seems to be compiler specific, it may be that we knocked it out of a local minimum and that performance will return the next time someone touches the eval-loop. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
Le mer. 27 févr. 2019 à 00:17, Victor Stinner a écrit : > My sad story with code placement: > https://vstinner.github.io/analysis-python-performance-issue.html > > tl; dr Use PGO. Hum wait, this article isn't complete. You have to see the follow-up: https://bugs.python.org/issue28618#msg286662 """ Victor: "FYI I wrote an article about this issue: https://haypo.github.io/analysis-python-performance-issue.html Sadly, it seems like I was just lucky when adding __attribute__((hot)) fixed the issue, because call_method is slow again!" I upgraded speed-python server (running benchmarks) to Ubuntu 16.04 LTS to support PGO compilation. I removed all old benchmark results and ran again benchmarks with LTO+PGO. It seems like benchmark results are much better now. I'm not sure anymore that _Py_HOT_FUNCTION is really useful to get stable benchmarks, but it may help code placement a little bit. I don't think that it hurts, so I suggest to keep it. Since benchmarks were still unstable with _Py_HOT_FUNCTION, I'm not interested to continue to tag more functions with _Py_HOT_FUNCTION. I will now focus on LTO+PGO for stable benchmarks, and ignore small performance difference when PGO is not used. I close this issue now. """ Now I recall that I tried hard to avoid PGO: the server used by speed.python.org to run benchmarks didn't support PGO. I fixed the issue by upgrading Ubuntu :-) Now speed.python.org uses PGO. I stopped to stop to manually help the compiler with code placement. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
Hi, PGO compilation is very slow. I tried very hard to avoid it. I started to annotate the C code with various GCC attributes like "inline", "always_inline", "hot", etc.. I also experimented likely/unlikely Linux macros which use __builtin_expect(). At the end... my efforts were worthless. I still had *major* issue (benchmark *suddenly* 68% slower! WTF?) with code locality and I decided to give up. You can still find some macros like _Py_HOT_FUNCTION and _Py_NO_INLINE in Python ;-) (_Py_NO_INLINE is used to reduce stack memory usage, that's a different story.) My sad story with code placement: https://vstinner.github.io/analysis-python-performance-issue.html tl; dr Use PGO. -- Since that time, I removed call_method from pyperformance to fix the root issue: don't waste your time on micro-benchmarks ;-) ... But I kept these micro-benchmarks in a different project: https://github.com/vstinner/pymicrobench For some specific needs (take a decision on a specific optimizaton), sometimes micro-benchmarks are still useful ;-) Victor Le mar. 26 févr. 2019 à 23:31, Neil Schemenauer a écrit : > > On 2019-02-26, Raymond Hettinger wrote: > > That said, I'm only observing the effect when building with the > > Mac default Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5). > > When building GCC 8.3.0, there is no change in performance. > > My guess is that the code in _PyEval_EvalFrameDefault() got changed > enough that Clang started emitting a bit different machine code. If > the conditional jumps are a bit different, I understand that could > have a significant difference on performance. > > Are you compiling with --enable-optimizations (i.e. PGO)? In my > experience, that is needed to get meaningful results. Victor also > mentions that on his "how-to-get-stable-benchmarks" page. Building > with PGO is really (really) slow so I supect you are not doing it > when bisecting. You can speed it up greatly by using a simpler > command for PROFILE_TASK in Makefile.pre.in. E.g. > > PROFILE_TASK=$(srcdir)/my_benchmark.py > > Now that you have narrowed it down to a single commit, it would be > worth doing the comparison with PGO builds (assuming Clang supports > that). > > > That said, it seems to be compiler specific and only affects the > > Mac builds, so maybe we can decide that we don't care. > > I think the key question is if the ceval loop got a bit slower due > to logic changes or if Clang just happened to generate a bit worse > code due to source code details. A PGO build could help answer > that. I suppose trying to compare machine code is going to produce > too large of a diff. > > Could you try hoisting the eval_breaker expression, as suggested by > Antoine: > > https://discuss.python.org/t/profiling-cpython-with-perf/940/2 > > If you think a slowdown affects most opcodes, I think the DISPATCH > change looks like the only cause. Maybe I missed something though. > > Also, maybe there would be some value in marking key branches as > likely/unlikely if it helps Clang generate better machine code. > Then, even if you compile without PGO (as many people do), you still > get the better machine code. > > Regards, > > Neil > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On 2019-02-26, Raymond Hettinger wrote: > That said, I'm only observing the effect when building with the > Mac default Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5). > When building GCC 8.3.0, there is no change in performance. My guess is that the code in _PyEval_EvalFrameDefault() got changed enough that Clang started emitting a bit different machine code. If the conditional jumps are a bit different, I understand that could have a significant difference on performance. Are you compiling with --enable-optimizations (i.e. PGO)? In my experience, that is needed to get meaningful results. Victor also mentions that on his "how-to-get-stable-benchmarks" page. Building with PGO is really (really) slow so I supect you are not doing it when bisecting. You can speed it up greatly by using a simpler command for PROFILE_TASK in Makefile.pre.in. E.g. PROFILE_TASK=$(srcdir)/my_benchmark.py Now that you have narrowed it down to a single commit, it would be worth doing the comparison with PGO builds (assuming Clang supports that). > That said, it seems to be compiler specific and only affects the > Mac builds, so maybe we can decide that we don't care. I think the key question is if the ceval loop got a bit slower due to logic changes or if Clang just happened to generate a bit worse code due to source code details. A PGO build could help answer that. I suppose trying to compare machine code is going to produce too large of a diff. Could you try hoisting the eval_breaker expression, as suggested by Antoine: https://discuss.python.org/t/profiling-cpython-with-perf/940/2 If you think a slowdown affects most opcodes, I think the DISPATCH change looks like the only cause. Maybe I missed something though. Also, maybe there would be some value in marking key branches as likely/unlikely if it helps Clang generate better machine code. Then, even if you compile without PGO (as many people do), you still get the better machine code. Regards, Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
Le mar. 26 févr. 2019 à 22:45, Raymond Hettinger a écrit : > Victor said he generally doesn't care about 5% regressions. That makes sense > for odd corners of Python. The reason I was concerned about this one is that > it hits the eval-loop and seems to effect every single op code. The > regression applies somewhat broadly (increasing the cost of reading and > writing local variables by about 20%). The effect is somewhat broad based. I ignore changes smaller than 5% because they are usually what I call the "noise" of the benchmark. It means that testing 3 commits give 3 different timings, even if the commits don't touch anything used in the benchmark. There are multiple explanation: PGO compilation in not deterministic, some benchmarks are too close to the performance of the CPU L1-instruction cache and so are heavily impacted by the "code locality" (exact address in memory), and many other things. Hum, sometimes running the same benchmark on the same code on the same hardware with the same strict procedure gives different timings at each attempt. At some point, I decided to give up on these 5% to not loose my mind :-) Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On Feb 25, 2019, at 8:23 PM, Eric Snow wrote: > > So it looks like commit ef4ac967 is not responsible for a performance > regression. I did narrow it down to that commit and I can consistently reproduce the timing differences. That said, I'm only observing the effect when building with the Mac default Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5). When building GCC 8.3.0, there is no change in performance. I conclude this is only an issue for Mac builds. > I ran the "performance" suite (https://github.com/python/performance), > which has 57 different benchmarks. Many of those benchmarks don't measure eval-loop performance. Instead, they exercise json, pickle, sqlite etc. So, I would expect no change in many of those because they weren't touched. Victor said he generally doesn't care about 5% regressions. That makes sense for odd corners of Python. The reason I was concerned about this one is that it hits the eval-loop and seems to effect every single op code. The regression applies somewhat broadly (increasing the cost of reading and writing local variables by about 20%). The effect is somewhat broad based. That said, it seems to be compiler specific and only affects the Mac builds, so maybe we can decide that we don't care. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
I made an attempt once and it was faster: https://faster-cpython.readthedocs.io/registervm.html But I had bugs and I didn't know how to implement correctly a compiler. Victor Le mardi 26 février 2019, Neil Schemenauer a écrit : > On 2019-02-25, Eric Snow wrote: >> So it looks like commit ef4ac967 is not responsible for a performance >> regression. > > I did a bit of exploration myself and that was my conclusion as > well. Perhaps others would be interested in how to use "perf" so I > did a little write up: > > https://discuss.python.org/t/profiling-cpython-with-perf/940 > > To me, it looks like using a register based VM could produce a > pretty decent speedup. Research project for someone. ;-) > > Regards, > > Neil > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com > -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On 2019-02-25, Eric Snow wrote: > So it looks like commit ef4ac967 is not responsible for a performance > regression. I did a bit of exploration myself and that was my conclusion as well. Perhaps others would be interested in how to use "perf" so I did a little write up: https://discuss.python.org/t/profiling-cpython-with-perf/940 To me, it looks like using a register based VM could produce a pretty decent speedup. Research project for someone. ;-) Regards, Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
Hi, Le mar. 26 févr. 2019 à 05:27, Eric Snow a écrit : > I ran the "performance" suite (https://github.com/python/performance), > which has 57 different benchmarks. Ah yes, by the way: I also ran manually performance on speed.python.org yesterday: it added a new dot at Feb 25. > In the results, 9 were marked as > "significantly" different between the two commits.. 2 of the > benchmarks showed a marginal slowdown and 7 showed a marginal speedup: I'm not surprised :-) Noise on micro-benchmark is usually "ignored by the std dev" (delta included in the std dev). At speed.python.org, you can see that basically the performances are stable since last summer. I let you have a look at https://speed.python.org/timeline/ > | Benchmark | speed.before | speed.after | Change > | Significance | > +=+==+=+==+===+ > | django_template | 177 ms | 172 ms | 1.03x faster > | Significant (t=3.66) | > +-+--+-+--+---+ > | html5lib| 126 ms | 122 ms | 1.03x faster > | Significant (t=3.46) | > +-+--+-+--+---+ > | json_dumps | 17.6 ms | 17.2 ms | 1.02x faster > | Significant (t=2.65) | > +-+--+-+--+---+ > | nbody | 157 ms | 161 ms | 1.03x slower > | Significant (t=-3.85) | (...) Usually, I just ignore changes which are smaller than 5% ;-) Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On Mon, Feb 25, 2019 at 10:42 AM Eric Snow wrote: > I'll look into it around then too. See https://bugs.python.org/issue33608. I ran the "performance" suite (https://github.com/python/performance), which has 57 different benchmarks. In the results, 9 were marked as "significantly" different between the two commits.. 2 of the benchmarks showed a marginal slowdown and 7 showed a marginal speedup: +-+--+-+--+---+ | Benchmark | speed.before | speed.after | Change | Significance | +=+==+=+==+===+ | django_template | 177 ms | 172 ms | 1.03x faster | Significant (t=3.66) | +-+--+-+--+---+ | html5lib| 126 ms | 122 ms | 1.03x faster | Significant (t=3.46) | +-+--+-+--+---+ | json_dumps | 17.6 ms | 17.2 ms | 1.02x faster | Significant (t=2.65) | +-+--+-+--+---+ | nbody | 157 ms | 161 ms | 1.03x slower | Significant (t=-3.85) | +-+--+-+--+---+ | pickle_dict | 29.5 us | 30.5 us | 1.03x slower | Significant (t=-6.37) | +-+--+-+--+---+ | scimark_monte_carlo | 144 ms | 139 ms | 1.04x faster | Significant (t=3.61) | +-+--+-+--+---+ | scimark_sparse_mat_mult | 5.41 ms | 5.25 ms | 1.03x faster | Significant (t=4.26) | +-+--+-+--+---+ | sqlite_synth| 3.99 us | 3.91 us | 1.02x faster | Significant (t=2.49) | +-+--+-+--+---+ | unpickle_pure_python| 497 us | 481 us | 1.03x faster | Significant (t=5.04) | +-+--+-+--+---+ (Issue #33608 has more detail.) So it looks like commit ef4ac967 is not responsible for a performance regression. -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On Mon, Feb 25, 2019 at 10:32 AM Raymond Hettinger wrote: > I got it down to two checkins before running out of time: > > Between > git checkout 463572c8beb59fd9d6850440af48a5c5f4c0c0c9 > > And: > git checkout 3b0abb019662e42070f1d6f7e74440afb1808f03 > > So the subinterpreter patch was likely the trigger. > > I can reproduce it over and over again on Clang, but not for a GCC-8 build, > so it is compiler specific (and possibly macOS specific). > > Will look at it more after work this evening. I posted here to try to > solicit independent confirmation. I'll look into it around then too. See https://bugs.python.org/issue33608. -eric ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
> On Feb 25, 2019, at 2:54 AM, Antoine Pitrou wrote: > > Have you tried bisecting to find out the offending changeset, if there > any? I got it down to two checkins before running out of time: Between git checkout 463572c8beb59fd9d6850440af48a5c5f4c0c0c9 And: git checkout 3b0abb019662e42070f1d6f7e74440afb1808f03 So the subinterpreter patch was likely the trigger. I can reproduce it over and over again on Clang, but not for a GCC-8 build, so it is compiler specific (and possibly macOS specific). Will look at it more after work this evening. I posted here to try to solicit independent confirmation. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On Sun, 24 Feb 2019 20:54:02 -0800 Raymond Hettinger wrote: > I'll been running benchmarks that have been stable for a while. But between > today and yesterday, there has been an almost across the board performance > regression. Have you tried bisecting to find out the offending changeset, if there any? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
Hi, Le lun. 25 févr. 2019 à 05:57, Raymond Hettinger a écrit : > I'll been running benchmarks that have been stable for a while. But between > today and yesterday, there has been an almost across the board performance > regression. How do you run your benchmarks? If you use Linux, are you using CPU isolation? > It's possible that this is a measurement error or something unique to my > system (my Mac installed the 10.14.3 release today), so I'm hoping other > folks can run checks as well. Getting reproducible benchmark results on timing smaller than 1 ms is really hard. I wrote some advices to get more stable results: https://perf.readthedocs.io/en/latest/run_benchmark.html#how-to-get-reproductible-benchmark-results > Variable and attribute read access: >4.0 ns read_local In my experience, for timing less than 100 ns, *everything* impacts the benchmark, and the result is useless without the standard deviation. On such microbenchmarks, the hash function hash a significant impact on performance. So you should run your benchmark on multiple different *processes* to get multiple different hash functions. Some people prefer to use PYTHONHASHSEED=0 (or another value), but I dislike using that since it's less representative of performance "on production" (with randomized hash function). For example, using 20 processes to test 20 randomized hash function is enough to compute the average cost of the hash function. More remark was more general, I didn't look at the specific case of var_access_benchmark.py. Maybe benchmarks on C depend on the hash function. For example, 4.0 ns +/- 10 ns or 4.0 ns +/- 0.1 ns is completely different to decide if "5.0 ns" is slower to faster. The "perf compare" command of my perf module "determines whether two samples differ significantly using a Student’s two-sample, two-tailed t-test with alpha equals to 0.95.": https://en.wikipedia.org/wiki/Student's_t-test I don't understand how these things work, I just copied the code from the old Python benchmark suite :-) See also my articles in my journey to stable benchmarks: * https://vstinner.github.io/journey-to-stable-benchmark-system.html # nosy applications / CPU isolation * https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html # PGO * https://vstinner.github.io/journey-to-stable-benchmark-average.html # randomized hash function There are likely other parameters which impact benchmarks, that's why std dev and how the benchmark matter so much. Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
> On Feb 24, 2019, at 10:06 PM, Eric Snow wrote: > > I'll look into it in more depth tomorrow. FWIW, I have a few commits > in the range you described, so I want to make sure I didn't slow > things down for us. :) Thanks for looking into it. FWIW, I can consistently reproduce the results several times in row. Here's the bash script I'm using: #!/bin/bash make clean ./configure make# Apple LLVM version 10.0.0 (clang-1000.11.45.5) for i in `seq 1 3`; do git checkout d610116a2e48b55788b62e11f2e6956af06b3de0 # Go back to 2/23 make# Rebuild sleep 30# Let the system get quiet and cool echo ' baseline ---' >> results.txt # Label output ./python.exe Tools/scripts/var_access_benchmark.py >> results.txt # Run benchmark git checkout 16323cb2c3d315e02637cebebdc5ff46be32ecdf # Go to end-of-day 2/24 make# Rebuild sleep 30# Let the system get quiet and cool echo ' end of day ---' >> results.txt # Label output ./python.exe Tools/scripts/var_access_benchmark.py >> results.txt # Run benchmark > > -eric > > > * commit 175421b58cc97a2555e474f479f30a6c5d2250b0 (HEAD) > | Author: Pablo Galindo > | Date: Sat Feb 23 03:02:06 2019 + > | > | bpo-36016: Add generation option to gc.getobjects() (GH-11909) > > $ ./python Tools/scripts/var_access_benchmark.py > Variable and attribute read access: > 18.1 ns read_local > 19.4 ns read_nonlocal These timings are several times larger than they should be. Perhaps you're running a debug build? Or perhaps 32-bit? Or on VM or some such. Something looks way off because I'm getting 4 and 5 ns on my 2013 Haswell laptop. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Possible performance regression
On Sun, Feb 24, 2019 at 10:04 PM Eric Snow wrote: > I'll take a look tonight. I made 2 successive runs of the script (on my laptop) for a commit from early Saturday, and 2 runs from a commit this afternoon (close to master). The output is below, with the earlier commit first. That one is a little faster in places and a little slower in others. However, I also saw quite a bit of variability in the results for the same commit. So I'm not sure what to make of it. I'll look into it in more depth tomorrow. FWIW, I have a few commits in the range you described, so I want to make sure I didn't slow things down for us. :) -eric * commit 175421b58cc97a2555e474f479f30a6c5d2250b0 (HEAD) | Author: Pablo Galindo | Date: Sat Feb 23 03:02:06 2019 + | | bpo-36016: Add generation option to gc.getobjects() (GH-11909) $ ./python Tools/scripts/var_access_benchmark.py Variable and attribute read access: 18.1 ns read_local 19.4 ns read_nonlocal 48.3 ns read_global 52.4 ns read_builtin 55.7 ns read_classvar_from_class 56.1 ns read_classvar_from_instance 78.6 ns read_instancevar 67.6 ns read_instancevar_slots 65.9 ns read_namedtuple 106.1 ns read_boundmethod Variable and attribute write access: 25.1 ns write_local 26.9 ns write_nonlocal ^[[A 78.0 ns write_global 154.1 ns write_classvar 132.0 ns write_instancevar 88.2 ns write_instancevar_slots Data structure read access: 69.6 ns read_list 69.0 ns read_deque 68.4 ns read_dict Data structure write access: 73.2 ns write_list 79.0 ns write_deque 103.5 ns write_dict Stack (or queue) operations: 348.3 ns list_append_pop 169.0 ns deque_append_pop 170.8 ns deque_append_popleft Timing loop overhead: 1.3 ns loop_overhead $ ./python Tools/scripts/var_access_benchmark.py Variable and attribute read access: 17.7 ns read_local 19.2 ns read_nonlocal 39.9 ns read_global 50.3 ns read_builtin 54.4 ns read_classvar_from_class 55.8 ns read_classvar_from_instance 80.3 ns read_instancevar 70.7 ns read_instancevar_slots 66.1 ns read_namedtuple 108.9 ns read_boundmethod Variable and attribute write access: 25.1 ns write_local 25.6 ns write_nonlocal 70.0 ns write_global 151.5 ns write_classvar 133.9 ns write_instancevar 90.7 ns write_instancevar_slots Data structure read access: 140.7 ns read_list 89.6 ns read_deque 86.6 ns read_dict Data structure write access: 97.9 ns write_list 100.5 ns write_deque 120.0 ns write_dict Stack (or queue) operations: 375.9 ns list_append_pop 179.3 ns deque_append_pop 179.4 ns deque_append_popleft Timing loop overhead: 1.5 ns loop_overhead * commit 3b0abb019662e42070f1d6f7e74440afb1808f03 (HEAD) | Author: Giampaolo Rodola | Date: Sun Feb 24 15:46:40 2019 -0800 | | bpo-33671: allow setting shutil.copyfile() bufsize globally (GH-12016) $ ./python Tools/scripts/var_access_benchmark.py Variable and attribute read access: 20.2 ns read_local 20.0 ns read_nonlocal 41.9 ns read_global 52.9 ns read_builtin 56.3 ns read_classvar_from_class 56.9 ns read_classvar_from_instance 80.2 ns read_instancevar 70.6 ns read_instancevar_slots 69.5 ns read_namedtuple 114.5 ns read_boundmethod Variable and attribute write access: 23.4 ns write_local 25.0 ns write_nonlocal 74.5 ns write_global 152.0 ns write_classvar 131.7 ns write_instancevar 90.1 ns write_instancevar_slots Data structure read access: 69.9 ns read_list 73.4 ns read_deque 77.8 ns read_dict Data structure write access: 83.3 ns write_list 94.9 ns write_deque 120.6 ns write_dict Stack (or queue) operations: 383.4 ns list_append_pop 187.1 ns deque_append_pop 182.2 ns deque_append_popleft Timing loop overhead: 1.4 ns loop_overhead $ ./python Tools/scripts/var_access_benchmark.py Variable and attribute read access: 19.1 ns read_local 20.9 ns read_nonlocal 43.8 ns read_global 57.8 ns read_builtin 58.4 ns read_classvar_from_class 61.3 ns read_classvar_from_instance 84.7 ns read_instancevar 72.9 ns read_instancevar_slots 69.7 ns read_namedtuple 109.9 ns read_boundmethod Variable and attribute write access: 23.1 ns write_local 23.7 ns write_nonlocal 72.8 ns write_global 149.9 ns write_classvar 133.3 ns write_instancevar 89.4 ns write_instancevar_slots Data structure read access: 69.0 ns read_list 69.6 ns read_deque 69.1 ns read_dict Data structure write a
Re: [Python-Dev] Possible performance regression
I'll take a look tonight. -eric On Sun, Feb 24, 2019, 21:54 Raymond Hettinger wrote: > I'll been running benchmarks that have been stable for a while. But > between today and yesterday, there has been an almost across the board > performance regression. > > It's possible that this is a measurement error or something unique to my > system (my Mac installed the 10.14.3 release today), so I'm hoping other > folks can run checks as well. > > > Raymond > > > -- Yesterday > > > $ ./python.exe Tools/scripts/var_access_benchmark.py > Variable and attribute read access: >4.0 ns read_local >4.5 ns read_nonlocal > 13.1 ns read_global > 17.4 ns read_builtin > 17.4 ns read_classvar_from_class > 15.8 ns read_classvar_from_instance > 24.6 ns read_instancevar > 19.7 ns read_instancevar_slots > 18.5 ns read_namedtuple > 26.3 ns read_boundmethod > > Variable and attribute write access: >4.6 ns write_local >4.8 ns write_nonlocal > 17.5 ns write_global > 39.1 ns write_classvar > 34.4 ns write_instancevar > 25.3 ns write_instancevar_slots > > Data structure read access: > 17.5 ns read_list > 18.4 ns read_deque > 19.2 ns read_dict > > Data structure write access: > 19.0 ns write_list > 22.0 ns write_deque > 24.4 ns write_dict > > Stack (or queue) operations: > 55.5 ns list_append_pop > 46.3 ns deque_append_pop > 46.7 ns deque_append_popleft > > Timing loop overhead: >0.3 ns loop_overhead > > > -- Today > --- > > $ ./python.exe py Tools/scripts/var_access_benchmark.py > > Variable and attribute read access: >5.0 ns read_local >5.3 ns read_nonlocal > 14.7 ns read_global > 18.6 ns read_builtin > 19.9 ns read_classvar_from_class > 17.7 ns read_classvar_from_instance > 26.1 ns read_instancevar > 21.0 ns read_instancevar_slots > 21.7 ns read_namedtuple > 27.8 ns read_boundmethod > > Variable and attribute write access: >6.1 ns write_local >7.3 ns write_nonlocal > 18.9 ns write_global > 40.7 ns write_classvar > 36.2 ns write_instancevar > 26.1 ns write_instancevar_slots > > Data structure read access: > 19.1 ns read_list > 19.6 ns read_deque > 20.6 ns read_dict > > Data structure write access: > 22.8 ns write_list > 23.5 ns write_deque > 27.8 ns write_dict > > Stack (or queue) operations: > 54.8 ns list_append_pop > 49.5 ns deque_append_pop > 49.4 ns deque_append_popleft > > Timing loop overhead: >0.3 ns loop_overhead > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Possible performance regression
I'll been running benchmarks that have been stable for a while. But between today and yesterday, there has been an almost across the board performance regression. It's possible that this is a measurement error or something unique to my system (my Mac installed the 10.14.3 release today), so I'm hoping other folks can run checks as well. Raymond -- Yesterday $ ./python.exe Tools/scripts/var_access_benchmark.py Variable and attribute read access: 4.0 ns read_local 4.5 ns read_nonlocal 13.1 ns read_global 17.4 ns read_builtin 17.4 ns read_classvar_from_class 15.8 ns read_classvar_from_instance 24.6 ns read_instancevar 19.7 ns read_instancevar_slots 18.5 ns read_namedtuple 26.3 ns read_boundmethod Variable and attribute write access: 4.6 ns write_local 4.8 ns write_nonlocal 17.5 ns write_global 39.1 ns write_classvar 34.4 ns write_instancevar 25.3 ns write_instancevar_slots Data structure read access: 17.5 ns read_list 18.4 ns read_deque 19.2 ns read_dict Data structure write access: 19.0 ns write_list 22.0 ns write_deque 24.4 ns write_dict Stack (or queue) operations: 55.5 ns list_append_pop 46.3 ns deque_append_pop 46.7 ns deque_append_popleft Timing loop overhead: 0.3 ns loop_overhead -- Today --- $ ./python.exe py Tools/scripts/var_access_benchmark.py Variable and attribute read access: 5.0 ns read_local 5.3 ns read_nonlocal 14.7 ns read_global 18.6 ns read_builtin 19.9 ns read_classvar_from_class 17.7 ns read_classvar_from_instance 26.1 ns read_instancevar 21.0 ns read_instancevar_slots 21.7 ns read_namedtuple 27.8 ns read_boundmethod Variable and attribute write access: 6.1 ns write_local 7.3 ns write_nonlocal 18.9 ns write_global 40.7 ns write_classvar 36.2 ns write_instancevar 26.1 ns write_instancevar_slots Data structure read access: 19.1 ns read_list 19.6 ns read_deque 20.6 ns read_dict Data structure write access: 22.8 ns write_list 23.5 ns write_deque 27.8 ns write_dict Stack (or queue) operations: 54.8 ns list_append_pop 49.5 ns deque_append_pop 49.4 ns deque_append_popleft Timing loop overhead: 0.3 ns loop_overhead ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com