Re: [Python-Dev] Python-Dev Digest, Vol 159, Issue 27

Wang, Peter Xihong Thu, 20 Oct 2016 14:29:00 -0700

Hi Victor,

Thanks for the great contribution to the unified benchmark development!   In 
addition to the OutReachy program that we are currently supporting, let us know 
how else we could help out in this effort.


Other than micros and benchmarking ideas, we'd also like to hear suggestions 
from the community on workload development around real world use cases, 
especially in the enterprise world, cloud computing, data analytics, machine 
learning, high performance computing, etc.

Thanks,

Peter
 

-----Original Message-----
From: Python-Dev 
[mailto:[email protected]] On Behalf Of 
[email protected]
Sent: Thursday, October 20, 2016 9:00 AM
To: [email protected]
Subject: Python-Dev Digest, Vol 159, Issue 27

Send Python-Dev mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://mail.python.org/mailman/listinfo/python-dev
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Python-Dev digest..."


Today's Topics:

   1. Benchmarking Python and micro-optimizations (Victor Stinner)
   2. Have I got my hg dependencies correct? (Skip Montanaro)
   3. Re: Have I got my hg dependencies correct? (Skip Montanaro)
   4. Re: Have I got my hg dependencies correct? (Victor Stinner)
   5. Re: Benchmarking Python and micro-optimizations
      (Maciej Fijalkowski)
   6. Re: Benchmarking Python and micro-optimizations (Eric Snow)


----------------------------------------------------------------------

Message: 1
Date: Thu, 20 Oct 2016 12:56:06 +0200
From: Victor Stinner <[email protected]>
To: Python Dev <[email protected]>
Subject: [Python-Dev] Benchmarking Python and micro-optimizations
Message-ID:
        <campsgwysaxjghfjh_hdh6pj13phacmwybi_lexloxdcouo6...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Hi,

Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed results 
in depth (up to the hardware and kernel drivers!), I wrote new tools and 
enhanced existing tools.

* I wrote a new perf module which runs benchmarks in a reliable way and 
contains a LOT of features: collect metadata, JSON file format, commands to 
compare, render an histogram, etc.

* I rewrote the Python benchmark suite: the old benchmarks Mercurial repository 
moved to a new performance GitHub project which uses my perf module and 
contains more benchmarks.

* I also made minor enhancements to timeit in Python 3.7 -- some dev don't want 
major changes to not "break the backward compatibility".

For timeit, I suggest to use my perf tool which includes a reliable timeit 
command and has much more features like --duplicate (repeat the statements to 
reduce the cost of the outer loop) and --compare-to (compare two versions of 
Python), but also all builtin perf features (JSON output, statistics, 
histogram, etc.).

I added benchmarks from PyPy and Pyston benchmark suites to
performance: performance 0.3.1 contains 51 benchmark scripts which run a total 
of 121 benchmarks. Example of tested Python modules:

* SQLAlchemy
* Dulwich (full Git implementation in Python)
* Mercurial (currently only the startup time)
* html5lib
* pyaes (AES crypto cipher in pure Python)
* sympy
* Tornado (HTTP client and server)
* Django (sadly, only the template engine right now, Pyston contains HTTP 
benchmarks)
* pathlib
* spambayes

More benchmarks will be added later. It would be nice to add benchmarks on 
numpy for example, numpy is important for a large part of our community.

All these (new or updated) tools can now be used to take smarter decisions on 
optimizations. Please don't push any optimization anymore without providing 
reliable benchmark results!


My first major action was to close the latest attempt to micro-optimize int+int 
in Python/ceval.c,
http://bugs.python.org/issue21955 : I closed the issue as rejected, because 
there is no significant speedup on benchmarks other than two
(tiny) microbenchmarks. To make sure that no one looses its time on trying to 
micro-optimize int+int, I even added a comment to Python/ceval.c :-)

   https://hg.python.org/cpython/rev/61fcb12a9873
   "Please don't try to micro-optimize int+int"


The perf and performance are now well tested: Travis CI runs tests on the new 
commits and pull requests, and the "tox" command can be used locally to test 
different Python versions, pep8, doc, ... in a single command.


Next steps:

* Run performance 0.3.1 on speed.python.org: the benchmark runner is currently 
stopped (and still uses the old benchmarks project). The website part may be 
updated to allow to download full JSON files which includes *all* information 
(all timings, metadata and more).

* I plan to run performance on CPython 2.7, CPython 3.7, PyPy and PyPy 3. Maybe 
also CPython 3.5 and CPython 3.6 if they don't take too much resources.

* Later, we can consider adding more implementations of Python:
Jython, IronPython, MicroPython, Pyston, Pyjion, etc. All benchmarks should be 
run on the same hardware to be comparable.

* Later, we might also allow other projects to upload their own benchmark 
results, but we should find a solution to groups benchmark results per 
benchmark runner (ex: at least by the hostname, perf JSON contains the 
hostname) to not compare two results from two different hardware

* We should continue to add more benchmarks to the performance benchmark suite, 
especially benchmarks more representative of real applications (we have enough 
microbenchmarks!)


Links:

* perf: http://perf.readthedocs.io/
* performance: https://github.com/python/performance
* Python Speed mailing list: https://mail.python.org/mailman/listinfo/speed
* https://speed.python.org/ (currently outdated, and don't use performance yet)

See https://pypi.python.org/pypi/performance which contains even more links to 
Python benchmarks (PyPy, Pyston, Numba, Pythran, etc.)

Victor


------------------------------

Message: 2
Date: Thu, 20 Oct 2016 06:47:42 -0500
From: Skip Montanaro <[email protected]>
To: python-dev Dev <[email protected]>
Subject: [Python-Dev] Have I got my hg dependencies correct?
Message-ID:
        <CANc-5UwuWxYkN5LgOdDPEO5-7xYDCLK5Xv6dFP=slbaxbh1...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

I've recently run into a problem building the math and cmath modules for 2.7. 
(I don't rebuild very often, so this problem might have been around for 
awhile.) My hg repos look like this:

* My cpython repo pulls from https://hg.python.org/cpython

* My 2.7 repo (and other non-tip repos) pulls from my cpython repo

I think this setup was recommended way back in the day when hg was new to the 
Python toolchain to avoid unnecessary network bandwidth.

So, if I execute

hg pull
hg update

in first cpython, then 2.7 repos I should be up-to-date, correct?
However, rebuilding in my 2.7 repo fails to build math and cmath. The compiler 
complains that Modules/_math.o doesn't exist. If I manually execute

make Modules/_math.o
make

after the failure, then the math and cmath modules build.

Looking on bugs.python.org I saw this closed issue:

http://bugs.python.org/issue24421

which seems related. Is it possible that the fix wasn't propagated to the 2.7 
branch? Or perhaps I've fouled up my hg repo relationships? My other repos 
which depend on cpython (3.5, 3.4, 3.3, and 3.2) all build the math module just 
fine.

I'm running on an ancient MacBook Pro with OS X 10.11.6 (El Capitan) and XCode 
8.0 installed.

Any suggestions?

Skip


------------------------------

Message: 3
Date: Thu, 20 Oct 2016 07:13:47 -0500
From: Skip Montanaro <[email protected]>
To: python-dev Dev <[email protected]>
Subject: Re: [Python-Dev] Have I got my hg dependencies correct?
Message-ID:
        <canc-5uy+xe3owavdny+eochcpbhgihmbetsbqxp9tlvgqo_...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Thu, Oct 20, 2016 at 6:47 AM, Skip Montanaro <[email protected]> 
wrote:
> Is it possible that the fix wasn't propagated to the 2.7 branch? Or 
> perhaps I've fouled up my hg repo relationships?

Either way, I went ahead and opened a ticket:

http://bugs.python.org/issue28487

S


------------------------------

Message: 4
Date: Thu, 20 Oct 2016 14:35:10 +0200
From: Victor Stinner <[email protected]>
To: Skip Montanaro <[email protected]>
Cc: python-dev Dev <[email protected]>
Subject: Re: [Python-Dev] Have I got my hg dependencies correct?
Message-ID:
        <campsgwa0ek0nma2pzk7pjrjgtvu1c5ybuzp0roh_dsawh4p...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Are you on the 2.7 branch or the default branch?

You might try to cleanup your checkout:

hg up -C -r 2.7
make distclean
hg purge # WARNING! it removes *all* files not tracked by Mercurial ./configure 
&& make

You should also paste the full error message.

Victor

2016-10-20 13:47 GMT+02:00 Skip Montanaro <[email protected]>:
> I've recently run into a problem building the math and cmath modules 
> for 2.7. (I don't rebuild very often, so this problem might have been 
> around for awhile.) My hg repos look like this:
>
> * My cpython repo pulls from https://hg.python.org/cpython
>
> * My 2.7 repo (and other non-tip repos) pulls from my cpython repo
>
> I think this setup was recommended way back in the day when hg was new 
> to the Python toolchain to avoid unnecessary network bandwidth.
>
> So, if I execute
>
> hg pull
> hg update
>
> in first cpython, then 2.7 repos I should be up-to-date, correct?
> However, rebuilding in my 2.7 repo fails to build math and cmath. The 
> compiler complains that Modules/_math.o doesn't exist. If I manually 
> execute
>
> make Modules/_math.o
> make
>
> after the failure, then the math and cmath modules build.
>
> Looking on bugs.python.org I saw this closed issue:
>
> http://bugs.python.org/issue24421
>
> which seems related. Is it possible that the fix wasn't propagated to 
> the 2.7 branch? Or perhaps I've fouled up my hg repo relationships? My 
> other repos which depend on cpython (3.5, 3.4, 3.3, and 3.2) all build 
> the math module just fine.
>
> I'm running on an ancient MacBook Pro with OS X 10.11.6 (El Capitan) 
> and XCode 8.0 installed.
>
> Any suggestions?
>
> Skip
> _______________________________________________
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gm
> ail.com


------------------------------

Message: 5
Date: Thu, 20 Oct 2016 15:15:43 +0200
From: Maciej Fijalkowski <[email protected]>
To: Victor Stinner <[email protected]>
Cc: Python Dev <[email protected]>
Subject: Re: [Python-Dev] Benchmarking Python and micro-optimizations
Message-ID:
        <cak5idxrk1muf5x8owot2pkucvvag2sgu6d+ldknnaeuh3p7...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Hi Victor

Despite the fact that I was not able to find time to run your stuff yet, thanks 
for all the awesome work!

On Thu, Oct 20, 2016 at 12:56 PM, Victor Stinner <[email protected]> 
wrote:
> Hi,
>
> Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed 
> results in depth (up to the hardware and kernel drivers!), I wrote new 
> tools and enhanced existing tools.
>
> * I wrote a new perf module which runs benchmarks in a reliable way 
> and contains a LOT of features: collect metadata, JSON file format, 
> commands to compare, render an histogram, etc.
>
> * I rewrote the Python benchmark suite: the old benchmarks Mercurial 
> repository moved to a new performance GitHub project which uses my 
> perf module and contains more benchmarks.
>
> * I also made minor enhancements to timeit in Python 3.7 -- some dev 
> don't want major changes to not "break the backward compatibility".
>
> For timeit, I suggest to use my perf tool which includes a reliable 
> timeit command and has much more features like --duplicate (repeat the 
> statements to reduce the cost of the outer loop) and --compare-to 
> (compare two versions of Python), but also all builtin perf features 
> (JSON output, statistics, histogram, etc.).
>
> I added benchmarks from PyPy and Pyston benchmark suites to
> performance: performance 0.3.1 contains 51 benchmark scripts which run 
> a total of 121 benchmarks. Example of tested Python modules:
>
> * SQLAlchemy
> * Dulwich (full Git implementation in Python)
> * Mercurial (currently only the startup time)
> * html5lib
> * pyaes (AES crypto cipher in pure Python)
> * sympy
> * Tornado (HTTP client and server)
> * Django (sadly, only the template engine right now, Pyston contains 
> HTTP benchmarks)
> * pathlib
> * spambayes
>
> More benchmarks will be added later. It would be nice to add 
> benchmarks on numpy for example, numpy is important for a large part 
> of our community.
>
> All these (new or updated) tools can now be used to take smarter 
> decisions on optimizations. Please don't push any optimization anymore 
> without providing reliable benchmark results!
>
>
> My first major action was to close the latest attempt to 
> micro-optimize int+int in Python/ceval.c,
> http://bugs.python.org/issue21955 : I closed the issue as rejected, 
> because there is no significant speedup on benchmarks other than two
> (tiny) microbenchmarks. To make sure that no one looses its time on 
> trying to micro-optimize int+int, I even added a comment to 
> Python/ceval.c :-)
>
>    https://hg.python.org/cpython/rev/61fcb12a9873
>    "Please don't try to micro-optimize int+int"
>
>
> The perf and performance are now well tested: Travis CI runs tests on 
> the new commits and pull requests, and the "tox" command can be used 
> locally to test different Python versions, pep8, doc, ... in a single 
> command.
>
>
> Next steps:
>
> * Run performance 0.3.1 on speed.python.org: the benchmark runner is 
> currently stopped (and still uses the old benchmarks project). The 
> website part may be updated to allow to download full JSON files which 
> includes *all* information (all timings, metadata and more).
>
> * I plan to run performance on CPython 2.7, CPython 3.7, PyPy and PyPy 
> 3. Maybe also CPython 3.5 and CPython 3.6 if they don't take too much 
> resources.
>
> * Later, we can consider adding more implementations of Python:
> Jython, IronPython, MicroPython, Pyston, Pyjion, etc. All benchmarks 
> should be run on the same hardware to be comparable.
>
> * Later, we might also allow other projects to upload their own 
> benchmark results, but we should find a solution to groups benchmark 
> results per benchmark runner (ex: at least by the hostname, perf JSON 
> contains the hostname) to not compare two results from two different 
> hardware
>
> * We should continue to add more benchmarks to the performance 
> benchmark suite, especially benchmarks more representative of real 
> applications (we have enough microbenchmarks!)
>
>
> Links:
>
> * perf: http://perf.readthedocs.io/
> * performance: https://github.com/python/performance
> * Python Speed mailing list: 
> https://mail.python.org/mailman/listinfo/speed
> * https://speed.python.org/ (currently outdated, and don't use 
> performance yet)
>
> See https://pypi.python.org/pypi/performance which contains even more 
> links to Python benchmarks (PyPy, Pyston, Numba, Pythran, etc.)
>
> Victor
> _______________________________________________
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com


------------------------------

Message: 6
Date: Thu, 20 Oct 2016 09:31:42 -0600
From: Eric Snow <[email protected]>
To: Victor Stinner <[email protected]>
Cc: Python Dev <[email protected]>
Subject: Re: [Python-Dev] Benchmarking Python and micro-optimizations
Message-ID:
        <calffu7azsqrazovfsdmmruqqtumuvxelwrijv08uasgnbrn...@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Thu, Oct 20, 2016 at 4:56 AM, Victor Stinner <[email protected]> 
wrote:
> Hi,
>
> Last months, I worked a lot on benchmarks. I ran benchmarks, analyzed 
> results in depth (up to the hardware and kernel drivers!), I wrote new 
> tools and enhanced existing tools.

This is a massive contribution.  Thanks!

> All these (new or updated) tools can now be used to take smarter 
> decisions on optimizations. Please don't push any optimization anymore 
> without providing reliable benchmark results!

+1

-eric


------------------------------

Subject: Digest Footer

_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev


------------------------------

End of Python-Dev Digest, Vol 159, Issue 27
*******************************************
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-Dev Digest, Vol 159, Issue 27

Reply via email to