I'm sorry, but we're just not going to turn this on by default without doing a trial period ourselves. Your (and Intel's) contribution is very welcome, but in order to establish trust in a feature like this, an optional trial period is absolutely required.
Regarding the training set, I agree that regrtest sounds to be better than pybench. If we make this an opt-in change, we can experiment with different training sets easily. (Also, I haven't seen the patch yet, but I presume it's easy to use a different training set? Experimentation should be encouraged.) On Sat, Aug 22, 2015 at 9:40 AM, Patrascu, Alecsandru < alecsandru.patra...@intel.com> wrote: > Hello and thank you for your feedback. > > We have measured PGO gain using other workloads also. Our initial choice > for this optimization was pybench, but the speedup obtained was lower than > using regrtest and it didn't cover a lot of Python scenarios. Instead, > regrtest has an uniform distribution for the tests and the resulting binary > is overall much faster than the default, or trained using other workloads, > and thus covering a larger pool of Python loads. This optimization was also > tested on a production environments running OpenStack Swift and got up to > 9% improvements. > > The reason we proposed this target to be always on is that the obtained > optimized binary is better out of the box for the general cases. > > Alecsandru > > From: gvanros...@gmail.com [mailto:gvanros...@gmail.com] On Behalf Of > Guido van Rossum > Sent: Saturday, August 22, 2015 7:15 PM > To: Patrascu, Alecsandru > Cc: python-dev@python.org > Subject: Re: [Python-Dev] Profile Guided Optimization active by-default > > How about we first add a new Makefile target that enables PGO, without > turning it on by default? Then later we can enable it by default. > Also, I have my doubts about regrtest. How sure are we that it represents > a typical Python load? Tests are often using a different mix of operations > than production code. > > On Sat, Aug 22, 2015 at 7:46 AM, Patrascu, Alecsandru < > alecsandru.patra...@intel.com> wrote: > Hi All, > > This is Alecsandru from Server Scripting Languages Optimization team at > Intel Corporation. > > I would like to submit a request to turn-on Profile Guided Optimization or > PGO as the default build option for Python (both 2.7 and 3.6), given its > performance benefits on a wide variety of workloads and hardware. For > instance, as shown from attached sample performance results from the Grand > Unified Python Benchmark, >20% speed up was observed. In addition, we are > seeing 2-9% performance boost from OpenStack/Swift where more than 60% of > the codes are in Python 2.7. Our analysis indicates the performance gain > was mainly due to reduction of icache misses and CPU front-end stalls. > > Attached is the Makefile patches that modify the all build target and adds > a new one called "disable-profile-opt". We built and tested this patch for > Python 2.7 and 3.6 on our Linux machines (CentOS 7/Ubuntu Server 14.04, > Intel Xeon Haswell/Broadwell with 18/8 cores). We use "regrtest" suite for > training as it provides the best performance improvement. Some of the test > programs in the suite may fail which leads to build fail. One solution is > to disable the specific failed test using the "-x " flag (as shown in the > patch) > > Steps to apply the patch: > 1. hg clone https://hg.python.org/cpython cpython > 2. cd cpython > 3. hg update 2.7 (needed for 2.7 only) > 4. Copy *.patch to the current directory > 5. patch < python2.7-pgo.patch (or patch < python3.6-pgo.patch) > 6. ./configure > 7. make > > To disable PGO > 7b. make disable-profile-opt > > In the following, please find our sample performance results from latest > XEON machine, XEON Broadwell EP. > Hardware (HW): Intel XEON (Broadwell) 8 Cores > > BIOS settings: Intel Turbo Boost Technology: false > Hyper-Threading: false > > Operating System: Ubuntu 14.04.3 LTS trusty > > OS configuration: CPU freq set at fixed: 2.6GHz by > echo 2600000 > > /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq > echo 2600000 > > /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq > Address Space Layout Randomization (ASLR) disabled (to > reduce run to run variation) by > echo 0 > /proc/sys/kernel/randomize_va_space > > GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04) > > Benchmark: Grand Unified Python Benchmark (GUPB) > GUPB Source: https://hg.python.org/benchmarks/ > > Python2.7 results: > Python source: hg clone https://hg.python.org/cpython cpython > Python Source: hg update 2.7 > hg id: 0511b1165bb6 (2.7) > hg id -r 'ancestors(.) and tag()': 15c95b7d81dc (2.7) v2.7.10 > hg --debug id -i: 0511b1165bb6cf40ada0768a7efc7ba89316f6a5 > > Benchmarks Speedup(%) > simple_logging 20 > raytrace 20 > silent_logging 19 > richards 19 > chaos 16 > formatted_logging 16 > json_dump 15 > hexiom2 13 > pidigits 12 > slowunpickle 12 > django_v2 12 > unpack_sequence 11 > float 11 > mako 11 > slowpickle 11 > fastpickle 11 > django 11 > go 10 > json_dump_v2 10 > pathlib 10 > regex_compile 10 > pybench 9.9 > etree_process 9 > regex_v8 8 > bzr_startup 8 > 2to3 8 > slowspitfire 8 > telco 8 > pickle_list 8 > fannkuch 8 > etree_iterparse 8 > nqueens 8 > mako_v2 8 > etree_generate 8 > call_method_slots 7 > html5lib_warmup 7 > html5lib 7 > nbody 7 > spectral_norm 7 > spambayes 7 > fastunpickle 6 > meteor_contest 6 > chameleon 6 > rietveld 6 > tornado_http 5 > unpickle_list 5 > pickle_dict 4 > regex_effbot 3 > normal_startup 3 > startup_nosite 3 > etree_parse 2 > call_method_unknown 2 > call_simple 1 > json_load 1 > call_method 1 > > Python3.6 results > Python source: hg clone https://hg.python.org/cpython cpython > hg id: 96d016f78726 tip > hg id -r 'ancestors(.) and tag()': 1a58b1227501 (3.5) v3.5.0rc1 > hg --debug id -i: 96d016f78726afbf66d396f084b291ea43792af1 > > > Benchmark Speedup(%) > fastunpickle 22.94 > fastpickle 21.67 > json_load 17.64 > simple_logging 17.49 > meteor_contest 16.67 > formatted_logging 15.33 > etree_process 14.61 > raytrace 13.57 > etree_generate 13.56 > chaos 12.09 > hexiom2 12 > nbody 11.88 > json_dump_v2 11.24 > richards 11.02 > nqueens 10.96 > fannkuch 10.79 > go 10.77 > float 10.26 > regex_compile 9.8 > silent_logging 9.63 > pidigits 9.58 > etree_iterparse 9.48 > 2to3 8.44 > regex_v8 8.09 > regex_effbot 7.88 > call_simple 7.63 > tornado_http 7.38 > etree_parse 4.92 > spectral_norm 4.72 > normal_startup 4.39 > telco 3.88 > startup_nosite 3.7 > call_method 3.63 > unpack_sequence 3.6 > call_method_slots 2.91 > call_method_unknown 2.59 > iterative_count 0.45 > threaded_count -2.79 > > > Thank you, > Alecsandru > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > -- > --Guido van Rossum (python.org/~guido) > -- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com