Hello,

I have been working to track down the origin of the performance penalty
exposed by this bug.

All the tests that I am performing are made on top of a locally compiled 
version of python 2.7.12 (from upstream sources, not applying any ubuntu patch 
on it)
built with different versions of GCC, 5.3.1 (current) and 4.8.0 both coming 
from the Ubuntu archives.

I can see important performance differences as I mentioned on my previous 
comments (check the full comparisons stats) just by
switching the GCC version. I decided to focus my investigation on the pickle 
module, since it seems to be the most affected one being
approximately 1.17x slower between the different gcc versions.

Due to the amount of changes introduced between 4.8.0 and 5.3.1 I decided to 
not persue the approach
of doing a bisection of the changes for identifying an offending commit yet, 
until we can identify which optimization or change
at compile time is causing the regression and focus our investigation on that 
specific area.

My understanding is that the performance penalty caused by the compiler might 
be related
to 2 factors, a important change on the linked libc or a optimization made by 
the compiler in the resulting object. 

Since the resulting objects are linked against the same glibc version 2.23, I 
will not consider that factor as part of the analysis,
instead I will focus on analyzing the performance of the resulting objects 
generated by the compiler.

For following this approach I ran the pyperformance suite and used a valgrind 
session excluding all the modules with the exception of the pickle module, 
using the default supressions to avoid missing any reference in the python 
runtime with the following arguments:

valgrind --tool=callgrind --instr-atstart=no --trace-children=yes
venv/cpython2.7-6ed9b6df9cd4/bin/python -m performance run --python
/usr/local/bin/python2.7 -b pickle --inside-venv

I did run this process multiple times with both GCC 4.8.0 and 5.3.1  to produce 
a large set of callgrind files to analyze , those callgrind files contains the 
full tree of execution 
including all the relocations, jumps, calls to the libc and the python runtime 
itself and of course time spent per function and the amount of calls made to it.

I cleaned out all the resulting callgrind files removing the files smaller than 
100k and the ones that were not loading the cPickle
extension (https://pastebin.canonical.com/175951/). 

Over that set of files I executed callgrind_annotate to generate the stats per 
function ordered by the exclusive cost of function, 
Then with this script (http://paste.ubuntu.com/23795048/
) I added all the costs per function per GCC version (4.8 and 5.3.1) and then I 
calculated the variance in cost between them.

The resulting file contains a tuple with the following format:

function name - gcc 4.8 cost - gcc 5.3.1 cost - variance in percent

As an example:

/home/ubuntu/python/cpython/Objects/tupleobject.c:tupleiter_dealloc 
258068.000000 445009.000000 (variance: 0.724387)
/home/ubuntu/python/cpython/Objects/object.c:try_3way_compare 984860.000000 
1676351.000000 (variance: 0.702121)
/home/ubuntu/python/cpython/Python/marshal.c:r_object 183524.000000 
27742.000000 (variance: -0.848837)

The full results can be located here sorted by variance in descending
order http://paste.ubuntu.com/23795023/

Now that we have these results we can move forward comparing the generated code 
for the functions with bigger variance 
and track which optimization done by GCC might be altering the resulting 
objects.

I will update this case after further investigation.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1638695

Title:
  Python 2.7.12 performance regression

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python2.7/+bug/1638695/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to