Author: Antonio Cuni <[email protected]> Branch: extradoc Changeset: r5905:a3f4e0ff4d99 Date: 2018-09-20 12:10 +0200 http://bitbucket.org/pypy/extradoc/changeset/a3f4e0ff4d99/
Log: write sections about benchmarks and next steps diff --git a/blog/draft/2018-09-cpyext/cpyext.rst b/blog/draft/2018-09-cpyext/cpyext.rst --- a/blog/draft/2018-09-cpyext/cpyext.rst +++ b/blog/draft/2018-09-cpyext/cpyext.rst @@ -21,6 +21,10 @@ poor, meaning that a Python program which makes heavy use of cpyext extensions is likely to be slower on PyPy than on CPython. +Note: in this blog post we are talking about Python 2.7 because it is still +the default version of PyPy: however most of the implementation of cpyext is +shared with PyPy3, so everything applies to that as well. + .. _`official C API`: https://docs.python.org/2/c-api/index.html @@ -207,7 +211,7 @@ well. This means that in theory, passing an arbitrary Python object to C is -potentially costly, because it involves doing a dictionary lookup. I assume +potentially costly, because it involves doing a dictionary lookup. We assume that this cost will eventually show up in the profiler: however, at the time of writing there are other parts of cpyext which are even more costly (as we will show later), so the cost of the dict lookup is never evident in the @@ -301,7 +305,7 @@ assumptions, usually pointing at the cost of conversions between ``W_Root`` and ``PyObject*``, but we never actually measured it. -So, I decided to write a set of `cpyext microbenchmarks`_ to measure the +So, we decided to write a set of `cpyext microbenchmarks`_ to measure the performance of various operation. The result was somewhat surprising: the theory suggests that when you do a cpyext C call, you should pay the border-crossing costs only once, but what the profiler told us was that we @@ -520,3 +524,87 @@ .. _`list strategies`: https://morepypy.blogspot.com/2011/10/more-compact-lists-with-list-strategies.html .. _convert: https://bitbucket.org/pypy/pypy/src/b9bbd6c0933349cbdbfe2b884a68a16ad16c3a8a/pypy/module/cpyext/listobject.py#lines-28 .. _`design a better C API`: https://pythoncapi.readthedocs.io/ + + +Current performance +-------------------- + +During the whole blog post we kept talking about the slowness of cpyext: how +much it is, exactly? + +We decided to concentrate on microbenchmarks_ for now: as it should be evident +by now there are simply too many issues which can slow down a cpyext +benchmark, and microbenchmarks help us to concentrate on one (or few) at a +time. + +The microbenchmarks measure very simple stuff, like calling function and +methods with the various calling conventions (no arguments, one arguments, +multiple arguments), passing various types as arguments (to measure conversion +costs), allocating objects from C, and so on. + +This was the performance of PyPy 5.8 relative and normalized to CPython 2.7, +the lower the better: + +.. image:: pypy58.png + +PyPy was horribly slow everywhere, ranging from 2.5x to 10x slower. It is +particularly interesting to compare ``simple.noargs``, which measure the cost +of calling an empty function with no arguments, and ``simple.onearg(i)``, +which measure the cost calling an empty function passing an integer argument: +the latter is ~2x slower than the former, indicating that the conversion cost +of integers is huge. + +PyPy 5.8 was the last release before we famouse Cape Town sprint, when we +started to look at cpyext performance seriously. These are the performance for +PyPy 6.0, the latest release at the time of writing: + +.. image:: pypy60.png + +The results are amazing! PyPy is now massively faster than before, and for +most benchmarks it is even faster than CPython: yes, you read it correctly: +PyPy is faster than CPython at doing CPython's job, even considering all the +extra work it has to do to emulate the C API. This happens thanks to the JIT, +which produce speedups high enough to counterbalance the slowdown caused by +cpyext. + +There are two microbenchmarks which are still slower though: ``allocate_int`` +and ``allocate_tuple``, for the reasons explained in the section about +`Conversion costs`_. + +.. _microbenchmarks: https://github.com/antocuni/cpyext-benchmarks + + +Next steps +----------- + +Despite the spectacular results we got so far, cpyext is still slow enough to +kill performance in most real-world code which uses C extensions extensively +(e.g., the omnipresent numpy). + +Our current approach is something along these lines: + + 1. run a real-world small benchmark which exercises cpyext + + 2. measure and find the bottleneck + + 3. write a corresponding microbenchmark + + 4. optimize it + + 5. repeat + +On one hand, this is a daunting task because the C API is huge and we need to +tackle functions one by one. On the other hand, not all the functions are +equally important, and is is enough to optimize a relatively small subset to +improve lots of different use cases. + +The biggest result is that now we have a clear picture of what are the +problems, and we developed some technical solutions to fix them. It is "only" +a matter of tackling them, one by one. Moreoever, keep in mind that most of +the work was done during two sprints, for a total 2-3 man-months. + +XXX: find a conclusion + + + + diff --git a/blog/draft/2018-09-cpyext/plot.py b/blog/draft/2018-09-cpyext/plot.py new file mode 100644 --- /dev/null +++ b/blog/draft/2018-09-cpyext/plot.py @@ -0,0 +1,45 @@ +def plot_benchmarks(filename, *pythons): + import numpy as np + import matplotlib + import matplotlib.pyplot as plt + + matplotlib.rcParams['figure.figsize'] = (20,15) + + data = {"CPython": {"simple.noargs": 0.43, "simple.onearg(None)": 0.45, "simple.onearg(i)": 0.44, "simple.varargs": 0.6, "simple.allocate_int": 0.46, "simple.allocate_tuple": 0.81, "obj.noargs": 0.44, "obj.onearg(None)": 0.48, "obj.onearg(i)": 0.47, "obj.varargs": 0.63, "len(obj)": 0.34, "obj[0]": 0.25}, + "PyPy 5.8": {"simple.noargs": 1.09, "simple.onearg(None)": 1.34, "simple.onearg(i)": 2.6, "simple.varargs": 2.74, "simple.allocate_int": 2.49, "simple.allocate_tuple": 8.21, "obj.noargs": 1.27, "obj.onearg(None)": 1.55, "obj.onearg(i)": 2.85, "obj.varargs": 3.06, "len(obj)": 1.36, "obj[0]": 1.53}, + "PyPy 5.9": {"simple.noargs": 0.16, "simple.onearg(None)": 0.2, "simple.onearg(i)": 1.61, "simple.varargs": 3.08, "simple.allocate_int": 1.69, "simple.allocate_tuple": 6.39, "obj.noargs": 1.17, "obj.onearg(None)": 1.74, "obj.onearg(i)": 3.03, "obj.varargs": 2.95, "len(obj)": 1.24, "obj[0]": 1.37}, + "PyPy 5.10": {"simple.noargs": 0.18, "simple.onearg(None)": 0.21, "simple.onearg(i)": 1.52, "simple.varargs": 2.59, "simple.allocate_int": 1.67, "simple.allocate_tuple": 6.44, "obj.noargs": 1.12, "obj.onearg(None)": 1.41, "obj.onearg(i)": 2.62, "obj.varargs": 2.89, "len(obj)": 1.21, "obj[0]": 1.32}, + "PyPy 6.0": {"simple.noargs": 0.18, "simple.onearg(None)": 0.2, "simple.onearg(i)": 0.22, "simple.varargs": 0.42, "simple.allocate_int": 0.89, "simple.allocate_tuple": 5.02, "obj.noargs": 0.19, "obj.onearg(None)": 0.22, "obj.onearg(i)": 0.24, "obj.varargs": 0.45, "len(obj)": 0.15, "obj[0]": 0.28}} + + + + #pythons = data.keys() + #pythons = ["CPython", "PyPy 5.10", "PyPy 6.0"] + benchmarks = sorted(data[pythons[0]].keys()) + + # create plot + fig, ax = plt.subplots() + index = np.arange(len(benchmarks)) + bar_width = 0.20 + opacity = 0.8 + + colors = ('blue', 'orange', 'red') #'bgryk' + + for i, python in enumerate(pythons): + values = [data[python][bench] for bench in benchmarks] + normalized = [v/data['CPython'][bench] for (v, bench) in zip(values, benchmarks)] + #print python, values + rects1 = plt.bar(index + bar_width*i, normalized, bar_width, + label=python, + color=colors[i]) + + plt.xlabel('Benchmark') + plt.ylabel('Time (normalized)') + plt.title('cpyext microbenchmarks') + plt.xticks(index + bar_width, benchmarks, rotation=45) + plt.legend() + + plt.savefig(filename) + +plot_benchmarks("pypy58.png", "CPython", "PyPy 5.8") +plot_benchmarks("pypy60.png", "CPython", "PyPy 5.8", "PyPy 6.0") diff --git a/blog/draft/2018-09-cpyext/pypy58.png b/blog/draft/2018-09-cpyext/pypy58.png new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..e7636a1d81e6dd169c1abd67b283a93870191cf9 GIT binary patch [cut] diff --git a/blog/draft/2018-09-cpyext/pypy60.png b/blog/draft/2018-09-cpyext/pypy60.png new file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..f4bc43a7c400fe0dff54421716bb150cb5ab4458 GIT binary patch [cut] _______________________________________________ pypy-commit mailing list [email protected] https://mail.python.org/mailman/listinfo/pypy-commit
