Author: fijal
Branch: extradoc
Changeset: r5951:0dd699c2378b
Date: 2019-07-24 13:54 +0200
http://bitbucket.org/pypy/extradoc/changeset/0dd699c2378b/

Log:    add a draft

diff --git a/blog/draft/2019-07-arm64.rst b/blog/draft/2019-07-arm64.rst
new file mode 100644
--- /dev/null
+++ b/blog/draft/2019-07-arm64.rst
@@ -0,0 +1,136 @@
+Hello everyone
+
+We are pleased to announce that we have successfully ported PyPy
+to Aarch64 (also known as a 64bit ARM) platform, thanks to funding
+provided by ARM Holdings Ltd. and Crossbar.io.
+
+We are presenting here the benchmark run done on a Graviton A1 machine
+from AWS. There is a very serious word of warning: Graviton A1 are
+virtual machines and as such, are not suitable for benchmarking. If someone
+has access to a beefy enough (16G) ARM64 server and is willing to give
+us access to it, we are happy to redo the benchmarks on a real machine.
+My main concern here is that while a vCPU is 1-1 with a real CPU, it's not
+clear to me how caches are shared and how they cross CPU boundaries.
+
+We are not interested in comparing machines, so what we are showing is
+a relative speedup to CPython (2.7.15), compared to PyPy (hg id 2417f925ce94).
+
+In the last column is a comparison - how much do we speedup on arm64, vs
+how much do we speed up on x86_64. One important thing to note is that
+by no means this is a representative enough benchmark set that we can average
+anything. Read numbers per-benchmark.
+
++------------------------------+----------+----------+----------+
+|*Benchmark name*              |x86_64    |Aarch64   |relative  |
++------------------------------+----------+----------+----------+
+|ai                            |5.66      |5.34      |0.94      |
++------------------------------+----------+----------+----------+
+|bm_chameleon                  |2.85      |6.57      |2.30      |
++------------------------------+----------+----------+----------+
+|bm_dulwich_log                |1.98      |1.34      |0.68      |
++------------------------------+----------+----------+----------+
+|bm_krakatau                   |1.20      |0.69      |0.58      |
++------------------------------+----------+----------+----------+
+|bm_mako                       |4.88      |6.38      |1.31      |
++------------------------------+----------+----------+----------+
+|bm_mdp                        |0.82      |0.74      |0.90      |
++------------------------------+----------+----------+----------+
+|chaos                         |25.40     |25.52     |1.00      |
++------------------------------+----------+----------+----------+
+|crypto_pyaes                  |32.35     |31.92     |0.99      |
++------------------------------+----------+----------+----------+
+|deltablue                     |1.60      |1.48      |0.93      |
++------------------------------+----------+----------+----------+
+|django                        |14.15     |13.71     |0.97      |
++------------------------------+----------+----------+----------+
+|eparse                        |1.43      |1.12      |0.78      |
++------------------------------+----------+----------+----------+
+|fannkuch                      |4.83      |6.53      |1.35      |
++------------------------------+----------+----------+----------+
+|float                         |8.43      |8.16      |0.97      |
++------------------------------+----------+----------+----------+
+|genshi_text                   |3.70      |3.61      |0.98      |
++------------------------------+----------+----------+----------+
+|genshi_xml                    |2.97      |1.64      |0.55      |
++------------------------------+----------+----------+----------+
+|go                            |2.77      |2.47      |0.89      |
++------------------------------+----------+----------+----------+
+|hexiom2                       |9.35      |8.03      |0.86      |
++------------------------------+----------+----------+----------+
+|html5lib                      |2.88      |1.93      |0.67      |
++------------------------------+----------+----------+----------+
+|json_bench                    |2.85      |2.81      |0.99      |
++------------------------------+----------+----------+----------+
+|meteor-contest                |2.21      |2.27      |1.03      |
++------------------------------+----------+----------+----------+
+|nbody_modified                |9.86      |8.59      |0.87      |
++------------------------------+----------+----------+----------+
+|nqueens                       |1.12      |1.02      |0.91      |
++------------------------------+----------+----------+----------+
+|pidigits                      |0.99      |0.62      |0.63      |
++------------------------------+----------+----------+----------+
+|pyflate-fast                  |3.86      |4.62      |1.20      |
++------------------------------+----------+----------+----------+
+|pypy_interp                   |2.12      |2.03      |0.95      |
++------------------------------+----------+----------+----------+
+|pyxl_bench                    |1.72      |1.37      |0.80      |
++------------------------------+----------+----------+----------+
+|raytrace-simple               |58.86     |44.21     |0.75      |
++------------------------------+----------+----------+----------+
+|richards                      |52.68     |44.90     |0.85      |
++------------------------------+----------+----------+----------+
+|rietveld                      |1.52      |1.28      |0.84      |
++------------------------------+----------+----------+----------+
+|spambayes                     |1.87      |1.58      |0.85      |
++------------------------------+----------+----------+----------+
+|spectral-norm                 |21.38     |20.28     |0.95      |
++------------------------------+----------+----------+----------+
+|spitfire                      |1.28      |2.77      |2.17      |
++------------------------------+----------+----------+----------+
+|spitfire_cstringio            |7.84      |7.42      |0.95      |
++------------------------------+----------+----------+----------+
+|sqlalchemy_declarative        |1.76      |1.05      |0.60      |
++------------------------------+----------+----------+----------+
+|sqlalchemy_imperative         |0.63      |0.60      |0.95      |
++------------------------------+----------+----------+----------+
+|sqlitesynth                   |1.17      |1.00      |0.86      |
++------------------------------+----------+----------+----------+
+|sympy_expand                  |1.32      |1.25      |0.95      |
++------------------------------+----------+----------+----------+
+|sympy_integrate               |1.10      |1.01      |0.91      |
++------------------------------+----------+----------+----------+
+|sympy_str                     |0.65      |0.62      |0.95      |
++------------------------------+----------+----------+----------+
+|sympy_sum                     |1.87      |1.79      |0.96      |
++------------------------------+----------+----------+----------+
+|telco                         |30.38     |19.09     |0.63      |
++------------------------------+----------+----------+----------+
+|twisted_iteration             |13.24     |8.95      |0.68      |
++------------------------------+----------+----------+----------+
+|twisted_names                 |5.27      |3.31      |0.63      |
++------------------------------+----------+----------+----------+
+|twisted_pb                    |5.85      |2.90      |0.50      |
++------------------------------+----------+----------+----------+
+|twisted_tcp                   |3.03      |2.08      |0.68      |
++------------------------------+----------+----------+----------+
+
+Note that we see a wide variance. There are generally three groups of
+benchmarks - those that run at more or less the same speed, those that
+run at 2x the speedup and those that run at 0.5x the speedup of x86_64.
+
+This can be related to a variety of issues, mostly related to differences
+in architecture. What *is* however interesting is that compared to older
+ARM boards, the branch predictor got a lot better, which means the speedups
+will be smaller, since "sophisticated" branchy code, like source code of
+CPython just runs a lot faster.
+
+One takeaway here is that there is a lot of improvement to be done in PyPy.
+This is true for both of the above platforms, but probably more so for Aarch64
+which comes with really a lot of registers. PyPy backend has been written with
+x86 (the 32bit variant) in mind, which is very register poor. We think we can
+improve somewhat in the area of emitting more modern code and it will probably
+make somewhat more difference on Aarch64 than on x86_64, where running old
+crappy code efficiently has been a massive focus.
+
+Best,
+Maciej Fijalkowski, Armin Rigo and the PyPy team
_______________________________________________
pypy-commit mailing list
[email protected]
https://mail.python.org/mailman/listinfo/pypy-commit

Reply via email to