Author: Antonio Cuni <anto.c...@gmail.com>
Branch: extradoc
Changeset: r5968:1634e069aabb
Date: 2019-12-17 18:43 +0100
http://bitbucket.org/pypy/extradoc/changeset/1634e069aabb/

Log:    finish the draft

diff --git a/blog/draft/2019-12-hpy-sprint.rst 
b/blog/draft/2019-12-hpy-sprint.rst
--- a/blog/draft/2019-12-hpy-sprint.rst
+++ b/blog/draft/2019-12-hpy-sprint.rst
@@ -50,12 +50,12 @@
 .. _`HPy repository`: https://github.com/pyhandle/hpy
 
 
-CPython and universal target ABI
----------------------------------
+Target ABI
+-----------
 
 When compiling an HPy extension you can choose two different target ABI:
 
-  - **CPython ABI**: in this case, ``hpy.h`` contains a set of macros and
+  - **HPy/CPython ABI**: in this case, ``hpy.h`` contains a set of macros and
     static inline functions which translates at compilation time the HPy API
     into the standard C-API: the compiled module will have no performance
     penalty and it will have an filename like
@@ -71,7 +71,7 @@
 Universal modules can be loaded **also** on CPython, thanks to the
 ``hpy_universal`` module which is under development: because of an extra layer
 of indirection, extensions compiled with the universal ABI will face a small
-performance penalty compared to the ones using the CPython ABI.
+performance penalty compared to the ones using the HPy/CPython ABI.
 
 This setup gives several benefits:
 
@@ -149,7 +149,7 @@
     similar to the new `METH_FASTCALL` which was introduced in CPython.
 
   - HPy relies a lot on C macros, which most of the time are needed to support
-    the CPython ABI compilation mode. For example, ``HPy_DEF_METH_VARARGS``
+    the HPy/CPython ABI compilation mode. For example, ``HPy_DEF_METH_VARARGS``
     expands into a trampoline which has the correct C signature that CPython
     expects (i.e., ``PyObject (*)(PyObject *self, *PyObject *args)``) and
     which calls ``add_ints_impl``.
@@ -162,4 +162,157 @@
 Sprint report and current status
 ---------------------------------
 
-XXX finish me
+After this long preamble, here is a rough list of what we accomplished during
+the week-long sprint and the days immediatly after.
+
+On the HPy side, We kicked-off the code in the repo: at the moment of writing
+the layout of the directories is a bit messy because we moved things around
+several times, but identified several main sections:
+
+  1. A specification of the API which serves both as documentation and as an
+     input for parts of the projects which are automatically
+     generated. Currently, this lives `public_api.h`_.
+
+  2. A set of header files which can be used to compile extension module:
+     depending on whether the flag ``-DHPY_UNIVERSAL_ABI`` is passed to the
+     compiler, the extension can target the `HPy/CPython ABI`_ or the `HPy
+     Universal ABI`_
+
+  3. A `CPython extension module`_ called ``hpy_universal`` which makes it
+     possible to import universal modules on CPython
+
+  4. A set of tests_ which are independent of the implementation and are meant
+     to be an "executable specification" of the semantics.  Currently, these
+     tests are run against three different implementations of the HPy API:
+
+       - the headers which implements the "HPy/CPython ABI"
+
+       - the ``hpy_universal`` module for CPython
+
+       - the ``hpy_universal`` module for PyPy (these tests are run in the 
PyPy repo)
+
+Moreover, we started a `PyPy branch`_ in which to implement the
+``hpy_univeral`` module: at the moment of writing PyPy can pass all the HPy
+tests apart the ones which allows to convert to and from ``PyObject *``.
+Among the other things, this means that it is already possible to load the
+very same binary module in both CPython and PyPy, which is impressive on its
+own :).
+
+Finally, we wanted a real-life use case to show how to port a module to HPy
+and to do benchmarks.  After some searching, we choose ultrajson_, for the
+following reasons:
+
+  - it is a real-world extension module which was written with performance in
+    mind
+
+  - when parsing a JSON file it does a lot of calls to the Python API to
+    construct the various parts of the result message
+
+  - it uses only a small subset of the Python API
+
+This repo contains the `HPy port of ultrajson`. This commit_ shows an example
+of how the porting looks like.
+
+``ujson_hpy`` is also a very good example of incremental migration: so far
+only ``ujson.loads`` is implemented using the HPy API, while ``ujson.dumps``
+is still implemented using the old C-API, and both can coexist nicely in the
+same compiled module.
+
+
+.. _`public_api.h`: 
https://github.com/pyhandle/hpy/blob/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/tools/public_api.h
+.. _`CPython extension module`: 
https://github.com/pyhandle/hpy/tree/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/cpython-universal/src
+.. _`HPy/CPython ABI`: 
https://github.com/pyhandle/hpy/blob/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/hpy-api/hpy_devel/include/cpython/hpy.h
+.. _`HPy Universal ABI`: 
https://github.com/pyhandle/hpy/blob/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/hpy-api/hpy_devel/include/universal/hpy.h
+.. _tests: 
https://github.com/pyhandle/hpy/tree/9aa8a2738af3fd2eda69d4773b319d10a9a5373f/test
+
+.. _`PyPy branch`: 
https://bitbucket.org/pypy/pypy/src/hpy/pypy/module/hpy_universal/
+
+.. _ultrajson: https://github.com/esnme/ultrajson
+.. _`HPy port of ultrajson`: https://github.com/pyhandle/ultrajson-hpy
+.. _commit: 
https://github.com/pyhandle/ultrajson-hpy/commit/efb35807afa8cf57db5df6a3dfd4b64c289fe907
+
+
+Benchmarks
+-----------
+
+Once we have a fully working ``ujson_hpy`` module, we can finally run
+benchmarks!  We tested several different versions of the module:
+
+  - ``ujson``: this is the vanilla implementation of ultrajson using the
+    C-API. On PyPy this is executed by the infamous ``cpyext`` compatibility
+    layer, so we expect it to be much slower than on CPython
+
+  - ``ujson_hpy``: our HPy port compiled to target the HPy/CPython ABI. We
+    expect it to be as fast as ``ujson``
+
+  - ``ujson_hpy_universal``: same as above but compiled to target the
+    Universal HPy ABI. We expect it to be slightly slower than ``ujson`` on
+    CPython, and much faster on PyPy.
+
+Finally, we also ran the benchmark using the builtin ``json`` module. This is
+not really relevant to HPy, but it might still be an interesting as a
+reference data point.
+
+The benchmark_ is very simple and consists of parsing a `big JSON file`_ 100
+times. Here is the average time per iteration (in milliseconds) using the
+various versions of the module, CPython 3.7 and the latest version of the hpy
+PyPy branch:
+
++---------------------+---------+--------+
+|                     | CPython | PyPy   |
++---------------------+---------+--------+
+| ujson               | 154.32  | 633.97 |
++---------------------+---------+--------+
+| ujson_hpy           | 152.19  |        |
++---------------------+---------+--------+
+| ujson_hpy_universal | 168.78  | 207.68 |
++---------------------+---------+--------+
+| json                | 224.59  | 135.43 |
++---------------------+---------+--------+
+
+As expected, the benchmark proves that when targeting the HPy/CPython ABI, HPy
+doesn't impose any performance penalty on CPython. The universal version is
+~10% slower on CPython, but gives an impressive 3x speedup on PyPy! It it
+worth noting that the PyPy hpy module is not fully optimized yet, and we
+expect to be able to reach the same performance as CPython for this particular
+example (or even more, thanks to our better GC).
+
+All in all, not a bad result for two weeks of intense hacking :)
+
+It is also worth noting than PyPy's builtin ``json`` module does **really**
+well in this benchmark, thanks to the recent optimizations that were described
+in an `earlier blog post`_.
+
+
+.. _benchmark: 
https://github.com/pyhandle/ultrajson-hpy/blob/hpy/benchmark/main.py
+.. _`big JSON file`: 
https://github.com/pyhandle/ultrajson-hpy/blob/hpy/benchmark/download_data.sh
+.. _`earlier blog post`: 
https://morepypy.blogspot.com/2019/10/pypys-new-json-parser.html
+
+
+Conclusion and future directions
+---------------------------------
+
+We think we can be very satisfied about what we have got so far. The
+development of HPy just started but these early results seem to indicate that
+we are on the right track to bring Python extensions into the future.
+
+At the moment, we can anticipate some of the next steps in the development of
+HPy:
+
+  - think about a proper API design: what we have done so far has
+    been a "dumb" translation of the API we needed to run ``ujson``. However,
+    one of the declared goal of HPy is to improve the design of the API. There
+    will be a trade-off between the desire of having a clean, fresh new API
+    and the need to be not too different than the old one, to make porting
+    easier.  Finding the sweet spot will not be easy!
+
+  - implement the "debug" mode, which will help developers to find
+    bugs such as leaking handles or using invalid handles
+
+  - instruct Cython to emit HPy code on request
+
+  - eventually, we will also want to try to port parts of ``numpy`` to HPy to
+    finally solve the long-standing problem of sub-optimal ``numpy``
+    performance in PyPy
+
+Stay tuned!
_______________________________________________
pypy-commit mailing list
pypy-commit@python.org
https://mail.python.org/mailman/listinfo/pypy-commit

Reply via email to