Re: [pypy-dev] NumPyPy vs NumPy

Matti Picus Wed, 27 Jul 2016 11:13:14 -0700

On 27/07/2016 3:35 AM, Papa, Florin wrote:

I am sorry, I mistakenly switched the header of the table, the middle column is 
actually the result for PyPy NumPyPy. The correct table is this:


Benchmark   CPython NumPy   PyPy NumPyPy     PyPy NumPy
cauchy          1           5.838852812     4.866947551
pointbypoint    1           4.922654347     0.981008211
numrand         1           2.478997019     1.082185897
rowmean         1           2.512893263     1.062233015
dsums           1           33.58240465     1.013388981
vectsum         1           1.738446611     0.771660704
cauchy          1           2.168377906     0.887388291
polarcoords     1           1.030962402     0.500905427
vectsort        1           2.214586698     0.973727924
arange          1           2.045342386     0.69941044
vectoradd       1           5.447667037     1.513217941
extractint      1           1.655717606     2.671712185
float2int       1           3.1688          0.905406988
insertzeros     1           2.375043445     1.037504453

The results were gathered without vectorization, I will provide the results 
with vectorization as soon as I have them.

Sorry again for the mistake.

Regards,
Florin
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Thanks for taking the time to test this. You asked in the first message"Is there an official benchmark suite for NumPy or a more relevantworkload to compare against CPython? What is NumPyPy's maturity /adoption rate from your knowledge?"

There is no official numpy benchmark, since there is really no "typical"numpy workload. Numpy is used as a common container for data processing,and each field has its own cases that interest it, for instance aworkload done by CAFFE for neural network processing is much differentthat one done by OpenCV for image processing, which is different thatthe natural language processing done in NLTK, even though for the mostpart all three of these use numpy. There are a few numpy benchmarksavailable;

https://github.com/serge-sans-paille/numpy-benchmarks (needs to beadapted to pypy's slow warmup time)

http://yarikoptic.github.io/numpy-vbench  (also AFAICT never run on PyPy)
https://bitbucket.org/mikefc/numpy-benchmark.git

I would expect numpypy to shine in cases where there is heavy use ofpython together with numpy. Your benchmarks are at the other extreme;they demonstrate that our reimplementation of the numpy looping ufuncsis slower than C, but do not test the python-numpy interaction nor howwell the JIT can optimize python code using numpy. For your testsRichard's suggestion of turning on vectorization may show a largeimprovement, as it brings numpypy's optimizations closer to the onesdone by a good C compiler. But even so, it is impressive that withoutvectorization we are only 2-4 times slower than the heavily vectorized cimplementation, and that the cpyext emulation layer seems not to matterthat much in your benchmarks.

In general, timeit does a bad job for pypy benchmarks since it does notallow for warmup time and is geared to measure a minimum. Your datademonstrates some of the pitfalls of benchmarking - note that you showtwo very different results for your "cauchy" benchmark. You may want tocheck out the perf module http://perf.readthedocs.io for a moresophisticated way of running benchmarks or readhttps://arxiv.org/abs/1602.00602, which summarizes the problemsbenchmarking.

In order to continue this discussion, could you create a repository withthese benchmarks and a set of instructions how to reproduce them? You donot say what platform you use, what machine you ran the tests on,whether you used MKL/BLAS, what versions of pypy and cpython you used,... Once we have a conveniently reproducible way to have thisconversation we may be able to make progress toward reaching someoperative conclusions, but I'm not sure a mailing list is the best placethese days.


Matti
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Re: [pypy-dev] NumPyPy vs NumPy

Reply via email to