Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python

YueCompl Tue, 24 Nov 2020 14:22:14 -0800

Is there some community interest to develop fusion based high-performance array 
programming? Something like 
https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations
 
<https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations>
 , but that embedded  DSL is far less pleasing compared to Python as the 
surface language for optimized Numpy code in C.


I imagine that we might be able to transpile a Numpy program into fused LLVM 
IR, then deploy part as host code on CPUs and part as CUDA code on GPUs?

I know Numba is already doing the array part, but it is too limited in 
addressing more complex non-array data structures. I had been approaching ~20K 
separate data series with some intermediate variables for each, then it took up 
to 30+GB RAM keep compiling yet gave no result after 10+hours.

Compl


> On 2020-11-24, at 23:47, PIERRE AUGIER <[email protected]> 
> wrote:
> 
> Hi,
> 
> I recently took a bit of time to study the comment "The ecological impact of 
> high-performance computing in astrophysics" published in Nature Astronomy 
> (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y, 
> https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best however, 
> for the environment is to abandon Python for a more environmentally friendly 
> (compiled) programming language.".
> 
> I wrote a simple Python-Numpy implementation of the problem used for this 
> study (https://www.nbabel.org) and, accelerated by Transonic-Pythran, it's 
> very efficient. Here are some numbers (elapsed times in s, smaller is better):
> 
> | # particles |  Py | C++ | Fortran | Julia |
> |-------------|-----|-----|---------|-------|
> |     1024    |  29 |  55 |   41    |   45  |
> |     2048    | 123 | 231 |  166    |  173  |
> 
> The code and a modified figure are here: https://github.com/paugier/nbabel 
> (There is no check on the results for https://www.nbabel.org, so one still 
> has to be very careful.)
> 
> I think that the Numpy community should spend a bit of energy to show what 
> can be done with the existing tools to get very high performance (and low CO2 
> production) with Python. This work could be the basis of a serious reply to 
> the comment by Zwart (2020).
> 
> Unfortunately the Python solution in https://www.nbabel.org is very bad in 
> terms of performance (and therefore CO2 production). It is also true for most 
> of the Python solutions for the Computer Language Benchmarks Game in 
> https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here 
> https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else).
> 
> We could try to fix this so that people see that in many cases, it is not 
> necessary to "abandon Python for a more environmentally friendly (compiled) 
> programming language". One of the longest and hardest task would be to 
> implement the different cases of the Computer Language Benchmarks Game in 
> standard and modern Python-Numpy. Then, optimizing and accelerating such code 
> should be doable and we should be able to get very good performance at least 
> for some cases. Good news for this project, (i) the first point can be done 
> by anyone with good knowledge in Python-Numpy (many potential workers), (ii) 
> for some cases, there are already good Python implementations and (iii) the 
> work can easily be parallelized.
> 
> It is not a criticism, but the (beautiful and very nice) new Numpy website 
> https://numpy.org/ is not very convincing in terms of performance. It's 
> written "Performant The core of NumPy is well-optimized C code. Enjoy the 
> flexibility of Python with the speed of compiled code." It's true that the 
> core of Numpy is well-optimized C code but to seriously compete with C++, 
> Fortran or Julia in terms of numerical performance, one needs to use other 
> tools to move the compiled-interpreted boundary outside the hot loops. So it 
> could be reasonable to mention such tools (in particular Numba, Pythran, 
> Cython and Transonic).
> 
> Is there already something planned to answer to Zwart (2020)?
> 
> Any opinions or suggestions on this potential project?
> 
> Pierre
> 
> PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion, 
> Pyston, etc.) could also be used, especially if HPy 
> (https://github.com/hpyproject/hpy) is successful (C core of Numpy written in 
> HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit 
> skeptical in the ability of such technologies to reach very high performance 
> for low-level Numpy code (performance that can be reached by replacing whole 
> Python functions with optimized compiled code). Of course, I hope I'm wrong! 
> IMHO, it does not remove the need for a successful HPy!
> 
> --
> Pierre Augier - CR CNRS                 http://www.legi.grenoble-inp.fr
> LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels
> BP53, 38041 Grenoble Cedex, France                tel:+33.4.56.52.86.16
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[email protected]
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python

Reply via email to