Re: [pypy-dev] New Python/PyPy extension for object oriented numerically intensive codes ?

Yury V. Zaytsev Wed, 06 Jan 2021 13:57:07 -0800

On Wed, 6 Jan 2021, PIERRE AUGIER wrote:

A big issue IMHO with Cython is that Cython code is not compatible withPython and can't be interpreted. So we lose the advantage of aninterpreted language in term of development. One small change in thisbig extension and one needs to recompile everything.

That's a valid point to a certain extent - however, in my experience, Iwas always somehow able to extract individual small functions inmini-modules and then I wrote some Makefile / setuptools glue to automatechained recompilation of all parts that changed, whenever I ran unit testsor command line interface so recompilation kept annoying me only untilI've got the magic to work :-)

For me, debugging is really harder (maybe because I'm not good atdebugging native codes). Moreover, actually one needs to know (a bit of)C to write efficient Cython code so that it's difficult for somecontributors to understand/develop Cython extensions.

I must admit that I never needed to debug anything because I was doing TDDin the fist place, but probably you are right - debugging generatedmonster codes must be quite scary as compared to pure Python code withfull IDE support like PyCharm.

Anyways, call me chauvinist, but I'd say it's just a sad fact of life thatyou need to know a thing or two about writing correct numeric low-levelperformance oriented code.

I assume you know it anyways and I'm sure that your worked up summationexample below was just to make a completely different point, but as amatter of fact in your code the worst-case error grows proportionally tothe number of elements in the vector (N) and RMS error growsproportionally to the square root of N for random inputs, so the resultsof your computations are going to be accordingly pretty random in thegeneral case ;-)

Where I'm getting with this is that people who do this kind of stuff aresomehow not bothered by Cython problems, and people who don't arerightfully bothered by valid issues, but if they are going to be helped,will it help their cause :-) ? Who knows...

On top of that, again, there is the whole MPI story. I used to writePython stuff that scaled to the hundreds of thousands of cores. I stilldid SIMD inside OpenMP threads on the local nodes on top of that just forkicks, but actually I could have achieved a factor of 4x speedup just byscheduling my jobs overnight with 4x cores instead and saved myself thetrouble. But I wanted trouble, because it was fun :-)

Cython and mpi4py make MPI almost criminally easy on Python, so once youget this far, there comes the question - does 2x or 4x on the local nodeactually matter at all?

So my questions are: Is it technically possible to extend Python andPyPy to develop such extension and make it very efficient? Which toolsshould be used? How should it be written?

It is absolutely technically possible and is a good idea in as far as I'mconcerned, but I think that the challenge lies in developing conventionsfor semantics and getting people to accept them. I think that the zoo ofvarious accelerators / compilers / boosters for Python only proves thepoint that this must be the hard part.

As for a backing buffer access mechanism, cffi is definitively a righttool - PyPy can already "see through" it as you've proven with your smallexample.


--
Sincerely yours,
Yury V. Zaytsev
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Re: [pypy-dev] New Python/PyPy extension for object oriented numerically intensive codes ?

Reply via email to