[Python-ideas] Multiple dispatch transpilation

Marvin van Aalst Tue, 16 Mar 2021 07:07:38 -0700

Hi fellow Pythonistas!

A colleague of mine recently got me involved in the Julia language andwhile I was really impressed by the performance, the complete lack ofOOP and thus bad discoverability (not to mention all their unicodefunction names) for me leaves a lot to be desired.


TL;DR:

Julia uses a type-based multi-method system (multiple dispatch), thatbecause of type-stability allows JIT compilation of very efficient code.Since Pythons single-dispatch classes are a subset of that system, itshould in principle be possible to transform the code from|Class.method|to |method(class:type)|, *before compilation*essentiallyallowing the same optimizations. For the other arguments could utilizethe existing type annotations. The idea would be to allow this viadecorators etc for code where it makes sense (e.g. high volume numbercrunching), while keeping the rest of the language free from JITcompiler issues and other nuisances.


Very long version:

Julia is a high level dynamically typed language, that often approachesor matches C performance. This is due to type-based multiple dispatchand type-stability through specialization. However, the coolest part isthat the type system is set-theoretic, which makes it unnecessary torepeat functions for related types. This page<https://ucidatascienceinitiative.github.io/IntroToJulia/Html/WhyJulia>andthis blog post<http://www.stochasticlifestyle.com/like-julia-scales-productive-insights-julia-developer/>givea good explanations of how this works. *However*, since with theirmultiple dispatch system everything can be written without usingclasses, they have opted to not include any OOP. This leads to a veryhigh amount of functions that are accessible in the global namespace andhinders one to explore the packages in the usual, playful way thatprobably all of us are used to. So the usual resort is to have to readdocumentation. What a shame ;)

Since all of Julia needs to be JIT compiled, importing packages usuallytakes a lot of time (it does some pre-compilation on first import, whichtakes a lot longer, but depending on the system its still quite awhile). A quick search for "time-to-first-plot" or "latency" will beplenty to show that this is quite a nuisance. I believe that itsperfectly fine for some code to be interpreted, as the python communityhas frequently shown, so that might not be the best approach. Onlyperformance-critical code needs to be fast. In our world this is usuallydone by offloading the quick stuff into C, but in that case it getsreally hard to expand those libraries (e.g. numpy has its own dispatchsystem, so pypy cannot help it as much as it could). Also extendingnumerical types for arrays is basically impossible.

So how can we try to retain usability and still gain the performance forthe code that needs it? In general I'd argue that multiple dispatch andJIT compilation is fine, as our performance critical code usually runsfor very long times, so the added overhead is minimal. If the codedoesn't take a large amount of time, you probably don't need to optimizeit anyway. I'd argue that one way to get there would be to "transpile"our Python code by inserting a step before compilation that utilizes thetype annotations that already exist and then pulls typed versions out ofthe classes. So e.g. |Class.method(x:int)|would be transformed into thetyped version |method(class:type, x:int)|, from where on multipledispatch could be run.

This would have the further advantage, that in Julia it can be very hardto see, which attributes a type should have and which methodsnecessarily have to be defined in order to guarantee librarycompatibility. In OOP, we can easily achieve that by defining the typeas an abstract base class.

A lot of the statements about Julia here come from Jeff Bezansons (oneof the Julia main devs PhD thesis)<https://dspace.mit.edu/handle/1721.1/99811>. It's a quick read and Iwould in general recommend reading it.

On the one hand, in general I think that julia is more compiler-friendlythan it is user or tooling-friendly (since methods for your type can bedefined outside of the scope of the class, its very hard for your IDE totell you if that function you just misspelled doesn't exist). On theother hand, its quite nice to be able to create a custom array and justpush it through all the numerical libraries in a very efficient way,which works because you have defined all the required methods. I thinkthis way of speeding up things would be cooler than enabling statictyping which many people are asking for, as we still retain theflexibility (or might even increase it in the case of arrays), whilegreatly speeding up code.

Since I only understand all these things at a high level, I'm sure thatI've greatly simplified or misrepresented something. The real hope hereis that someone smarter than me is inspired by that idea and manages tobuild something cool. I'd by happy to help in any way I can though.


Thanks for reading this far, you are a good person :*

Cheers,

Marvin

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/Z26RADVHYNQBET3FHP47XFIJLEXPY3ND/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Multiple dispatch transpilation

Reply via email to