Certainly a means of fixing the "two language problem", if it works, would be a 
great thing. However, the SPy documentation I saw doesn't address what I see as 
some of the main issues that lead to this problem.

The fact that Python doesn't run like C (or Rust, or Fortran) is only partly to 
do with dynamic typing. It also has to do with the ability for the programmer 
to reason well about aspects of the algorithm like memory layouts (including 
things like row-major vs. column-major ordering and how many levels of pointer 
indirection exist in an object), cache "friendliness", the cost of operations, 
etc. 

Most Python code (at least, non-"toy" Python code, as opposed to people using 
Python sort of like BASIC was used in the early days of computing) is built 
heavily on high-level, vectorized constructs that seem to actively fight 
reasoning about these sort of algorithmic details, that end up accounting for 
the majority of performance optimizations in the "fast languages". I do recall 
that Numpy has functionality to do things like specifying or querying the 
memory layout of an array, but I'd be confident in saying that few Python 
programmers, in fact even few Numpy programmers, ever learn about these, and 
even fewer use them. For a language like SPy that aims to close the gap between 
Python and compiled languages, these features need to be prominent, "first 
class" aspects of the language that are routinely covered in tutorials.

It's not JUST numerical algorithms that this applies to. For instance, in 
Python code it's very common to pass what are essentially structs as 
dictionaries, where the field names are keys. To make them C-like in 
performance (and memory usage), these should decay to actual structs, which of 
course requires that queries/"indices" to these dicts are known at compile time 
in all instances, i.e. are not the values of variables. As I understand, Python 
already tries to do this somewhat with "qstr", but when it does this or not is 
not something a typical Python programmer is given the tools to anticipate, let 
alone control.

Also, in what is kind of the flip side to this, since many already existing 
Python programmers DO think in terms of high-level, vectorized constructs, 
having sensible default inferences for the VM in terms of how to map them to 
decently fast machine code is important for programmers who do NOT want to need 
to reason about or specify the algorithm details. This includes things like how 
to minimize the overhead of bounds checks on containers when using iterators or 
assigning to slices. Some of this probably requires passing high-level aspects 
of the code through the syntax tree to the machine code compiler, in order to 
tell it when data access patterns are such that it should emit vector 
instructions or elide bounds checks.

None of this explicitly has to do directly with static vs. dynamic typing. It's 
true that dynamic typing is one of the main aspects of Python that forces its 
interpreter to be complex and slow, which in turn is why higher-level, 
vectorized code is often a must in order to offload as much of the heavy 
lifting as possible to C code paths within the interpreter itself or C 
libraries like Numpy. But JUST static typing by itself doesn't magically solve 
all these issues by itself, and I don't see much in the roadmap about how SPy 
attempts to address them.
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

Reply via email to