Certainly a means of fixing the "two language problem", if it works, would be a great thing. However, the SPy documentation I saw doesn't address what I see as some of the main issues that lead to this problem.
The fact that Python doesn't run like C (or Rust, or Fortran) is only partly to do with dynamic typing. It also has to do with the ability for the programmer to reason well about aspects of the algorithm like memory layouts (including things like row-major vs. column-major ordering and how many levels of pointer indirection exist in an object), cache "friendliness", the cost of operations, etc. Most Python code (at least, non-"toy" Python code, as opposed to people using Python sort of like BASIC was used in the early days of computing) is built heavily on high-level, vectorized constructs that seem to actively fight reasoning about these sort of algorithmic details, that end up accounting for the majority of performance optimizations in the "fast languages". I do recall that Numpy has functionality to do things like specifying or querying the memory layout of an array, but I'd be confident in saying that few Python programmers, in fact even few Numpy programmers, ever learn about these, and even fewer use them. For a language like SPy that aims to close the gap between Python and compiled languages, these features need to be prominent, "first class" aspects of the language that are routinely covered in tutorials. It's not JUST numerical algorithms that this applies to. For instance, in Python code it's very common to pass what are essentially structs as dictionaries, where the field names are keys. To make them C-like in performance (and memory usage), these should decay to actual structs, which of course requires that queries/"indices" to these dicts are known at compile time in all instances, i.e. are not the values of variables. As I understand, Python already tries to do this somewhat with "qstr", but when it does this or not is not something a typical Python programmer is given the tools to anticipate, let alone control. Also, in what is kind of the flip side to this, since many already existing Python programmers DO think in terms of high-level, vectorized constructs, having sensible default inferences for the VM in terms of how to map them to decently fast machine code is important for programmers who do NOT want to need to reason about or specify the algorithm details. This includes things like how to minimize the overhead of bounds checks on containers when using iterators or assigning to slices. Some of this probably requires passing high-level aspects of the code through the syntax tree to the machine code compiler, in order to tell it when data access patterns are such that it should emit vector instructions or elide bounds checks. None of this explicitly has to do directly with static vs. dynamic typing. It's true that dynamic typing is one of the main aspects of Python that forces its interpreter to be complex and slow, which in turn is why higher-level, vectorized code is often a must in order to offload as much of the heavy lifting as possible to C code paths within the interpreter itself or C libraries like Numpy. But JUST static typing by itself doesn't magically solve all these issues by itself, and I don't see much in the roadmap about how SPy attempts to address them. _______________________________________________ NumPy-Discussion mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: [email protected]
