On 2009-11-13, at 18:02, Robert Brown wrote: > Common Lisp and Scheme were designed by people who wanted to write complicated > systems on machines with a tiny fraction of the horsepower of current > workstations. They were carefully designed to be compiled efficiently, which > is not the case with Python. There really is a difference here. Python the > language has features that make fast implementations extremely difficult.
Not true. Common Lisp was designed primarily by throwing together all of the features in every Lisp implementation the design committee was interested in. Although the committee members were familiar with high-performance compilation, the primary impetus was to achieve a standardized language that would be acceptable to the Lisp community. At the time that Common Lisp was started, there was still some sentiment that Lisp machines were the way to go for performance. As for Scheme, it was designed primarily to satisfy an aesthetic of minimalism. Even though Guy Steele's thesis project, Rabbit, was a Scheme compiler, the point here was that relatively simple compilation techniques could produce moderately reasonable object programs. Chez Scheme was indeed first run on machines that we would nowadays consider tiny, but so too was C++. Oh, wait, so was Python! I would agree that features such as exec and eval hurt the speed of Python programs, but the same things do the same thing in CL and in Scheme. There is a mystique about method dispatch, but again, the Smalltalk literature has dealt with this issue in the past. Using Python 3 annotations, one can imagine a Python compiler that does the appropriate thing (shown in the comments) with the following code. import my_module # static linking __private_functions__ = ['my_fn'] # my_fn doesn't appear in the module dictionary. def my_fn(x: python.int32): # Keeps x in a register def inner(z): # Lambda-lifts the function, no nonlocal vars return z // 2 # does not construct a closure y = x + 17 # Via flow analysis, concludes that y can be registerized; return inner(2 * y) # Uses inline integer arithmetic instructions. def blarf(a: python.int32): return my_fn(a // 2) # Because my_fn isn't exported, it can be inlined. A new pragma statement (which I am EXPLICITLY not proposing; I respect and support the moratorium) might be useful in telling the implementation that you don't mind integer overflow. Similarly, new library classes might be created to hold arrays of int32s or doubles. Obviously, no Python system does any of these things today. But there really is nothing stopping a Python system from doing any of these things, and the technology is well-understood in implementations of other languages. I am not claiming that this is _better_ than JIT. I like JIT and other runtime things such as method caches better than these because you don't have to know very much about the implementation in order to take advantage of them. But there may be some benefit in allowing programmers concerned with speed to relax some of Python's dynamism without ruining it for the people who need a truly dynamic language. If I want to think about scalability seriously, I'm more concerned about problems that Python shares with almost every modern language: if you have lots of processors accessing a large shared memory, there is a real GC efficiency problem as the number of processors goes up. On the other hand, if you have a lot of processors with some degree of private memory sharing a common bus (think the Cell processor), how do we build an efficient implementation of ANY language for that kind of environment? Somehow, the issues of Python seem very orthogonal to performance scalability. -- v -- http://mail.python.org/mailman/listinfo/python-list