On 2009-11-13, at 18:02, Robert Brown wrote:

> Common Lisp and Scheme were designed by people who wanted to write complicated
> systems on machines with a tiny fraction of the horsepower of current
> workstations.  They were carefully designed to be compiled efficiently, which
> is not the case with Python.  There really is a difference here.  Python the
> language has features that make fast implementations extremely difficult.

Not true. Common Lisp was designed primarily by throwing together all of the 
features in every Lisp implementation the design committee was interested in. 
Although the committee members were familiar with high-performance compilation, 
the primary impetus was to achieve a standardized language that would be 
acceptable
to the Lisp community. At the time that Common Lisp was started, there was still
some sentiment that Lisp machines were the way to go for performance.  

As for Scheme, it was designed primarily to satisfy an aesthetic of minimalism. 
Even
though Guy Steele's thesis project, Rabbit, was a Scheme compiler, the point 
here was
that relatively simple compilation techniques could produce moderately 
reasonable 
object programs. Chez Scheme was indeed first run on machines that we would 
nowadays
consider tiny, but so too was C++. Oh, wait, so was Python!

I would agree that features such as exec and eval hurt the speed of Python 
programs, 
but the same things do the same thing in CL and in Scheme. There is a mystique 
about
method dispatch, but again, the Smalltalk literature has dealt with this issue 
in the 
past. 

Using Python 3 annotations, one can imagine a Python compiler that does the 
appropriate
thing (shown in the comments) with the following code. 

  import my_module                    # static linking

  __private_functions__ = ['my_fn']   # my_fn doesn't appear in the module 
dictionary.

  def my_fn(x: python.int32):         # Keeps x in a register
    def inner(z):                     # Lambda-lifts the function, no nonlocal 
vars
      return z // 2                   #   does not construct a closure
    y = x + 17                        # Via flow analysis, concludes that y can 
be registerized;
    return inner(2 * y)               # Uses inline integer arithmetic 
instructions. 

  def blarf(a: python.int32):
    return my_fn(a // 2)              # Because my_fn isn't exported, it can be 
inlined. 

A new pragma statement (which I am EXPLICITLY not proposing; I respect and 
support
the moratorium) might be useful in telling the implementation that you don't 
mind
integer overflow. 

Similarly, new library classes might be created to hold arrays of int32s or 
doubles. 

Obviously, no Python system does any of these things today. But there really is 
nothing stopping a Python system from doing any of these things, and the 
technology 
is well-understood in implementations of other languages. 

I am not claiming that this is _better_ than JIT. I like JIT and other runtime 
things
such as method caches better than these because you don't have to know very 
much about 
the implementation in order to take advantage of them. But there may be some 
benefit
in allowing programmers concerned with speed to relax some of Python's dynamism 
without ruining it for the people who need a truly dynamic language. 

If I want to think about scalability seriously, I'm more concerned about 
problems that
Python shares with almost every modern language: if you have lots of processors 
accessing 
a large shared memory, there is a real GC efficiency problem as the number of 
processors 
goes up. On the other hand, if you have a lot of processors with some degree of 
private 
memory sharing a common bus (think the Cell processor), how do we build an 
efficient 
implementation of ANY language for that kind of environment?

Somehow, the issues of Python seem very orthogonal to performance scalability. 

-- v


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to