Re: python simply not scaleable enough for google?

2009-11-23 Thread Robin Becker

sturlamolden wrote:

On 20 Nov, 11:12, Robin Becker ro...@reportlab.com wrote:


Presumably that means they could potentially run in parallel on the 10 cpu
machines of the future.

I'm not so clear on whether the threadless tasklets will run on separate cpus.


You can make a user-space scheduler and run a 10 tasklets on a
threadpool. But there is a GIL in stackless as well.

Nobody wants 10 OS threads, not with Python, not with Go, not with
C.

Also note that Windows has native support for taskelets, regardless
of language. They are called fibers (as opposed to threads) and
are created using the CreateFiber system call. I would not be
surprised if Unix'es has this as well. We do not need Stackless for
light-weight threads. We can just take Python's threading modules' C
code and replace CreateThread with CreateFiber.


...

not really sure about all the parallelism that will actually be achievable, but 
apparently the goroutines are multiplexed onto native threads by the run time. 
Apparently each real thread is run until it blocks and then another goroutine is 
allowed to make use of the thread. Apparently the gccgo runtime has 1 goroutine 
per thread and is different to the fast compilers.

--
Robin Becker
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-21 Thread Nobody
On Fri, 20 Nov 2009 09:51:49 -0800, sturlamolden wrote:

 You can make a user-space scheduler and run a 10 tasklets on a
 threadpool. But there is a GIL in stackless as well.
 
 Nobody wants 10 OS threads, not with Python, not with Go, not with
 C.
 
 Also note that Windows has native support for taskelets, regardless
 of language. They are called fibers (as opposed to threads) and
 are created using the CreateFiber system call. I would not be
 surprised if Unix'es has this as well. We do not need Stackless for
 light-weight threads. We can just take Python's threading modules' C
 code and replace CreateThread with CreateFiber.

POSIX.1-2001 and POSIX.1-2004 have makecontext(), setcontext(),
getcontext() and swapcontext(), but obsoleted by POSIX.1-2008.

They are available on Linux; I don't know about other Unices.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-20 Thread Aahz
In article mailman.224.1257933469.2873.python-l...@python.org,
Robert P. J. Day rpj...@crashcourse.ca wrote:

http://groups.google.com/group/unladen-swallow/browse_thread/thread/4edbc406f544643e?pli=1

  thoughts?

Haven't seen this elsewhere in the thread:

http://dalkescientific.com/writings/diary/archive/2009/11/15/10_tasklets.html
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.  --Brian W. Kernighan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-20 Thread Robin Becker

Aahz wrote:

In article mailman.224.1257933469.2873.python-l...@python.org,
Robert P. J. Day rpj...@crashcourse.ca wrote:

http://groups.google.com/group/unladen-swallow/browse_thread/thread/4edbc406f544643e?pli=1

 thoughts?


Haven't seen this elsewhere in the thread:

http://dalkescientific.com/writings/diary/archive/2009/11/15/10_tasklets.html



I looked at this and it looks very good in that stackless appears twice as fast 
as go(lang) (I used to be in the department of computing at Imperial so I 
suppose I have to side with McCabe).


Anyhow, my reading of why Pike was so proud of his set up and tear down of the 
tasks example was that these were real threads.


Presumably that means they could potentially run in parallel on the 10 cpu 
machines of the future.


I'm not so clear on whether the threadless tasklets will run on separate cpus.
--
Robin Becker

--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-20 Thread sturlamolden
On 20 Nov, 11:12, Robin Becker ro...@reportlab.com wrote:

 Presumably that means they could potentially run in parallel on the 10 cpu
 machines of the future.

 I'm not so clear on whether the threadless tasklets will run on separate cpus.

You can make a user-space scheduler and run a 10 tasklets on a
threadpool. But there is a GIL in stackless as well.

Nobody wants 10 OS threads, not with Python, not with Go, not with
C.

Also note that Windows has native support for taskelets, regardless
of language. They are called fibers (as opposed to threads) and
are created using the CreateFiber system call. I would not be
surprised if Unix'es has this as well. We do not need Stackless for
light-weight threads. We can just take Python's threading modules' C
code and replace CreateThread with CreateFiber.












-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-20 Thread Aahz
In article de86d30e-c9c1-4d30-9e62-d043b78ea...@a31g2000yqn.googlegroups.com,
sturlamolden  sturlamol...@yahoo.no wrote:

Also note that Windows has native support for taskelets, regardless
of language. They are called fibers (as opposed to threads) and are
created using the CreateFiber system call. I would not be surprised if
Unix'es has this as well. We do not need Stackless for light-weight
threads. We can just take Python's threading modules' C code and
replace CreateThread with CreateFiber.

Are you advocating a high-fiber diet?
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.  --Brian W. Kernighan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-20 Thread Aaron Watters

 Because `language is slow' is meaningless.

Yes.  Everyone knows java is faster than Python right?
But look here:

http://gaejava.appspot.com/

(you might want to run it a couple times to see what
it does when it is 'warm').  I don't think this is
a biased test -- I think the author expected to see
Java faster.

In my runs Python is usually a bit faster on a majority
of these metrics consistently.  Why?  My guess is the
reason is that Python is less bloated than Java so
more of it can stay resident on a shared machine
whereas big hunks of Java have to be swapped in for
every access -- but it's just a guess.

By the way: I see this all the time -- for web use
Python always seems to be faster than Java in my
experience.  (With programs that I write: I tend
to avoid some of the larger Python web tools available
out there and code close to the protocol.)

Comparing language platforms using small
numeric benchmarks often completely misses the
point.

   -- Aaron Watters
  http://whiffdoc.appspot.com
  http://listtree.appspot.com

===
an apple every 8 hours will keep 3 doctors
away.  - kliban
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-20 Thread sturlamolden
On 20 Nov, 22:45, a...@pythoncraft.com (Aahz) wrote:

 Are you advocating a high-fiber diet?

Only if you are a ruminant.

No really...

Windows has user-space threads natively. But you must reserve some
stack space for them (from virtual memory), which mainly makes them
useful on 64 bit systems.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-18 Thread sturlamolden
On 18 Nov, 00:31, Terry Reedy tjre...@udel.edu wrote:

 The
 problem for the future is the switch to multiple cores for further speedups.

The GIL is not a big problem for scientists. Scientists are not so
dependent on threads as the Java/webdeveloper crowd:

- We are used to running multiple processes with MPI.

- Numerical libraries running C/Fortran/Assembler will often release
the GIL. Python threads are ok for multicores then.

- Numerical libraries can be written or compiles for multicores e.g.
using OpenMP or special compilers. If FFTW is compiled for multiple
cores it does not matter that Python has a GIL. LAPACK will use
multiple cores if you use MKL or GotoBLAS, regardless of the GIL.
Etc.

- A scientist used to MATLAB will think MEX function (i.e. C or
Fortran) if something is too slow. A web developer used to Java will
think multithreading.
































-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-18 Thread sturlamolden
On 18 Nov, 00:24, greg g...@cosc.canterbury.ac.nz wrote:

 NumPy, for example, is *extremely* flexible. Someone put
 in the effort, once, to write it and make it fast -- and
 now an endless variety of programs can be written very easily
 in Python to make use of it.

I'm quite sure David Cournapeau knows about NumPy...

By the way, NumPy is not particularly fast because of the way it is
written. It's performance is hampered by the creation of temporary
arrays. But NumPy provides a flexible way of managing memory in
scientific programs.






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread Aaron Watters

 I don't think Python and Go address the same set of programmer
 desires.  For example, Go has a static type system.  Some programmers
 find static type systems to be useless or undesirable.  Others find
 them extremely helpful and want to use them them.  If you're a
 programmer who wants a static type system, you'll probably prefer Go
 to Python, and vice versa.  That has nothing to do with implementation
 speed or development expenditures.  If Google spent a million dollars
 adding static types to Python, it wouldn't be Python any more.

... and I still have an issue with the whole Python is slow
meme.  The reason NASA doesn't build a faster Python is because
Python *when augmented with FORTRAN libraries that have been
tested and optimized for decades and are worth billions of dollars
and don't need to be rewritten* is very fast.

The reason they don't replace the Python drivers with Java is
because that would be very difficult and just stupid and I'd be
willing to bet that when they were done the result would actually
be *slower* especially when you consider things like process
start-up time.

And when someone implements a Mercurial replacement in GO (or C#
or Java) which is faster and more useful than Mercurial, I'll
be very impressed.  Let me know when it happens (but I'm not
holding my breath).

By the way if it hasn't happened and if he isn't afraid
of public speaking someone should invite Matt Mackall
to give a Python conference keynote.  Or how about
Bram Cohen for that matter...

   -- Aaron Watters http://listtree.appspot.com/

===
if you want a friend, get a dog.  -Truman




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread David Cournapeau
On Tue, Nov 17, 2009 at 10:48 PM, Aaron Watters aaron.watt...@gmail.com wrote:

 I don't think Python and Go address the same set of programmer
 desires.  For example, Go has a static type system.  Some programmers
 find static type systems to be useless or undesirable.  Others find
 them extremely helpful and want to use them them.  If you're a
 programmer who wants a static type system, you'll probably prefer Go
 to Python, and vice versa.  That has nothing to do with implementation
 speed or development expenditures.  If Google spent a million dollars
 adding static types to Python, it wouldn't be Python any more.

 ... and I still have an issue with the whole Python is slow
 meme.  The reason NASA doesn't build a faster Python is because
 Python *when augmented with FORTRAN libraries that have been
 tested and optimized for decades and are worth billions of dollars
 and don't need to be rewritten* is very fast.

It is a bit odd to dismiss python is slow by saying that you can
extend it with fortran. One of the most significant point of python
IMO is its readability, even for people not familiar with it, and
that's important when doing scientific work. Relying on a lot of
compiled libraries goes against it.

I think that python with its scientific extensions is a fantastic
tool, but I would certainly not mind if it were ten times faster. In
particular, the significant cost of function calls makes it quickly
unusable for code which cannot be easily vectorized - we have to
resort to using C, etc... to circumvent this ATM.

Another point which has not been mentioned much, maybe because it is
obvious: it seems that it is possible to makes high level languages
quite fast, but doing so while keeping memory usage low is very
difficult. Incidentally, the same tradeoff appears when working with
vectorized code in numpy/scipy.

David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread Paul Boddie
On 17 Nov, 14:48, Aaron Watters aaron.watt...@gmail.com wrote:

 ... and I still have an issue with the whole Python is slow
 meme.  The reason NASA doesn't build a faster Python is because
 Python *when augmented with FORTRAN libraries that have been
 tested and optimized for decades and are worth billions of dollars
 and don't need to be rewritten* is very fast.

That's why I wrote that Python's extensibility using C, C++ and
Fortran [has] helped adoption of the language considerably, and
Python was particularly attractive to early adopters of the language
precisely because of the scripting functionality it could give to
existing applications, but although there are some reasonable
solutions for writing bottlenecks of a system in lower-level
programming languages, it can be awkward if those bottlenecks aren't
self-contained components or if the performance issues permeate the
entire system.

[...]

 And when someone implements a Mercurial replacement in GO (or C#
 or Java) which is faster and more useful than Mercurial, I'll
 be very impressed.  Let me know when it happens (but I'm not
 holding my breath).

Mercurial is a great example of a Python-based tool with good
performance. However, it's still interesting to consider why the
implementers chose to rewrite precisely those parts that are
implemented using C. I'm sure many people have had the experience of
looking at a piece of code and being quite certain of what that code
does, and yet wondering why it's so inefficient in vanilla Python.
It's exactly this kind of issue that has never really been answered
convincingly, other than claims that Python must be that dynamic and
no less and it's doing so much more than you think, leaving people
to try and mitigate the design issues using clever implementation
techniques as best they can.

 By the way if it hasn't happened and if he isn't afraid
 of public speaking someone should invite Matt Mackall
 to give a Python conference keynote.  Or how about
 Bram Cohen for that matter...

Bryan O'Sullivan gave a talk on Mercurial at EuroPython 2006, and
although I missed that talk for various reasons beyond my control, I
did catch his video lightning talk which emphasized performance.
That's not to say that we couldn't do with more talks of this nature
at Python conferences, however.

Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread Rustom Mody
Language L is (in)efficient. No! Only implementations are (in)efficient

I am reminded of a personal anecdote.  It happened about 20 years ago
but is still fresh and this thread reminds me of it.

I was attending some workshop on theoretical computer science.
I gave a talk on Haskell.

I showed off all the good-stuff -- pattern matching, lazy lists,
infinite data structures, etc etc.
Somebody asked me: Isnt all this very inefficient?
Now at that time I was a strong adherent of the Dijkstra-religion and
this viewpoint efficiency has nothing to do with languages, only
implementations traces to him. So I quoted that.

Slowing the venerable P S Thiagarajan got up and asked me:
Lets say that I have a language with a type 'Proposition'
And I have an operation on proposition called sat [ sat(p) returns
true if p is satisfiable]...

I wont complete the tale other than to say that Ive never had the wind
in my sails taken out so completely!

So Vincent? I wonder what you would have said in my place?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread J Kenneth King
David Cournapeau courn...@gmail.com writes:

 On Tue, Nov 17, 2009 at 10:48 PM, Aaron Watters aaron.watt...@gmail.com 
 wrote:

 I don't think Python and Go address the same set of programmer
 desires.  For example, Go has a static type system.  Some programmers
 find static type systems to be useless or undesirable.  Others find
 them extremely helpful and want to use them them.  If you're a
 programmer who wants a static type system, you'll probably prefer Go
 to Python, and vice versa.  That has nothing to do with implementation
 speed or development expenditures.  If Google spent a million dollars
 adding static types to Python, it wouldn't be Python any more.

 ... and I still have an issue with the whole Python is slow
 meme.  The reason NASA doesn't build a faster Python is because
 Python *when augmented with FORTRAN libraries that have been
 tested and optimized for decades and are worth billions of dollars
 and don't need to be rewritten* is very fast.

 It is a bit odd to dismiss python is slow by saying that you can
 extend it with fortran. One of the most significant point of python
 IMO is its readability, even for people not familiar with it, and
 that's important when doing scientific work. Relying on a lot of
 compiled libraries goes against it.

 I think that python with its scientific extensions is a fantastic
 tool, but I would certainly not mind if it were ten times faster. In
 particular, the significant cost of function calls makes it quickly
 unusable for code which cannot be easily vectorized - we have to
 resort to using C, etc... to circumvent this ATM.

 Another point which has not been mentioned much, maybe because it is
 obvious: it seems that it is possible to makes high level languages
 quite fast, but doing so while keeping memory usage low is very
 difficult. Incidentally, the same tradeoff appears when working with
 vectorized code in numpy/scipy.

I think this is the only interesting point in the whole conversation so
far.

It is possible for highly dynamic languages to be optimized, compiled,
and run really fast.

The recent versions of SBCL can compile Common Lisp into really fast and
efficient binaries.  And Lisp could be considered even more dynamic than
Python (but that is debateable and I have very little evidence... so
grain of salt on that statement).  It's possible, it just hasn't been
done yet.

PyPy is getting there, but development is slow and they could probably
use a hand.  Instead of waiting on the sidelines for a company to back
PyPy developemnt, the passionate Python programmers worth a salt that
care about Python development should contribute at least a patch or two.

The bigger problem though is probably attention span.  A lot of
developers today are more apt to simply try the next new language than
to roll of their sleeves and think deeply enough to improve the tools
they're already invested in.


 David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread David Cournapeau
On Wed, Nov 18, 2009 at 5:48 AM, Paul Rubin
http://phr...@nospam.invalid wrote:


 What about Git?  Some people prefer it.

Git is an interesting example, because it both really pushes
performance into its core structure and reasonably complete
implementations exist in other languages. In particular, jgit is
implemented in java by one of the core git developer, here is what he
has to say:

http://marc.info/?l=gitm=124111702609723w=2

I found the comment on optimizing 5% here and 5 % there interesting.
It is often claimed that optimization should be done after having
found the hotspot, but that does not always apply, and I think git is
a good example of that.

In those cases, using python as the main language does not work well,
at least in my experience. Rewriting the slow parts in a compiled
language only works if you can identify the slow parts, and even in
numerical code, that's not always possible (this tends to happen when
you need to deal with many objects interacting together, for example).

David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread greg

David Cournapeau wrote:


It is a bit odd to dismiss python is slow by saying that you can
extend it with fortran. One of the most significant point of python
IMO is its readability, even for people not familiar with it, and
that's important when doing scientific work. Relying on a lot of
compiled libraries goes against it.


If it were necessary to write a new compiled library every
time you wanted to solve a new problem, that would be true.
But it's not like that if you pick the right libraries.

NumPy, for example, is *extremely* flexible. Someone put
in the effort, once, to write it and make it fast -- and
now an endless variety of programs can be written very easily
in Python to make use of it.

--
Greg
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread Terry Reedy

David Cournapeau wrote:

On Tue, Nov 17, 2009 at 10:48 PM, Aaron Watters aaron.watt...@gmail.com wrote:

I don't think Python and Go address the same set of programmer
desires.  For example, Go has a static type system.  Some programmers
find static type systems to be useless or undesirable.  Others find
them extremely helpful and want to use them them.  If you're a
programmer who wants a static type system, you'll probably prefer Go
to Python, and vice versa.  That has nothing to do with implementation
speed or development expenditures.  If Google spent a million dollars
adding static types to Python, it wouldn't be Python any more.

... and I still have an issue with the whole Python is slow
meme.  The reason NASA doesn't build a faster Python is because
Python *when augmented with FORTRAN libraries that have been
tested and optimized for decades and are worth billions of dollars
and don't need to be rewritten* is very fast.


It is a bit odd to dismiss python is slow by saying that you can
extend it with fortran.


I find it a bit odd that people are so resistant to evaluating Python as 
it was designed to be. As Guido designed the language, he designed the 
implementation to be open and easily extended by assembler, Fortran, and 
C. No one carps about the fact the dictionary key lookup, say, is writen 
in (optimized) C rather than pretty Python. Why should Basic Linear 
Algebra Subroutines (BLAS) be any different?



One of the most significant point of python
IMO is its readability, even for people not familiar with it, and
that's important when doing scientific work.


It is readable by humans because it was designed for that purpose.

 Relying on a lot of compiled libraries goes against it.

On the contrary, Python could be optimized for human readability because 
it was expected that heavy computation would be delegated to other code. 
There is no need for scientists to read the optimized code in BLAS, 
LINPACK, and FFTPACK, in assembler, Fortran, and/or C, which are 
incorporated in Numpy.


It is unfortunate that there is not yet a 3.1 version of Numpy. That is 
what 3.1 most needs to run faster, as fast as intended.



I think that python with its scientific extensions is a fantastic
tool, but I would certainly not mind if it were ten times faster.


Python today is at least 100x as fast as 1.4 (my first version) was in 
its time. Which is to say, Python today is as fast as C was then. The 
problem for the future is the switch to multiple cores for further speedups.


Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread greg

David Cournapeau wrote:


It is often claimed that optimization should be done after having
found the hotspot, but that does not always apply


It's more that if you *do* have a hotspot, you had better
find it and direct your efforts there first. E.g. if there is
a hotspot taking 99% of the time, then optimising elsewhere
can't possibly improve the overall time by more than 1% at
the very most.

Once there are no hotspots left, then there may be further
spread-out savings to be made.

--
Greg
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread Wolfgang Rohdewald
On Wednesday 18 November 2009, Terry Reedy wrote:
 Python today is at least 100x as fast as 1.4 (my first version) was
  in  its time. Which is to say, Python today is as fast as C was
  then

on the same hardware? That must have been a very buggy C compiler.
Or was it a C interpreter?


-- 
Wolfgang
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread David Cournapeau
On Wed, Nov 18, 2009 at 8:31 AM, Terry Reedy tjre...@udel.edu wrote:
 David Cournapeau wrote:

 On Tue, Nov 17, 2009 at 10:48 PM, Aaron Watters aaron.watt...@gmail.com
 wrote:

 I don't think Python and Go address the same set of programmer
 desires.  For example, Go has a static type system.  Some programmers
 find static type systems to be useless or undesirable.  Others find
 them extremely helpful and want to use them them.  If you're a
 programmer who wants a static type system, you'll probably prefer Go
 to Python, and vice versa.  That has nothing to do with implementation
 speed or development expenditures.  If Google spent a million dollars
 adding static types to Python, it wouldn't be Python any more.

 ... and I still have an issue with the whole Python is slow
 meme.  The reason NASA doesn't build a faster Python is because
 Python *when augmented with FORTRAN libraries that have been
 tested and optimized for decades and are worth billions of dollars
 and don't need to be rewritten* is very fast.

 It is a bit odd to dismiss python is slow by saying that you can
 extend it with fortran.

 I find it a bit odd that people are so resistant to evaluating Python as it
 was designed to be. As Guido designed the language, he designed the
 implementation to be open and easily extended by assembler, Fortran, and C.

I am well aware of that fact - that's one of the major reason why I
decided to go the python route a few years ago instead of matlab,
because matlab C api is so limited.

 No one carps about the fact the dictionary key lookup, say, is writen in
 (optimized) C rather than pretty Python. Why should Basic Linear Algebra
 Subroutines (BLAS) be any different?

BLAS/LAPACK explicitly contains stuff that can easily be factored out
in a library. Linear algebra in general works well because the basic
data structures are well understood. You can deal with those as black
boxes most of the time (I for example have no idea how most of LAPACK
algo work, except for the simple ones). But that's not always the case
for numerical computations. Sometimes, you need to be able to go
inside the black box, and that's where python is sometimes limited for
me because of its cost.

To be more concrete, one of my area is speech processing/speech
recognition. Most of current engines are based on Hidden Markov
Models, and there are a few well known libraries to deal with those,
most of the time written in C/C++. You can wrap those in python (and
people do), but you cannot really use those unless you deal with them
at a high level. If you want to change some core algorithms (to deal
with new topology, etc), you cannot do it without going into C. It
would be great to write my own HMM library in python, but I cannot do
it because it would be way too slow. There is no easy black-box which
I could wrap so that I keep enough flexibility without sacrificing too
much speed.

Said differently, I would be willing to be one order of magnitude
slower than say C, but not 2 to 3 as currently in python when you
cannot leverage existing libraries. When the code can be vectorized,
numpy and scipy give me this.

 Relying on a lot of compiled libraries goes against it.

 On the contrary, Python could be optimized for human readability because it
 was expected that heavy computation would be delegated to other code. There
 is no need for scientists to read the optimized code in BLAS, LINPACK, and
 FFTPACK, in assembler, Fortran, and/or C, which are incorporated in Numpy.

I know all that (I am one of the main numpy develop nowadays), and
indeed, writing blas/lapack in python does not make much sense. I am
talking about libraries *I* would write. Scipy, for example, contains
more fortran and C code than python, without counting the libraries we
wrap, and a lot of it is because of speed/memory concern.

David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread Chris Rebert
On Tue, Nov 17, 2009 at 8:41 AM, Rustom Mody rustompm...@gmail.com wrote:
 Language L is (in)efficient. No! Only implementations are (in)efficient

 I am reminded of a personal anecdote.  It happened about 20 years ago
 but is still fresh and this thread reminds me of it.

 I was attending some workshop on theoretical computer science.
 I gave a talk on Haskell.

 I showed off all the good-stuff -- pattern matching, lazy lists,
 infinite data structures, etc etc.
 Somebody asked me: Isnt all this very inefficient?
 Now at that time I was a strong adherent of the Dijkstra-religion and
 this viewpoint efficiency has nothing to do with languages, only
 implementations traces to him. So I quoted that.

 Slowing the venerable P S Thiagarajan got up and asked me:
 Lets say that I have a language with a type 'Proposition'
 And I have an operation on proposition called sat [ sat(p) returns
 true if p is satisfiable]...

 I wont complete the tale other than to say that Ive never had the wind
 in my sails taken out so completely!

 So Vincent? I wonder what you would have said in my place?

I'm not Vincent, but: The sat() operation is by definition in
inefficient, regardless of language?

Cheers,
Chris
--
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-17 Thread Steven D'Aprano
On Tue, 17 Nov 2009 22:11:42 +0530, Rustom Mody wrote:

 Language L is (in)efficient. No! Only implementations are
 (in)efficient
 
 I am reminded of a personal anecdote.  It happened about 20 years ago
 but is still fresh and this thread reminds me of it.
 
 I was attending some workshop on theoretical computer science. I gave a
 talk on Haskell.
 
 I showed off all the good-stuff -- pattern matching, lazy lists,
 infinite data structures, etc etc.
 Somebody asked me: Isnt all this very inefficient? Now at that time I
 was a strong adherent of the Dijkstra-religion and this viewpoint
 efficiency has nothing to do with languages, only implementations
 traces to him. So I quoted that.
 
 Slowing the venerable P S Thiagarajan got up and asked me: Lets say that
 I have a language with a type 'Proposition' And I have an operation on
 proposition called sat [ sat(p) returns true if p is satisfiable]...

I assume you're referring to this:

http://en.wikipedia.org/wiki/Boolean_satisfiability_problem

which is N-P complete and O(2**N) (although many such problems can be 
solved rapidly in polynomial time).

 
 I wont complete the tale other than to say that Ive never had the wind
 in my sails taken out so completely!
 
 So Vincent? I wonder what you would have said in my place?

I won't answer for Vincent, but I would have made five points:

(1) The existence of one inherently slow function in a language does not 
mean that the language itself is slow overall. It's not clear exactly 
what overall means in the context of a language, but one function out 
of potentially thousands obviously isn't it.

(2) Obviously the quality of implementation for the sat function will 
make a major difference as far as speed goes, so the speed of the 
function is dependent on the implementation.

(3) Since the language definition doesn't specify an implementation, no 
prediction of the time needed to execute the function can be made. At 
most we know how many algorithmic steps the function will take, given 
many assumptions, but we have no idea of the constant term. The language 
definition would be satisfied by having an omniscient, omnipotent deity 
perform the O(2**N) steps required by the algorithm infinitely fast, i.e. 
in constant (zero) time, which would make it pretty fast. The fact that 
we don't have access to such deities to do our calculations for us is an 
implementation issue, not a language issue.

(4) In order to justify the claim that the language is slow, you have to 
define what you are comparing it against and how you are measuring the 
speed. Hence different benchmarks give different relative ordering 
between language implementations. You must have a valid benchmark, and 
not stack the deck against one language: compared to (say) integer 
addition in C, yes the sat function is slow, but that's an invalid 
comparison, as invalid as comparing the sat function against factorizing 
a one million digit number. (Time to solve sat(P) -- sixty milliseconds. 
Time to factorize N -- sixty million years.) You have to compare similar 
functionality, not two arbitrary operations.

Can you write a sat function in (say) C that does better than the one in 
your language? If you can't, then you have no justification for saying 
that C is faster than your language, for the amount of work your language 
does. If you can write a faster implementation of sat, then you can 
improve the implementation of your language by using that C function, 
thus demonstrating that speed depends on the implementation, not the 
language.

(5) There's no need for such hypothetical examples. Let's use a more 
realistic example... disk IO is expensive and slow. I believe that disk 
IO is three orders of magnitude slower than memory access, and heaven 
help you if you're reading from tape instead of a hard drive!

Would anyone like to argue that every language which supports disk IO 
(including C, Lisp, Fortran and, yes, Python) are therefore slow? Since 
the speed of the hard drive dominates the time taken, we might even be 
justified as saying that all languages are equally slow!

Obviously this conclusion is nonsense. Since the conclusion is nonsense, 
we have to question the premise, and the weakest premise is the idea that 
talking about the speed of a *language* is even meaningful (except as a 
short-hand for state of the art implementations of that language).



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-16 Thread sturlamolden
On 14 Nov, 02:42, Robert Brown bbr...@speakeasy.net wrote:

 If you want to know why Python *the language* is slow, look at the Lisp code
 CLPython generates and at the code implementing the run time.  Simple
 operations end up being very expensive.

You can also see this by looking at the C that Cython or Pyrex
generates.

You can also see the dramatic effect by a handful of strategically
placed type declarations.


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-16 Thread Paul Boddie
On 16 Nov, 05:51, sturlamolden sturlamol...@yahoo.no wrote:

 NASA can find money to build a space telescope and put it in orbit.
 They don't find money to create a faster Python, which they use for
 analyzing the data.

Is the analysis in Python really what slows it all down?

 Google is a multi-billion dollar business. They are using Python
 extensively. Yes I know about Unladen Swallow, but why can't they put
 1 mill dollar into making a fast Python?

Isn't this where we need those Ohloh figures on how much Unladen
Swallow is worth? ;-) I think Google is one of those organisations
where that Steve Jobs mentality of shaving time off a once-per-day
activity actually pays off. A few more cycles here and there is
arguably nothing to us, but it's a few kW when running on thousands of
Google nodes.

 And then there is IBM and Cern's Blue Brain project. They can set up
 the fastest supercomputer known to man, but finance a faster Python?
 No...

Businesses and organisations generally don't spend any more money than
they need to. And if choosing another technology is cheaper for future
work then they'll just do that instead. In a sense, Python's
extensibility using C, C++ and Fortran have helped adoption of the
language considerably, but it hasn't necessarily encouraged a focus on
performance.

Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-16 Thread Paul Rubin
sturlamolden sturlamol...@yahoo.no writes:
        Python is a very clean language held back from widespread use by slow
  implementations.  If Python ran faster, Go would be unnecessary.
 
 Google is a multi-billion dollar business. They are using Python
 extensively. Yes I know about Unladen Swallow, but why can't they put
 1 mill dollar into making a fast Python?

I don't think Python and Go address the same set of programmer
desires.  For example, Go has a static type system.  Some programmers
find static type systems to be useless or undesirable.  Others find
them extremely helpful and want to use them them.  If you're a
programmer who wants a static type system, you'll probably prefer Go
to Python, and vice versa.  That has nothing to do with implementation
speed or development expenditures.  If Google spent a million dollars
adding static types to Python, it wouldn't be Python any more.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-15 Thread Terry Reedy

greg wrote:

John Nagle wrote:

   Take a good look at Shed Skin.  ...
You give up some flexibility; a variable can have only one primitive type
in its life, or it can be a class object.  That's enough to simplify the
type analysis to the point that most types can be nailed down before the
program is run.


These restrictions mean that it isn't really quite
Python, though.


Python code that only uses a subset of features very much *is* Python 
code. The author of ShedSkin makes no claim that is compiles all Python 
code.


--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-15 Thread Paul Boddie
On 15 Nov, 09:30, Terry Reedy tjre...@udel.edu wrote:
 greg wrote:


[Shed Skin]

  These restrictions mean that it isn't really quite
  Python, though.

 Python code that only uses a subset of features very much *is* Python
 code. The author of ShedSkin makes no claim that is compiles all Python
 code.

Of course, Shed Skin doesn't support all the usual CPython features,
but the code you would write for Shed Skin's benefit should be Python
code that runs under CPython. It's fair to say that Shed Skin isn't a
complete implementation of what CPython defines as being the full
Python, but you're still writing Python. One can argue that the
restrictions imposed by Shed Skin inhibit the code from being proper
Python, but every software project has restrictions in the form of
styles, patterns and conventions.

This is where the Lesser Python crowd usually step in and say that
they won't look at anything which doesn't support the full Python,
but I think it's informative to evaluate which features of Python give
the most value and which we could do without. The Lesser Python
attitude is to say, No! We want it all! It's all necessary for
everything! That doesn't really help the people implementing proper
implementations or those trying to deliver better-performing
implementations.

In fact, the mentality that claims that it's perfect, or it will be
if we keep adding features could drive Python into a diminishing
niche over time. In contrast, considering variations of Python as some
kind of Greater Python ecosystem could help Python (the language)
adapt to the changing demands on programming languages to which Go
(the Google language, not Go! which existed already) is supposedly a
response.

Paul

P.S. And PyPy is hardly a dud: they're only just getting started
delivering the performance benefits, and it looks rather promising.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-15 Thread Edward A. Falk
In article m2d43kemvs@roger-vivier.bibliotech.com,
Robert Brown  bbr...@speakeasy.net wrote:

It's hard to refute your assertion.  You're claiming that some future
hypothetical Python implementation will have excellent performance via a JIT.
On top of that you say that you're willing to change the definition of the
Python language, say by adding type declarations, if an implementation with a
JIT doesn't pan out.  If you change the Python language to address the
semantic problems Willem lists in his post and also add optional type
declarations, then Python becomes closer to Common Lisp, which we know can be
executed efficiently, within the same ballpark as C and Java.

Ya know; without looking at Go, I'd bet that this was some of the thought
process that was behind it.

-- 
-Ed Falk, f...@despams.r.us.com
http://thespamdiaries.blogspot.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-15 Thread Paul Rubin
f...@mauve.rahul.net (Edward A. Falk) writes:
 If you change the Python language to address the semantic problems
 Willem lists in his post and also add optional type declarations,
 then Python becomes closer to Common Lisp, which we know can be
 executed efficiently, within the same ballpark as C and Java.
 
 Ya know; without looking at Go, I'd bet that this was some of the thought
 process that was behind it.

I don't have the slightest impression that Python had any significant
influence on Go.  Go has C-like syntax, static typing with mandatory
declarations, and concurrency inspired by Occam.  It seems to be a
descendant of Oberon and Newsqueak (Pike's earlier language used in
Plan 9).  It also seems to be decades behind the times in some ways.
Its creators are great programmers and system designers, but I wish
they had gotten some PL theorists involved in designing Go.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-15 Thread John Nagle

Paul Boddie wrote:

On 15 Nov, 09:30, Terry Reedy tjre...@udel.edu wrote:

greg wrote:



[Shed Skin]


These restrictions mean that it isn't really quite
Python, though.

Python code that only uses a subset of features very much *is* Python
code. The author of ShedSkin makes no claim that is compiles all Python
code.


Of course, Shed Skin doesn't support all the usual CPython features,
but the code you would write for Shed Skin's benefit should be Python
code that runs under CPython. It's fair to say that Shed Skin isn't a
complete implementation of what CPython defines as being the full
Python, but you're still writing Python. One can argue that the
restrictions imposed by Shed Skin inhibit the code from being proper
Python, but every software project has restrictions in the form of
styles, patterns and conventions.

This is where the Lesser Python crowd usually step in and say that
they won't look at anything which doesn't support the full Python,
but I think it's informative to evaluate which features of Python give
the most value and which we could do without. The Lesser Python
attitude is to say, No! We want it all! It's all necessary for
everything! That doesn't really help the people implementing proper
implementations or those trying to deliver better-performing
implementations.

In fact, the mentality that claims that it's perfect, or it will be
if we keep adding features could drive Python into a diminishing
niche over time. In contrast, considering variations of Python as some
kind of Greater Python ecosystem could help Python (the language)
adapt to the changing demands on programming languages to which Go
(the Google language, not Go! which existed already) is supposedly a
response.


Yes.  Niklaus Wirth, who designed Pascal, Modula, and Oberon, had
that happen to his languages.  He's old and bitter now; a friend of
mine knows him.

 The problem is that Greater Python is to some extent the set of
features that are easy to implement if we look up everything at run time.
You can insert a variable into a running function of
another thread.  This feature of very marginal utility is free in a
naive lookup-based interpreter, and horribly expensive in anything that
really compiles.  Obsession with the CPython implementation as the language
definition tends to overemphasize such features.

 The big headache from a compiler perspective is hidden dynamism -
use of dynamic features that isn't obvious from examining the source code.
(Hidden dynamism is a big headache to maintenance programmers, too.)
For example, if you had the rule that you can't use getattr and setattr
on an object from the outside unless the class itself implements or uses getattr
and setattr, then you know at compile time if the machinery for dynamic
attributes needs to be provided for that class.  This allows the slots
optimization, and direct compilation into struct-type code.

 Python is a very clean language held back from widespread use by slow
implementations.  If Python ran faster, Go would be unnecessary.

 And yes, performance matters when you buy servers in bulk.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-15 Thread sturlamolden
On 16 Nov, 05:09, John Nagle na...@animats.com wrote:

       Python is a very clean language held back from widespread use by slow
 implementations.  If Python ran faster, Go would be unnecessary.

That boggles me.

NASA can find money to build a space telescope and put it in orbit.
They don't find money to create a faster Python, which they use for
analyzing the data.

Google is a multi-billion dollar business. They are using Python
extensively. Yes I know about Unladen Swallow, but why can't they put
1 mill dollar into making a fast Python?

And then there is IBM and Cern's Blue Brain project. They can set up
the fastest supercomputer known to man, but finance a faster Python?
No...

I saw this myself. At work I could get money to buy a € 30,000
recording equipment. I could not get money for a MATLAB license.

It seems software and software development is heavily underfinanced.
The big bucks goes into fancy hardware. But fancy hardware is not so
fancy without equally fancy software.
















-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-15 Thread sturlamolden
On 16 Nov, 05:09, John Nagle na...@animats.com wrote:

       Python is a very clean language held back from widespread use by slow
 implementations.

Python is clean, minimalistic, and beautiful.

Python don't have bloat like special syntax for XML or SQL databases
(cf C#) or queues (Go).

Most of all, it is easier to express ideas in Python than any computer
language I know.

Python's major drawback is slow implementations. I always find myself
resorting to Cython (or C, C++, Fortran 95) here and there.

But truth being told, I wrote an awful lot of C mex files when using
MATLAB as well. MATLAB can easily be slower than Python by orders of
magnitude, but it has not preventet it from widespread adoption.
What's keeping it back is an expensive license.















-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Vincent Manis
On 2009-11-13, at 23:20, Robert Brown wrote, quoting me:
 On 2009-11-13, at 17:42, Robert Brown wrote, quoting me: 
 
 ... Python *the language* is specified in a way that
 makes executing Python programs quickly very very difficult.  
 
 That is untrue. I have mentioned before that optional declarations integrate
 well with dynamic languages. Apart from CL and Scheme, which I have
 mentioned several times, you might check out Strongtalk (typed Smalltalk),
 and Dylan, which was designed for high-performance compilation, though to my
 knowledge no Dylan compilers ever really achieved it.
 
 You are not making an argument, just mentioning random facts.  You claim I've
 made a false statement, then talk about optional type declarations, which
 Python doesn't have.  Then you mention Smalltalk and Dylan.  What's your
 point?  To prove me wrong you have to demonstrate that it's not very difficult
 to produce a high performance Python system, given current Python semantics.
The false statement you made is that `... Python *the language* is specified 
in a way that makes executing Python programs quickly very very difficult. 
I refuted it by citing several systems that implement languages with semantics
similar to those of Python, and do so efficiently.  

 I've never looked at CLPython. Did it use a method cache (see Peter
 Deutsch's paper on Smalltalk performance in the unfortunately out-of-print
 `Smalltalk-80: Bits of History, Words of Advice'? That technique is 30 years
 old now.
 
 Please look at CLPython.  The complexity of some Python operations will make
 you weep.  CLPython uses Common Lisp's CLOS method dispatch in various places,
 so yes, those method lookups are definitely cached.
Ah, that does explain it. CLOS is most definitely the wrong vehicle for 
implementing
Python method dispatch. CLOS is focused around generic functions that 
themselves 
do method dispatch, and do so in a way that is different from Python's. If I 
were
building a Python implementation in CL, I would definitely NOT use CLOS, but 
do my own dispatch using funcall (the CL equivalent of the now-vanished Python 
function apply). 

 Method lookup is just the tip if the iceburg.  How about comparison?  Here are
 some comments from CLPython's implementation of compare.  There's a lot going
 on.  It's complex and SLOW.
Re comparison. Python 3 has cleaned comparison up a fair bit. In particular, 
you 
can no longer compare objects of different types using default comparisons.
However, it could well be that there are nasty little crannies of inefficiency 
there, they could be the subject of PEPs after the moratorium is over. 

quoting from the CLPython code
   ;; The CPython logic is a bit complicated; hopefully the following
   ;; is a correct translation.
I can see why CLPython has such troubles. The author has endeavoured to copy
CPython faithfully, using an implementation language (CLOS) that is hostile 
to Python method dispatch. 

OK, let me try this again. My assertion is that with some combination of 
JITting, 
reorganization of the Python runtime, and optional static declarations, Python 
can be made acceptably fast, which I define as program runtimes on the same 
order 
of magnitude as those of the same programs in C (Java and other languages have 
established a similar goal). I am not pushing optional declarations, as it's 
worth seeing what we can get out of JITting. If you wish to refute this 
assertion, 
citing behavior in CPython or another implementation is not enough. You have to 
show that the stated feature *cannot* be made to run in an acceptable time. 

For example, if method dispatch required an exponential-time algorithm, I would 
agree with you. But a hypothetical implementation that did method dispatch in 
exponential time for no reason would not be a proof, but would rather just be 
a poor implementation. 

-- v



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread sturlamolden
On 12 Nov, 18:33, J Kenneth King ja...@agentultra.com wrote:

 Where Python might get hit *as a language* is that the Python programmer
 has to drop into C to implement optimized data-structures for dealing
 with the kind of IO that would slow down the Python interpreter.  That's
 why we have numpy, scipy, etc.

That's not a Python specific issue. We drop to SciPy/NumPy for certain
compute-bound tasks that operates on vectors. If that does not help,
we drop further down to Cython, C or Fortran. If that does not help,
we can use assembly. In fact, if we use SciPy linked against GotoBLAS,
a lot of compute-intensive work solving linear algebra is delegated to
hand-optimized assembly.

With Python we can stop at the level of abstraction that gives
acceptable performance. When using C, we start out at a much lower
level. The principle that premature optimization is the root of all
evil applies here: Python code that is fast enough is fast enough. It
does not matter that hand-tuned assembly will be 1000 times faster. We
can direct our optimization effort to the parts of the code that needs
it.




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Paul Rubin
sturlamolden sturlamol...@yahoo.no writes:
 With Cython we can get Python to run at the speed of C just by
 adding in optional type declarations for critical variables (most need
 not be declared).

I think there are other semantic differences too.  For general
thoughts on such differences (Cython is not mentioned though), see:

 http://dirtsimple.org/2005/10/children-of-lesser-python.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Vincent Manis
On 2009-11-13, at 23:39, Robert Brown wrote, quoting me:
 Common Lisp blends together features of previous Lisps, which were designed to
 be executed efficiently.  Operating systems were written in these variants.
 Execution speed was important.  The Common Lisp standardization committee
 included people who were concerned about performance on C-optimized hardware.
Guy L Steele, Jr., `Common Lisp The Language' 1/e (1984). p. 1 `COMMON LISP 
is intended to meet these goals: Commonality [...] Portability [...] Consistency
[...] Expressiveness [...] Compatibility [...] Efficiency [...] Power [...]
Stability [...]' The elided text amplifies each of the points. I repeat: the 
purpose of Common Lisp was to have a standard Lisp dialect; efficiency was 
less of an issue for those investigators. 

As for C-optimized hardware, well, the dialects it aims to be compatible with
are ZetaLisp (Symbolics Lisp Machine), MacLisp (PDP-10), and Interlisp (PDP-10, 
originally). 

CLtL mentions S-1 Lisp as its exemplar of high numerical performance. 
Unfortunately, 
S-1 Lisp, written by Richard Gabriel and Rod Brooks was never finished. MacLisp 
was
a highly efficient implementation, as I've mentioned. I worked at BBN at the 
time
Interlisp flourished; it was many things, some of them quite wonderful, but 
efficiency
was NOT its goal. 

 The Scheme standard has gone through many revisions.  I think we're up to
 version 6 at this point.  The people working on it are concerned about
 performance.  
Yes, they are. You should see a's rants about how b specified certain 
features so they'd be efficient on his implementation. I had real people's 
names there, but I deleted them in the interests of not fanning flamewar 
flames. 

 For instance, see the discussions about whether the order of
 evaluating function arguments should be specified.  
That was a long time ago, and had as much if not more to do with making
arguments work the same as let forms as it had to do with efficiency. 
But I'll point out that the big feature of Scheme is continuations, and 
it took quite a few years after the first Scheme implementations came out
to make continuations stop being horrendously *IN*efficient. 

 You can't point to Rabbit (1978 ?) as
 representative of the Scheme programming community over the last few decades.
I didn't. I used it to buttress YOUR argument that Schemers have always been
concerned with performance. 

 Using Python 3 annotations, one can imagine a Python compiler that does the
 appropriate thing (shown in the comments) with the following code.
 I can imagine a lot too, but we're talking about Python as it's specified
 *today*.  The Python language as it's specified today is hard to execute
 quickly.  Not impossible, but very hard, which is why we don't see fast Python
 systems.

Python 3 annotations exist. Check the Python 3 Language Reference. 

I notice you've weakened your claim. Now we're down to `hard to execute 
quickly'. That I would agree with you on, in that building an efficient 
Python system would be a lot of work. However, my claim is that that work 
is engineering, not research: most of the bits and pieces of how to implement
Python reasonably efficiently are known and in the public literature. And 
that has been my claim since the beginning.

-- v 

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Alf P. Steinbach

* Vincent Manis:

On 2009-11-13, at 22:51, Alf P. Steinbach wrote:

It's sort of hilarious. g
It really is, see below. 


So no, it's not a language that is slow, it's of course only concrete 
implementations that may have slowness flavoring. And no, not really, they 
don't, because it's just particular aspects of any given implementation that 
may exhibit slowness in certain contexts. And expanding on that trend, later in 
the thread the observation was made that no, not really that either, it's just 
(if it is at all) at this particular point in time, what about the future? 
Let's be precise! Can't have that vague touchy-feely impression about a 
/language/ being slow corrupting the souls of readers.
Because `language is slow' is meaningless. 

An earlier post of mine listed four examples where the common wisdom was `XXX is slow' and yet where that 
turned out not to be the case.


Some others. 

1. I once owned a Commodore 64. I got Waterloo Pascal for it. I timed the execution of some program 
(this was 25 years ago, I forget what the program did) at 1 second per statement. Therefore: `Pascal 
is slow'. 

2. Bell Labs produced a fine programming language called Snobol 4. It was slow. But some folks at 
IIT in Chicago did their own implementation, Spitbol, which was fast and completely compatible. 
Presto: Snobol 4 was slow, but then it became fast. 


3. If you write the following statements in Fortran IV (the last version of 
Fortran I learned)

   DO 10 I=1, 100
 DO 10 J=1, 100
   A(I, J) = 0.0
10 CONTINUE

you would paralyze early virtual memory systems, because Fortran IV defined arrays to be stored 
in column major order, and the result was extreme thrashing. Many programmers did not realize 
this, and would naturally write code like that. Fortran cognoscenti would interchange the two 
DO statements and thus convert Fortran from being a slow language to being a fast one. 

4. When Sun released the original Java system, programs ran very slowly, and everybody said 
`I will not use Java, it is a slow language'. Then Sun improved their JVM, and other organizations 
wrote their own JVMs which were fast. Therefore Java became a fast language. 


Actually, although C++ has the potential for being really really fast (and some C++ programs are), the amount 
of work you have to add to realize the potential can be staggering. This is most clearly evidenced by C++'s 
standard iostreams, which have the potential of being much much faster than C FILE i/o (in particular Dietmar 
Kuhl made such an implementation), but where the complexity of and the guidance offered by the 
design is such that nearly all extant implementations are painfully slow, even compared to C 
FILE. So, we generally talk about iostreams being slow, knowing full well what we mean and that fast 
implementations are theoretically possible (as evidenced by Dietmar's)  --  but fast and 
slow are in-practice terms, and so what matters is in-practice, like, how does your compiler's 
iostreams implementation hold up.
OK, let me work this one out. Because most iostreams implementations are very slow, C++ is a slow 
language. But since Kuhl did a high-performance implementation, he made C++ into a fast language. 
But since most people don't use his iostreams implementation, C++ is a slow language again, except
for organizations that have banned iostreams (as my previous employers did) because it's too slow, 
therefore C++ is a fast language. 

Being imprecise is so much fun! I should write my programs this imprecisely. 

More seriously, when someone says `xxx is a slow language', the only thing they can possibly mean 
is `there is no implementation in existence, and no likelihood of an implementation being possible, 
that is efficient enough to solve my problem in the required time' or perhaps `I must write peculiar

code in order to get programs to run in the specified time; writing code in the 
way the language seems
to encourage produces programs that are too slow'. This is a very sweeping 
statement, and at the very
least ought to be accompanied by some kind of proof. If Python is indeed a slow language, then Unladen 
Swallow and pypy, and many other projects, are wastes of time, and should not be continued. 

Again, this doesn't have anything to do with features of an implementation that are slow or fast. 
The only criterion that makes sense is `do programs run with the required performance if written 
in the way the language's inventors encourage'. Most implementations of every language have a nook 
or two where things get embarrassingly slow; the question is `are most programs unacceptably slow'. 


But, hey, if we are ok with being imprecise, let's go for it. Instead of saying 
`slow' and `fast',
why not say `good' and `bad'?


:-)

You're piling up so extremely many fallacies in one go that I just quoted it 
all.

Anyways, it's a good example of focusing on irrelevant and meaningless precision 
plus at the same time 

Re: python simply not scaleable enough for google?

2009-11-14 Thread Vincent Manis
On 2009-11-14, at 00:22, Alf P. Steinbach wrote, in response to my earlier post.

 Anyways, it's a good example of focusing on irrelevant and meaningless 
 precision plus at the same time utilizing imprecision, higgedly-piggedly as 
 it suits one's argument. Mixing hard precise logic with imprecise concepts 
 and confound e.g. universal quantification with existential quantification, 
 for best effect several times in the same sentence. Like the old Very Hard 
 Logic + imprecision adage: we must do something. this is something. ergo, we 
 must do this.
OK, now we've reached a total breakdown in communication, Alf. You appear to 
take exception to
distinguishing between a language and its implementation. My academic work, 
before I became a computer
science/software engineering instructor, was in programming language 
specification and implementation, 
so I *DO* know what I'm talking about here. However, you and I apparently are 
speaking on different
wavelengths.  

 It's just idiocy.
Regretfully, I must agree.

 But fun.
Not so much, from my viewpoint.

-- v

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread sturlamolden
On 12 Nov, 18:32, Alf P. Steinbach al...@start.no wrote:

 Of course Python is slow: if you want speed, pay for it by complexity.

Python is slow is really a misconception. Python is used for
scientific computing at HPC centres around the world. NumPy's
predecessor numarray was made by NASA for the Hubble space telescope.
Python is slow for certain types of tasks, particularly iterative
compute-bound work. But who says you have to use Python for this?  It
can easily be delegated to libraries written in C or Fortran.

I can easily demonstrate Python being faster than C. For example, I
could compare the speed of appending strings to a list and .join
(strlist) with multiple strcats in C. I can easily demonstrate C being
faster than Python as well.

To get speed from a high-level language like Python you have to
leverage on high-level data types. But then you cannot compare
algorithms in C and Python directly.

Also consider that most program today are not CPU-bound: They are i/o
bound or memory-bound. Using C does not give you faster disk access,
faster ethernet connection, or faster RAM... It does not matter that
computation is slow if the CPU is starved anyway. We have to consider
what actually limits the speed of a program.

Most of all I don't care that computation is slow if slow is fast
enough. For example, I have a Python script that parses OpenGL headers
and writes a declaration file for Cython. It takes a fraction of a
second to complete. Should I migrate it to C to make it 20 times
faster? Or do you really think I care if it takes 20 ms or just 1 ms
to complete? The only harm the extra CPU cycles did was a minor
contribution to global warming.




















-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Alf P. Steinbach

* sturlamolden:

On 12 Nov, 18:32, Alf P. Steinbach al...@start.no wrote:


Of course Python is slow: if you want speed, pay for it by complexity.


Python is slow is really a misconception.


Sorry, no, I don't think so.

But we can't know that without ESP powers.

Which seem to be in short supply.



Python is used for
scientific computing at HPC centres around the world. NumPy's
predecessor numarray was made by NASA for the Hubble space telescope.
Python is slow for certain types of tasks, particularly iterative
compute-bound work. But who says you have to use Python for this?  It
can easily be delegated to libraries written in C or Fortran.


Yes, that's what I wrote immediately following what you quoted.



I can easily demonstrate Python being faster than C. For example, I
could compare the speed of appending strings to a list and .join
(strlist) with multiple strcats in C. I can easily demonstrate C being
faster than Python as well.


That is a straw man argument (which is one of the classic fallacies), that is, 
attacking a position that nobody's argued for.




To get speed from a high-level language like Python you have to
leverage on high-level data types. But then you cannot compare
algorithms in C and Python directly.

Also consider that most program today are not CPU-bound: They are i/o
bound or memory-bound. Using C does not give you faster disk access,
faster ethernet connection, or faster RAM... It does not matter that
computation is slow if the CPU is starved anyway. We have to consider
what actually limits the speed of a program.

Most of all I don't care that computation is slow if slow is fast
enough. For example, I have a Python script that parses OpenGL headers
and writes a declaration file for Cython. It takes a fraction of a
second to complete. Should I migrate it to C to make it 20 times
faster? Or do you really think I care if it takes 20 ms or just 1 ms
to complete? The only harm the extra CPU cycles did was a minor
contribution to global warming.


Yeah, that's what I wrote immediately following what you quoted.

So, except for the straw man arg and to what degree there is a misconception, 
which we can't know without ESP, it seems we /completely agree/ on this :-) )



Cheers  hth.,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread sturlamolden
On 12 Nov, 18:32, Alf P. Steinbach al...@start.no wrote:

 Hm, this seems religious.

 Of course Python is slow: if you want speed, pay for it by complexity.

Not really. The speed problems of Python can to a large extent be
attributed to a sub-optimal VM.

Perl tends to be much faster than Python.

Certain Common Lisp and Scheme implementations can often perform
comparable to C++.

There are JIT-compiled JavaScript which are very efficient.

Java's Hotspot JIT comes from StrongTalk, a fast version of SmallTalk.
It's not the static typing that makes Java run fast. It is a JIT
originally developed for a dynamic language. Without Hotspot, Java can
be just as bad as Python.

Even more remarkable: Lua with LuaJIT performs about ~80% of GCC on
Debian benchmarks. Question: Why is Lua so fast and Python so slow?
Here we have two very similar dynamic scripting languages. One beats
JIT-compiled Java and almost competes with C. The other is the slowest
there is. Why? Lot of it has to do with the simple fact that Python'
VM is stack-based whereas Lua's VM is register based. Stack-based VM's
are bad for branch prediction and work against the modern CPUs. Python
has reference counting which is bad for cache. Lua has a tracing GC.
But these are all implementation details totally orthogonal to the
languages. Python on a better VM (LuaJIT, Parrot, LLVM, several
JavaScript) will easily outperform CPython by orders of magnitide.

Sure, Google can brag about Go running at 80% of C speed, after
introducing static typing. But LuaJIT does the same without any typing
at all.























-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Alf P. Steinbach

* Vincent Manis:

On 2009-11-14, at 00:22, Alf P. Steinbach wrote, in response to my earlier post.


Anyways, it's a good example of focusing on irrelevant and meaningless
precision plus at the same time utilizing imprecision, higgedly-piggedly
as it suits one's argument. Mixing hard precise logic with imprecise
concepts and confound e.g. universal quantification with existential
quantification, for best effect several times in the same sentence. Like
the old Very Hard Logic + imprecision adage: we must do something. this
is something. ergo, we must do this.


OK, now we've reached a total breakdown in communication, Alf. You appear
to take exception to distinguishing between a language and its implementation.


Not at all.

But that doesn't mean that making that distinction is always meaningful.

It's not like there exists a context where making the distinction is not 
meaningful means that in all contexts making the distinction is meaningful.


So considering that, my quoted comment about confounding universal 
quantification with existential quantification was spot on... :-)


In some contexts, such as here, it is meaningless and just misleading to add the 
extra precision of the distinction between language and implementation. 
Academically it's there. But it doesn't influence anything (see below).


Providing a counter example, a really fast Python implementation for the kind of 
processing mix that Google does, available for the relevant environments, would 
be relevant.


Bringing in the hypothethical possibility of a future existence of such an 
implementation is, OTOH., only hot air.


If someone were to apply the irrelevantly-precise kind of argument to that, then 
one could say that future hypotheticals don't have anything to do with what 
Python is, today. Now, there's a fine word-splitting distinction... ;-)




My academic work, before I became a computer science/software engineering
instructor, was in programming language specification and implementation, 
so I *DO* know what I'm talking about here. However, you and I apparently
are speaking on different wavelengths.  


Granted that you haven't related incorrect facts, and I don't think anyone here 
has, IMO the conclusions and implied conclusions still don't follow.



Cheers  hth.,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Alf P. Steinbach

* sturlamolden:

On 12 Nov, 18:32, Alf P. Steinbach al...@start.no wrote:


Hm, this seems religious.

Of course Python is slow: if you want speed, pay for it by complexity.


Not really. The speed problems of Python can to a large extent be
attributed to a sub-optimal VM.

Perl tends to be much faster than Python.

Certain Common Lisp and Scheme implementations can often perform
comparable to C++.

There are JIT-compiled JavaScript which are very efficient.

Java's Hotspot JIT comes from StrongTalk, a fast version of SmallTalk.
It's not the static typing that makes Java run fast. It is a JIT
originally developed for a dynamic language. Without Hotspot, Java can
be just as bad as Python.

Even more remarkable: Lua with LuaJIT performs about ~80% of GCC on
Debian benchmarks. Question: Why is Lua so fast and Python so slow?
Here we have two very similar dynamic scripting languages. One beats
JIT-compiled Java and almost competes with C. The other is the slowest
there is. Why? Lot of it has to do with the simple fact that Python'
VM is stack-based whereas Lua's VM is register based. Stack-based VM's
are bad for branch prediction and work against the modern CPUs. Python
has reference counting which is bad for cache. Lua has a tracing GC.
But these are all implementation details totally orthogonal to the
languages. Python on a better VM (LuaJIT, Parrot, LLVM, several
JavaScript) will easily outperform CPython by orders of magnitide.

Sure, Google can brag about Go running at 80% of C speed, after
introducing static typing. But LuaJIT does the same without any typing
at all.


Good points and good facts.

And you dispensed with the word-splitting terminology discussion, writing just 
The other [language] is the slowest. Currently. He he. :-)


And it is, as you imply, totally in the in-practice domain.


Cheers,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Vincent Manis
On 2009-11-14, at 01:11, Alf P. Steinbach wrote:
 OK, now we've reached a total breakdown in communication, Alf. You appear
 to take exception to distinguishing between a language and its 
 implementation.
 
 Not at all.
 
 But that doesn't mean that making that distinction is always meaningful.
It certainly is. A language is a (normally) infinite set of strings with a way 
of ascribing 
a meaning to each string. 

A language implementation is a computer program of some sort, which is a finite 
set of bits 
representing a program in some language, with the effect that the observed 
behavior of the 
implementation is that strings in the language are accepted, and the computer 
performs the 
operations defined by the semantics. 

These are always different things. 

 It's not like there exists a context where making the distinction is not 
 meaningful means that in all contexts making the distinction is meaningful.
Because they are different things, in all cases the distinction is meaningful. 
 
 So considering that, my quoted comment about confounding universal 
 quantification with existential quantification was spot on... :-)
It was not spot on. The examples I provided were just that, examples to help 
people see the 
difference. They were not presented as proof. The proof comes from the 
definitions above. 

 In some contexts, such as here, it is meaningless and just misleading to add 
 the extra precision of the distinction between language and implementation. 
 Academically it's there. But it doesn't influence anything (see below).

Your assertion that this distinction is meaningless must be based upon YOUR 
definitions of words
like `language' and `implementation'. Since I don't know your definitions, I 
cannot respond to this
charge. 

 Providing a counter example, a really fast Python implementation for the kind 
 of processing mix that Google does, available for the relevant environments, 
 would be relevant.
I have presented arguments that the technologies for preparing such an 
implementation are 
basically known, and in fact there are projects that aim to do exactly that. 
 
 Bringing in the hypothethical possibility of a future existence of such an 
 implementation is, OTOH., only hot air.
Hmm...in every programming project I have ever worked on, the goal was to write 
code that 
didn't already exist. 

 If someone were to apply the irrelevantly-precise kind of argument to that, 
 then one could say that future hypotheticals don't have anything to do with 
 what Python is, today. Now, there's a fine word-splitting distinction... ;-)
Python is a set of strings, with a somewhat sloppily-defined semantics that 
ascribes meaning to the legal strings in the language. It was thus before any 
implementation existed, although I imagine that the original Python before GvR 
wrote any code had many differences from what Python is today. 

It is quite common for language designers to specify a language completely 
without regard to an implementation, or only a `reference' implementation that 
is not designed for performance or 
robustness. The `good' implementation comes after the language has been defined 
(though again
languages and consequently implementations are almost always modified after the 
original release). 
If you like, a language is part of (but not all of) the set of requirements for 
the implementation.

Alf, if you want to say that this is a difference that makes no difference, 
don't let me 
stop you. You are, however, completely out of step with the definitions of 
these terms as used
in the field of programming languages. 

 My academic work, before I became a computer science/software engineering
 instructor, was in programming language specification and implementation, so 
 I *DO* know what I'm talking about here. However, you and I apparently
 are speaking on different wavelengths.  
 
 Granted that you haven't related incorrect facts, and I don't think anyone 
 here has, IMO the conclusions and implied conclusions still don't follow.
The fact that you see the situation that way is a consequence of the fact that 
we're on different 
wavelengths. 

-- v
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread sturlamolden
On 14 Nov, 09:47, Alf P. Steinbach al...@start.no wrote:

  Python is slow is really a misconception.

 Sorry, no, I don't think so.

No, i really think a lot of the conveived slowness in Python comes
from bad programming practices. Sure we can deomstrate that C or
LuaJIT is faster by orders of magnitude for CPU-bound tasks like
comparing DNA-sequences or or calculating the value of pi.

But let me give an example to the opposite from graphics programming,
one that we often run into when using OpenGL. This is not a toy
benchmark problem but one that is frequently encountered in real
programs.

We all know that calling functions in Python has a big overhead. There
are a dictionary lookup for the attribute name, and arguments are
packed into a tuple (and somtimes  a dictionary). Thus calling
glVertex* repeatedly from Python will hurt. Doing it from C or Fortran
might still be ok (albeit not always recommended).  So should we
conclude that Python is too slow and use C instead?

No!

What if we use glVertexArray or a display list instead? In case of a
vertex array (e.g. using NumPy ndarray for storage), there is
practically no difference in performance of C and Python. With a
display list, there is a difference on creation, but not on
invocation. So slowness from calling glVertex* multiple times is
really slowness from bad Python programming. I use numpy ndarrays to
store vertices, and pass them to OpenGL as a vertex arrays, instead of
hammering on  glVertex* in a tight loop. And speed wise, it does not
really matter if I use C or Python.

But what if we need some computation in the graphics program as well?
We might use OpenCL, DirectCompute or OpenGL vertex shaders to control
the GPU. Will C be better than Python for this? Most likely not. A
program for the GPU is compiled by the graphics driver at run-time
from a text string passed to it. It is much better to use Python than
C to generate these. Will C on the CPU be better than OpenCL or a
vertex shader on the GPU? Most likely not.

So we might perhaps conclude that Python (with numpy) is better than C
for high-performance graphics? Even though Python is slower than C, we
can do just as well as C programmers by not falling into a few stupid
pitfalls. Is Python really slower than C for practical programming
like this? Superficially, perhaps yes. In practice, only if you use it
badly. But that's not Python's fault.

But if you make a CPU-bound benchmark like Debian, or time thousands
of calls to glVertex*, yes it will look like C is much better. But it
does not directly translate to the performance of a real program. The
slower can be the faster, it all depends on the programmer.


Two related issues:

- For the few cases where a graphics program really need C, we can
always resort to using ctypes, f2py or Cython. Gluing Python with C or
Fortran is very easy using these tools. That is much better than
keeping it all in C++.

- I mostly find myself using Cython instead of Python for OpenGL. That
is because I am unhappy with PyOpenGL. It was easier to expose the
whole of OpenGL to Cython than create a full or partial wrapper for
Python. With Cython there is no extra overhead from calling glVertex*
in  a tight loop, so we get the same performance as C in this case.
But because I store vertices in NumPy arrays on the Python side, I
mostly end up using glVertexArray anyway.



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Roel Schroeven
Vincent Manis schreef:
 On 2009-11-14, at 01:11, Alf P. Steinbach wrote:
 OK, now we've reached a total breakdown in communication, Alf. You appear
 to take exception to distinguishing between a language and its 
 implementation.
 Not at all.

 But that doesn't mean that making that distinction is always meaningful.
 It certainly is. A language is a (normally) infinite set of strings with a 
 way of ascribing 
 a meaning to each string. 

That's true, for sure.

But when people in the Python community use the word Python, the word is
not used in the strict sense of Python the language. They use it to
refer to both the language and one or more of implementations, mostly
one of the existing and working implementations, and in most cases
CPython (and sometimes it can also include the documentation, the
website or the community).

Example: go to http://python.org. Click Download. That page says
Download Python
The current product versions are Python 2.6.4 and Python 3.1.1
...
You can't download a language, but you can download an implementation.
Clearly, even the project's website itself uses Python not only to refer
to the language, but also to it's main implementation (and in a few
places to other implementations).

From that point of view, your distinction between languages and
implementations is correct but irrelevant. What is relevant is that all
currently usable Python implementations are slow, and it's not incorrect
to say that Python is slow.

If and when a fast Python implementation gets to a usable state and
gains traction (in the hopefully not too distant future), that changes.
We'll have to say that Python can be fast if you use the right
implementation. And once the most commonly used implementation is a fast
one, we'll say that Python is fast, unless you happen to use a slow
implementation for one reason or another.

-- 
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
  -- Isaac Asimov

Roel Schroeven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Alf P. Steinbach

* Vincent Manis:

On 2009-11-14, at 01:11, Alf P. Steinbach wrote:

OK, now we've reached a total breakdown in communication, Alf. You appear
to take exception to distinguishing between a language and its implementation.

Not at all.

But that doesn't mean that making that distinction is always meaningful.
It certainly is. A language is a (normally) infinite set of strings with a way of ascribing 
a meaning to each string. 

A language implementation is a computer program of some sort, which is a finite set of bits 
representing a program in some language, with the effect that the observed behavior of the 
implementation is that strings in the language are accepted, and the computer performs the 
operations defined by the semantics. 

These are always different things. 


Well, there you have it, your basic misconception.

Sometimes, when that's practically meaningful, people use the name of a language 
to refer to both, as whoever it was did up-thread.


Or, they might mean just the latter. :-)

Apply some intelligence and it's all clear.

Stick boneheadedly to preconceived distinctions and absolute context independent 
meanings, and statements using other meanings appear to be meaningless or very 
unclear.


[snippety]


Cheers  hth.,

- Alf

PS: You might, or might not, benefit from looking up Usenet discussions on the 
meaning of character code, which is classic case of the confusion you have 
here. There's even a discussion of that in some RFC somewhere, I think it was 
MIME-related. Terms mean different things in different *contexts*.

--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Roel Schroeven
Vincent Manis schreef:
 I notice you've weakened your claim. Now we're down to `hard to execute 
 quickly'. That I would agree with you on, in that building an efficient 
 Python system would be a lot of work. However, my claim is that that work 
 is engineering, not research: most of the bits and pieces of how to implement
 Python reasonably efficiently are known and in the public literature. And 
 that has been my claim since the beginning.

You talk about what can be and what might be. We talk about what is.

The future is an interesting place, but it's not here.

-- 
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
  -- Isaac Asimov

Roel Schroeven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Paul Rubin
sturlamolden sturlamol...@yahoo.no writes:
 Python on a better VM (LuaJIT, Parrot, LLVM, several
 JavaScript) will easily outperform CPython by orders of magnitide.


Maybe Python semantics make it more difficult to optimize than those
other languages.  For example, in
  a = foo.bar(1)
  b = muggle()
  c = foo.bar(2)
it is not ok to cache the value of foo.bar after the first assignment.
Maybe the second one goes and modifies it through foo.__dict__ .
See Children of a Lesser Python (linked in another post, or websearch)
for discussion.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Grant Edwards
On 2009-11-14, David Robinow drobi...@gmail.com wrote:
 On Fri, Nov 13, 2009 at 3:32 PM, Paul Rubin
http://phr...@nospam.invalid wrote:
 ... ?This is Usenet so
 please stick with Usenet practices. ?If you want a web forum there are
 plenty of them out there.
  Actually this is python-list@python.org

Actually this is comp.lang.python

 I don't use usenet and I have no intention to stick with Usenet practices.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Terry Reedy

sturlamolden wrote:


- For the few cases where a graphics program really need C, we can
always resort to using ctypes, f2py or Cython. Gluing Python with C or
Fortran is very easy using these tools. That is much better than
keeping it all in C++.


In case anyone thinks resorting to C or Fortran is cheating, they should 
know that CPython, the implementation, was designed for this. That is 
why there is a documented C-API and why the CPython devs are slow to 
change it. Numerical Python dates back to at least 1.3 and probably 
earlier. The people who wrote it were some of the first production users 
of Python.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Edward A. Falk
In article mailman.270.1257970526.2873.python-l...@python.org,
Terry Reedy  tjre...@udel.edu wrote:

I can imagine a day when code compiled from Python is routinely 
time-competitive with hand-written C.

I can't.  Too much about the language is dynamic.  The untyped variables
alone are a killer.

int a,b,c;
...
a = b + c;

In C, this compiles down to just a few machine instructions.  In Python,
the values in the variables need to be examined *at run time* to determine
how to add them or if they can even be added at all.  You'll never in
a million years get that down to just two or three machine cycles.

Yes, technically, the speed of a language depends on its implementation,
but the nature of the language constrains what you can do in an
implementation.  Python the language is inherently slower than C the
language, no matter how much effort you put into the implementation.  This
is generally true for all languages without strongly typed variables.

-- 
-Ed Falk, f...@despams.r.us.com
http://thespamdiaries.blogspot.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Terry Reedy

Willem Broekema wrote:


It might have gotten a bit better, but the central message still
stands: Python has made design choices that make efficient compilation
hard.


OK, let me try this again. My assertion is that with some combination of 
JITting,
reorganization of the Python runtime, and optional static declarations, Python
can be made acceptably fast,


That does not contradict that, had other language design choices been
made, it could be much easier to get better performance. Python may in
general be about as dynamic as Common Lisp from a _user_ perspective,
but from an implementator's point of view Python is harder to make it
run efficiently.


I think you are right about the design choices. The reason for those 
design choices is that Guido intended from the beginning that Python 
implementations be part of open computational systems, and not islands 
to themselves like Smalltalk and some Lisps. While the public CPython 
C-API is *not* part of the Python language, Python was and has been 
designed with the knowledge that there *would be* such an interface, and 
that speed-critical code would be written in C or Fortran, or that 
Python programs would interface with and use such code already written.


So: Python the language was designed for human readability, with the 
knowledge that CPython the implementation (originally and still today 
just called python.exe) would exist in a world where intensive 
computation could be pushed onto C or Fortan when necessary.


So: to talk about the 'speed of Python', one should talk about the speed 
of human reading and writing. On this score, Python, I believe, beats 
most other algorithm languages, as intended. It certainly does for me. 
To talk about the speed of CPython, one must, to be fair, talk about the 
speed of CPython + extensions compiled to native code.


In the scale of human readability, I believe Google go is a step 
backwards from Python.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Robert Brown

Vincent Manis vma...@telus.net writes:
 The false statement you made is that `... Python *the language* is specified
 in a way that makes executing Python programs quickly very very difficult.
 I refuted it by citing several systems that implement languages with
 semantics similar to those of Python, and do so efficiently.

The semantic details matter.  Please read Willem's reply to your post.  It
contains a long list of specific differences between Python (CPython) language
semantics and Common Lisp language semantics that cause Python performance to
suffer.

 OK, let me try this again. My assertion is that with some combination of
 JITting, reorganization of the Python runtime, and optional static
 declarations, Python can be made acceptably fast, which I define as program
 runtimes on the same order of magnitude as those of the same programs in C
 (Java and other languages have established a similar goal). I am not pushing
 optional declarations, as it's worth seeing what we can get out of
 JITting. If you wish to refute this assertion, citing behavior in CPython or
 another implementation is not enough. You have to show that the stated
 feature *cannot* be made to run in an acceptable time.

It's hard to refute your assertion.  You're claiming that some future
hypothetical Python implementation will have excellent performance via a JIT.
On top of that you say that you're willing to change the definition of the
Python language, say by adding type declarations, if an implementation with a
JIT doesn't pan out.  If you change the Python language to address the
semantic problems Willem lists in his post and also add optional type
declarations, then Python becomes closer to Common Lisp, which we know can be
executed efficiently, within the same ballpark as C and Java.

bob
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Vincent Manis
This whole thread has now proceeded to bore me senseless. I'm going to respond 
once with a restatement of what I originally said. Then I'm going to drop it, 
and
never respond to the thread again. Much of what's below has been said by others 
as well; I'm taking no credit for it, just trying to put it together into a 
coherent
framework. 

1. The original question is `Is Python scalable enough for Google' (or, I 
assume 
any other huge application). That's what I was responding to.

2. `Scalable' can mean performance or productivity/reliability/maintenance 
quality.
A number of posters conflated those. I'll deal with p/r/m by saying I'm not 
familiar 
with any study that has taken real enterprise-type programs and compared, e.g., 
Java, Python, and C++ on the p/r/m criteria. Let's leave that issue by saying 
that 
we all enjoy programming in Python, and Python has pretty much the same feature 
set (notably modules) as any other enterprise language. This just leaves us with
performance. 

3. Very clearly CPython can be improved. I don't take most benchmarks very 
seriously, 
but we know that CPython interprets bytecode, and thus suffers relative to 
systems 
that compile into native code, and likely to some other interpretative systems. 
(Lua
has been mentioned, and I recall looking at a presentation by the Lua guys on 
why they
chose a register rather than stack-based approach.)

4. Extensions such as numpy can produce tremendous improvements in productivity 
AND
performance. One answer to `is Python scalable' is to rephrase it as `is 
Python+C 
scalable'. 

5. There are a number of JIT projects being considered, and one or more of 
these might 
well hold promise. 

6. Following Scott Meyers' outstanding advice (from his Effective C++ books), 
one should
prefer compile time to runtime wherever possible, if one is concerned about 
performance. 
An implementation that takes hints from programmers, e.g., that a certain 
variable is 
not to be changed, or that a given argument is always an int32, can generate 
special-case
code that is at least in the same ballpark as C, if not as fast. 

This in no way detracts from Python's dynamic nature: these hints would be 
completely
optional, and would not change the semantics of correct programs. (They might 
cause
programs running on incorrect data to crash, but if you want performance, you 
are kind of 
stuck). These hints would `turn off' features that are difficult to compile 
into efficient
code, but would do so only in those parts of a program where, for example, it 
was known that
a given variable contains an int32. Dynamic (hint-free) and somewhat 
less-dynamic (hinted)
code would coexist. This has been done for other languages, and is not a 
radically new 
concept. 

Such hints already exist in the language; __slots__ is an example. 

The language, at least as far as Python 3 is concerned, has pretty much all the 
machinery 
needed to provide such hints. Mechanisms that are recognized specially by a 
high-performance
implementation (imported from a special module, for example) could include: 
annotations, 
decorators, metaclasses, and assignment to special variables like __slots__.

7. No implementation of Python at present incorporates JITting and hints fully. 
Therefore, 
the answer to `is CPython performance-scalable' is likely `NO'. Another 
implementation that 
exploited all of the features described here might well have satisfactory 
performance for 
a range of computation-intensive problems. Therefore, the answer to `is the 
Python language 
performance-scalable' might be `we don't know, but there are a number of 
promising implementation
techniques that have been proven to work well in other languages, and may well 
have tremendous
payoff for Python'. 

-- v




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Steven D'Aprano
On Fri, 13 Nov 2009 18:25:59 -0800, Vincent Manis wrote:

 On 2009-11-13, at 15:32, Paul Rubin wrote:
   This is Usenet so
 please stick with Usenet practices.
 Er, this is NOT Usenet.

Actually it is. I'm posting to comp.lang.python.


 1. I haven't, to the best of my recollection, made a Usenet post in this
 millennium.

Actually you have, you just didn't know it.


 2. I haven't fired up a copy of rn or any other news reader in at least
 2 decades.
 
 3. I'm on the python-list mailing list, reading this with Apple's Mail
 application, which actually doesn't have convenient ways of enforcing
 `Usenet practices' regarding message format.

Nevertheless, the standards for line length for email and Usenet are 
compatible.


 4. If we're going to adhere to tried-and-true message format rules, I
 want my IBM 2260 circa 1970, with its upper-case-only display and weird
 little end-of-line symbols.

No you don't, you're just taking the piss.


 Stephen asked me to wrap my posts. I'm happy to do it. Can we please
 finish this thread off and dispose of it?

My name is actually Steven, but thank you for wrapping your posts.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread John Nagle

Steven D'Aprano wrote:

On Wed, 11 Nov 2009 16:38:50 -0800, Vincent Manis wrote:


I'm having some trouble understanding this thread. My comments aren't
directed at Terry's or Alain's comments, but at the thread overall.

1. The statement `Python is slow' doesn't make any sense to me. Python
is a programming language; it is implementations that have speed or lack
thereof.


Of course you are right, but in common usage, Python refers to CPython, 
and in fact since all the common (and possibly uncommon) implementations 
of Python are as slow or slower than CPython, it's not an unreasonable 
short-hand.


   Take a good look at Shed Skin.  One guy has been able to build a system
that compiles Python to C++, without requiring the user to add annotations
about types.  The system uses type inference to figure it out itself.
You give up some flexibility; a variable can have only one primitive type
in its life, or it can be a class object.  That's enough to simplify the
type analysis to the point that most types can be nailed down before the
program is run.  (Note, though, that the entire program may have to
be analyzed as a whole.  Separate compilation may not work; you need
to see the callers to figure out how to compile the callees.)

   It's 10 to 60x faster than CPython.

   It's the implementation, not the language.  Just because PyPy was a
dud doesn't mean it's impossible. There are Javascript JIT systems
far faster than Python.

   Nor do you really need a JIT system.  (Neither does Java; GCC has
a hard-code Java compiler.  Java is JIT-oriented for historical reasons.
Remember browser applets?)  If you're doing server-side work, the
program's structure and form have usually been fully determined by
the time the program begins execution.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Rami Chowdhury
On Saturday 14 November 2009 18:42:07 Vincent Manis wrote:
 
 3. Very clearly CPython can be improved. I don't take most benchmarks
  very seriously, but we know that CPython interprets bytecode, and
  thus suffers relative to systems that compile into native code, and
  likely to some other interpretative systems. (Lua has been
  mentioned, and I recall looking at a presentation by the Lua guys on
  why they chose a register rather than stack-based approach.)
 

For those interested in exploring the possible performance benefits of 
Python on a register-based VM, there's Pynie 
(http://code.google.com/p/pynie/)... and there's even a JIT in the works 
for that (http://docs.parrot.org/parrot/1.0.0/html/docs/jit.pod.html)...



Rami Chowdhury
A man with a watch knows what time it is. A man with two watches is 
never sure. -- Segal's Law
408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread Terry Reedy

John Nagle wrote:

Steven D'Aprano wrote:

   Take a good look at Shed Skin.  One guy has been able to build a system
that compiles Python to C++, without requiring the user to add 
annotations about types.


In *only* compiles a subset of Python, as does Cython. Both cannot 
(currently) do generators, but that can be done and probably will 
eventually at least for Cython. Much as I love them, they can be 
rewritten by hand as iterator classes and even then are not needed for a 
lot of computational code.


I think both are good pieces of work so far.

  The system uses type inference to figure it out itself.

You give up some flexibility; a variable can have only one primitive type
in its life, or it can be a class object.  That's enough to simplify the
type analysis to the point that most types can be nailed down before the
program is run.  (Note, though, that the entire program may have to
be analyzed as a whole.  Separate compilation may not work; you need
to see the callers to figure out how to compile the callees.)

   It's 10 to 60x faster than CPython.

   It's the implementation, not the language.  Just because PyPy was a
dud doesn't mean it's impossible. There are Javascript JIT systems
far faster than Python.

   Nor do you really need a JIT system.  (Neither does Java; GCC has
a hard-code Java compiler.  Java is JIT-oriented for historical reasons.
Remember browser applets?)  If you're doing server-side work, the
program's structure and form have usually been fully determined by
the time the program begins execution.

John Nagle


--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-14 Thread greg

John Nagle wrote:

   Take a good look at Shed Skin.  ...
You give up some flexibility; a variable can have only one primitive type
in its life, or it can be a class object.  That's enough to simplify the
type analysis to the point that most types can be nailed down before the
program is run.


These restrictions mean that it isn't really quite
Python, though.

--
Greg
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Tim Chase

Steven D'Aprano wrote:
Vincent, could you please fix your mail client, or news 
client, so that it follows the standard for mail and news 
(that is, it has a hard-break after 68 or 72 characters?


This seems an awfully curmudgeonly reply, given that
word-wrapping is also client-controllable.  Every MUA I've used
has afforded word-wrap including the venerable command-line
mail, mutt, Thunderbird/Seamonkey, pine, Outlook  Outlook
Express...the list goes on.  If you're reading via web-based
portal, if the web-reader doesn't support wrapped lines, (1) that
sounds like a lousy reader and (2) if you absolutely must use
such a borked web-interface, you can always hack it in a good
browser with a greasemonkey-ish script or a user-level CSS !
important attribute to ensure that the div or p in question
wraps even if the site tries to specify otherwise.

There might be some stand-alone news-readers that aren't smart
enough to support word-wrapping/line-breaking, in which case,
join the 80's and upgrade to one that does.  Or even just pipe to
your text editor of choice:  vi, emacs, ed, cat, and even Notepad
has a wrap long lines sort of setting or does the right thing
by default (okay, so cat relies on your console to do the
wrapping, but it does wrap).

I can see complaining about HTML content since not all MUA's
support it.  I can see complaining about top-posting vs. inline
responses because that effects readability.  But when the issue
is entirely controllable on your end, it sounds like a personal
issue.

-tkc


--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Steven D'Aprano
On Fri, 13 Nov 2009 04:48:59 -0600, Tim Chase wrote:

 There might be some stand-alone news-readers that aren't smart enough to
 support word-wrapping/line-breaking, in which case, join the 80's and
 upgrade to one that does.

Of course I can change my software. That fixes the problem for me. Or the 
poster can get a clue and follow the standard -- which may be as simple 
as clicking a checkbox, probably called Wrap text, under Settings 
somewhere -- and fix the problem for EVERYBODY, regardless of what mail 
client or newsreader they're using.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Aaron Watters
On Nov 11, 3:15 pm, Terry Reedy tjre...@udel.edu wrote:
 Robert P. J. Day wrote:
 I can imagine a day when code compiled from Python is routinely
 time-competitive with hand-written C.

That time is now, in many cases.

I still stand by my strategy published in Unix World
ages ago: get it working in Python, profile it, optimize
it, if you need to do it faster code the inner loops in
C.

Of course on google app engine, the last step is not possible,
but I don't think it is needed for 90% of applications
or more.

My own favorite app on google app engine/appspot is

http://listtree.appspot.com/

implemented using whiff
   http://whiff.sourceforge.net
as described in this tutorial
   http://aaron.oirt.rutgers.edu/myapp/docs/W1100_2300.GAEDeploy

not as fast as I would like sporadically.  But that's
certainly not Python's problem because the same application
running on my laptop is *much* faster.

By the way: the GO language smells like Rob Pike,
and I certainly hope it is more successful than
Limbo was.  Of course, if Google decides to really
push it then it's gonna be successful regardless
of all other considerations, just like Sun
did to Java...

   -- Aaron Watters

===
Limbo: how low can you go?


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Brian J Mingus
On Fri, Nov 13, 2009 at 12:19 AM, Steven D'Aprano 
st...@remove-this-cybersource.com.au wrote:

 On Thu, 12 Nov 2009 22:20:11 -0800, Vincent Manis wrote:

  When I was approximately 5, everybody knew that higher level languages
 were too slow for high-speed numeric computation (I actually didn't know
 that then, I was too busy watching Bill and Ben the Flowerpot Men), and
 therefore assembly languages were mandatory. Then IBM developed Fortran, and
 higher-level languages were not too slow for numeric computation.

 Vincent, could you please fix your mail client, or news client, so
 that it follows the standard for mail and news (that is, it has a
 hard-break after 68 or 72 characters?

 Having to scroll horizontally to read your posts is a real pain.


You're joking, right? Try purchasing a computer manufactured in this
millennium. Monitors are much wider than 72 characters nowadays, old timer.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Terry Reedy

Aaron Watters wrote:

On Nov 11, 3:15 pm, Terry Reedy tjre...@udel.edu wrote:

Robert P. J. Day wrote:
I can imagine a day when code compiled from Python is routinely
time-competitive with hand-written C.


That time is now, in many cases.


By routinely, I meant ***ROUTINELY***, as in
C become the province of specialized tool coders, much like assembly is 
now, while most programmers use Python (or similar languages) because 
they cannot (easily) beat it with hand-coded C.  We are not yet at 
*tha* place yet.



I still stand by my strategy published in Unix World
ages ago: get it working in Python, profile it, optimize
it, if you need to do it faster code the inner loops in
C.


Agreed

By the way: the GO language smells like Rob Pike,
and I certainly hope it is more successful than
Limbo was.  Of course, if Google decides to really
push it then it's gonna be successful regardless
of all other considerations, just like Sun
did to Java...

 By the way: the GO language smells like Rob Pike,
 and I certainly hope it is more successful than

It still has the stupid, unnecessary, redundant C brackets, given that 
all their example code is nicely indented like Python. That alone is a 
deal killer for me.


Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Paul Rubin
Tim Chase python.l...@tim.thechases.com writes:
 Or even just pipe to
 your text editor of choice:  vi, emacs, ed, cat, and even Notepad
 has a wrap long lines sort of setting or does the right thing
 by default (okay, so cat relies on your console to do the
 wrapping, but it does wrap).

No, auto wrapping long lines looks like crap.  It's better to keep the
line length reasonable when you write the posts.  This is Usenet so
please stick with Usenet practices.  If you want a web forum there are
plenty of them out there.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Robert Brown
Vincent Manis vma...@telus.net writes:

 On 2009-11-11, at 14:31, Alain Ketterlin wrote:
 I'm having some trouble understanding this thread. My comments aren't
 directed at Terry's or Alain's comments, but at the thread overall.

 1. The statement `Python is slow' doesn't make any sense to me. Python is a
 programming language; it is implementations that have speed or lack thereof.

This is generally true, but Python *the language* is specified in a way that
makes executing Python programs quickly very very difficult.  I'm tempted to
say it's impossible, but great strides have been made recently with JITs, so
we'll see.

 2. A skilled programmer could build an implementation that compiled Python
 code into Common Lisp or Scheme code, and then used a high-performance
 Common Lisp compiler such as SBCL, or a high-performance Scheme compiler
 such as Chez Scheme, to produce quite fast code ...

A skilled programmer has done this for Common Lisp.  The CLPython
implementation converts Python souce code to Common Lisp code at read time,
which is then is compiled.  With SBCL you get native machine code for every
Python expression.

  http://github.com/franzinc/cl-python/
  http://common-lisp.net/project/clpython/

If you want to know why Python *the language* is slow, look at the Lisp code
CLPython generates and at the code implementing the run time.  Simple
operations end up being very expensive.  Does the object on the left side of a
comparison implement compare?  No, then does the right side implement it?  No,
then try something else 

I'm sure someone can come up with a faster Python implementation, but it will
have to be very clever.

 This whole approach would be a bad idea, because the compile times would be
 dreadful, but I use this example as an existence proof that Python
 implementations can generate reasonably efficient executable programs.

The compile times are fine, not dreadful.  Give it a try.

 3. It is certainly true that CPython doesn't scale up to environments where
 there are a significant number of processors with shared memory.

Even on one processor, CPython has problems.

I last seriously used CPython to analyze OCRed books.  The code read in the
OCR results for one book at a time, which included the position of every word
on every page.  My books were long, 2000 pages, and dense and I was constantly
fighting address space limitations and CPython slowness related to memory
usage.  I had to resort to packing and unpacking data into Python integers in
order to fit all the OCR data into RAM.

bob
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Robert Brown

Vincent Manis vma...@telus.net writes:
 My point in the earlier post about translating Python into Common Lisp or
 Scheme was essentially saying `look, there's more than 30 years experience
 building high-performance implementations of Lisp languages, and Python
 isn't really that different from Lisp, so we ought to be able to do it too'.

Common Lisp and Scheme were designed by people who wanted to write complicated
systems on machines with a tiny fraction of the horsepower of current
workstations.  They were carefully designed to be compiled efficiently, which
is not the case with Python.  There really is a difference here.  Python the
language has features that make fast implementations extremely difficult.

bob
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Robert Brown

J Kenneth King ja...@agentultra.com writes:

 mcherm mch...@gmail.com writes:
 I think you have a fundamental misunderstanding of the reasons why Python
 is slow. Most of the slowness does NOT come from poor implementations: the
 CPython implementation is extremely well-optimized; the Jython and Iron
 Python implementations use best-in-the-world JIT runtimes. Most of the
 speed issues come from fundamental features of the LANGUAGE itself, mostly
 ways in which it is highly dynamic.

 -- Michael Chermside

 You might be right for the wrong reasons in a way.  Python isn't slow
 because it's a dynamic language.  All the lookups you're citing are highly
 optimized hash lookups.  It executes really fast.

Sorry, but Michael is right for the right reason.  Python the *language* is
slow because it's too dynamic.  All those hash table lookups are unnecessary
in some other dynamic languages and they slow down Python.  A fast
implementation is going to have to be very clever about memoizing method
lookups and invalidating assumptions when methods are dynamically redefined.

 As an implementation though, the sky really is the limit and Python is
 only getting started.

Yes, but Python is starting in the basement.

bob
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Vincent Manis
On 2009-11-13, at 12:46, Brian J Mingus wrote:
 You're joking, right? Try purchasing a computer manufactured in this 
 millennium. Monitors are much wider than 72 characters nowadays, old timer.
I have already agreed to make my postings VT100-friendly. Oh, wait, the VT-100, 
or at least some models of it, had a mode where you could have a line width of 
132 characters. 

And what does this have to do with Python? About as much as an exploding 
penguin 
on your television. 

-- v
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Vincent Manis

On 2009-11-13, at 15:32, Paul Rubin wrote:
   This is Usenet so
 please stick with Usenet practices.  
Er, this is NOT Usenet. 

1. I haven't, to the best of my recollection, made a Usenet post in this 
millennium. 

2. I haven't fired up a copy of rn or any other news reader in at least 2 
decades. 

3. I'm on the python-list mailing list, reading this with Apple's Mail 
application, 
which actually doesn't have convenient ways of enforcing `Usenet practices' 
regarding
message format. 

4. If we're going to adhere to tried-and-true message format rules, I want my 
IBM
2260 circa 1970, with its upper-case-only display and weird little end-of-line 
symbols. 

Stephen asked me to wrap my posts. I'm happy to do it. Can we please finish 
this thread
off and dispose of it? 

-- v

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Vincent Manis

On 2009-11-13, at 17:42, Robert Brown wrote, quoting me: 

 ... Python *the language* is specified in a way that
 makes executing Python programs quickly very very difficult.  
That is untrue. I have mentioned before that optional declarations integrate 
well with dynamic languages. Apart from CL and Scheme, which I have mentioned 
several times, you might check out Strongtalk (typed Smalltalk), and Dylan, 
which was designed for high-performance compilation, though to my knowledge
no Dylan compilers ever really achieved it. 

 I'm tempted to
 say it's impossible, but great strides have been made recently with JITs, so
 we'll see.

 If you want to know why Python *the language* is slow, look at the Lisp code
 CLPython generates and at the code implementing the run time.  Simple
 operations end up being very expensive.  Does the object on the left side of a
 comparison implement compare?  No, then does the right side implement it?  No,
 then try something else 
I've never looked at CLPython. Did it use a method cache (see Peter Deutsch's 
paper on Smalltalk performance in the unfortunately out-of-print `Smalltalk-80:
Bits of History, Words of Advice'? That technique is 30 years old now.

I have more to say, but I'll do that in responding to Bob's next post.

-- v
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread David Robinow
On Fri, Nov 13, 2009 at 3:32 PM, Paul Rubin
http://phr...@nospam.invalid wrote:
 ...  This is Usenet so
 please stick with Usenet practices.  If you want a web forum there are
 plenty of them out there.
 Actually this is python-list@python.org
I don't use usenet and I have no intention to stick with Usenet practices.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Vincent Manis
On 2009-11-13, at 18:02, Robert Brown wrote:

 Common Lisp and Scheme were designed by people who wanted to write complicated
 systems on machines with a tiny fraction of the horsepower of current
 workstations.  They were carefully designed to be compiled efficiently, which
 is not the case with Python.  There really is a difference here.  Python the
 language has features that make fast implementations extremely difficult.

Not true. Common Lisp was designed primarily by throwing together all of the 
features in every Lisp implementation the design committee was interested in. 
Although the committee members were familiar with high-performance compilation, 
the primary impetus was to achieve a standardized language that would be 
acceptable
to the Lisp community. At the time that Common Lisp was started, there was still
some sentiment that Lisp machines were the way to go for performance.  

As for Scheme, it was designed primarily to satisfy an aesthetic of minimalism. 
Even
though Guy Steele's thesis project, Rabbit, was a Scheme compiler, the point 
here was
that relatively simple compilation techniques could produce moderately 
reasonable 
object programs. Chez Scheme was indeed first run on machines that we would 
nowadays
consider tiny, but so too was C++. Oh, wait, so was Python!

I would agree that features such as exec and eval hurt the speed of Python 
programs, 
but the same things do the same thing in CL and in Scheme. There is a mystique 
about
method dispatch, but again, the Smalltalk literature has dealt with this issue 
in the 
past. 

Using Python 3 annotations, one can imagine a Python compiler that does the 
appropriate
thing (shown in the comments) with the following code. 

  import my_module# static linking

  __private_functions__ = ['my_fn']   # my_fn doesn't appear in the module 
dictionary.

  def my_fn(x: python.int32): # Keeps x in a register
def inner(z): # Lambda-lifts the function, no nonlocal 
vars
  return z // 2   #   does not construct a closure
y = x + 17# Via flow analysis, concludes that y can 
be registerized;
return inner(2 * y)   # Uses inline integer arithmetic 
instructions. 

  def blarf(a: python.int32):
return my_fn(a // 2)  # Because my_fn isn't exported, it can be 
inlined. 

A new pragma statement (which I am EXPLICITLY not proposing; I respect and 
support
the moratorium) might be useful in telling the implementation that you don't 
mind
integer overflow. 

Similarly, new library classes might be created to hold arrays of int32s or 
doubles. 

Obviously, no Python system does any of these things today. But there really is 
nothing stopping a Python system from doing any of these things, and the 
technology 
is well-understood in implementations of other languages. 

I am not claiming that this is _better_ than JIT. I like JIT and other runtime 
things
such as method caches better than these because you don't have to know very 
much about 
the implementation in order to take advantage of them. But there may be some 
benefit
in allowing programmers concerned with speed to relax some of Python's dynamism 
without ruining it for the people who need a truly dynamic language. 

If I want to think about scalability seriously, I'm more concerned about 
problems that
Python shares with almost every modern language: if you have lots of processors 
accessing 
a large shared memory, there is a real GC efficiency problem as the number of 
processors 
goes up. On the other hand, if you have a lot of processors with some degree of 
private 
memory sharing a common bus (think the Cell processor), how do we build an 
efficient 
implementation of ANY language for that kind of environment?

Somehow, the issues of Python seem very orthogonal to performance scalability. 

-- v


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Paul Rubin
Vincent Manis vma...@telus.net writes:
 3. I'm on the python-list mailing list, reading this with Apple's
 Mail application, which actually doesn't have convenient ways of
 enforcing `Usenet practices' regarding message format.

Oh, I see.  Damn gateway.

 Stephen asked me to wrap my posts. I'm happy to do it. Can we please
 finish this thread off and dispose of it?

Please wrap to 72 columns or less.  It's easier to read that way.  (I
actually don't care if you do it or not.  If you don't, I'll just
stop responding to you, which might even suit your wishes.)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Paul Rubin
Robert P. J. Day rpj...@crashcourse.ca writes:
 http://groups.google.com/group/unladen-swallow/browse_thread/thread/4edbc406f544643e?pli=1
   thoughts?

I'd bet it's not just about multicore scaling and general efficiency,
but also the suitability of the language itself for large, complex
projects.  It's just not possible to be everything for everybody.
Python is beginner-friendly, has a very fast learning curve for
experienced programmers in other languages, and is highly productive
for throwing small and medium sized scripts together, that are
debugged through iterated testing.  One might say it's optimized for
those purposes.  I use it all the time because a lot of my programming
fits the pattern.  The car analogy is the no-frills electric commuter
car, just hop in and aim it where you want to go; if you crash it,
brush yourself off and restart.  But there are times (large production
applications) when you really want the Airbus A380 with the 100's of
automatic monitoring systems and checkout procedures to follow before
you take off, even if the skill level needed to use it is much higher
than the commuter car.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Vincent Manis
On 2009-11-13, at 19:53, Paul Rubin wrote:

 Robert P. J. Day rpj...@crashcourse.ca writes:
 http://groups.google.com/group/unladen-swallow/browse_thread/thread/4edbc406f544643e?pli=1
  thoughts?
 
 I'd bet it's not just about multicore scaling and general efficiency,
 but also the suitability of the language itself for large, complex
 projects.  It's just not possible to be everything for everybody.
 Python is beginner-friendly, has a very fast learning curve for
 experienced programmers in other languages, and is highly productive
 for throwing small and medium sized scripts together, that are
 debugged through iterated testing.  One might say it's optimized for
 those purposes.  I use it all the time because a lot of my programming
 fits the pattern.  The car analogy is the no-frills electric commuter
 car, just hop in and aim it where you want to go; if you crash it,
 brush yourself off and restart.  But there are times (large production
 applications) when you really want the Airbus A380 with the 100's of
 automatic monitoring systems and checkout procedures to follow before
 you take off, even if the skill level needed to use it is much higher
 than the commuter car.

OK. The quoted link deals with Unladen Swallow, which is an attempt to deal 
with the 
very real performance limitations of current Python systems. The remarks above 
deal with
productivity scalability, which is a totally different matter. So...

People can and do write large programs in Python, not just `throwing...medium 
sized 
scripts together'. Unlike, say, Javascript, it has the necessary machinery to 
build very
large programs that are highly maintainable. One can reasonably compare it with 
Java, C#, 
and Smalltalk; the facilities are comparable, and all of those (as well as 
Python) are 
used for building enterprise systems. 

I believe that the A380's control software is largely written in Ada, which is 
a 
perfectly fine programming language that I would prefer not to write code in. 
For 
approximately 10 years, US DOD pretty much required the use of Ada in military 
(and 
aerospace) software (though a a couple of years ago I discovered that there is 
still 
one remaining source of Jovial compilers that still sells to DOD). According to 
a 
presentation by Lt. Colonel J. A. Hamilton, `Programming Language Policy in the 
DOD:
After The Ada Mandate', given in 1999, `We are unlikely to see a return of a 
programming
language mandate' (www.drew-hamilton.com/stc99/stcAda_99.pdf). As I understand 
it, 
the removal of the Ada mandate came from the realization (20 years after many 
computer
scientists *told* DOD this) that software engineering processes contribute more 
to 
reliability than do programming language structures (c.f. Fred Brooks, `No 
Silver 
Bullet').

So: to sum up, there are lots of large systems where Python might be totally 
appropriate, 
especially if complemented with processes that feature careful specification 
and strong 
automated testing. There are some large systems where Python would definitely 
NOT be 
the language of choice, or even usable at all, because different engineering 
processes 
were in place. 

From a productivity viewpoint, there is no data to say that Python is more, 
less, or equally
scalable than Language X in that it produces correctly-tested, 
highly-maintainable programs
at a lower, higher, or equal cost. I would appreciate it if people who wanted 
to comment on
Python's scalability or lack thereof would give another programming language 
that they would 
compare it with. 

-- v



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Alf P. Steinbach

* Rami Chowdhury:
On Thu, 12 Nov 2009 12:02:11 -0800, Alf P. Steinbach al...@start.no 
wrote:
I think that was in the part you *snipped* here. Just fill in the 
mentioned qualifications and weasel words.


OK, sure. I don't think they're weasel words, because I find them 
useful, but I think I see where you're coming from.


Specifically, I reacted to the statement that it is sheer nonsense 
to talk about the speed of an implementation, made in response to 
someone upthread, in the context of Google finding CPython overall too 
slow.


IIRC it was the speed of a language that was asserted to be nonsense, 
wasn't it?


Yes, upthread.

It's sort of hilarious. g

  Alain Ketterlin:
  slide/page 22 explains why python is so slow

  Vincent Manis (response):
  Python is a programming language; it is implementations that have speed
  or lack thereof

This was step 1 of trying to be more precise than the concept warranted.

Then Steven D'Aprano chimed in, adding even more precision:

  Steven D'Aprano (further down response stack):
  it is sheer nonsense to talk about the speed of an implementation

So no, it's not a language that is slow, it's of course only concrete 
implementations that may have slowness flavoring. And no, not really, they 
don't, because it's just particular aspects of any given implementation that may 
exhibit slowness in certain contexts. And expanding on that trend, later in the 
thread the observation was made that no, not really that either, it's just (if 
it is at all) at this particular point in time, what about the future? Let's be 
precise! Can't have that vague touchy-feely impression about a /language/ being 
slow corrupting the souls of readers.


Hip hurray, Google's observation annuled, by the injections of /precision/. :-)


Which IMO is fair -- a physicist friend of mine works with a 
C++ interpreter which is relatively sluggish, but that doesn't mean C++ 
is slow...


Actually, although C++ has the potential for being really really fast (and some 
C++ programs are), the amount of work you have to add to realize the potential 
can be staggering. This is most clearly evidenced by C++'s standard iostreams, 
which have the potential of being much much faster than C FILE i/o (in 
particular Dietmar Kuhl made such an implementation), but where the complexity 
of and the guidance offered by the design is such that nearly all extant 
implementations are painfully slow, even compared to C FILE. So, we generally 
talk about iostreams being slow, knowing full well what we mean and that fast 
implementations are theoretically possible (as evidenced by Dietmar's)  --  but 
fast and slow are in-practice terms, and so what matters is in-practice, 
like, how does your compiler's iostreams implementation hold up.



Cheers,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Robert Brown

Vincent Manis vma...@telus.net writes:

 On 2009-11-13, at 17:42, Robert Brown wrote, quoting me: 

 ... Python *the language* is specified in a way that
 makes executing Python programs quickly very very difficult.  

 That is untrue. I have mentioned before that optional declarations integrate
 well with dynamic languages. Apart from CL and Scheme, which I have
 mentioned several times, you might check out Strongtalk (typed Smalltalk),
 and Dylan, which was designed for high-performance compilation, though to my
 knowledge no Dylan compilers ever really achieved it.

You are not making an argument, just mentioning random facts.  You claim I've
made a false statement, then talk about optional type declarations, which
Python doesn't have.  Then you mention Smalltalk and Dylan.  What's your
point?  To prove me wrong you have to demonstrate that it's not very difficult
to produce a high performance Python system, given current Python semantics.

 I'm tempted to say it's impossible, but great strides have been made
 recently with JITs, so we'll see.

 If you want to know why Python *the language* is slow, look at the Lisp
 code CLPython generates and at the code implementing the run time.  Simple
 operations end up being very expensive.  Does the object on the left side
 of a comparison implement compare?  No, then does the right side implement
 it?  No, then try something else 

 I've never looked at CLPython. Did it use a method cache (see Peter
 Deutsch's paper on Smalltalk performance in the unfortunately out-of-print
 `Smalltalk-80: Bits of History, Words of Advice'? That technique is 30 years
 old now.

Please look at CLPython.  The complexity of some Python operations will make
you weep.  CLPython uses Common Lisp's CLOS method dispatch in various places,
so yes, those method lookups are definitely cached.

Method lookup is just the tip if the iceburg.  How about comparison?  Here are
some comments from CLPython's implementation of compare.  There's a lot going
on.  It's complex and SLOW.

   ;; This function is used in comparisons like , =, ==.
   ;; 
   ;; The CPython logic is a bit complicated; hopefully the following
   ;; is a correct translation.

   ;; If the class is equal and it defines __cmp__, use that.

   ;; The rich comparison operations __lt__, __eq__, __gt__ are
   ;; now called before __cmp__ is called.
   ;; 
   ;; Normally, we take these methods of X.  However, if class(Y)
   ;; is a subclass of class(X), the first look at Y's magic
   ;; methods.  This allows the subclass to override its parent's
   ;; comparison operations.
   ;; 
   ;; It is assumed that the subclass overrides all of
   ;; __{eq,lt,gt}__. For example, if sub.__eq__ is not defined,
   ;; first super.__eq__ is called, and after that __sub__.__lt__
   ;; (or super.__lt__).
   ;; 
   ;; object.c - try_rich_compare_bool(v,w,op) / try_rich_compare(v,w,op)

   ;; Try each `meth'; if the outcome it True, return `res-value'.

   ;; So the rich comparison operations didn't lead to a result.
   ;; 
   ;; object.c - try_3way_compare(v,w)
   ;; 
   ;; Now, first try X.__cmp__ (even if y.class is a subclass of
   ;; x.class) and Y.__cmp__ after that.

   ;; CPython now does some number coercion attempts that we don't
   ;; have to do because we have first-class numbers. (I think.)

   ;; object.c - default_3way_compare(v,w)
   ;; 
   ;; Two instances of same class without any comparison operator,
   ;; are compared by pointer value. Our function `py-id' fakes
   ;; that.

   ;; None is smaller than everything (excluding itself, but that
   ;; is catched above already, when testing for same class;
   ;; NoneType is not subclassable).

   ;; Instances of different class are compared by class name, but
   ;; numbers are always smaller.

   ;; Probably, when we arrive here, there is a bug in the logic
   ;; above. Therefore print a warning.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Vincent Manis
On 2009-11-13, at 22:51, Alf P. Steinbach wrote:
 It's sort of hilarious. g
It really is, see below. 

 So no, it's not a language that is slow, it's of course only concrete 
 implementations that may have slowness flavoring. And no, not really, they 
 don't, because it's just particular aspects of any given implementation that 
 may exhibit slowness in certain contexts. And expanding on that trend, later 
 in the thread the observation was made that no, not really that either, it's 
 just (if it is at all) at this particular point in time, what about the 
 future? Let's be precise! Can't have that vague touchy-feely impression about 
 a /language/ being slow corrupting the souls of readers.
Because `language is slow' is meaningless. 

An earlier post of mine listed four examples where the common wisdom was `XXX 
is slow' and yet where that 
turned out not to be the case.

Some others. 

1. I once owned a Commodore 64. I got Waterloo Pascal for it. I timed the 
execution of some program 
(this was 25 years ago, I forget what the program did) at 1 second per 
statement. Therefore: `Pascal 
is slow'. 

2. Bell Labs produced a fine programming language called Snobol 4. It was slow. 
But some folks at 
IIT in Chicago did their own implementation, Spitbol, which was fast and 
completely compatible. 
Presto: Snobol 4 was slow, but then it became fast. 

3. If you write the following statements in Fortran IV (the last version of 
Fortran I learned)

   DO 10 I=1, 100
 DO 10 J=1, 100
   A(I, J) = 0.0
10 CONTINUE

you would paralyze early virtual memory systems, because Fortran IV defined 
arrays to be stored 
in column major order, and the result was extreme thrashing. Many programmers 
did not realize 
this, and would naturally write code like that. Fortran cognoscenti would 
interchange the two 
DO statements and thus convert Fortran from being a slow language to being a 
fast one. 

4. When Sun released the original Java system, programs ran very slowly, and 
everybody said 
`I will not use Java, it is a slow language'. Then Sun improved their JVM, and 
other organizations 
wrote their own JVMs which were fast. Therefore Java became a fast language. 

 Actually, although C++ has the potential for being really really fast (and 
 some C++ programs are), the amount of work you have to add to realize the 
 potential can be staggering. This is most clearly evidenced by C++'s standard 
 iostreams, which have the potential of being much much faster than C FILE i/o 
 (in particular Dietmar Kuhl made such an implementation), but where the 
 complexity of and the guidance offered by the design is such that nearly 
 all extant implementations are painfully slow, even compared to C FILE. So, 
 we generally talk about iostreams being slow, knowing full well what we mean 
 and that fast implementations are theoretically possible (as evidenced by 
 Dietmar's)  --  but fast and slow are in-practice terms, and so what 
 matters is in-practice, like, how does your compiler's iostreams 
 implementation hold up.
OK, let me work this one out. Because most iostreams implementations are very 
slow, C++ is a slow 
language. But since Kuhl did a high-performance implementation, he made C++ 
into a fast language. 
But since most people don't use his iostreams implementation, C++ is a slow 
language again, except
for organizations that have banned iostreams (as my previous employers did) 
because it's too slow, 
therefore C++ is a fast language. 

Being imprecise is so much fun! I should write my programs this imprecisely. 

More seriously, when someone says `xxx is a slow language', the only thing they 
can possibly mean 
is `there is no implementation in existence, and no likelihood of an 
implementation being possible, 
that is efficient enough to solve my problem in the required time' or perhaps 
`I must write peculiar
code in order to get programs to run in the specified time; writing code in the 
way the language seems
to encourage produces programs that are too slow'. This is a very sweeping 
statement, and at the very
least ought to be accompanied by some kind of proof. If Python is indeed a slow 
language, then Unladen 
Swallow and pypy, and many other projects, are wastes of time, and should not 
be continued. 

Again, this doesn't have anything to do with features of an implementation that 
are slow or fast. 
The only criterion that makes sense is `do programs run with the required 
performance if written 
in the way the language's inventors encourage'. Most implementations of every 
language have a nook 
or two where things get embarrassingly slow; the question is `are most programs 
unacceptably slow'. 

But, hey, if we are ok with being imprecise, let's go for it. Instead of saying 
`slow' and `fast',
why not say `good' and `bad'?

-- v
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread Robert Brown

Vincent Manis vma...@telus.net writes:

 On 2009-11-13, at 18:02, Robert Brown wrote:

 Common Lisp and Scheme were designed by people who wanted to write
 complicated systems on machines with a tiny fraction of the horsepower of
 current workstations.  They were carefully designed to be compiled
 efficiently, which is not the case with Python.  There really is a
 difference here.  Python the language has features that make fast
 implementations extremely difficult.

 Not true. Common Lisp was designed primarily by throwing together all of the
 features in every Lisp implementation the design committee was interested
 in.  Although the committee members were familiar with high-performance
 compilation, the primary impetus was to achieve a standardized language that
 would be acceptable to the Lisp community. At the time that Common Lisp was
 started, there was still some sentiment that Lisp machines were the way to
 go for performance.

Common Lisp blends together features of previous Lisps, which were designed to
be executed efficiently.  Operating systems were written in these variants.
Execution speed was important.  The Common Lisp standardization committee
included people who were concerned about performance on C-optimized hardware.

 As for Scheme, it was designed primarily to satisfy an aesthetic of
 minimalism. Even though Guy Steele's thesis project, Rabbit, was a Scheme
 compiler, the point here was that relatively simple compilation techniques
 could produce moderately reasonable object programs. Chez Scheme was indeed
 first run on machines that we would nowadays consider tiny, but so too was
 C++. Oh, wait, so was Python!

The Scheme standard has gone through many revisions.  I think we're up to
version 6 at this point.  The people working on it are concerned about
performance.  For instance, see the discussions about whether the order of
evaluating function arguments should be specified.  Common Lisp evaluates
arguments left to right, but Scheme leaves the order unspecified so the
compiler can better optimize.  You can't point to Rabbit (1978 ?) as
representative of the Scheme programming community over the last few decades.

 Using Python 3 annotations, one can imagine a Python compiler that does the
 appropriate thing (shown in the comments) with the following code.

I can imagine a lot too, but we're talking about Python as it's specified
*today*.  The Python language as it's specified today is hard to execute
quickly.  Not impossible, but very hard, which is why we don't see fast Python
systems.

bob
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-13 Thread sturlamolden
On 14 Nov, 08:39, Robert Brown bbr...@speakeasy.net wrote:

  Using Python 3 annotations, one can imagine a Python compiler that does the
  appropriate thing (shown in the comments) with the following code.

 I can imagine a lot too, but we're talking about Python as it's specified
 *today*.  The Python language as it's specified today is hard to execute
 quickly.  Not impossible, but very hard, which is why we don't see fast Python
 systems.

It would not be too difficult to have a compiler like Cython recognize
those annotations instead of current cdefs.

With Cython we can get Python to run at the speed of C just by
adding in optional type declarations for critical variables (most need
not be declared).

With CMUCL and SBCL we can make Common Lisp perform at the speed of
C, for the same reason.

Also a Cython program will usually out-perform most C code. It
combines the strengths of C, Fortran 90 and Python.












-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread samwyse
On Nov 11, 3:57 am, Robert P. J. Day rpj...@crashcourse.ca wrote:
 http://groups.google.com/group/unladen-swallow/browse_thread/thread/4...

   thoughts?

Google's already given us its thoughts:
http://developers.slashdot.org/story/09/11/11/0210212/Go-Googles-New-Open-Source-Programming-Language
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread mcherm
On Nov 11, 7:38 pm, Vincent Manis vma...@telus.net wrote:
 1. The statement `Python is slow' doesn't make any sense to me.
 Python is a programming language; it is implementations that have
 speed or lack thereof.
   [...]
 2. A skilled programmer could build an implementation that compiled
 Python code into Common Lisp or Scheme code, and then used a
 high-performance Common Lisp compiler...

I think you have a fundamental misunderstanding of the reasons why
Python is
slow. Most of the slowness does NOT come from poor implementations:
the CPython
implementation is extremely well-optimized; the Jython and Iron Python
implementations use best-in-the-world JIT runtimes. Most of the speed
issues
come from fundamental features of the LANGUAGE itself, mostly ways in
which
it is highly dynamic.

In Python, a piece of code like this:
len(x)
needs to watch out for the following:
* Perhaps x is a list OR
  * Perhaps x is a dict OR
  * Perhaps x is a user-defined type that declares a __len__
method OR
  * Perhaps a superclass of x declares __len__ OR
* Perhaps we are running the built-in len() function OR
  * Perhaps there is a global variable 'len' which shadows the
built-in OR
  * Perhaps there is a local variable 'len' which shadows the
built-in OR
  * Perhaps someone has modified __builtins__

In Python it is possible for other code, outside your module to go in
and
modify or replace some methods from your module (a feature called
monkey-patching which is SOMETIMES useful for certain kinds of
testing).
There are just so many things that can be dynamic (even if 99% of the
time
they are NOT dynamic) that there is very little that the compiler can
assume.

So whether you implement it in C, compile to CLR bytecode, or
translate into
Lisp, the computer is still going to have to to a whole bunch of
lookups to
make certain that there isn't some monkey business going on, rather
than
simply reading a single memory location that contains the length of
the list.
Brett Cannon's thesis is an example: he attempted desperate measures
to
perform some inferences that would allow performing these
optimizations
safely and, although a few of them could work in special cases, most
of the
hoped-for improvements were impossible because of the dynamic nature
of the
language.

I have seen a number of attempts to address this, either by placing
some
restrictions on the dynamic nature of the code (but that would change
the
nature of the Python language) or by having some sort of a JIT
optimize the
common path where we don't monkey around. Unladen Swallow and PyPy are
two
such efforts that I find particularly promising.

But it isn't NEARLY as simple as you make it out to be.

-- Michael Chermside
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Joel Davis
On Nov 12, 10:07 am, mcherm mch...@gmail.com wrote:
 On Nov 11, 7:38 pm, Vincent Manis vma...@telus.net wrote:

  1. The statement `Python is slow' doesn't make any sense to me.
  Python is a programming language; it is implementations that have
  speed or lack thereof.
    [...]
  2. A skilled programmer could build an implementation that compiled
  Python code into Common Lisp or Scheme code, and then used a
  high-performance Common Lisp compiler...

 I think you have a fundamental misunderstanding of the reasons why
 Python is
 slow. Most of the slowness does NOT come from poor implementations:
 the CPython
 implementation is extremely well-optimized; the Jython and Iron Python
 implementations use best-in-the-world JIT runtimes. Most of the speed
 issues
 come from fundamental features of the LANGUAGE itself, mostly ways in
 which
 it is highly dynamic.

 In Python, a piece of code like this:
     len(x)
 needs to watch out for the following:
     * Perhaps x is a list OR
       * Perhaps x is a dict OR
       * Perhaps x is a user-defined type that declares a __len__
 method OR
       * Perhaps a superclass of x declares __len__ OR
     * Perhaps we are running the built-in len() function OR
       * Perhaps there is a global variable 'len' which shadows the
 built-in OR
       * Perhaps there is a local variable 'len' which shadows the
 built-in OR
       * Perhaps someone has modified __builtins__

 In Python it is possible for other code, outside your module to go in
 and
 modify or replace some methods from your module (a feature called
 monkey-patching which is SOMETIMES useful for certain kinds of
 testing).
 There are just so many things that can be dynamic (even if 99% of the
 time
 they are NOT dynamic) that there is very little that the compiler can
 assume.

 So whether you implement it in C, compile to CLR bytecode, or
 translate into
 Lisp, the computer is still going to have to to a whole bunch of
 lookups to
 make certain that there isn't some monkey business going on, rather
 than
 simply reading a single memory location that contains the length of
 the list.
 Brett Cannon's thesis is an example: he attempted desperate measures
 to
 perform some inferences that would allow performing these
 optimizations
 safely and, although a few of them could work in special cases, most
 of the
 hoped-for improvements were impossible because of the dynamic nature
 of the
 language.

 I have seen a number of attempts to address this, either by placing
 some
 restrictions on the dynamic nature of the code (but that would change
 the
 nature of the Python language) or by having some sort of a JIT
 optimize the
 common path where we don't monkey around. Unladen Swallow and PyPy are
 two
 such efforts that I find particularly promising.

 But it isn't NEARLY as simple as you make it out to be.

 -- Michael Chermside

obviously the GIL is a major reason it's so slow. That's one of the
_stated_ reasons why Google has decided to forgo CPython code. As far
as how sweeping the directive is, I don't know, since the situation
would sort of resolve itself if one committed to Jython application
building or just wait until unladen swallow is finished.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Steven D'Aprano
On Thu, 12 Nov 2009 08:35:23 -0800, Joel Davis wrote:

 obviously the GIL is a major reason it's so slow. 

No such obviously about it.

There have been attempts to remove the GIL, and they lead to CPython 
becoming *slower*, not faster, for the still common case of single-core 
processors.

And neither Jython nor IronPython have the GIL. Jython appears to scale 
slightly better than CPython, but for small data sets, is slower than 
CPython. IronPython varies greatly in performance, ranging from nearly 
twice as fast as CPython on some benchmarks to up to 6000 times slower!

http://www.smallshire.org.uk/sufficientlysmall/2009/05/22/ironpython-2-0-and-jython-2-5-performance-compared-to-python-2-5/

http://ironpython-urls.blogspot.com/2009/05/python-jython-and-ironpython.html


Blaming CPython's supposed slowness on the GIL is superficially plausible 
but doesn't stand up to scrutiny. The speed of an implementation depends 
on many factors, and it also depends on *what you measure* -- it is sheer 
nonsense to talk about the speed of an implementation. Different tasks 
run at different speeds, and there is no universal benchmark.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Alf P. Steinbach

* Steven D'Aprano:

On Thu, 12 Nov 2009 08:35:23 -0800, Joel Davis wrote:

obviously the GIL is a major reason it's so slow. 



http://en.wikipedia.org/wiki/Global_Interpreter_Lock

Uh oh...



No such obviously about it.

There have been attempts to remove the GIL, and they lead to CPython 
becoming *slower*, not faster, for the still common case of single-core 
processors.


And neither Jython nor IronPython have the GIL. Jython appears to scale 
slightly better than CPython, but for small data sets, is slower than 
CPython. IronPython varies greatly in performance, ranging from nearly 
twice as fast as CPython on some benchmarks to up to 6000 times slower!


http://www.smallshire.org.uk/sufficientlysmall/2009/05/22/ironpython-2-0-and-jython-2-5-performance-compared-to-python-2-5/

http://ironpython-urls.blogspot.com/2009/05/python-jython-and-ironpython.html


Blaming CPython's supposed slowness


Hm, this seems religious.

Of course Python is slow: if you want speed, pay for it by complexity.

It so happens that I think CPython could have been significantly faster, but (1) 
doing that would amount to creating a new implementation, say, C++Python g, 
and (2) what for, really?, since CPU-intensive things should be offloaded to 
other language code anyway.



on the GIL is superficially plausible 
but doesn't stand up to scrutiny. The speed of an implementation depends 
on many factors, and it also depends on *what you measure* -- it is sheer 
nonsense to talk about the speed of an implementation. Different tasks 
run at different speeds, and there is no universal benchmark.


This also seems religious. It's like in Norway it became illegal to market lemon 
soda, since umpteen years ago it's soda with lemon flavoring. This has to do 
with the *origin* of the citric acid, whether natural or chemist's concoction, 
no matter that it's the same chemical. So, some people think that it's wrong to 
talk about interpreted languages, hey, it should be a language designed for 
interpretation, or better yet, dynamic language, or bestest, language with 
dynamic flavor. And slow language, oh no, should be language whose current 
implementations are perceived as somewhat slow by some (well, all) people, but 
of course, that's just silly.



Cheers,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread J Kenneth King
mcherm mch...@gmail.com writes:

 On Nov 11, 7:38 pm, Vincent Manis vma...@telus.net wrote:
 1. The statement `Python is slow' doesn't make any sense to me.
 Python is a programming language; it is implementations that have
 speed or lack thereof.
[...]
 2. A skilled programmer could build an implementation that compiled
 Python code into Common Lisp or Scheme code, and then used a
 high-performance Common Lisp compiler...

 I think you have a fundamental misunderstanding of the reasons why
 Python is
 slow. Most of the slowness does NOT come from poor implementations:
 the CPython
 implementation is extremely well-optimized; the Jython and Iron Python
 implementations use best-in-the-world JIT runtimes. Most of the speed
 issues
 come from fundamental features of the LANGUAGE itself, mostly ways in
 which
 it is highly dynamic.

 In Python, a piece of code like this:
 len(x)
 needs to watch out for the following:
 * Perhaps x is a list OR
   * Perhaps x is a dict OR
   * Perhaps x is a user-defined type that declares a __len__
 method OR
   * Perhaps a superclass of x declares __len__ OR
 * Perhaps we are running the built-in len() function OR
   * Perhaps there is a global variable 'len' which shadows the
 built-in OR
   * Perhaps there is a local variable 'len' which shadows the
 built-in OR
   * Perhaps someone has modified __builtins__

 In Python it is possible for other code, outside your module to go in
 and
 modify or replace some methods from your module (a feature called
 monkey-patching which is SOMETIMES useful for certain kinds of
 testing).
 There are just so many things that can be dynamic (even if 99% of the
 time
 they are NOT dynamic) that there is very little that the compiler can
 assume.

 So whether you implement it in C, compile to CLR bytecode, or
 translate into
 Lisp, the computer is still going to have to to a whole bunch of
 lookups to
 make certain that there isn't some monkey business going on, rather
 than
 simply reading a single memory location that contains the length of
 the list.
 Brett Cannon's thesis is an example: he attempted desperate measures
 to
 perform some inferences that would allow performing these
 optimizations
 safely and, although a few of them could work in special cases, most
 of the
 hoped-for improvements were impossible because of the dynamic nature
 of the
 language.

 I have seen a number of attempts to address this, either by placing
 some
 restrictions on the dynamic nature of the code (but that would change
 the
 nature of the Python language) or by having some sort of a JIT
 optimize the
 common path where we don't monkey around. Unladen Swallow and PyPy are
 two
 such efforts that I find particularly promising.

 But it isn't NEARLY as simple as you make it out to be.

 -- Michael Chermside

You might be right for the wrong reasons in a way.

Python isn't slow because it's a dynamic language.  All the lookups
you're citing are highly optimized hash lookups.  It executes really
fast.

The OP is talking about scale.  Some people say Python is slow at a
certain scale.  I say that's about true for any language.  Large amounts
of IO is a tough problem.

Where Python might get hit *as a language* is that the Python programmer
has to drop into C to implement optimized data-structures for dealing
with the kind of IO that would slow down the Python interpreter.  That's
why we have numpy, scipy, etc.  The special cases it takes to solve
problems with custom types wasn't special enough to alter the language.
Scale is a special case believe it or not.

As an implementation though, the sky really is the limit and Python is
only getting started.  Give it another 40 years and it'll probably
realize that it's just another Lisp. ;)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Rami Chowdhury
On Thu, 12 Nov 2009 09:32:28 -0800, Alf P. Steinbach al...@start.no  
wrote:


This also seems religious. It's like in Norway it became illegal to  
market lemon soda, since umpteen years ago it's soda with lemon  
flavoring. This has to do with the *origin* of the citric acid, whether  
natural or chemist's concoction, no matter that it's the same chemical.  
So, some people think that it's wrong to talk about interpreted  
languages, hey, it should be a language designed for interpretation,  
or better yet, dynamic language, or bestest, language with dynamic  
flavor. And slow language, oh no, should be language whose current  
implementations are perceived as somewhat slow by some (well, all)  
people, but of course, that's just silly.


Perhaps I'm missing the point of what you're saying but I don't see why  
you're conflating interpreted and dynamic here? Javascript is unarguably a  
dynamic language, yet Chrome / Safari 4 / Firefox 3.5 all typically JIT  
it. Does that make Javascript non-dynamic, because it's compiled? What  
about Common Lisp, which is a compiled language when it's run with CMUCL  
or SBCL?



--
Rami Chowdhury
Never attribute to malice that which can be attributed to stupidity --  
Hanlon's Razor

408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Alf P. Steinbach

* Rami Chowdhury:
On Thu, 12 Nov 2009 09:32:28 -0800, Alf P. Steinbach al...@start.no 
wrote:


This also seems religious. It's like in Norway it became illegal to 
market lemon soda, since umpteen years ago it's soda with lemon 
flavoring. This has to do with the *origin* of the citric acid, 
whether natural or chemist's concoction, no matter that it's the same 
chemical. So, some people think that it's wrong to talk about 
interpreted languages, hey, it should be a language designed for 
interpretation, or better yet, dynamic language, or bestest, 
language with dynamic flavor. And slow language, oh no, should be 
language whose current implementations are perceived as somewhat slow 
by some (well, all) people, but of course, that's just silly.


Perhaps I'm missing the point of what you're saying but I don't see why 
you're conflating interpreted and dynamic here? Javascript is unarguably 
a dynamic language, yet Chrome / Safari 4 / Firefox 3.5 all typically 
JIT it. Does that make Javascript non-dynamic, because it's compiled? 
What about Common Lisp, which is a compiled language when it's run with 
CMUCL or SBCL?


Yeah, you missed it.

Blurring and coloring and downright hiding reality by insisting on misleading 
but apparently more precise terminology for some vague concept is a popular 
sport, and chiding others for using more practical and real-world oriented 
terms, can be effective in politics and some other arenas.


But in a technical context it's silly. Or dumb. Whatever.

E.g. you'll find it impossible to define interpretation rigorously in the sense 
that you're apparently thinking of. It's not that kind of term or concept. The 
nearest you can get is in a different direction, something like a program whose 
actions are determined by data external to the program (+ x qualifications and 
weasel words), which works in-practice, conceptually, but try that on as a 
rigorous definition and you'll see that when you get formal about it then it's 
completely meaningless: either anything qualifies or nothing qualifies.


You'll also find it impossible to rigorously define dynamic language in a 
general way so that that definition excludes C++. g


So, to anyone who understands what one is talking about, interpreted, or e.g. 
slow language (as was the case here), conveys the essence.


And to anyone who doesn't understand it trying to be more precise is an exercise 
in futility and pure silliness  --  except for the purpose of misleading.



Cheers  hth.,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Rami Chowdhury
On Thu, 12 Nov 2009 11:24:18 -0800, Alf P. Steinbach al...@start.no  
wrote:



* Rami Chowdhury:
On Thu, 12 Nov 2009 09:32:28 -0800, Alf P. Steinbach al...@start.no  
wrote:


This also seems religious. It's like in Norway it became illegal to  
market lemon soda, since umpteen years ago it's soda with lemon  
flavoring. This has to do with the *origin* of the citric acid,  
whether natural or chemist's concoction, no matter that it's the same  
chemical. So, some people think that it's wrong to talk about  
interpreted languages, hey, it should be a language designed for  
interpretation, or better yet, dynamic language, or bestest,  
language with dynamic flavor. And slow language, oh no, should be  
language whose current implementations are perceived as somewhat slow  
by some (well, all) people, but of course, that's just silly.
 Perhaps I'm missing the point of what you're saying but I don't see  
why you're conflating interpreted and dynamic here? Javascript is  
unarguably a dynamic language, yet Chrome / Safari 4 / Firefox 3.5 all  
typically JIT it. Does that make Javascript non-dynamic, because it's  
compiled? What about Common Lisp, which is a compiled language when  
it's run with CMUCL or SBCL?


Yeah, you missed it.

Blurring and coloring and downright hiding reality by insisting on  
misleading but apparently more precise terminology for some vague  
concept is a popular sport, and chiding others for using more practical  
and real-world oriented terms, can be effective in politics and some  
other arenas.





But in a technical context it's silly. Or dumb. Whatever.

E.g. you'll find it impossible to define interpretation rigorously in  
the sense that you're apparently thinking of.


Well, sure. Can you explain, then, what sense you meant it in?

You'll also find it impossible to rigorously define dynamic language  
in a general way so that that definition excludes C++. g


Or, for that matter, suitably clever assembler. I'm not arguing with you  
there.


So, to anyone who understands what one is talking about, interpreted,  
or e.g. slow language (as was the case here), conveys the essence.


Not when the context isn't clear, it doesn't.

And to anyone who doesn't understand it trying to be more precise is an  
exercise in futility and pure silliness  --  except for the purpose of  
misleading.


Or for the purpose of greater understanding, surely - and isn't that the  
point?



--
Rami Chowdhury
Never attribute to malice that which can be attributed to stupidity --  
Hanlon's Razor

408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Alf P. Steinbach

* Rami Chowdhury:
On Thu, 12 Nov 2009 11:24:18 -0800, Alf P. Steinbach al...@start.no 
wrote:



* Rami Chowdhury:
On Thu, 12 Nov 2009 09:32:28 -0800, Alf P. Steinbach al...@start.no 
wrote:


This also seems religious. It's like in Norway it became illegal to 
market lemon soda, since umpteen years ago it's soda with lemon 
flavoring. This has to do with the *origin* of the citric acid, 
whether natural or chemist's concoction, no matter that it's the 
same chemical. So, some people think that it's wrong to talk about 
interpreted languages, hey, it should be a language designed for 
interpretation, or better yet, dynamic language, or bestest, 
language with dynamic flavor. And slow language, oh no, should be 
language whose current implementations are perceived as somewhat 
slow by some (well, all) people, but of course, that's just silly.
 Perhaps I'm missing the point of what you're saying but I don't see 
why you're conflating interpreted and dynamic here? Javascript is 
unarguably a dynamic language, yet Chrome / Safari 4 / Firefox 3.5 
all typically JIT it. Does that make Javascript non-dynamic, because 
it's compiled? What about Common Lisp, which is a compiled language 
when it's run with CMUCL or SBCL?


Yeah, you missed it.

Blurring and coloring and downright hiding reality by insisting on 
misleading but apparently more precise terminology for some vague 
concept is a popular sport, and chiding others for using more 
practical and real-world oriented terms, can be effective in politics 
and some other arenas.





But in a technical context it's silly. Or dumb. Whatever.

E.g. you'll find it impossible to define interpretation rigorously in 
the sense that you're apparently thinking of.


Well, sure. Can you explain, then, what sense you meant it in?


I think that was in the part you *snipped* here. Just fill in the mentioned 
qualifications and weasel words. And considering that a routine might be an 
intepreter of data produced elsewhere in program, needs some fixing...



You'll also find it impossible to rigorously define dynamic language 
in a general way so that that definition excludes C++. g


Or, for that matter, suitably clever assembler. I'm not arguing with you 
there.


So, to anyone who understands what one is talking about, 
interpreted, or e.g. slow language (as was the case here), conveys 
the essence.


Not when the context isn't clear, it doesn't.

And to anyone who doesn't understand it trying to be more precise is 
an exercise in futility and pure silliness  --  except for the purpose 
of misleading.


Or for the purpose of greater understanding, surely - and isn't that the 
point?


I don't think that was the point.

Specifically, I reacted to the statement that it is sheer nonsense to talk 
about the speed of an implementation, made in response to someone upthread, 
in the context of Google finding CPython overall too slow.


It is quite slow. ;-)


Cheers,

- Alf
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Rami Chowdhury
On Thu, 12 Nov 2009 12:02:11 -0800, Alf P. Steinbach al...@start.no  
wrote:
I think that was in the part you *snipped* here. Just fill in the  
mentioned qualifications and weasel words.


OK, sure. I don't think they're weasel words, because I find them useful,  
but I think I see where you're coming from.


Specifically, I reacted to the statement that it is sheer nonsense to  
talk about the speed of an implementation, made in response to  
someone upthread, in the context of Google finding CPython overall too  
slow.


IIRC it was the speed of a language that was asserted to be nonsense,  
wasn't it? Which IMO is fair -- a physicist friend of mine works with a  
C++ interpreter which is relatively sluggish, but that doesn't mean C++ is  
slow...


--
Rami Chowdhury
Never attribute to malice that which can be attributed to stupidity --  
Hanlon's Razor

408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Benjamin Kaplan
On Thu, Nov 12, 2009 at 2:24 PM, Alf P. Steinbach al...@start.no wrote:

 You'll also find it impossible to rigorously define dynamic language in a
 general way so that that definition excludes C++. g

 So, to anyone who understands what one is talking about, interpreted, or
 e.g. slow language (as was the case here), conveys the essence.

 And to anyone who doesn't understand it trying to be more precise is an
 exercise in futility and pure silliness  --  except for the purpose of
 misleading.

You just made Rami's point. You can't define a language as insert
word here. You can however describe what features it has - static vs.
dynamic typing, duck-typing, dynamic dispatch, and so on. Those are
features of the language. Other things, like interpreted vs
compiled are features of the implementation. C++ for instance is
considered language that gets compiled to machine code. However,
Visual Studio can compile C++ programs to run on the .NET framework
which makes them JIT compiled. Some one could even write an
interpreter for C++ if they wanted to.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Rami Chowdhury
On Thu, 12 Nov 2009 12:44:00 -0800, Benjamin Kaplan  
benjamin.kap...@case.edu wrote:



Some one could even write an
interpreter for C++ if they wanted to.


Someone has (http://root.cern.ch/drupal/content/cint)!

--
Rami Chowdhury
Never attribute to malice that which can be attributed to stupidity --  
Hanlon's Razor

408-597-7068 (US) / 07875-841-046 (UK) / 0189-245544 (BD)
--
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Steven D'Aprano
On Thu, 12 Nov 2009 21:02:11 +0100, Alf P. Steinbach wrote:

 Specifically, I reacted to the statement that it is sheer nonsense to
 talk about the speed of an implementation, made in response to
 someone upthread, in the context of Google finding CPython overall too
 slow.
 
 It is quite slow. ;-)

Quite slow to do what? Quite slow compared to what?

I think you'll find using CPython to sort a list of ten million integers 
will be quite a bit faster than using bubblesort written in C, no matter 
how efficient the C compiler.

And why are we limiting ourselves to integers representable by the native 
C int? What if the items in the list were of the order of 2**10? Of 
if they were mixed integers, fractions, fixed-point decimals, and 
floating-point binaries? How fast is your C code going to be now? That's 
going to depend on the C library you use, isn't it? In other words, it is 
an *implementation* issue, not a *language* issue.

Okay, let's keep it simple. Stick to numbers representable by native C 
ints. Around this point, people start complaining that it's not fair, I'm 
not comparing apples with apples. Why am I comparing a highly-optimized, 
incredibly fast sort method in CPython with a lousy O(N**2) algorithm in 
C? To make meaningful comparisons, you have to make sure the algorithms 
are the same, so the two language implementations do the same amount of 
work. (Funnily enough, it's unfair to play to Python's strengths, and 
fair to play to C's strengths.)

Then people invariable try to compare (say) something in C involving low-
level bit-twiddling or pointer arithmetic with something in CPython 
involving high-level object-oriented programming. Of course CPython is 
slow if you use it to do hundreds of times more work in every operation 
-- that's comparing apples with oranges again, but somehow people think 
that's okay when your intention is to prove Python is slow.

An apples-to-apples comparison would be to use a framework in C which 
offered the equivalent features as Python: readable syntax (executable 
pseudo-code), memory management, garbage disposal, high-level objects, 
message passing, exception handling, dynamic strong typing, and no core 
dumps ever.

If you did that, you'd get something that runs much closer to the speed 
of CPython, because that's exactly what CPython is: a framework written 
in C that provides all those extra features.

(That's not to say that Python-like high-level languages can't, in 
theory, be significantly faster than CPython, or that they can't have JIT 
compilers that emit highly efficient -- in space or time -- machine code. 
That's what Psyco does, now, and that's the aim of PyPy.)

However, there is one sense that Python *the language* is slower than 
(say) C the language. Python requires that an implementation treat the 
built-in function (say) int as an object subject to modification by the 
caller, while C requires that it is a reserved word. So when a C compiler 
sees int, it can optimize the call to a known low-level routine, while 
a Python compiler can't make this optimization. It *must* search the 
entire scope looking for the first object called 'int' it finds, then 
search the object's scope for a method called '__call__', then execute 
that. That's the rules for Python, and an implementation that does 
something else isn't Python. Even though the searching is highly 
optimized, if you call int() one million times, any Python implementation 
*must* perform that search one million times, which adds up. Merely 
identifying what function to call is O(N) at runtime for Python and O(1) 
at compile time for C.

Note though that JIT compilers like Psyco can often take shortcuts and 
speed up code by a factor of 2, or up to 100 in the best cases, which 
brings the combination of CPython + Psyco within shouting distance of the 
speed of the machine code generated by good optimizing C compilers. Or 
you can pass the work onto an optimized library or function call that 
avoids the extra work. Like I said, there is no reason for Python 
*applications* to be slow.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Vincent Manis
When I was approximately 5, everybody knew that higher level languages were too 
slow for high-speed numeric computation (I actually didn't know that then, I 
was too busy watching Bill and Ben the Flowerpot Men), and therefore assembly 
languages were mandatory. Then IBM developed Fortran, and higher-level 
languages were not too slow for numeric computation. 

When I was in university, IBM released a perfectly horrible implementation of 
PL/I, which dynamically allocated and freed stack frames for each procedure 
entry and exit (`Do Not Use Procedures: They Are Inefficient': section heading 
from the IBM PL/I (G) Programmer's Guide, circa 1968). Everyone knew PL/I was 
an abomination of a language, which could not be implemented efficiently. Then 
MIT/Bell Labs/GE/Honeywell wrote Multics in a PL/I subset, and (eventually) it 
ran quite efficiently. 

When Bell Labs pulled out of the Multics effort, some of their researchers 
wrote the first version of Unix in assembly language, but a few years later 
rewrote the kernel in C. Their paper reporting this included a sentence that 
said in effect, `yes, the C version is bigger and slower than the assembler 
version, but it has more functionality, so C isn't so bad'. Everybody knew that 
high-level languages were too inefficient to write an operating system in (in 
spite of the fact that Los Alamos had already written an OS in a Fortran 
dialect). Nobody knew that at about that time, IBM had started writing new OS 
modules in a company-confidential PL/I subset. 

When I was in grad school, everybody knew that an absolute defence to a student 
project running slowly was `I wrote it in Lisp'; we only had a Lisp interpreter 
running on our system. We didn't have MacLisp, which had been demonstrated to 
compile carefully-written numerical programs into code that ran more 
efficiently than comparable programs compiled by DEC's PDP-10 Fortran compiler 
in optimizing mode. 

In an earlier post, I mentioned SBCL and Chez Scheme, highly optimizing 
compiler-based implementations of Common Lisp and Scheme, respectively. I don't 
have numbers for SBCL, but I know that (again with carefully-written Scheme 
code) Chez Scheme can produce code that runs in the same order of magnitude as 
optimized C code. These are both very old systems that, at least in the case of 
Chez Scheme, use techniques that have been reported in the academic literature. 
My point in the earlier post about translating Python into Common Lisp or 
Scheme was essentially saying `look, there's more than 30 years experience 
building high-performance implementations of Lisp languages, and Python isn't 
really that different from Lisp, so we ought to be able to do it too'. 

All of which leads me to summarize the current state of things. 

1. Current Python implementations may or may not be performance-scalable in 
ways we need. 

2. Reorganized interpreters may give us a substantial improvement in 
performance. More significant improvements would require a JIT compiler, and 
there are good projects such as Unladen Swallow that may well deliver a 
substantial improvement. 

3. We might also get improvements from good use of Python 3 annotations, or 
other pragma style constructs that might be added to the language after the 
moratorium, which would give a compiler additional information about the 
programmer's intent. (For example, Scheme has a set of functions that 
essentially allow a programmer to say, `I am doing integer arithmetic with 
values that are limited in range to what can be stored in a machine word'.) 
These annotations wouldn't destroy the dynamic nature of Python, because they 
are purely optional. This type of language feature would allow a programmer to 
exploit the high-performance compilation technologies that are common in the 
Lisp world. 

Even though points (2) and (3) between them offer a great deal of hope for 
future Python implementations, there is much that can be done with our current 
implementations. Just ask the programmer who writes a loop that laboriously 
does what could be done much more quickly with a list comprehension or with 
map. 

-- v


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Steven D'Aprano
On Thu, 12 Nov 2009 22:20:11 -0800, Vincent Manis wrote:

 When I was approximately 5, everybody knew that higher level languages were 
 too slow for high-speed numeric computation (I actually didn't know that 
 then, I was too busy watching Bill and Ben the Flowerpot Men), and therefore 
 assembly languages were mandatory. Then IBM developed Fortran, and 
 higher-level languages were not too slow for numeric computation.

Vincent, could you please fix your mail client, or news client, so 
that it follows the standard for mail and news (that is, it has a 
hard-break after 68 or 72 characters?

Having to scroll horizontally to read your posts is a real pain.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python simply not scaleable enough for google?

2009-11-12 Thread Vincent Manis
On 2009-11-12, at 23:19, Steven D'Aprano wrote:
 On Thu, 12 Nov 2009 22:20:11 -0800, Vincent Manis wrote:
 
 Vincent, could you please fix your mail client, or news client, so 
 that it follows the standard for mail and news (that is, it has a 
 hard-break after 68 or 72 characters?
My apologies. Will do. 

 Having to scroll horizontally to read your posts is a real pain.
At least you're reading them. :)

-- v-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >