Previous version posted at http://lambda-the-ultimate.org/node/531#comment-23457 on 2006-12-25.
This is a partial rebuttal to Alan Kay's occasional assertion that computers aren't nearly as much faster at executing late-bound things like Smalltalk as you would expect from Moore's Law. In an interview with ACM Queue, Kay writes [7]: Just as an aside, to give you an interesting benchmark --- on roughly the same system, roughly optimized the same way, a benchmark from 1979 at Xerox PARC runs only 50 times faster today. Moore’s law has given us somewhere between 40,000 and 60,000 times improvement in that time. So there’s approximately a factor of 1,000 in efficiency that has been lost by bad CPU architectures. But Moore's Law is about price-performance, not absolute performance; here I estimate that the actual loss of price-performance attributable to bad CPU architectures is perhaps a factor of 10 to 50, and it is plausible that better compilers can remedy this. Guesswork ========= "Resuna" writes [6]: The [VAX] 11/780 was 3.6 MHz, 32-bit words. I don't know how fast the Alto or Dorado were, but with the Dorado being the archetypical "3M" machine I assume its performance was comparable to a nominally 1-MIPS 11/780. According to Wikipedia [0], the Dorado was an all-ECL machine. The abstract to Lampson and Pier's paper on the Dorado [1], which I haven't read, says it ran at 20MHz, had 16 hardware threads to provide zero-context task switching, and was built out of "approximately 3000 MSI [ECL] components". So it was considerably faster than a VAX. Maybe one of the older D-machines is "the archetypal 3M-machine". Apparently it could run 200k-400k Smalltalk bytecodes per second [2]. I'm guessing that the Dorado is the particular machine Kay was alluding to benchmarking, since it was introduced in 1979, and the context of the conversation is how machines designed to be efficient at high-level language execution were worthwhile. I don't think it was ever sold commercially (or even mass-produced in-house), which makes per-unit costs difficult to calculate. However, if we assume that each of the 3000 chips in the thing cost $20 each (unfortunately I have no real idea how much ECL chips cost in 1980), that's a $60 000 bill-of-materials cost. So it might have cost $100 000 per machine if it had been mass-produced, and since it was ECL, the electrical power cost of running it would likely be higher per chip as well. According to the squeak-dev thread on the subject [3], modern 600MHz uniprocessors are about 20x the speed of the Dorado when running Squeak, or 35 million bytecodes per second (which sounds more like 100x the speed of the Dorado, actually). However, the uniprocessors in question cost US$150 or so, which is inflation-equivalent to maybe US$75 in 1980 dollars. (They also include hundreds of megabytes of RAM, instead of the 8MB on the Dorado.) If you were going to spend $100 000 today (or when Kay gave this interview) on a computer to run Smalltalk on, you would probably get a Beowulf of 50 nodes, each node of which could run bytecodes at 50 to 200 times the speed of a Dorado, and that's running Squeak, which is not designed to be a particularly high-performance Smalltalk. But Moore's Law has still given us, by my rough estimates, a factor of 2500 to 10 000 in price/performance in this case. (That's not counting the difference between 8 megs of RAM and 50 000 megs of RAM, or the advantage of having 10TB of disk, etc.) A factor of 2500 is still noticeably less than the 131072x improvement that you might predict from a naive application of Moore's law, but the remaining factor of 10-50 is probably explicable in terms of Kay's explanation: the architecture is not optimized for Smalltalk bytecode execution, so you get a 10-50x slowdown when you use it as if it were a Dorado. (You might be able to get a Beowulf of 300 nodes at that price, depending on other circumstances.) How much faster are other Smalltalk implementations than Squeak? Various microbenchmarks seem to peg Strongtalk at 3x-10x faster than Squeak (Avi Bryant's [4], David Griswold/Klaus Witzel's [5]), which would nicely compensate for the remainder of Kay's complaint. References ========== [0] Wikipedia article "Xerox Alto", section "Diffusion and Evolution", as of 2006-12-25 > http://en.wikipedia.org/wiki/Xerox_Alto#Diffusion_and_evolution [1] "A Processor for a High-Performance Personal Computer", from Butler W. Lampson and Kenneth A. Pier, Xerox PARC, 1980, IEEE "CH1494-4/80/0000-0146" (whatever that means), 15 pp.; mentions, among other things, that the first machine "came up in the spring of 1979". > http://research.microsoft.com/Lampson/24-DoradoProcessor/Acrobat.pdf [2] Squeak-dev post "Dorado bytecodes per second", from Bruce ONeel (edoneel at sdf.lonestar.org), 2005-05-28T16:41:49 CEST, quoting previous post from Jecel Assumpcao Jr (jecel at merlintec.com): By running the benchmarks for the "green book" and doing a lot of rough extrapolations, my guess is that the Dorado would get between 200K and 400K bytecodes/sec. And followup from Tim Rowledge (tim at rowledge.org): That is pretty much what I remember as the claim for Dorados. > http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-April/091211.html [3] Squeak-dev post "Dorado bytecodes per second", from Jecel Assumpcao Jr (jecel at merlintec.com), 2005-05-28T22:38:19 CEST --- he's talking about 600MHz ARMs. > http://lists.squeakfoundation.org/pipermail/squeak-dev/2005-April/091215.html [4] Blog post "Ruby and Strongtalk II", by Avi Bryant, on his blog "HREF Considered Harmful"; the microbenchmark in question did a billion accesses of a thousand-element array of small integers, took 0.7 seconds in Java, 7 seconds in Strongtalk, 70 seconds in Squeak, or 16 if you use Array instead of ByteArray. > http://smallthought.com/avi/?p=17 [5] Squeak-dev thread "Thue-Morse and performance: Squeak v.s. Strongtalk v.s. VisualWorks", started by Klaus D. Witzel 2006-12-17; several people, including David Griswold, point out flaws in Witzel's initial benchmark, and the results are interesting. > http://www.nabble.com/Thue-Morse-and-performance:-Squeak-v.s.-Strongtalk-v.s.-VisualWorks-t2834773.html [6] Comment "I still want to see Kay's benchmark...", from "Resuna", 2005-07-22 > http://lambda-the-ultimate.org/node/531#comment-7895 [7] ACM Queue article "A Conversation with Alan Kay: Big Talk with the creator of Smalltalk --- and much more.", by Stuart Feldman and Alan Kay, vol. 2, no. 9, Dec/Jan 2004-2005, is the origin of this quote. > http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=273&page=3