There's a persistent myth among programmers that interpreters are slow.
I disagree that this is a myth. Interpreted code is typically anywhere from 5x
to 100x slower than compiled code depending on the features of the interpreter
and quality of the compiled code. The real issue is that developers are
notoriously bad at predicting where performance matters in their code and in
particular the impact of the interpreter's “slowness” on the overall
performance of the application. If the interpreted code is mainly ordering the
execution of rather long-running operations that are well implemented in C then
the slowness of the interpreter will not be observable by the user (for example
typical shell scripts executed by an interpreter-based shell).
Your numbers sound correct. The myth is not about the slowdown factor;
it's that the slowdown matters to users. Many (perhaps most) code is I/O
bound so users don't perceive a difference. On the other hand, we easily
perceive a difference in how fast the program loads when you click its
icon or type its name in the shell (fancy runtimes have slower startup).
Also, programmers notice a big difference in how fast they can build and
run the program after each change they make. Programs that are easy and
fast to build are fun to tinker with, which leads to more improvements.
There are still a lot of programmers who use languages like C++, Java,
Haskell or Rust by default because they are perceived as "real
languages" that run fast (even when their program is 1000 lines and I/O
bound). These come with heavy toolchains which are difficult and slow to
install and operate.
Emacs is a good example of a huge, mostly interpreted application whose
interpreter is not stellar. And still almost none of the day-to-day
slowdown when using Emacs is due to the interpreter; mostly you're
waiting for some external program that is not responding.
Even an old version of Microsoft Excel was interpreted. The team had
their own C compiler that generated bytecode instead of machine code.
Joel Spolsky has a blog post about it.
But the performance of code execution by the embedded language matters for some
applications and usually the developper only knows this late in the development
process (after the application’s goals have evolved) which is after the
embedded language implementation has been selected and much code written around
it. In such a situation it is usually too expensive to change the embedded
language implementation, so instead more and more functionality gets
implemented in the low-level language.
So code execution performance should not be overlooked when selecting an
embedded language implementation, otherwise the development benefits of the
high-level language may eventually be lost.
These are very good points. Indeed a good experience with interpreters
requires moving performance-sensitive parts to C. Emacs is harmed by
this as well.
Then again, using an exotic compiled language is certainly far from
problem-free. Using C for the fast parts and a self-contained
interpreter for the slow parts offers a nice mix of portability and
expressive power. For example, it might make sense to rewrite Emacs in
Common Lisp (since Emacs Lisp is basically a subset of CL already) but
the result would probably be less portable, and there would be problems
with CL implementations that they don't have with C. It's a trade-off.
The best of both worlds is a standardized language supported by many
interpreters and compilers of different levels of sophistication - for
example, a certain long-lived language in the Lisp family ;-) Then one
can start with something simple and move up with a minimum of friction.
Lua is still a big inspiration to me as an interpreter that is actually
as easy to drop into a C codebase as they advertise. TinyScheme and S7
are currently the closest thing to Lua in the Scheme world. Other
Schemes run code faster with more features and with ready-to-use
libraries, but at a cost in simplicity.