On Tue, May 7, 2013 at 6:06 PM, Dave Smith <[email protected]> wrote:
> 1. Python programs can be "frozen" into a compiled binary so that it can run > natively on a target OS. I've done this for Windows before. However, the > binary is just a clever packaging of the Python interpreter with your code, > still in Python form (well, probably byte code, which leads me to #2). So > it's not really native executable code. > > 2. When the Python interpreter runs your code, by default it will generate a > .pyc file next to your .py file that is compiled Python byte code. This is > similar in concept to Java byte code, and this byte code is actually what the > Python interpreter executes. Still not native executable code, but a little > bit faster than interpreting Python source. Even if it didn't create the pyc file, I'm pretty sure it (like Perl, and most other interpreted languages) compiles your code into some sort of lower-level representation before interpreting. This technique has a long history, going back to BCPL (an ancestor of C) in the late 60s that compiled to a defined intermediate language called O-code that captured the low-level operational semantics of the language in the form of a virtual machine and corresponding assembly language. This was primarily to aid in porting BCPL to other computers (there was a lot more variety in computer architecture in those days) but it was sometimes directly interpreted by a program that simulated the O-code machine. The technique was popularized, however, by Pascal and P-code. UCSD Pascal had a standard virtual machine, the p-machine, that Pascal was compiled to. They built an entire operating system, the p-System, around it. Thanks to the virtual machine-based implementation, it was very portable. Thanks to the portability, it became widely used and very popular. Western Digital even implemented a CPU in which the microcode interpreted p-code. I would venture to say that the majority of languages since then have had at least one implementation as a virtual machine code interpreter. Most of these have implemented a stack-based bytecode (which follows the Pascal p-Code tradition), though some implement a register-based machine (Lua is a notable example of this kind). A couple of exceptions in the world of interpreters would be Perl and Oberon (a follow-on to Pascal and Modula by Wirth) which operate on a representation of the program AST and Forth, which typically uses a method called threaded code. > > 3. The PyPy interpreter actually does JIT compilation of your Python code > into native machine code while it's running. This is not a translation into > another language like C. PyPy translates Python byte code into native machine > code. However, like most JITs, it only does this to the portions of your > program that get executed a lot (the criteria are configurable). In my > testing, certain kids of programs can get a 20x speedup from the PyPy JIT > (e.g., most Project Euler solutions). However, lots of programs see no > speedup at all (e.g., programs that are I/O bound). > > I personally believe that the future of Python is PyPy. It's hard to say for sure how this will turn out. PyPy is awesome, but a number of Python implementations are gaining solid niches and CPython doesn't appear to be likely to step down as the "canonical" form of Python anytime soon. There's a bit of a similar situation with Lua and LuaJIT; both are excellent implementations, but the primary implementation of Lua is pretty tightly controlled by its original authors and so despite some great performance benefits of LuaJIT, the canonical implementation remains the most widely used. In my experience, languages defined by a canonical implementation instead of a specification tend to stick to that canonical implementation despite the existence of alternative implementations. So I'd say PyPy has a bright future, but I wouldn't say the future of Python is PyPy. > > A somewhat related note: > > Another noteworthy example of a (potentially) really good toolchain is Google > Go. Its toolchain only runs in cross-compile mode, so it's easy to support > multiple target output binary formats (last I checked, only x86 and ARM were > supported). And of course, Google Go compiles to native code as well, and yet > there are also interactive shells for Go too (e.g., go-eval). Go is a pretty interesting language. It's definitely a bit of a "throwback" language, which goes to show how truly old-school C is when a decidedly conservative language like Go looks like a huge advance compared to it. I don't mean to be disparaging by that characterization of Go--I think a lot of important lessons in language implementation have been forgotten in the frenzy to make cool new things all the time, and Go definitely takes many of those important lessons to heart while providing a few key features aimed at real problems in today's software systems. And even those features, like goroutines, tend to be old and well-tested techniques that fell out of favor for a while. And to take another little digression down history lane, Go's concurrency features such as goroutines are ultimately derived from early work on specifying/formally describing concurrent processes, namely the model of Communicating Sequential Processes as first described in a paper by C.A.R Hoare. He later published a book describing the system, which is now available for free here: http://www.usingcsp.com/cspbook.pdf CSP was very influential (the PDF above is one of the most cited works in computer science) and spawned a lot of other formalisms for describing concurrent processes as well as more practical programming languages. Rob Pike, one of the Go authors, has worked on a series of previous languages that fuse C-like syntax and semantics with the CSP-model of concurrency. These include Newsqueak, Limbo (a bytecode-interpreted language that was the foundation of Inferno, a follow-on to the Plan 9 OS), and finally Go. CSP was also the basis for the Inmos Transputer, a highly-concurrent computer architecture that wasn't a huge commercial success, but ended up being highly influential to future CPU designs. Hoare himself consulted on the design of a new language to run on the transputer, called Occam, which was the first practical implementation of a CSP-based programming language. Anyway, goroutines are a significant thing but not the only nice feature of Go. For one thing, they finally ditched C's terrible type declaration syntax! That's reason enough to switch to it right there. /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
