Begin forwarded message:

> From: Andreas Raab <[email protected]>
> Date: December 17, 2008 4:30:30 AM CEST
> To: The general-purpose Squeak developers list 
> <[email protected] 
> >
> Subject: [squeak-dev] Cog VM status update
> Reply-To: The general-purpose Squeak developers list 
> <[email protected] 
> >
>
> Folks -
>
> I just read Eliot's most recent blog post about the Cog VM and it  
> reminded me how difficult it must be for others to see where this  
> project stands. So here is a bit of an update on the current status:
>
> As you may recall we started this project in spring this year by  
> hiring Eliot for the express purpose of building us a new VM that  
> would speed up execution of our products. We decided to structure  
> the work into stages at the end of each there would be a tangible  
> deliverable i.e., a new VM that could be run and benchmarked.
>
> The first stage in this process is what we call the "Closure VM". It  
> is nothing more (and nothing less) than a Squeak VM with closures  
> and the required support (compiler, decompiler, debugger etc). Given  
> past experience, we had originally expected this stage to cost us  
> some speed (up to 20% were estimated) since closure support has a  
> cost which at that stage is hard to offset with other improvements.  
> However, thanks to a truly ingenious bit of engineering done by  
> Eliot in the design for the closure implementation the resulting  
> speed difference was negligible. Since there was no speed penalty we  
> decided to jump ship earlier than we originally anticipated and the  
> Closure VM has been the regular shipping VM with Qwaq products since  
> September this year.
>
> The second stage in the process is the "Stack VM". It is a Closure  
> VM that executes on the native stack and transparently maps contexts  
> from and to stack frames as required. The VM itself is still an  
> interpreter so any speed improvements come purely from the more  
> efficient organization of the stack layout (no allocations,  
> overlapping frames etc). For those of you having been around for  
> long enough it is equivalent to what Anthony Hanan did a few years  
> ago, except that it hides the existence of the native stack entirely  
> and gives the programmer the naive view of just dealing with linked  
> frames (contexts). The original expectations for the resulting  
> speedups by Eliot were a little higher than we've seen in practice,  
> but are in line with the results that Anthony got: approx. 30%  
> improvements across the board in macro benchmarks. The work on the  
> Stack VM was completed last month, we are currently rolling it out  
> internally and the next product release will ship with the Stack VM.
>
> The third stage which has just begun is what we call the "Simple JIT  
> VM" (well, really it doesn't have a name yet, I just made it up ;-)  
> Its focus is send performance (as we see send performance as the  
> single biggest current bottleneck). It will sport a very simple JIT  
> w/ inline caches with the idea being to bring up send performance to  
> the point where it's no longer the single biggest bottleneck, then  
> measure performance again and figure out what the next best target  
> is. I am not going to speculate on performance (we have been wrong  
> every single step of the way ;-) but both Eliot and I do think that  
> we'll see some nice improvements in application performance here.
>
> The fourth stage is a bit more speculation at this point because the  
> concrete direction depends on what the results of stage 3 really  
> show the new bottleneck to be. We have various candidates lined up:  
> Very high on the list is a delayed code generator which can  
> dramatically improve the code quality. Next to it are changes in the  
> object format moving to a unified 32/64bit header model which would  
> dramatically simplify some tests for inline caching and primitives  
> etc. However, since this work is driven by product performance, it  
> is possible (albeit unlikely at this point) that the focus might  
> shift towards FFI speed or float inlining. There is no shortage of  
> possible directions, the main issue will be to figure out what the  
> bottlenecks at that point are and how to address them most  
> efficiently.
>
> Stage four won't be the end of it, but from where we are this is how  
> far we've planned at this point. And if you want to know all the  
> gory details about the stuff that Eliot's working on, please do  
> check out his blog at:
>
>  http://cogblog.mirandabanda.org/
>
> Cheers,
>  - Andreas
>
>
>
>


_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to