On Fri, Mar 25, 2011 at 12:54 AM, Stéphane Ducasse < [email protected]> wrote:
> > On Mar 25, 2011, at 2:51 AM, Toon Verwaest wrote: > > > Ok, I will do so. (read the f-ing paper) I only read the blogpost until > now. > > which paper? > Is there something more than the blog? I read the old VW5 paper but eliot > told me that this is old and not accurate with Cog anymore. > The http://www.esug.org/data/Articles/misc/oopsla99-contexts.pdf paper describes the problem with an implementation of closures using contexts such as that described by Toon, since the pre-5i implementation is isomorphic to Toon's. It also describes the 5i solution which is also essentially isomorphic to my Squeak closure implementation. But in detail this paper clearly does not describe my Squeak closure implementation; the bytecodes have different names etc. > > > > I just realized that I actually made a mistake in my mental model of your > model. See! It's complex! > > So I realized that getting to the remotes is exactly as fast as going to > the parent or outer context. > > > > This makes it as fast as having a method context with maximally 2 nested > contexts (3 blocks nested), and faster than deeper nestings. How often does > it occur that you have deeper nesting in Pharo? Is it worthwhile to make the > remote arrays just for those cases? > > > > Is the copying really worthwhile to make those cases faster? > > > > My biggest problem until now is... why wouldn't you be able to do > everything you do with the remote arrays, directly with the context frames? > Why limit it to only the part that is being closed over? The naive > implementation that just extends squeak with proper closure-links will > obviously be slow. I agree that you need a stack. Now I'd just like to read > why you choose to just take a part of the frame (the remote array) rather > than the whole frame. This would avoid the copyTemps thing... > > > > But then. I guess I should go off and read the f-ing paper. I hope that > particular thing is described there, since it's basically the piece I'm > missing. > > > > Also I don't exactly know what Peter Deutsch did, but if it was the > straightforward implementation then it seems obvious you get such a > speedup. Implementing it is less obvious, naturally ;) > > These responses are exactly why I posed the question here... I'd like to > understand why. No offense. > > > > cheers, > > Toon > > > > > > On 03/25/2011 02:22 AM, Eliot Miranda wrote: > >> Toon, > >> > >> what you describe is how Peter Deutsch designed closures for > ObjectWorks 2.4 & ObjectWorks 2.5, whose virtual machine and bytecode set > served all the way through VisualWorks 3.0. If you read the context > management paper you'll understand why this is a really slow design for a > JIT. When I replaced that scheme by one essentially isomorphic to the > Squeak one the VM became substantially faster; for example factors of two > and three in exception delivery performance. The description of the problem > and the performance numbers are all in the paper. There are two main > optimizations I performed on the VisualWorkas VM, one is the closures scheme > and the other is PICs. Those together sped-up what was the fastest > commercial Smalltalk implementation by a factor of two on most platforms and > a factor of three on Windows. > >> > >> I'm sorry it's complex, but if one wants good performance it's a price > well-worth paying. After all I was able to implement the compiler and > decompiler within a month, and Jorge proved at INRIA-Lille that I'm far form > the only person on the planet who understands it. Lispers have understood > the scheme for a long time now. > >> > >> best, > >> Eliot > >> > >> On Thu, Mar 24, 2011 at 6:01 PM, Toon Verwaest <[email protected]> > wrote: > >> > >> I can't say that i clearly understood your concept. But if it will > >> simplify implementation > >> without seemingly speed loss, i am all ears :) > >> > >> > >> test > >> |b| > >> [ |a| > >> a + b ] > >> > >> Suppose you can't compile anything away, then you get > >> > >> |============== > >> |MethodContext > >> | > >> |a := ... > >> |============== > >> ^ > >> | > >> |============== > >> |BlockContext > >> | > >> |b := ... > >> |============== > >> > >> And you just look up starting at the current context and go up. Except > if the var is from the homeContext, then you directly follow the > home-context pointer. > >> Since all contexts link to the home-context, this makes it 1 pointer > indirection to get to the method's context. 1 for the parent context. So > that makes only 2 indirections starting from the 3 nested block (so when you > have [ ... [ ... [ ... ] ... ] ... ]; where all of them are required for > storing captured data. ifTrue:ifFalse: etc blocks obviously don't count. And > blocks without shared locals could be left out (although we might not do > that, for debugging reasons). > >> > >> Hope that helps. > >> > >> cheers, > >> Toon > >> > >> > > > > >
