Wow. Why didn't I think of asking for this before :-)
On May 4, 2011, at 7:11 PM, Ryan Culpepper wrote: > On 05/04/2011 01:57 PM, Tony Garnock-Jones wrote: >> On 2011-05-04 12:04 PM, Matthias Felleisen wrote: >>> I still believe that the Java implementation (just under 1s without >>> their 'Google' contract) benefits from typed dispatches. >> >> Maybe it does, but it's almost certain that it is benefiting from inline >> caching at send sites (i.e. dynamic type information) much more than it >> will be benefiting from static type information. >> >> A quick-and-dirty comparison of raw send performance on my Mac: >> >> Language Virtual machine Nanoseconds per send >> ------------------------------------------------------------ >> Java Hotspot 64-bit 1.6.0_24 1.4 >> Smalltalk Cog r2382 21 >> Smalltalk SqueakVM 4.2.4beta1U 122 >> Racket Racket v5.1 ~350 >> >> Note that Cog is a JITting VM and SqueakVM is a plain (but very well >> optimised) interpreter. Both Cog and SqueakVM use a couple of levels of >> method lookup cache. >> >> A simple experiment I just performed suggests that a monomorphic inline >> cache hit can reduce the time needed for a send in Racket from 350ns to >> around 60ns, which is a massive win. I've attached the program I used to >> measure this, FWIW. (Run it using command-line Racket, not DrRacket: I >> got some *very* weird timing artifacts out of DrRacket during this >> experiment!) >> >> The question, then, is: how do we implement MICs or PICs using Racket's >> macro system? Each send site needs to expand into >> >> - a piece of global state >> - a test involving that global state >> - a possible update to that global state >> >> Hypothesising some kind of (let-static) form that introduces a >> lexically-scoped piece of global state, > > Here's 'let-static': > > (define-syntax (let-static stx) > (syntax-case stx () > [(let-static ([var rhs] ...) . body) > (with-syntax ([(gvar ...) > (syntax-local-lift-values-expression > (length (syntax->list #'(var ...))) > #'(values rhs ...))]) > #'(let-syntax ([var (make-rename-transformer #'gvar)] ...) > . body))])) > > > this kind of thing might Just >> Work to provide a speedup of almost six-fold on almost-monomorphic >> send-heavy code: > >> (define-syntax cached-send >> (syntax-rules () >> ((_ obj msg arg ...) >> (let-static ((bc (box #f)) >> (bm (box #f))) >> (let* ((tmp obj) >> (cls (object-ref tmp))) >> (if (eq? (unbox bc) cls) >> ((unbox bm) tmp arg ...) >> (let-values (((method _) >> (find-method/who 'send tmp 'msg))) >> (set-box! bc cls) >> (set-box! bm method) >> (method tmp arg ...)))))))) > > That code has a space-safety problem: the caches might keep around classes > that should be garbage collected. It also has a race condition: there's a > period of time when the class cache has been updated but the method cache > hasn't. > > The safe-for-space issue can be fixed by using weak boxes. The race condition > could be fixed by using one instead of two... but there's no structure that > combines a class and a particular method implementation that's strongly held > elsewhere. Creating a new pair for the weak box to hold works, but it means > that every GC is likely to flush the cache, but maybe that's okay. It also > means that every cache miss triggers heap allocation, but I don't know any > way around that. > > I've attached an updated benchmark script. Still looks like a win on speed, > but slightly less than before. > > Ryan > <time-method-send-v2.rkt>_________________________________________________ > For list-related administrative tasks: > http://lists.racket-lang.org/listinfo/dev _________________________________________________ For list-related administrative tasks: http://lists.racket-lang.org/listinfo/dev