One more thought: Do you get to pick whether you use 16-bit integers or 64-bit floating-point numbers? The `flvector-' and `f64vector-' operations are inlined by the JIT and recognized for unboxing, so using flonum vectors and operations could be much faster than using raw pointers and 16-bit integers.
At Thu, 23 Sep 2010 19:42:15 -0700, John Clements wrote: > I'm trying to add together big buffers. The following code creates two big > fat > buffers of 16-bit integers, and adds them together destructively. It looks to > me like this code *could* run really fast, but it doesn't; this takes about > 8.5 seconds. Changing + to unsafe-fx+ has no detectable effect. Is there > allocation going on in the inner loop? I'd hoped that since an _sint16 fits > safely in 31 bits, that no memory would be allocated in the inner loop. Grr! > Any suggestions? (I ran a similar test on floats, and C ran about 64x faster, > about a tenth of a second). > > Doc pointers appreciated as always, > > John > > #lang racket > > (require ffi/unsafe) > > (define (make-buffer-of-small-random-ints len) > (let ([buf (malloc _sint16 len)]) > (for ([i (in-range len)]) > (ptr-set! buf _sint16 i 73)) > buf)) > > (define buf-len (* 44100 2 200)) > > (define b1 (make-buffer-of-small-random-ints buf-len)) > (define b2 (make-buffer-of-small-random-ints buf-len)) > > (time > (for ([i (in-range buf-len)]) > (ptr-set! b1 _sint16 i > (+ (ptr-ref b1 _sint16 i) > (ptr-ref b2 _sint16 i))))) > ------------------------------------------------------------------------------ > [application/#f "smime.p7s"] [~/Desktop & open] [~/Temp & open] > _________________________________________________ > For list-related administrative tasks: > http://lists.racket-lang.org/listinfo/dev _________________________________________________ For list-related administrative tasks: http://lists.racket-lang.org/listinfo/dev