Re: [racket-dev] can we write these four lines of C in performant racket?

2011-07-25 Thread Matthew Flatt
At Sat, 23 Jul 2011 14:42:15 -0400, John Clements wrote:
 This C code adds the content of one buffer to another one, with no checking.  
 The corresponding racket code runs about 10x slower. Do you folks think that 
 it 
 should be possible to do better? (One salient fact: these are 
 shorts--16-bit-ints--not 32-bit ints.)

My sources say no --- at least not a lot better with the current JIT.

 void addOn(short *dst, int dstOffset, short *src, int srcOffset, int len) {
   int i;
   for (i = 0; ilen; i++) {
 dst[i+dstOffset] += src[i+srcOffset];
   }}

One problem is that a `s16-vector' in Racket is a struct that wraps an
FFI wrapper around the array pointer, instead of just an array pointer.
So, a C version of the Racket loop is more like

 void addOn(short_ptr *dst, int dstOffset, 
short_ptr *src, int srcOffset, int len) {
   int i;
   for (i = 0; ilen; i++) {
 dst-a-a[i+dstOffset] += src-a-a[i+srcOffset];
   }
 }

Another problem is that the JIT doesn't use registers well enough, so
its code is closer in this case to unoptimized gcc output. Finally,
there's some extra overhead for the fixnum encoding.

Here are some timings for 1000 iterations on 2^20-element inputs
(32-bit mode, Mac Book Pro 2.53 GHz):

  C as above, gcc -02  :  1409
  C with indirections, gcc -O2 :  4041
  C as above, gcc -O0  :  6425
  C with indirections, gcc -O0 :  8480
  Racket fxvector (more direct):  8883
  Racket   : 11248

I can tweak the JIT in small ways to make a small improvement:

  Racket with JIT tweaks   : 10670


The Racket code I tried is below (ugly to run as fast as I could make
it).



#lang racket/base
(require racket/unsafe/ops
 ffi/vector)

(define SIZE 1048576)

(define addOn #f) ; defeat inlining of `addOn'
(set! addOn
  (lambda (dst dstOffset src srcOffset len)
(let loop ([i 0])
  (unless (eq? i len)
(let ([d (unsafe-fx+ dstOffset i)])
  (unsafe-s16vector-set! 
   dst 
   d
   (unsafe-fx+
(unsafe-s16vector-ref dst d)
(unsafe-s16vector-ref src (unsafe-fx+ srcOffset i)
(loop (unsafe-fx+ i 1))

(let ([a (make-s16vector SIZE)]
  [b (make-s16vector SIZE)])
  (time
   (for ([i (in-range 1000)])
 (addOn a 0 b 0 SIZE

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev


[racket-dev] mouse cursor changes cause frequent collections?

2011-07-25 Thread John Clements
I have the log showing right now, and I see that moving the mouse in a way that 
causes mouse cursor changes (for instance, back and forth between the 
definitions  interactions windows) causes collections of about 32K on the 
order of every eight to ten mouse cursor changes, something like 2K of garbage 
per cursor change.

There's no observed slowdown or penalty for this, so please ignore this message 
unless you're right now thinking hey! I bet I can reuse those cursors!

John

p.s.: Concord trailways is awesome: nice wireless, comfy ride.

p.p.s.: many thanks as always for RacketCon. One of the highlights of my year.





smime.p7s
Description: S/MIME cryptographic signature
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev