On May 7, 2010, at 7:23 AM, Douglas Burke wrote:
>
> Can I suggest that before any sort of optimization work that we get a
> solid set of performance benchmarks (a la Daniel's micro benchmarks
> perhaps?) so that we can be sure that any such work actually does
> improve things (and, just as importantly, doesn't slow things down
> elsewhere).

Er, sorry, I was noodling around and may have jumped the gun.  I just  
checked in a small speed improvement for matmult.  It just evaluates  
the terms in the matrix product in tiled order (multiplying 32x32  
tiles) rather than in direct threading order; that fits each tile into  
16k in the double-precision case, which is small enough to fit in L1  
cache of most performance CPUs.  Unsurprisingly, it helps.   
Surprisingly, not so very much.  On my PowerBook:

        perldl> $a = random(2000,2000);
        perldl> $b = random(2000,2000);
        perldl> {$t0=time; $c = $a->dummy(1)->inner($b->xchg(0,1)->dummy(2));
        ..{    > $t1=time; print $t1-$t0,"\n";}
        82

        perldl> {$t0=time; $d = $a x $b;
        ..{    > $t1=time; print $t1-$t0,"\n";}
        70
                
        perldl> print all($d==$c)
        1

I am a bit puzzled how these other packages manage to go so much  
faster...

Cheers,
Craig

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to