On May 7, 2010, at 7:23 AM, Douglas Burke wrote:
>
> Can I suggest that before any sort of optimization work that we get a
> solid set of performance benchmarks (a la Daniel's micro benchmarks
> perhaps?) so that we can be sure that any such work actually does
> improve things (and, just as importantly, doesn't slow things down
> elsewhere).
Er, sorry, I was noodling around and may have jumped the gun. I just
checked in a small speed improvement for matmult. It just evaluates
the terms in the matrix product in tiled order (multiplying 32x32
tiles) rather than in direct threading order; that fits each tile into
16k in the double-precision case, which is small enough to fit in L1
cache of most performance CPUs. Unsurprisingly, it helps.
Surprisingly, not so very much. On my PowerBook:
perldl> $a = random(2000,2000);
perldl> $b = random(2000,2000);
perldl> {$t0=time; $c = $a->dummy(1)->inner($b->xchg(0,1)->dummy(2));
..{ > $t1=time; print $t1-$t0,"\n";}
82
perldl> {$t0=time; $d = $a x $b;
..{ > $t1=time; print $t1-$t0,"\n";}
70
perldl> print all($d==$c)
1
I am a bit puzzled how these other packages manage to go so much
faster...
Cheers,
Craig
_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl