Here are a couple of nearest scaling MMX paths I wrote a long time ago
for Loongson and other things using the MMX code.
I've got a few more patches for the MMX code that I'll send out as I
benchmark them.
I don't really expect any reviews, so barring objections I'll plan to
commit them in a few
lowlevel-blt-bench -n, over__, 15 iterations on Loongson 2f:
Before After
Mean StdDev Mean StdDev Change
L115.8 0.02 24.0 0.06 +52.0%
L214.8 0.15 23.3 0.13 +56.9%
M 10.3 0.01 13.8 0.03 +33.6%
HT
lowlevel-blt-bench -n, over__n_, 15 iterations on Loongson 2f:
Before After
Mean StdDev Mean StdDev Change
L1 9.7 0.01 19.2 0.02 +98.2%
L2 9.6 0.11 19.2 0.16 +99.5%
M 7.3 0.02 12.5 0.01 +72.0%
HT