Is the slowdown for the PIII significant enough for you to consider splitting the code for them and other non-SSE2 686 machines?
The slowdown is very significant for many FFT sizes. I recommend P2s, P3s, old Celerons, old Xeons, and Pentium-Ms compare version 23 and version 24 and use the one that is faster for them.
It is a non-trivial project to merge the old and new code into a single executable.
The FFTs in v24 were completely rewritten to do discrete weighted transforms
on numbers of the form k*2^n+c (the old code only did 2^n+/-1). This has been
a boon to other math projects like PFGW, SoB, LLR, Reisel search, Proth search, etc.
During the rewrite, whenever I had to make a performance choice I optimized for
the Athlon instead of the P3. The biggest difference was in the memory layout. The
Athlon hated the rather convoluted layout that the P3 liked, the P3 does not like the
more linear new layout.
With the P3 architecture becoming obsolete and the high cost of merging two different
code bases into one executable, I'll just let the knowledgeable user choose v23
or v24 to suit their needs.
-- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 266.5.4 - Release Date: 3/1/2005
_______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime
