Philippe Trottier wrote:

>         I don't know how the software of primenet work but When I wanted a really
> fast execution I was doing like so (The last time I coded it was in 1991)

*snap*

Sounds like loop unrolling is what you're talking about. Most modern compilers (try
to) do this already automatically. However, I've experimented on different variations
of this with the Linux source to, I think, v16 or so, where it seemed possible to
attain small benefits from various variations of look-unrolling. The biggest problem
here is that the number of iterations isn't divisible by any fixed amount. Because of
that the last few iterations need to be done "manually" outside the unrolled block.
The main advantage of such unrolling comes from not needing to check for the number
of timed events present in Prime95/mprime between each iteration - due to cache
considerations actually copying the whole FFT code out as many times as needed
instead of just using calls to it is probably even worse.

I've posted about this suggestion before on this list, so I hope the possible
optimizations have been taken into consideration in v19 already, altough with the
exponent increasing the FFT code is starting to take more and more time and the
optimization of all the rest of the code become less important. I seem to also have
forgotten the rest of the optimizations I've toyed with ;>

 -Donwulff


_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to