Mersenne: Update on using Prime95 as a windows shell

1999-04-30 Thread Foghorn Leghorn

Thanks for the responses to my suggestion on this topic. Brian Beesley's 
ReCache code was very helpful. It doesn't lower the best possible iteration 
time attainable on my machine, but it does provide a reliable way to get it. 
In fact, I found that there is only a marginal difference between the 
performance of ReCache+Prime95 under the Explorer GUI versus using Prime95 
as a system shell. With the former, my pII-400 gets 0.188sec/iteration on an 
exponent around 6.39 million (320K FFT); with the later, it gets either 
0.188 or 0.187 sec/iter.

I made one small but useful enhancement to Brian's code: at the beginning of 
pass 2b, I added a call to the system() function using argv[2] as the 
argument. This saves the user from having to manually start Prime95 at the 
precise moment that Brian specified. On my desktop I have a shortcut with 
the command line
   c:\stuff\recache 64 c:\stuff\prime\prime95.exe
which starts prime95 using the modified ReCache program. Now when I leave my 
computer for the day or go to bed, I can close Prime95 and then restart it 
with this shortcut in order to insure optimal performance while I'm gone.


___
Get Free Email and Do More On The Web. Visit http://www.msn.com

Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm



Mersenne: JavaPrime update... (Lotsa Questions)

1999-04-30 Thread Blosser, Jeremy

Okay, I've gone through and figured out a lot about FFTs and have done a ton
of reading... So of course, now I have some questions...

I've rewritten JavaPrime so it is no longer based on Lucas.c, and am at
approximately the same speed as lucas (3sec/iteration @ 256k FFT) on my
P2-412 (ocd).

But of course, I have some questions for the great GIMPS gurus out there:

a) I'm doing multi-radix FFTs in multiple passes. For example the 256k FFT
does six Radix-8 FFT passes. Is this a good way of doing it, or is there
more optimization to be had? For example, are there more geometric patterns
I can take advantage of. So far, I've only seen 4,5,8 and 10.

b) Would it be more optimal to use a properly sized FFT to do the
calculation than to use the 256k or 512k FFT. For example, a 64k FFT is
subsecond, so is it feasable to initialize several different FFT engines for
properly sized numbers. Example:
N   Min FFT Size for N^2 in
bytes
4   2(1 bit)
14  2...
194 2...
34634   2...
1416317954  4 (2 bits)
2005956546822746114 8 (3 bits)
4.0238616677410360228256356561e+36  16   (4 bits)
1.6191462721115671781777559070e+73  32   (5 bits)
2.621634650492785145260593695e+146  64   (6 bits)
6.872968240664427723883748623e+292  128 (7 bits)
4.72376924371818749845167e+585  256 (8 bits)
... ...
2^6466417   512k (19 bits)

So, I'm thinking you could optimize the multiplication time by using the
proper size FFT for the job right? I guess I could always try it. :P

Also, I'm wondering if any pattern begins to occur in the N=N^2-2 % 2^P-1
sequence... Do you think N ever repeats itself? Has anyone checked this?

c) At some point, doing the math via FFT will become more expensive that a
disk read (Especially if we plan on doing the 1,000,000,000 digit prime). So
wouldn't a DB type lookup be more efficient at that point? In which case,
multithreading and multiple computers working on the prime could really
start to come into play.

d) Has anyone looked into NTTs? It would seems that processors with weak FP
would benefit from an NTT, plus we'll eventually have to go there for REALLY
big numbers due to loss of precision... I think that doubles only get us to
a certain precision

e) Can N=N^2-2 be represented as a series? If so, wouldn't that be faster?

f) Can the subtraction be done without coming back into "real space"? The
FFTs kill ya. ;P

I forget the rest...

-Jeremy

Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm