Mersenne: Re: Question on repeated squarings in the frequency domain
Brendan Younger [EMAIL PROTECTED] wrote: I know there has been some discussion on this list about the theoretical possibility of staying in frequency space and doing a bunch of point-wise squarings in there instead of computing the inverse DWT after every iteration. Now, I'm not suggesting this as a feasible avenue for GIMPS as I've read some of the arguments against it in the archives. I'm just curious as to how it might proceed if you could assume infinite precision, etc. For instance, would it be as simple as: DWT(x) - X^n - IDWT(X^n) == x^n (mod 2^q - 1) (where "n" is now only 2?) Yes. Of course since you have just one element in your input vector, the DWT multiplier is just unity, FFT(x) = x, and you're really just doing a scalar multiply and subsequent modular reduction (that's the carry step). Of course I'm simplifying by not including the subtraction by 2 For any length-n vector with just a single nonzero term in the first slot, i.e. x = (c, 0, 0, ... , 0), the FFT is trivial: FFT(x) = (c, c, c, ... , c), so to do the subtract-2 in the frequency space, just subtract 2 from *every* term after doing the dyadic squaring (there might be a factor 1/n in there, depending on where the divide-by-n occurs in your FFT implementation). Also, if it *is* this simple, could there be any modular reduction of the X^n before doing the IDWT? AFAIK, no, at least not in a floating-point-FFT-based scheme. If you were doing a number-theoretic transform modulo 2^q-1 you'd have your mod, but then things again reduce to the one-term scheme described above. Another question one might ask is, can one do any part of the normalize-outputs-and-carry step in frequency space? The problem here is that the forward-FFTed and dyadically- squared terms contain a mixture (linear superposition, to be precise) of the actual inverse FFT outputs one wants. Each term contains a different mix of the n outputs, and in the Crandall-Fagin DWT scheme, these outputs are w.r.to a mixture of bases, so even if one had a clever way to effect the normalization by a constant base in frequency space, it wouldn't work with DWT. Cheers, -Ernst
Mersenne: Re: Question on repeated squarings in the frequency domain
On Tuesday, September 10, 2002, at 01:00 PM, [EMAIL PROTECTED] wrote: Brendan Younger [EMAIL PROTECTED]> wrote: >I know there has been some discussion on this list about the >theoretical possibility of staying in frequency space and doing a bunch >of point-wise squarings in there instead of computing the inverse DWT >after every iteration. Now, I'm not suggesting this as a feasible >avenue for GIMPS as I've read some of the arguments against it in the >archives. I'm just curious as to how it might proceed if you could >assume infinite precision, etc. For instance, would it be as simple as: > >DWT(x) -> X^n -> IDWT(X^n) == x^n (mod 2^q - 1) (where "n" is now only 2?) Yes. Of course since you have just one element in your input vector, the DWT multiplier is just unity, FFT(x) = x, and you're really just doing a scalar multiply and subsequent modular reduction (that's the carry step). I'm sorry, I was talking about a sequence of x's, not just a single one. (i.e. the x_i for the base in the DWT) >Also, if it *is* this simple, could there be any >modular reduction of the X^n before doing the IDWT? AFAIK, no, at least not in a floating-point-FFT-based scheme. If you were doing a number-theoretic transform modulo 2^q-1 you'd have your mod, but then things again reduce to the one-term scheme described above. Another question one might ask is, can one do any part of the normalize-outputs-and-carry step in frequency space? The problem here is that the forward-FFTed and dyadically- squared terms contain a mixture (linear superposition, to be precise) of the actual inverse FFT outputs one wants. Each term contains a different mix of the n outputs, and in the Crandall-Fagin DWT scheme, these outputs are w.r.to a mixture of bases, so even if one had a clever way to effect the normalization by a constant base in frequency space, it wouldn't work with DWT. Let me see if I understand correctly. Most of the literature I've seen dealing with NTTs for large-integer multiplication has always said to pick a field with size > (input values)^2 so that you get the exact answer after the INTT. It seems to me that allowing the calculation to overflow in the dyadic squaring in an NTT wouldn't disrupt the final answer. Therefore, my question is whether the LL test can proceed as such: NTT(x_i) -> X_i = (X_i * X_i - 2) (q - 2 dyadic squarings modulo the NTT modulus and the necessary subtraction by two) -> INTT(X_i) == (0, or some other number) (mod 2^q - 1) My apologies for not making this more explicit earlier, but I didn't want to be shouted down with a "That's already been considered and discarded" response. Brendan Younger
Mersenne: WinXP SP1 slows prime95
Yesterday I went from Windows XP home to service pack 1. The speed of prime95 went down by over 2%. Has anyone else seen this? Any ideas on what caused it or how it can be fixed? +-+ | Jud McCranie | | | | Programming Achieved with Structure, Clarity, And Logic | +-+ _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: Re: Question on repeated squarings in the frequency domain
In a message dated 9/10/2002 11:47:24 AM Pacific Daylight Time, [EMAIL PROTECTED] writes: Let me see if I understand correctly. Most of the literature I've seen dealing with NTTs for large-integer multiplication has always said to pick a field with size (input values)^2 so that you get the exact answer after the INTT. It seems to me that allowing the calculation to overflow in the dyadic squaring in an NTT wouldn't disrupt the final answer. Therefore, my question is whether the LL test can proceed as such: The literature is correct. The problem with modding in the intermediate steps is that you lose information needed to exactly reconstruct the outputs - you only know what the ouptuts are modulo your NTT prime, and don't know what the carries from the exact output digits into the higher-order coefficients are. In that sense, floating-point FFT and NTT are similar - in both cases you need enough bits to accomodate the full convolution output digits. The only advantage NTT has here is that you don't sacrifice any bits of your computer words to roundoff error, nor to the exponent field floats are required to carry around. For (say) IEEE doubles vs. 64-bit ints, that might translate into an NTT needing only 80% or so of the vector length of an FFT. However, the fact that most CPUs can do floating-point arithmetic (especially multiply) quite quickly tends to more than make up for this modest precision difference in practice - I have yet to hear of an NTT implementation running on a popular general-purpose processor that is less than 2X slower than a well-tuned FFT implementation. OTOH, on hardware like FPGAs, where floating-point is awkward but multiple integer execution units can be implemented easily, NTT is very often hands-down a winner for doing this kind of thing (though generally not with the vector lengths and precision requirements we have here at GIMPS.) -Ernst
Re: Mersenne: WinXP SP1 slows prime95
Yesterday I went from Windows XP home to service pack 1. The speed of prime95 went down by over 2%. Has anyone else seen this? Any ideas on what caused it or how it can be fixed? Code performance can vary more than that just based on cache and page boundries of where its loaded. _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: Some problem about reporting a TF
I lost a TF I know my machine Torbenskværn was ready to report the TF of M19.875.7* today. It now has started M19.917.917. It never wrote the prime.log and it never was credited upon my account, and I don't have prime.spl for that result. What happened? Don't tell me the æ matters. I have results for that machine during the past 2,5 years - no problems. Of course I can back up to immediately before, I guess at 85% of 66 bit; and I will. This starts an another discussion: What happens to that amount of work done to a factor that I some time gets after someone else has brought it from 58 bits up to 64 or 65 where I get the assignment? It seems to me this is just forgotten work. Why don't we donate it to the challenge account? have a very niceday tsc PS.: Though I don't like TeamPrimeRib because they are so fast I can't catch up, I'm still amazed of their site. Go visit http://www.teamprimerib.com. The graphs are wonderful! And so the 7-days producers and the 30-days producers lists. Very good work, guys! Besides you smatch me out by a factor of at least 11. :-( _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers