In a message dated 9/10/2002 11:47:24 AM Pacific Daylight Time, [EMAIL PROTECTED] writes:


Let me see if I understand correctly.  Most of the literature I've seen
dealing with NTTs for large-integer multiplication has always said to
pick a field with size > (input values)^2 so that you get the exact
answer after the INTT.  It seems to me that allowing the calculation to
overflow in the dyadic squaring in an NTT wouldn't disrupt the final
answer.  Therefore, my question is whether the LL test can proceed as
such:



The literature is correct. The problem with modding in the intermediate steps
is that you lose information needed to exactly reconstruct the outputs -
you only know what the ouptuts are modulo your NTT prime, and don't
know what the carries from the exact output digits into the higher-order
coefficients are. In that sense, floating-point FFT and NTT are similar -
in both cases you need enough bits to accomodate the full convolution
output digits. The only advantage NTT has here is that you don't sacrifice
any bits of your computer words to roundoff error, nor to the exponent field
floats are required to carry around. For (say) IEEE doubles vs. 64-bit ints,
that might translate into an NTT needing only 80% or so of the vector
length of an FFT. However, the fact that most CPUs can do floating-point
arithmetic (especially multiply) quite quickly tends to more than make up
for this modest precision difference in practice - I have yet to hear of an
NTT implementation running on a popular general-purpose processor
that is less than 2X slower than a well-tuned FFT implementation.

OTOH, on hardware like FPGAs, where floating-point is awkward but
multiple integer execution units can be implemented easily, NTT is
very often hands-down a winner for doing this kind of thing (though
generally not with the vector lengths and precision requirements we
have here at GIMPS.)

-Ernst

Reply via email to