2010/1/3 David Harvey <[email protected]>: > > > On Jan 3, 9:03 am, Bill Hart <[email protected]> wrote: > >> I figured I probably did misunderstand. What I think they are trying >> to get at is that one can use coefficients modulo 2^N+1 except at >> certain levels because the roots of unity are actually the same. But >> then the discussion becomes one of combining a Fermat and Mersenne >> transform, which is the basic strategy that the implementation uses. I >> got confused. I still am. > > I think they are saying that to multiply two integers whose product > has N bits, choose a and b (as in Lemma 1 of their paper) such that a > + b ~ N and then multiply modulo 2^a - 1 (use Mersenne transform) and > modulo 2^b + 1 (use Fermat transform). The smoothness comes from the > freedom to choose a and b. >
OK, I think I see. Their big N corresponds to an FFT for the original integers, which are multiplied mod 2^N+1 in the case of a Fermat transform or 2^N-1 in the case of a Mersenne transform. I think I would call these negacyclic and cyclic convolutions respectively. But then a Mersenne transform is nothing but an ordinary FFT, no? The idea of combining with a negacyclic FFT seems to be one of the strategies I discussed and discarded. I still don't understand what they mean by "power-of-2 roots of unity are needed only at the "lower level"". I understand that by the "lower level" they seem to mean the pointwise mults. But what is a power of 2 root of unity? Aren't they all powers of 2, except when using the sqrt2 trick. Maybe I don't understand the Schonhage-Strassen algorithm correctly. Perhaps they not only worked with coefficients mod 2^B+1 but also did the convolution mod x^m+1. They were worried about asymptotic complexity, not speed, so they wanted their algorithm to recurse so that the pointwise mults could be done using the same algorithm. Gosh that is exactly what the authors of this paper say! So my misunderstanding came from the fact that I assumed SS advocated a Mersenne transform, i.e. mod x^m-1, not a Fermat transform, mod x^m+1. Well, just to be clear. I worked mod x^m-1 because it is more efficient, and in a ring mod p where p = 2^B+1, for which one must perform a negacyclic convolution if one wishes to recurse, which involves twiddling to get an ordinary cyclic convolution. Anyhow, this is all a non-strategy once you have truncation. All that is required is to implement a single transform which works with coefficients mod 2^B-1, which I was formerly calling a Mersenne ring, when the large integers being multiplied are "small". Then you don't want to recurse for doing the pointwise mults and so the pointwise mults can be done mod 2^B/2-1 and 2^B/2+1 using our mpn_mulmod_2expp1 and mpn_mulmod_2expm1, which can do whatever is most efficient. Bill. -- You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en.
