# Mersenne: NTT faster than FFT ?

`I might have found a way of speeding up NTT that are usually slower than FFT.`
```
Firts, as some people may not know Number Theoritic Transfoms ( NTT ) , I
am going to remember few ideas about them. NTT are also called FFT modulo a
prime. Instead of using complex numbers, NTT compute wxith integer modulo a
prime p. Thus, all operations are done modulo p ( that is to say, in the
ring of integers modulo p) .

There are restrictions for the choice of p :
- the length l of then NTT must divide (p-1) ;
- the radix b must be lower than p ;
- p must be prime and (p-1)highly composite.
With 32 bit integers , a common choice is : p=13*2^28+1 ; b=2^28.
So, the number M represented by an array t[0..(l-1)] of l elements is :
M=t+t*b+...+t[l-1]*b^(l-1)=t+t*2^28+...+t[l-1]*(2^28)^(l-1) .

Usually, we do several NTT modulo different primes and then use the Chinese
Remainder Theorem to get t ;...; t[i] ;...; t[l-1] modulo p, because the
terms in the central loop of the transform can exceed the largest integer
that fits in a word machine (for instance, 2^32-1) . That's one of the
reasons why NTT are slower than FTT.

Why don't we compute these terms modulo p at each step of the central loop ?
Consequently, it would be useless to do several NTT and the Chinese
Remainder Theorem would not be required.

Even if I am wrong, the speed of NTT can be improved by using 64 bit
integers ( shorter NTT length ) supported by both new processors and new C
compilers, and by using Intel MMX instructions.

_________________________________________________________________________
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
```