from:"Peter\-Lawrence . Montgomery"

Mersenne: Penultimate Lucas-Lehmer step

2003-12-11 Thread Peter-Lawrence . Montgomery


 Let p > 2 be prime and  Mp = 2^p - 1.
The familiar Lucas-Lehmer test sets a[0] = 4
and a[j+1] = a[j]^2 - 2 for j >= 0.
Mp is prime if and only if a[p-1] == 0 mod Mp.

When Mp is prime, then 

a[p-2]^2 == 2 == 2*Mp + 2 = 2^(p+1)  (mod Mp).

Taking square roots, either

   a[p-2] ==  2^((p+1)/2) mod Mp
or
   a[p-2] == -2^((p+1)/2) mod Mp.


Around 20 years ago I heard that nobody could predict
which of these would occur.  For example,

  p = 3a[1] = 4 == 2^2 (mod 7)
  p = 5a[3] = 194 == 2^3 (mod 31)
  p = 7a[5] = 1416317954 == -2^4 (mod 127).

Now that 40 Mersenne primes are known, can anyone
extend this table further?  That will let us test
heuristics, such as whether both  +- 2^((p+1)/2) 
seem to occur 50% of the time, and
provide data to support or disprove conjectures.

  Peter Montgomery


_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Re: M#40 verification run

2003-06-12 Thread Peter-Lawrence . Montgomery

Nathan Russell <[EMAIL PROTECTED]> comments

--On Wednesday, June 11, 2003 10:58 PM -0400 George Woltman 
<[EMAIL PROTECTED]> wrote:

> 2)  I'll change prime95 to NOT delete the save files when a new prime is
> found. When a prime is reported, I'll ask for the save file and rerun the
> last 30 minutes of the test.  I think this would have helped in this
> case.  This measure also deters a hacker.   It might be easy for a hacker
> to spoof a prime report - but he'll have difficulty generating a save
> file that reports the number as prime after the final 30 minutes of LL
> testing.

That is a collosal understatement.  Every modulo operation destroys 
information, and I'm not sure whether one COULD construct such a file.

Nathan

 Suppose we are able to go back 50 iterations.  That is,
suppose the checkpoint gives a value x[0] such that if we iterate

   x[i+1] = x[i]^2 - 2 (mod Mp)

for i = 0, 1, ..., 49, then we encounter x[50] = 0.

 Let q be a prime factor of Mp.  Solving the quadratic

   x[0] = alpha + 1/alpha(mod q)

for alpha gives either a solution in GF(q) (with alpha^q = alpha)
or a solution in GF(q^2) (with alpha^q = 1/alpha).  In both cases

 x[i] = alpha^(2^i) + alpha^(-2^i) (mod q)

for all i.  The assumption x[50] = 0 implies alpha^(2^51) = -1 (mod q).
The multiplicative order of alpha must be 2^52.  But we know this
order divides one of q +- 1.  Hence q == +- 1 (mod 2^52).

   This argument applies to all prime factors q of Mp.
It seems very unlikely that all prime factors will be +- 1 (mod 2^52)
unless Mp is itself prime.

_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Why is trial factoring of small exponents slower than large ones?

2003-02-06 Thread Peter-Lawrence . Montgomery

= From [EMAIL PROTECTED] Fri Feb  7 05:15:26 2003

= I am using mprime 22.12 on a pentium 166 MMX to do trial factoring. For the
= exponents currently being assigned from primenet it takes this machine about
= 12 minutes to factor from 2^57 to 2^58.
= 
= I thought I would try factoring some small exponents (under 1,000,000) from
= the nofactors.zip file. I put FactorOverride=64 into prime.ini and started
= mprime as usual but progress is _much_ slower, it will take about 8 hours to
= factor from 2^57 to 2^58.
= 
= Can someone tell me why the time difference is so great?
= 
= 
= Geoff.

 If you are looking for divisors q of Mp, where p and q are prime, then
q must be 1 mod p.  This eliminates all but (2^58 - 2^57)/p = 2^57/p
potential values of q in [2^57, 2^58].  Other tests
(e.g., skip any potential q which is 3 or 5 mod 8, skip multiples of 3)
eliminate a fixed percentage of the potential q.  
The number of survivors is inversely proportional to p.

Checking whether an individual q divides 2^p - 1 takes about log2(p) 
squarings modulo q.  The overall cost of O(1/p) such tests is O(log(p)/p)

 p  p/log(p)
 2^1926000
 2^21   10
 2^23   36
 2^25  130 

A test with p near 2^19 is estimated to take 130/26000 = 50
times as long as one with p near 2^25  (64 times as many candidates,
19/25 times as long per candidate).
_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Mersenne: Problem of the year(s)

2003-01-03 Thread Peter-Lawrence . Montgomery

 A Mersenne number M_p = 2^p - 1, where p is prime and p < 1000,
has a prime divisor q with q == 1 (mod 2002) and q == -1 (mod 2001)
[== denotes congruent].  Find q mod 2003.

Peter Montgomery
_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Request for help (especially ECM)

2002-11-10 Thread Peter-Lawrence . Montgomery

"Brian J. Beesley" <[EMAIL PROTECTED]> wrote
- From [EMAIL PROTECTED] Sun Nov 10 20:22:33 2002

- On Saturday 09 November 2002 04:45, George Woltman wrote:
- > A harder problem is finding some smooth ECM curves to test.  I do not
- > have tools to compute group orders.  If someone can help by finding a
- > couple of dozen smooth ECM test cases for exponents between 1000
- > and 50, I would be most grateful.
- 
- Here's one example:
- 
- With sigma=1459848859275459, Prime95 v22.12 finds the factor 
- 777288435261989969 of M1123:
- 
- in stage 1 with B1 >= 535489
- in stage 2 with B1 >= 38917 & B2 >= 534241
- 
- I'm not entirely sure why the B2 required to find the factor at the end of 
- stage 2 is smaller than the B1 required to find it in stage 1. One of the 
- improvements I guess.
- 
- Regards
- Brian Beesley

   Probably the 534241 entry is compared against several others, 
including 1248 (= 535489 - 534241).

   How does one map sigma to a curve (and initial point)?  
What is the range of sigma (it seems to go beyond 2^32)?

   The ECM tests should include a case where two primes are found at the
same time during step 2, because the largest primes dividing the
two group orders are equal.  [That is, the GCD will be composite.]
This test may be hard to construct, however.

  Either the ECM tests or the p-1 tests should include a case where
the group order (or p-1) is divisible by a power of a moderate prime,
such as 61^3 or 757^2 .

Peter Montgomery[EMAIL PROTECTED]

_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Ernst Mayer's GIMPS pages

2002-11-08 Thread Peter-Lawrence . Montgomery

- From [EMAIL PROTECTED] Sat Nov  9 00:21:04 2002

Anurag Garg <[EMAIL PROTECTED]> writes

- I have been unable to reach his pages using both the domain name
- hogranch.com - which gets a non-existant domain name error - or the IP
- address 209.133.33.182 .
- 
- What gives?
- Anurag
- 
How many times did you try, and over how long a period? 
Last I heard, Ernst Mayer lived near San Jose, CA, and kept the 
web pages on a home machine.  One headline at www.sjmercury.com reads

 Wet and Wild
 
 Storm knocks out power, closes schools, and causes flooding.

Even if the power has remained up, it would be prudent for Ernst to 
keep the machines off.


_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: re: EE Times reports Faster Multiplication

2002-07-30 Thread Peter-Lawrence . Montgomery


"Griffith, Shaun" <[EMAIL PROTECTED]> asks

> Luke Welsh wrote:
> 
> > The EE times (www.eet.com) reports:
> > 
> > "Research by a team of U.K. mathematicians has produced a different 
> > approach to two of the most fundamental aspects of electronic computation:
> > hardware addition and multiplication."
> 
> Can anyone tell me how to find the online version? I spent a few minutes
> trying, but didn't get anywhere. I don't mind registering either.
> 
> -Shaun
> _

The following is taken from newsgroup comp.arch.arithmetic .

- From: [EMAIL PROTECTED] (Jim Battle)
- Newsgroups: comp.arch.arithmetic
- Subject: Re: UK startup claims 2x speedup for multipliers
- Date: 28 Jul 2002 13:09:43 -0700
- Organization: http://groups.google.com/
- Lines: 20
- Message-ID: <[EMAIL PROTECTED]>
- References: 
- NNTP-Posting-Host: 66.125.237.190
- Content-Type: text/plain; charset=ISO-8859-1
- Content-Transfer-Encoding: 8bit
- X-Trace: posting.google.com 1027886983 17224 127.0.0.1 (28 Jul 2002 20:09:43 GMT)
- X-Complaints-To: [EMAIL PROTECTED]
- NNTP-Posting-Date: 28 Jul 2002 20:09:43 GMT
- 
- "Norbert Juffa" <[EMAIL PROTECTED]> wrote in message 
news:...
- > I read in EE Times of July 22,2002,  page 20, that there is a company in the
- > UK that claims they have a ways of building full adders and 4:2 compressors
- > that are much faster than those built using traditional approaches. They claim
- > this enables them to implement multipliers that have 2x the speed of traditional
- > multiplier approaches. Of course, no details are revealed. The company's web
- > site is at www.autopd.com. I looked at it but no details are given either, as one
- > might expect.
- 
- Actually, they do give a reason:
- 
- http://www.autopd.com/ax_multiplier.html
- 
- It says that they have a logarithmic-type compression tree structure,
- but that has only nearest-neighbor connections (typically,
- wallace-tree and such have very irregular layout).  The resulting
- short wires and low capacitance is what gives them their claimed
- 50%-100% speedup.
- 
- Now whether it is true or just reinvented a wheel, time will tell...
- 
- 
_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Re: SF Bay GIMPS party (was Mersenne #39)

2001-11-18 Thread Peter-Lawrence . Montgomery



"John R Pierce" <[EMAIL PROTECTED]> writes

- > Can anyone come?  Announce it on the list if you do it.
- > Where abouts are they held?
- 
--  I will if I can.  Depends on the date and family conflicts and where in the
- rather sprawling SFBA it is... I'm in Santa Cruz, so naturally would prefer
- points towards the south end of the Bay Area...

   The Western Number Theory Conference 

http://home.earthlink.net/~bartgoddard/mainpage.html

meets in Monterey (acutally Asilomar) Sunday-Thursday December 16-20.  
It is four weeks away, so some out-of-town attendees may not 
have bought air tickets yet and might be able to 
stop between the SFO/OAK/SJC airports and Monterey.

If the Mersenne gathering is in Silicon Valley around 4 pm on Thursday 
December 20, then attendees will have time to get from Monterey to 
the event, and still have time to get to SFO two hours before
a red-eye flight to the east coast.  The meeting location will need storage 
for luggage.

If the gathering is in the Santa Cruz/Monterey area, it should be the
afternoon of Tuesday December 18, which is a free period for conference 
attendees.  Point Lobos and the Monterey Aquarium are the most common
destinations on these afternoons.

I reside in Marin County, north of San Francisco.  I rarely attend
evening (as opposed to multi-day) events in Silicon Valley, since it
is too hard to get home after the last evening commute buses leave 
San Francisco.  Plus, if I eat near Mountain View, I'll need to use the
toilet when I reach San Francisco, but the Transbay Terminal toilets
are closed at night.

   Peter Montgomery
_
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Extracting digit of real number

2001-10-09 Thread Peter-Lawrence . Montgomery


To: [EMAIL PROTECTED]
Subject: Mersenne: n-th decimal digit in real number
Date: Tue, 09 Oct 2001 15:09:08 -0700
Mime-Version: 1.0
Content-Type: text/plain; format=flowed
Message-ID: <[EMAIL PROTECTED]>
X-OriginalArrivalTime: 09 Oct 2001 22:09:08.0593 (UTC) FILETIME=[048F1610:01C1510F]
Sender: [EMAIL PROTECTED]
Precedence: bulk
Status: R


"Robert Braunwart" <[EMAIL PROTECTED]> wrote


>What do you think the best is the best approach for this problem?
>
>Calculate the n th. (any number) decimal digit of the expansion of a
>regular rational 1/n (not necessarily the same n), with n a prime number,
>without calculating the preceding digits.
>

   Let x be a positive real number.  
If d is the n-th decimal digit of x, then

 d is an integer
 0 <= d <= 9
 FLOOR(10^n * x) == d (mod 10)

For example, if x = sqrt(2) = 1.4142135 ..., and n = 4, then

  FLOOR(10^4 * x) = FLOOR(14142.135 ...) = 14142.

Reduce this modulo 10, to find the fourth decimal digit
(to the right of the decimal point) is a 2.


>Please give a example using substitution into formula.


  If r is a rational number, say r = r1/r2, 
with r1, r2 positive integers, then

FLOOR(10^n * r) mod 10 
  = FLOOR(10*(r1*10^(n-1) mod r2)/r2)

For example, to get the fourth decimal digit of 3/7, evaluate
3 * 10^3 mod 7 = 3000 mod 7 = 4.
FLOOR(10*4/7) = FLOOR(40/7) gives 5, which agrees with 3/7 = .428571 ...


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: M727 factored!

2001-09-07 Thread Peter-Lawrence . Montgomery


"Daran" <[EMAIL PROTECTED]> asks

> Is M751 now the smallest unfactorised composite Mersenne?  What is the
> smallest Mersenne not completely factorised?

M751 and M809 are the first Mersennes with no known factor.

The first holes in the 2^n - 1 table are 2,673- c151 and 2,683- c203.
This means, for example, that 2^673 - 1 is partially factored, 
but it has a 151-digit composite cofactor.

The first holes in the 2^n + 1 table are 2,647+ c169 and 2,653+ c154.

The first holes in the 2LM table are 2,1238L c160 and 2,1238M c145.
These denote cofactors of 2^619 - 2^310 + 1 and 2^619 + 2^310 + 1,
both of which divide 2^1238 + 1.

Below are ten recently found factorizations.

Algebraic factors, such as the factors 23 and 89 of 2^671 - 1
(these divide 2^11 - 1, which in turn divides 2^671 - 1) do not appear.
 
Factors above the * lines were previously known.

Peter Montgomery
[EMAIL PROTECTED]
September, 2001

C(2,619+)
* c186 = p91.p96
 
1257388159910804265763446600825278256318012249697661907431303034629811311740864601663325811
 
576735513593459498091224888775105070536100466874272552892992673568274909599171018374973329229033

C(2,632+)
 286297736737
* c177 = p66.p111
 47121166891830156151548920824621951426679953187468148445645729
 
514032034228747931945571962470228019101894473686825197910636442532520433593399910044880487939432413264840902977

C(2,641+)
 1283  32051  139739  353833  1078163
* c169 = p59.p110
 73819843823154749726309925820314356063695778208135055585507
 
18795947089943685289042850201861326689719641302766796883477502913493254840480457593719603600834006204492522841

C(2,641-)
 35897  4  1173835097  2401258891949526685926151441
* c148 = p69.p79
 745276300734440606226386924312213175677903182797334854064486587296999
 2420161564200739329410254310444778820196576654139080232429544162649795567983079

C(2,643-)
 3189281
* c188 = p71.p117
 22532429052605670225026391054393428833168207234802434915090881303620353
 
507909591297683949138862971271266635431758872031092542127980551589004038646657157217329569167343063743426799521984799

C(2,671-)
 116356769  33491655209  64110547427930873
* c145 = p68.p78
 13646560594525825890627182668772241639702837721889959372317451952089
 608833519146176962786346063898868909094632504100539398786357475514441579020823

C(2,727-)
* c219 = p98.p122
 
17606291711815434037934881872331611670777491166445300472749449436575622328171096762265466521858927
 
40099499726183758517891939428601665707063794593443940689888526556802581529262728143398959743444150539520890742947533452401

C(2,1202L)
 7213
* c178 = p87.p91
 
322191336498946329196503049475322564558677154558189688098324958943433876457693205092709
 
3571063752373727959434120513220011301363863161567383982128636656862221716975343445451820353

C(2,1222L)
 1363753
* c161 = p69.p92
 390941529316414655423854492083019690148253103500590794058313678419233
 
70215922956051621713037377195110638684576079957295845924735328290270267776117780526372201309

C(2,1234M)
 86381  7367588575848411802768597653205046693
* c144 = p56.p88
 25030363534817185101957125006047751030454874478028239473
 
6828519750766734356393222104746037438762841641726469993132504189166732581463128027874013



_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Mersenne: All Cunningham Wanted Numbers Factored

2001-09-07 Thread Peter-Lawrence . Montgomery


 All Cunningham Wanted Numbers Factored
  Peter Montgomery
  [EMAIL PROTECTED]
  September 7, 2001

The Cunningham tables have factorizations of b^n +- 1
for small b and n.  The full reference is

   John Brillhart, D.H. Lehmer, J.L. Selfridge,
   Bryant Tuckerman, and S.S. Wagstaff, Jr.,
   "Factorizations of b^n +- 1, b = 2, 3, 5, 6, 7, 10, 11, 12
up to high powers", Contemporary Mathematics,
   Volume 22, American Mathematical Society.

   http://www.cerias.purdue.edu/homes/ssw/cun/index.html

The first edition appeared in 1983 and the second in 1988.
Sam Wagstaff, the principal maintainer of the tables,
is sending a third edition to the publisher this month.

TRADITION OF WANTED NUMBERS

One Cunningham project tradition has been lists of wanted numbers,
usually chosen by author John Selfridge.  These have served as challenge
numbers -- somebody who claims a superior factoring
algorithm will typically demonstrate it on these numbers,
which have resisted repeated attempts by others.

Pages lviii-lix of the 1983 edition list 10 "most wanted" numbers
and 15 "more wanted" numbers.
The largest of these is the 71-digit repunit (10^71 - 1)/9.
All were factored within about two years, usually by the just-discovered
quadratic sieve or by P-1.

Page lxxxvi of the 1988 edition lists
10 "most wanted" numbers and 22 "more wanted" numbers,
the smallest being the 86-digit number (3^199 - 1)/(2 * 1360648969)
and the largest being the 148-digit cofactor (2^512 + 1)/2424833
of the Fermat number F9.  All of these were later factored, usually by the
elliptic curve method (ECM), multiple-polynomial quadratic sieve (MPQS), or
number field sieve (NFS).  The F9 factorization made international headlines.

The wanted lists have been revised many times between printed editions,
approximately annually, as existing entries are factored.  The largest
number to ever appear, a 291-digit cofactor of F10 = 2^1024 + 1, was
factored in October, 1995, when Richard Brent found a 40-digit factor by ECM.
The May 31, 2001 list has 10 "most wanted" numbers and 16 "more wanted"
numbers, ranging from 144-digit cofactors of 2^631 + 1 and
2^617 + 2^309 + 1 to the 193-digit number (11^191 - 1)/195867 .
[There had been 34 wanted entries in July, 2000 --
the other 8 were factored before May 31.]

ENTIRELY NEW LISTS

The publication of the 3rd edition starts a new era.
For the first time since the 1983 edition appeared,
the old lists have been completed en masse.
All 26 numbers on the May 31 lists were factored by September 7, or
99 days later.  The average pace was one number completed every four days.

The following individuals contributed to this effort:

  Bruce Dodson, Lehigh University, Bethlehem, PA, USA
  Jens Franke, Institute for Pure Mathematics, Bonn, Germany
  Torbjorn Granlund, Swox, Stockholm, Sweden
  Thorsten Kleinjung, Institute for Pure Mathematics, Bonn, Germany
  John Klos
  Arjen Lenstra, Citicorp, NY, USA (home in NJ)
  Paul Leyland, Microsoft Research, Cambridge, UK
  Peter Montgomery, Microsoft Research, Redmond, WA, USA (home in CA)
  Robert Silverman, Bedford, MA, USA
  Herman teRiele, CWI, Amsterdam, The Netherlands

We used computer facilities at

  Centrum voor Wiskunde en Informatica (CWI), Amsterdam, The Netherlands
  High Performance Computing Center North, Stockholm, Sweden
  Institute for Applied Mathematics, Bonn, Germany
  Institute for Pure Mathematics, Bonn, Germany
  Lehigh University, Bethlehem, PA, USA
  Microsoft Research, Cambridge, UK
  RSA Security, Bedford, MA, USA
  SARA, Amsterdam, The Netherlands
  Stockholm University, Department of Mathematics, Stockholm, Sweden
  Swedish Institute of Computer Science, Stockholm, Sweden
  Swox, Stockholm, Sweden

plus some home PCs (including a laptop during intercontinental flights!).

Only one of the 26 wanted numbers had a prime factor under 47 digits.
That number, a c168 cofactor of 3^382+, left a c127 when its 42-digit
factor was found by ECM.  All other factors were found by SNFS.

One wanted number had two factors over 90 digits:

 2,619+ = 3 * p91 * p96
   p91 = 125738815991080426576344660082527825631801224969 \
 7661907431303034629811311740864601663325811
   p96 = 576735513593459498091224888775105070536100466874 \
 272552892992673568274909599171018374973329229033

This is just below the previous size record 10^211 - 1 = 3*3*p93*p118.
Both are dwarfed, however, by the new factorization

 2^727 - 1 = p98 * p122
   p98 = 1760629171181543403793488187233161167077749116644 \
 5300472749449436575622328171096762265466521858927
   p122 = 400994997261837585178919394286016657070637945934 \
 4394068988852655680258152926272814339895974344415 \
 0539520890742947533452401

Re: Mersenne: M(2) - composite?!

2001-07-31 Thread Peter-Lawrence . Montgomery


> From: Nathan Russell <[EMAIL PROTECTED]>

> Hi all,
> 
> I'm a bit puzzled.  The other day, I donated blood and kept my mind
> busy by doing LL tests on a few exponents mentally.  I kept getting
> the result that the LL test for M(2) ends up resulting in a repeating
> value of -1, and certainly cannot ever become zero.  Am I missing
> something really obvious?  I confirmed it on paper later to make sure
> I didn't make a mistake in the mental math.  

 
One proof of the LL test notes that if we define

a[0] = 4
a[j+1] = a[j]^2 - 2(j >= 0)
then 
a[j] = (2 + sqrt(3))^(2^j) + (2 - sqrt(3))^(2^j) (j >= 0)

This is a simple induction argument.  Next, if

   alpha = (1 + sqrt(3))/sqrt(2)
   beta = (1 - sqrt(3))/sqrt(2)
then
   alpha^2 = 2 + sqrt(3)
   beta^2  = 2 - sqrt(3)
   alpha*beta = -1
so
   a[j-1] = alpha^(2^j) + beta^(2^j)(j >= 1)

If N = 2^p - 1 is prime, and p >= 3 is odd, then N == 7 (mod 24).
Therefore 2 is a quadratic residue (indeed, 2^((p+1)/2) is a square
root of 2 modulo N) but 3 is a quadratic nonresidue.  
alpha and beta are algebraic conjugates in GF(N^2).
So alpha^N = beta and beta^N = alpha (mod N).  
Since alpha*beta = -1, this implies

   alpha^(N+1) == beta^(N+1) == -1 (mod N)

and a[p-1] == alpha^(N+1) + beta^(N+1) == -2 (mod N) .
Therefore a[p-2] == 0 (mod N).

When p = 2 and N = 3, we can neither claim that 2 is a 
quadratic residue nor that 3 is a non-residue.
Both alpha and beta reduce to 1/sqrt(2) mod 3.
As in the case where p is odd, alpha and beta
are in the extension field GF(N^2) and not the base field GF(N).
Unlike that case, alpha = beta rather than alpha = beta^N and beta = alpha^N.
The proof falls apart.

   Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Re: [PrimeNumbers] RSA

2001-07-30 Thread Peter-Lawrence . Montgomery

> From: Nathan Russell <[EMAIL PROTECTED]> writes

> On Mon, 30 Jul 2001 11:45:26 -0700 (PDT), "Pavlos S."
> <[EMAIL PROTECTED]> wrote:

> >It has been a while since RSA labs published new
> >factoring challenges..Does anyone know if there is any
> >project going on at the moment for these challenges?
> >(probably with the help of the Number Field Sieve)

 Some of us plan to start RSA-576 later this year.  
Bruce Dodson ([EMAIL PROTECTED]) is the organizer.

> On a related subject, has anyone seen signs of life at the NFSnet
> project in the last year or so?  I've emailed the leader about five
> times asking to reserve work, with no response one way or the other.  

Many of us share your frustrations.  
Some of us are presently trying to finish the Cunningham wanted list, 
and have factored half of the 26 composite entries on the May 31, 2001 list.  
The NFSNET web page has long shown NFSNET to be 92% done sieving 2,641-, 
one of the remaining 13, but we get no response from its leader.
Contact Bruce Dodson (above) if you want to assist on 5,283+. 
[BTW, 5^283 is about 7 times 2^641.]

  Peter Montgomery

_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: P-1

2001-07-25 Thread Peter-Lawrence . Montgomery


>  From: "CARLETON GARRISON" <[EMAIL PROTECTED]>

> I honestly thought that the long term goal (maybe not by this panel but for
> others) was to factor all these numbers and that we were setting/recording a
> lower boundary for that effort.
> 
> Carleton Garrison

 It will be a _long_ time before we are completely factoring those
values of 2^n +- 1 which are now being subject to LL testing.

In the 1983 Cunningham edition, (2^211 - 1)/15173 was the 
first incomplete 2^n +- 1 factorization. 
In the 1988 edition it was (2^311 - 1)/5344847.  
Now, in 2001, the exponent has risen to 641.
This threshold exponent advanced 20 per year between 1983 and 1988,
due primarily to the MPQS and ECM algorithms.
Since 1988, it has risen about 25 per year, due 
primarily to the Number Field Sieve.
Unless there are major algorithmic advances, don't
expect to pass 2^2000 - 1 in our lifetimes.






_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

RE: Mersenne: Proth observations

2001-06-26 Thread Peter-Lawrence . Montgomery



Andy Hedges <[EMAIL PROTECTED]> asks:

> Anyone have any idea why for k = 659 there are very little primes? In fact
> for k up to 20 there are none (I haven't found any in this range yet!).

Let k = 659.

If n == 1 (mod 2) then k*2^n == 1 (mod 3)
If n == 2 (mod 4) then k*2^n == 1 (mod 5)
If n == 0 (mod 3) then k*2^n == 1 (mod 7)
If n == 4 (mod 12) then k*2^n == 1 (mod 13)
If n == 8 (mod 9) then k*2^n == 1 (mod 73)

Therefore, if k*2^n - 1 is prime, then n == 20 or 32 (mod 36).
Other useful congruences include

If n == 2 (mod 5) then k*2^n == 1 (mod 31)
if n == 0 (mod 23) then k*2^n == 1 (mod 47)

This doesm't explain the total lack of primes, but
shows that many potential n can be eliminated early.


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Proth observations

2001-06-22 Thread Peter-Lawrence . Montgomery




 Gordon Bower <[EMAIL PROTECTED]> observes


> After seeing a post on this list a few weeks ago I decided to branch out
> and try a few ranges from Michael Hartley's page looking for k*2^n-1
> primes. I must say there is a bit of a thrill in actually discovering a
> new prime every day I run the program instead of proving two numbers a
> month composite. :)
 

> I assumed that one value of k was pretty much the same as any other as far
> as execution time and the chance of finding primes. To my surprise this
> turned out not to be so: On the P3-500, for "most" 650 about 5 hours for 16000 k=701 it took less than 2 and just over 6 hours, respectively. The
> phenomenon is reproducible, doesn't seem to be an artifact of other
> programs or reboots or suchlike. Any number theorists care to explain what
> is special about k=701 that makes it easy to check for primality?
> 

  Fix k = 701.  We check that

If n == 1 (mod 2) then k*2^n == 1 (mod 3)
If n == 0 (mod 4) then k*2^n == 1 (mod 5)
If n == 6 (mod 8) then k*2^n == 1 (mod 17)
If n == 0 (mod 3) then k*2^n == 1 (mod 7)

Therefore k*2^n - 1 can be prime only if n == 2 or 10 (mod 24).
We can eliminate more potential values of n using

If n == 8  (mod 18) then k*2^n == 1 (mod 19)
If n == 18 (mod 20) then k*2^n == 1 (mod 41)
If n == 6  (mod 28) then k*2^n == 1 (mod 29)

Some congruences are redundant; for example

If n == 6 (mod 12) then k*2^n == 1 (mod 13)

eliminates nothing new.  k = 701 has less such redundancy than 
the typical k.




_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Re: Mersenne Digest V1 #843

2001-04-29 Thread Peter-Lawrence . Montgomery



"Brian J. Beesley" <[EMAIL PROTECTED]> wrote,
in response to my earlier message

>  The point of course is that there is a formal proof that, if a prime 
>  p is congruent to 3 modulo 4 and 2p+1 is also prime, then 2^p-1 is 
>  divisible by 2p+1 - which makes searching for a factor of M(p) by 
>  trying to factor M(M(p)) somewhat pointless!
>  > 
>  > With c = 2^727 - 1, 2^751 - 1, 2^809 - 1, 2^997 - 1, 2^1051 - 1, I
>  > looked for factors 2*k*c + 1 of 2^c - 1, but found none with k <=
>  > 2.  I invite those with a special search program for M_(M127)
>  > factors to search these further.
>  
>  None of 727, 751, 809, 997 & 1051 are "3 mod 4" S-G primes (else they 
>  wouldn't be on the "factors wanted" lists!)
  
 Then I challenge readers to show how to discover the factor 
11447 of 2^97 - 1 given the factor 1800*(2^97 - 1) + 1 of M(M97)
97 is not 3 mod 4.

 Let c be a large positive integer.   
Given k with 1 <= k <= 2, the chance that 2*k*c + 1 
is a prime divisor of 2^c - 1 is about

  (1/(2*k)) * (2/ln(2*k*c)) = 1/(k * ln(2*k*c))

Integrate this from k = 1 to k = 2.
An antiderivative is ln(ln(2*k*c)).  The integral is

  ln(ln(4*c)) - ln(ln(2*c))
= ln( ln(4*c) / ln(2*c) )
= ln(1 + ln(2) / ln(2*c) )

For c as large as 10^8, the quotient ln(2) / ln(2*c) exceeds 0.5, 
giving an estimated ln(1.5) ~= 0.4 factors with 1 <= k <= 2.
For c = 2^e - 1 with large e, we estimate 14.2/(e + 1) factors
with k in this range.  This is only 0.02 factors when e = 727,
so it should not be surprising that my search was unsuccessful.
Extending the search to k < 10^8 will approximately double
our chance of success.


  Peter Montgomery
  [EMAIL PROTECTED]



--JAA22954.988529496/hera.cwi.nl--




_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Re: Mersenne Digest V1 #843

2001-04-27 Thread Peter-Lawrence . Montgomery


On 26 Apr 2001, Brian J. Beasley wrote

> On 26 Apr 2001, at 6:34, Hans Riesel wrote:
> 
> >  Hi everybody,
> > 
> >   If 2^p-1 is known to be composite with no factor known, then so is
> > 2^(2^p-1)-1.
> 
> Much as I hate to nitpick a far better mathematician than myself, 
> this is seems to be an overstatement.
 
 (two paragraphs deleted)

> Question (deep) - if we did discover a factor of 2^(2^727-1)-1, would 
> that help us to find a factor of 2^727-1 ?

   I am skeptical too.  Show us how the factors

   131009 of M_(M11) = 2^(2^11 - 1) - 1
   724639 of M_(M11)
285212639 of M_(M23)

lead to factorizations of M11 and M23.   Why don't the factors

231733529 of M_(M17)
 62914441 of M_(M19)

lead to similar factorizations of M17 and M19?

With c = 2^727 - 1, 2^751 - 1, 2^809 - 1, 2^997 - 1, 2^1051 - 1, 
I looked for factors 2*k*c + 1 of 2^c - 1, but found none
with k <= 2.  I invite those with a special search program
for M_(M127) factors to search these further.

Peter L. Montgomery
[EMAIL PROTECTED]


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: [slightly OT] Web discussion about distributed computing

2001-04-22 Thread Peter-Lawrence . Montgomery



Spike Jones <[EMAIL PROTECTED]> comments;
--953E7B41754C35108B97B346

> Nathan I liked your comment about the largest genuine composite:
> a number known to be composite but for which none of the factors
> are known.  I suppose we could set up a computer to arbitrarily
> generate a few million 20 digit primes by factoring, then multiply
> them all together to get the largest known composite number for
> which none of the factors would be "known", eh?  spike

Knowing that the product is composed of 20-digit
primes, it can be broken by ECM.   
If the product has 100 million decimal digits (330 million bits), 
then we need 41 megabytes per residue.  
ECM step 1 with homogeneous coordinates needs about 6-10 such residues,
so 1 Gb is more than adequate even with temporaries for FFT code.

The first GCD will reveal several of the factors --
how many depends upon the step 1 limit.  
Perhaps one finds an 8-million-digit factor and a 92-million-digit cofactor. 
Both of these can be fed back into the algorithm
(recursively) until the original number is completely factored.

This computation can be partially parallelized.  Perhaps 30 machines
each get two factors, around 8 million digits and 92 million digits.
Using results from two machines, we get four factors, about
64, 736, 736, 8464 digits.
After incorporating results from all 30 machines, 
the largest factor (where all curves were unsuccessful) 
will be about 10^8 * (23/25)^20 ~- 8.2 million digits.

Spike's idea, with 50-digit or 100-digit primes rather than
20 digits, will give hard-to-factor numbers using today's 

Are there any large known composite numbers c
for which we know a (not necessarily prime) factor of 2^c - 1 but not of c?


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: arithmetic progression of consecutive primes

2001-04-07 Thread Peter-Lawrence . Montgomery


> From: [EMAIL PROTECTED]
> In a message dated 05/04/2001 05:19:38 GMT Daylight Time, "Gary Untermeyer" 
> <[EMAIL PROTECTED]> writes:
> 
> > Let x be a prime number.  Consider the series of  numbers that take the
> > following form:
> > 
> > x,  x + n,  x + 2n,  x + 3n,  x + 4n,  x + 5n,  x + 6n,  where n is an
> > even positive whole number.
> > 
> > In this series of seven numbers, can anyone tell me why, if ALL of these
> > numbers are prime, that the minimum value of n is 210 if all the terms
> > are _consecutive_ prime numbers?
> 
> Inverting the argument, if the sequence has N terms, then n must be a 
> multiple of all the primes <= N. So in your example (N = 7), n must be a 
> multiple of 2, 3, 5 and 7, i.e. of 210. This is still the case for N = 8 
> (your next case), 9, and 10. For N=11, n must be a multiple of 210*11=2310.
> 
> Note that this is true for _all_ arithmetic progressions of primes, not just 
> for  progressions of _consecutive_ primes.

  We need the additional hypothesis that the least prime exceeds N.
The primes 5, 53, 101, 149, 197 are in arithmetic progression, 
but the difference (48) is not divisible by 5.
Likewise for 7, 157, 307, 457, 607, 757, 907.

Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: arithmetic progression of consecutive primes

2001-04-04 Thread Peter-Lawrence . Montgomery


> From: "Gary Untermeyer" <[EMAIL PROTECTED]>

> Greetings,

> Although this is not a question regarding mersenne primes, I thought I'd
> throw this out to the readers here.
 
> Let x be a prime number.  Consider the series of  numbers that take the
> following form:
 
> x,  x + n,  x + 2n,  x + 3n,  x + 4n,  x + 5n,  x + 6n,  where n is an
> even positive whole number.
 
> In this series of seven numbers, can anyone tell me why, if ALL of these
> numbers are prime, that the minimum value of n is 210 if all the terms
> are _consecutive_ prime numbers?

The _consecutive_ hypothesis is not required, 
other than requiring x > 7 and n > 0.  If n is not divisible by 7,
then one of the seven numbers x, x + n, ..., x + 6n
will be divisible by 7, hence not prime.
[No two of the seven will be congruent modulo 7,
so all possible remainders modulo 7 must be represented.]
Likewise n must be divisible by 2, 3, 5.





_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

RE: Mersenne: LL question

2001-03-28 Thread Peter-Lawrence . Montgomery


- From: "Hoogendoorn, Sander" <[EMAIL PROTECTED]>

-> But what if the mod M comes out to 1 on one of
-> the intermediate steps?  Then 1^2 - 2 = -1
-> Then what?  spike

-Then you will be stuck in a loop, same thing happens when the outcome is
- -2, -1, 0 and 2

   This case never occurs.  If M = 2^p - 1 where p is prime and p > 2,
then M == 7 (mod 12).  By quadratic reciprocity, the 
Jacobi symbol (3 over M) is -1, so 3 is not a square modulo M.
Therefore x^2 - 2 == 1 (mod M) has no solution.



_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Primitive roots for Galois transforms?

2001-03-13 Thread Peter-Lawrence . Montgomery



> From [EMAIL PROTECTED] Tue Mar 13 16:06:33 2001

> I wonder if there is possible to find primitive roots in GF(q^2), say a
> + ib,  with the condition 
> 
> a^2 + b^2 = 1 mod q
> 
> If this were possible, then  
> 
> (a + ib)*(a - ib) = 1 + 0i (mod q)
> 
> i, e. the inverse of the root would be the conjugate, and the
> transformed of a real integer signal should have the same symmetries
> than in trigonometric complex case. Could it resolve the problem for a
> non-power-of-two FFT length ?. If the answer is yes: any trick to find
> the root other than brute force?
> 
> Regards.
> 
> Guillermo.

 No.  Given alpha = a + ib in GF(q^2), where q == 3 (mod 4), observe that

   alpha^q = (a + ib)^q = a^q + (ib)^q = a^q + i^q b^q = a - ib

is the conjugate of alpha.  Therefore 

   alpha^(q+1) = alpha * alpha^q = (a + ib)*(a - ib) = a^2 + b^2.

If alpha is a primitive root in GF(q^2) (order q^2 - 1), 
then alpha^(q + 1) is a primitive root in GF(q) (order q - 1).
In particular, a^2 + b^2 <> 1.

(I deleted names other than [EMAIL PROTECTED] from the To: and Cc:
lines, so we don't receive duplicates).

  Peter


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Primitive roots for Galois transforms?

2001-03-11 Thread Peter-Lawrence . Montgomery


Jason Stratos Papadopoulos <[EMAIL PROTECTED]> writes


> Hello again. After working out how to do integer FFTs on the Alpha, I'm
> considering additional code that uses Fast Galois Transforms (FGTs),
> e.g. for complex integers in GF(p^2) for p a prime congruent to 3 mod
> 4. Bill Daly mentioned this idea before on the list (end of 1998) but
> unfortunately didn't give enough details for me to work out the rest.
> I'm considering a 64-bit version for the Alpha and maybe a 32-bit version
> using MMX and two primes on the Pentium. As usual, though, I need some
> pointers with the theory.
> 
> The first big question is how you find a primitive root for transforms of
> this type. Richard Crandall gives a closed-form expression when p is a
> Mersenne prime, but if I'm not using a Mersenne prime for p I would still
> need primitive roots.

 Assuming the prime p is fixed at compile time, you can specify 
a primitive root g (of prder p-1) in the binary.  You can try g = 3, 5, 7, ...
until you succeed.   You will need the prime factorization of p-1
when you test whether g is really a primitive root, but that is
easy for 64-bit values of p.  A symbolic algebra program 
such as Maple can assist you in the calculation.

 Later, if you want to do an FFT of length n where n divides p-1, 
your primitive root (of order n) can be g^((p-1)/n) (mod p).
 
> The next question involves eighth-roots of 1 modulo p. For FFTs in complex
> numbers, these eighth roots have a special form that need less arithmetic
> than a complex multiply. Is this also the case in GF(p^2) for general p? 
> Or do you need a special primitive root to get these savings? In
> principle, an eighth-root of this form could mean a radix-8 FGT is
> possible that uses no integer multiplies.

Over GF(p^2) the primitive root g will have order p^2 - 1
rather than p-1.  Again a small search will suffice.
You raise this to the power (p^2 - 1)/n where n divides p^2 - 1.

If p == 7 (mod 8), then there exists a value sqrt(1/2) in GF(p) 
such that 2*sqrt(1/2)^2 == 1 (mod p).  
The primitive 8th roots of 1 modulo p are

 +- sqrt(1/2) +- i*sqrt(1/2),

just as in the complex case.  You can optimize

   (a + b*i) * (sqrt(1/2) + i*sqrt(1/2))

to
(a - b)*sqrt(1/2) + i*(a + b)*sqrt(1/2)

(with two multiplications modulo p), just as in the complex case.

 
> Finally, how in blazes do you find a root of two with these strange
> complex integer things? I managed to answer this halfway by myself for
> more conventional integer FFTs, but I have no clue where to begin here.

 Assume p == 7 (mod 8).  Then 2 is a square mod p, but -1 is not.
The two values of sqrt(2) will be negatives of each other.
Exactly one of these will itself be a square; select that
value for sqrt(2).  Then choose sqrt(sqrt(2)) to itself be a square.
Continue until you have the root you want.

 In other words, if n is a power of 2, you are looking for
a value x mod p such that

   x^((p-1)/2) == 1 (mod p)
   x^n == 2 (mod p)

The last argument shows that a common solution exists.
The exponents n and (p-1)/2 are coprime.  
If n*n' == 1 (mod (p-1)/2), then x == 2^(n') (mod p).

 When n is not a power of 2, such as if p was chosen with
p == 1 (mod 105) to allow transforms of lengths 3, 5, 7,
then you will need to check (when p is chosen at compile time)
whether 2 is an appropriate power.

 Ernst Mayer and I exchanged many mailings about 
using GF(p^2) when p = 2^61 - 1.  
I thought he had implemented it, as part of a
mixed integer/complex FFT.

   Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Understanding the Propagate Carry Step in the LL Test

2001-01-05 Thread Peter-Lawrence . Montgomery


Fix a length n.  If (a0, a1, ..., a_(n-1)) and
(b0, b1, ..., b_(n-1)) are sequences of length n
(in some ring or field), then their circular convolution
is (c0, c1, ..., c_(n-1)) where

  c_i = sum_{i == j + k mod n}  a_j * b_k

For example, when n = 5, the circular convolution of
(a0, a1, a2, a3, a4) and (b0, b1, b2, b3, b4) is
(c0, c1, c2, c3, c4) where

  c0 = a0*b0 + a1*b4 + a2*b3 + a3*b2 + a4*b1
  c1 = a0*b1 + a1*b0 + a2*b4 + a3*b3 + a4*b2
  c2 = a0*b2 + a1*b1 + a2*b0 + a3*b4 + a4*b3
  c3 = a0*b3 + a1*b2 + a2*b1 + a3*b0 + a4*b4
  c4 = a0*b4 + a1*b3 + a2*b2 + a3*b1 + a4*b0

The circular convolution is useful for polynomial
multiplication.  Here, if
  
  A(X) = a0 + a1*X + a2*X^2 + a3*X^3 + a4*X^4
  B(X) = b0 + b1*X + b2*X^2 + b3*X^3 + b4*X^4
  C(X) = c0 + c1*X + c2*X^2 + c3*X^3 + c4*X^4

then 

  A(X)*B(X) = C(X) + (X^5 -   1)(a1*b4 + a2*b3 + a3*b2 + a4*b1)
   + (X^6 -   X)(a2*b4 + a3*b3 + a4*b2)
   + (X^7 - X^2)(a3*b4 + a4*b3)
   + (X^8 - X^3)(a4*b4)
== C(X)   (mod X^5 - 1)

In general C(X) == A(X)*B(X) mod (X^n - 1).
By choosing n > deg(A) + deg(B) (i.e., padding the input 
polynomials with leading zero coefficients) 
one can get their product exactly.

At first glance the computation of the ci's seems
to require n^2 multiplications and n(n-1) additions.
Techniques which I won't describe reduce this cost to O(n*log n) for large n.  

If all ai and bi are integers, then so are the ci.
An upper on the ci is easily derived from upper bounds
on the ai and bi.  Convolution algorithms which use 
floating point (i.e., approximate) arithmetic 
can estimate the largest potential error in the ci.
If this potential error is under 0.5, then one can
round the approximate ci to the nearest integers,
ensured one will get the correct results.

- Large integer multiplication

When p is prime, the Lucas-Lehmer test for Mp = 2^p - 1
needs many squarings modulo Mp.  How do we use the
circular convolution to assist us?

The task is to multiply two integers a, b in [0, Mp)
(or the closed interval [0, Mp] if we allow two 
representations for 0 mod Mp), getting another such result.  
We know the binary representations of a and b.

The textbook strategy first computes the double-length product a*b.
later reducing this modulo Mp.  Choose a radix R (a power of 2)
and a length n such that R^n > Mp.  Write the inputs in radix R:

   a = a0 + a1*R + ... + a_(n-1)*R^(n-1)
   b = b0 + b1*R + ... + b_(n-1)*R^(n-1)

where all ai and bi are in [0, R-1). 
[Or use signed digits in [-R/2, R/2].
Set ai = 0 and bi = 0 for n <= i < 2n.
Use a length-(2n) convolution to multiply the 
corresponding polynomials A(X) and B(X) modulo X^(2n) - 1.
This will give the exact polynomial product since
deg(A) + deg(B) <= (n-1) + (n-1) < 2n.
The coefficients of this product are in [0, n*(R-1)^2] --
the convolution algorithm must produce accurate outputs
up to about n*R^2.

Let the output coefficients be 
(c0, c1, ..., c_(2n-2), c_(2n-1)) where c_(2n-1) = 0.
To get the integer product c = a*b we set X = R
in the polynomial identity C(X) = A(X)*B(X).  

Alas, the digits of C can exceed R-1.
We circumvent this by adding an appropriate multiple
of X-R to C(X).  The loop resembles

carry := 0
for j from 0 to 2n-2 do
sum := carry + c_j
carry := truncate(sum / R)
c_j := sum - carry*R
end for
c_(2n-1) := carry

Now one can read off the product a*b, in radix R.

 Multiplication modulo Mp

Given a, b in [0, Mp-1], represented in radix R,
the last section describes how to get their product a*b in radix R.
Reduction modulo Mp is easy when R is a power of 2.

This algorithm uses a length-(2n) convolution where R^n > Mp.  
It seems wasteful to insert all those leading zeros into the 
(a0, ... a_(n-1)) and (b0, ..., b_(n-1)) sequences?  Can we avoid this?

The answer is yes if R^n - 1 = Mp, i.e., if R = 2^(p/n).
Let's look at this case in detail.  We write a and b in radix R,
with digits in [0, n-1].  Take the polynomial product
modulo X^n - 1 (rather than modulo X^(2n) - 1).  Then let X = R.  
The full polynomial product is not obtained
(only the product modulo X^n - 1).  
When we substitute X = R, the resulting integer may be off 
by a multiple of R^n - 1 = 2^p - 1 = Mp, but that is OK.
The coefficients in the polynomial product modulo X^n - 1
will not exceed n*(R-1)^2.  

This seems to require that R be an integer, 
which is possible only if p/n be an integer.
When p is prime, this allows only the cases R = 2
and R = 2^p.  Neither is convenient to use.

The Crandall-Fagin trick allows this to work even when 
p does not divide p, using an irrational radix R = 2^(p/n).  
We illustrate with n = 5 and p = 5k + 2.
Given an integer a in [0, 2^p - 1], expand it as

   a = a0' + a1'*2^(k+1) + a2'*2^(2k+

Re: Mersenne: 2000 - first calender year without a Mersenne

2000-12-31 Thread Peter-Lawrence . Montgomery



Nathan Russell observes:
> Well, unless there is an announcement within the next few hours, 2000
> will be the first calender year in the history of GIMPS without a
> Mersenne prime.  
> 
> Is the number of searchers, and the power of their hardware, increasing
> enough to keep up with the increased runtimes and (slightly) decreased
> chance of a given exponent being prime?  Or can we expect only one prime
> every several years, as was the case before the start of the GIMPS
> effort?  
 
  The chance that a given Mp is prime is O(1/log(Mp)) = O(1/p).
The LL test for Mp needs O(p) iterations.  
The time per iteration (using an FFT) is about O(p*log(p)).  
So the estimated time per LL test is O(p^2 * log(p)) 
whereas the chance of success on one LL test is O(1/p).

  We need a few adjustments to the formula since (for example)
the time to trial divide by primes < N is approximately 
O(log(p) /p) for fixed N.  [We test O(N/p) divisors below N, 
with a cost of O(log(p)) per divisor.]  This trial division
time decreases with p, whereas LL time increases, which
affects our strategy.

  Nonetheless, to test all O(2^b / b) primes with b bits,
we estimate O(2^b / b) * O(2^(2b) * b) = O(8^b) operations
with O(1) expected successes.  Doing all 24-bit p's 
takes about eight times as long as doing all 23-bit p's.
Even with more contributors and faster hardware, 
we soon get diminishing returns.

   Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: n_th roots of 2 in a finite field?

2000-12-17 Thread Peter-Lawrence . Montgomery



 If p is a prime which for which 2^22-th roots of unity exist,
then p == 1 (mod 2^22).  All such primes have 2 as a square, but 2 will 
be a 2^22 power only one time in 2^21.  
Of the 2^61/ln(2^61) primes between 2^61 and 2^62, 
only one in 2^42 will meet both requirements.  
You should find about 2^19/(61*ln(2)) ~= 52/ 42 ~= 12000 cases.

To test whether 2 is a 2^22-the power, check whether
2^((p-1)/2^22) == 1 (mod p).  

Once a p is found, you can take 22 successive square roots,
starting with sqrt(2).  One algorithm for sqrt(y0) mod p where y0 <> 0
picks a random nonzero x0 and tries to solve the simultaneous equations

 x^2 == y0   (mod p)
 (x + x0)^((p-1)/2) == 1 (mod p)

The hope is that one of sqrt(y0) +  x0 and -sqrt(y0) + x0
will be a quadratic residue, the other a non-residue.
This will happen for about half of the possible x0
values (precisely those for which x0^2 - y0 is a non-residue).
The left side of the degree-((p-1)/2) equation is 
computed modulo the quadratic x^2 - y0 (and modulo p),
until one gets a linear equation  ax == b (mod p).
If a is nonzero, one solves for x.  
If a is zero, try another x0.

 Getting a (2^22)-th root of 1 is easier. 
Try x0^((p-1)/2^22) mod p, where x0 is a quadratic non-residue.

   Peter Montgomery


> Jason Stratos Papadopoulos <[EMAIL PROTECTED]> asked


Hello. I'm putting the finishing touches on a large-integer convolution
library that's optimized for the Alpha ev6, and I want to build support
for Mersenne-mod convolution right into the library. However, the
library is integer-only and works with integers modulo a 62-bit prime
(eventually several 62-bit primes, once I code up Chinese remainder
convolution). This means that in order to perform DWT arithmetic I have to
find n_th roots of 2 in a finite field modulo a given prime, where n is a
large power of two.

I have no clue how to do this. Nick Craig-Wood's page gives some hints for 
the special case of 2^64-2^32+1, but not enough for me to apply the theory
to other primes. It looks like you need to find primes p so that
(p-1)/(order of 2 mod p) has a large power of 2 as a factor. 62-bit primes 
where this is the case seem quite rare; I've found only 14 primes of this
size that allow a Mersenne DWT of size 2^22 or more (out of a billion
candidates). 

But now I need some way to figure out the actual root of two. Can anybody
point me in the right direction?

jasonp

PS: This code promises to be the fastest integer-only LL test available. 
It's already producing runtimes in the ballpark of floating point code,
and will improve more when I add CRT code and optimize more heavily.

_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Factoring

2000-11-30 Thread Peter-Lawrence . Montgomery



Chris Nash proposed:

> >We could let prime95 decide the next election .
> >Give everybody a different prime number. Multiply the primes for 
> candidate A together, likewise for B.
 
> And like in this election, unless we can completely factor the results, 
> we wouldn't know who won, either.
 
No need to factor during primery elections.




_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: P4

2000-11-26 Thread Peter-Lawrence . Montgomery



> Hi all,
> 
> The mailing list has been quiet.  I hope everyone enjoyed
> a happy Thanksgiving (or at least a good weekend for non-U.S. readers).

The focus has been on double-checking, by a different mechanism
than that which made the original count, alas in a non-Mersenne context.

>   I've received 2 queries about the recently released Pentium 4
> and prime95.  I have no timings at this point, but I figured some folks
> would like to know how the architecture helps or hurts our cause.  I've
> downloaded the manuals and have the following observations:

 (stuff deleted)

> 2)  The P4 introduces SSE2 instructions.  Intel hopes new programs
> stop using the old FPU instructions and start using these new instructions.
> The SSE2 instructions work on 2 floating point values at the same time!
> An ADD takes 4 clocks, but can only issue every other clock cycle.  A
> MUL takes 6 clocks and also can be issued every other clock cycle.
> 
> The theoretical maximum throughput for SSE2 is one ADD *AND* one
> MUL every clock cycle.  The average latency is 2 for a ADD and 3 for
> a MUL.
> 
> Summary:  If a program can be effectively recoded to use SSE2,
> then it can have greater throughput than even the Athlon.  Of course,
> months ago I had hoped that the P4 would be able to get a throughput
> of 2 ADDs and 2 MULs per clock cycle.  Maybe in a few years, a
> future P4 or AMD chip will do this.

 I understand that the SSE2 instructions operate only on
64-bit (and 32-bit) floating point data, whereas the 
FPU registers support 80-bit intermediate results.
How will the loss of precision affect the FFT length?

Vector processors such as the Cray typically support both

  vector op vector -> vector
  vector op scalar -> vector

opcodes, so one can (for example) form all b[i]^2 - 4.0*a[i]*c[i]
when solving several quadratic equations.
[We need two vector*vector multiplications, 
one vector*scalar multiplication, 
one vector-vector subtraction.]
I find it strange that the MMX and XMM and SSE2 instruction sets
lack vector*scalar operations and also lack a way to make multiple
copies of the constant 4.0, other than to store multiple copies
in memory.  While data replication in memory 
(or adding a[i]*c[i] to itself twice) may be acceptable here, 
we don't want multiple copies of the table of roots of unity,
for example.

Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.exu.ilstu.edu/mersenne/faq-mers.txt

Re: Mersenne: Likelihood Of Small Factors

2000-08-04 Thread Peter-Lawrence . Montgomery



 If you are trying to factor Mp = 2^p - 1 where p is prime and
p > 2^58, then Mp can have no 59-bit factors
[All factors are == 1 (mod 2*p).]

 But when factoring a random integer n, we estimate the probability 
of a factor between 2^51 and 2^59 as


 1 - product (1 - 1/q)
2^51 < q < 2^59
q prime


That is, we model this with a separate event for each 
potential prime divisor.  You can approximate this
probability by first approximating the logarithm of the product
by an integral.  The probability does not depend on n.

If we want to restrict the product to primes q == 1 (mod p),
as when factoring Mp, then (1 - 1/q) becomes (1 - p/(q-1)),
the probability that 2 is a ((q-1)/p)-th residue mod q
This is approximately (1 - 1/q)^p.  
There are only 1/(p-1) times as many potential choices for q,
but each such q has approximately p times as much impact.
To a first approximation, the success probability remains 
stable (i.e., reaches a limit other than 0 or 1)
as the exponent increases.

 Peter Montgomery





Question:

>From a number theoretic point of view, does the
likelihood of small (52 through 59 bits) factors increase
with exponent size?

Regards,
Stefanovic


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.exu.ilstu.edu/mersenne/faq-mers.txt

Re: Mersenne: Another ECM question

2000-04-09 Thread Peter-Lawrence . Montgomery


> Nathan Russell asks

> To start with, thanks to all those who answered my first question.
> 
> What I want to know is as follows: If no factors are found for an exponent 
> after running the needed curves for a 55 digit factor, what is done?  Are 
> additional curves run, or is that exponent put aside to await improvements 
> in software?

The exponent would probably be put aside unless another
algorithm is known for completing the factorization.  

Nathan's earlier question was about M727, the first Mp 
with no known prime factors.  I anticipate somebody 
(perhaps NFSNET) will finish this by ECM or SNFS within two years.

Peter Montgomery




_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Another Conjecture?

2000-03-22 Thread Peter-Lawrence . Montgomery


"Nathan Russell" <[EMAIL PROTECTED]>
Nathan Russell observes:

> This is something I've been toying with the last few days.
> 
> I can prove that all even positive integers except 2, 4 and 8 can be written 
> as the sum of Mersenne primes.
> 
> All above 21 are either 0, 1 or 2 mod 3, and are therefore the sum of either 
> 1, 2 or 3 sevens with a sufficient number of 3's thrown on top.
> I am sure this can (and should) be stated far more formerly, but my real 
> question is this: Is it possible to strengthen this conjecture, say by 
> putting a ceiling on the number of times that any one prime need be 
> repeated?
> 
No.  Let e1 < e2 < ... be the exponents of the Mersenne primes. 
If this sequence is finite, say ending at ek, then only finitely 
many numbers are representable if we bound the number of summands.

 Next suppose there are infinitely many Mersenne primes.  
We can bound the partial sum

 M(e1) + M(e2) + ... + M(ek) 
   < 2^e1 + 2^e2 + ... + 2^ek
   < 1 + 2 + 4 + ... + 2^(ek) = 2^(ek+1) - 1.

To represent M(e(k+1)) as a sum of these, some summand must be repeated
at least 2^(e(k+1) - ek - 1) times.  Both e(k+1) and ek are primes.
It is an easy exercise to show that there can be arbitrarily
large gaps between adjacent primes and hence 
between adjacent Mersenne exponents.

Another proof starts with a bound m on the number of times each
summand is used.  Using M(e1) through M(ek) we can form at most
(m + 1)^k different sums.  Show that 

   M(e(k+1)) > (m+1)^k

for large k, whatever the choice of m.



_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Mersenne: Summer research opportunities for undergraduates

2000-02-12 Thread Peter-Lawrence . Montgomery


The deadlines are approaching for many summer research programs 
for undergraduates.  See

   http://www.nsf.gov/mps/dms/reulist.htm

for a partial list of programs in the USA.  
For a hardcopy, see p. 20 of the January/February 2000 SIAM News.


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Fibonacci series..

1999-12-20 Thread Peter-Lawrence . Montgomery


"Ian L McLoughlin" <[EMAIL PROTECTED]> asks

> Since the list is quiet...
> Does a Fibonnacci series contain a finite or an infinite number of primes?
> From what I understand..
> In a gen.F sequence if the first two numbers are divisible by a prime all
> its numbers are divisible by the same prime, if the first two numbers are
> co-prime is there a generalised sequence that contains NO PRIMES

  This is part of problem A3 in Richard K. Guy's
`Unsolved Problems in Number Theory', Second Edition,
Springer-Verlag, 1994.  This book could make a good
Christmas gift to your number theory friends.
Richard Guy gives solutions starting with

1786 772701  928802  632268  715130  455793
1059 683225  053915  111058  165141  686995   (Graham)

and

 49463  435743  205665
 62638  280004  239857  (Knuth)


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Re: Mersenne Digest V1 #663; Norwegian Abel contest problem

1999-11-24 Thread Peter-Lawrence . Montgomery


[EMAIL PROTECTED] gives a UBASIC solution to Steinar's problem:
<<  
 > First, perhaps I should explain some notation :-) a_11 is the letter `a',
 > followed by 11 in subscript. x^2 is the letter `x', followed by the letter
 > 2 in superscript (ie. `x^2' would be mathematically the same as `x*x').
 > OK, here goes:
 > 
 > If (3x^2 - x - 2)^6 = (a_12)x^12 + (a_11)x^11 + ... + (a_1)x + a_0, what
 > is a_0 + a_2 + a_4 + ... + a_12?
 > 
 > The answer is an integer from 0 to 999, inclusive.
 > 
  >>
  A Maple solution needs only one input line (plus the command line):

|\^/| Maple V Release 5 (WMI Campus Wide License)
._|\|   |/|_. Copyright (c) 1981-1997 by Waterloo Maple Inc. All rights
 \  MAPLE  /  reserved. Maple and Maple V are registered trademarks of
 < >  Waterloo Maple Inc.
  |   Type ? for help.
> add(coeff((3*x^2 - x - 2)^6, x, 2*j), j = 0..6); quit;
   32


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Mersennes are square free?

1999-11-24 Thread Peter-Lawrence . Montgomery


Daniel Grace <[EMAIL PROTECTED]> asks

>  I looked at Chris Caldwells page on Wieferich (1909)
> primes but I could not see exaclty how p^2|2^(p-1)-1 relates to
> Mersennes with square factors?  I can see that Mp=3(2^(p-1)-1).
> So my question is this "How does one derive Wieferich's result,
> from the statement: let p be a prime and n be an integer such
> that p^2|2^n-1?"
 
> I assume that n must be a prime otherwise:
> Is it always true that if q|2^p-1 where p & q are primes
> then q^2|2^(pq)-1? eg. 23^2|2^(23.11)-1.
 
 Write 2^p = 1 + k*q, since you assume q | (2^p - 1).
You want to show that q^2 divides (1 + k*q)^q - 1.
Use the binomial theorem.  Or show by induction on m that

(1 + k*q)^m == 1 + k*q*m (mod q^2)

for all integers m >= 0.  Then set m = q.


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Modular arithemtic

1999-11-13 Thread Peter-Lawrence . Montgomery


Wojciech Florek <[EMAIL PROTECTED]> notes:

> Hi all!
> 
> S.A. Fulling from Texas A&M Univ. posted two articles at
> the Los Alamos Nat. Lab. archive. The first is on
> quantum computers and refers to the second one `abstracted' as
> 
> "This is a pedagogical article cited in the foregoing research note"
>  
> 
> I'd be obliged if some of our number-theory/programming wizards
> will look at these articles (esp. the second).
> 
> [To look visit  http://xxx.lanl.gov/abs/quant-ph/9911050
> and http://xxx.lanl.gov/abs/quant-ph/9911051 ]
> 
> 
> Is there something new?
> Is there something interesting for the GIMPS and other math projects?
   
I looked only at the second article.  It begins with the recurrence

  a[n+1] = a[n]^2 + (n + 3)*n*a[n]  a[0] = 1

We can grind out a[1] = 1, a[2] = 5, a[3] = 75, 
a[4] = 6975, a[5] = 48845925, ...

 Author S.A. Fulling suggests computing the sequence by modular
arithmetic (using primes below 2^16) to avoid big-integer computations.
Then use the Chinese remainder theorem to get the final result.

 This technique is well-known, but it requires a good _upper bound_
on the final answer.  Here we see a[2] = 5 > 2^2, and it is easy to prove

a[n] > 2^(2^(n-1))(n >= 2)

The product of the primes below 65536 is obviously below
65536^65536 = 2^(2^20).  So

   a[21] > 2^(2^20) > (product of primes below 65536)

We can't determine a[21] uniquely from its remainder modulo these primes.

An example where the CRT can be useful is determinant evaluation.
Consider an n x n matrix of integers, all bounded by B in absolute value.
The determinant of this matrix is a sum of n! products, each at most B^n,
so the determinant is bounded by n! * B^n.  The Hadamard inequality gives
a better bound:  n^(n/2) * B^n.  A 100 x 100 matrix with integer 
entries up to 10^6 will have determinant bounded by 

100^50 * (10^6)^100 = 10^700.

We can evaluate this determinant modulo 150 primes near 6,
after which we know it uniquely.   The modular evaluations can
use Gaussian elimination.  A direct approach using Gaussian elimination
would involve many rational numbers with huge numerator and denominator, 
and soon become unwieldly.  Direct evalation by minors avoids the fractions 
but needs 100! arithmetic operations.

 Peter Montgomery





_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: LL and Pollard-rho in one?

1999-10-28 Thread Peter-Lawrence . Montgomery


Alexander Kruppa <[EMAIL PROTECTED]> observes:

> Hi,
> 
> the Lucas-Lehmer iteration
> 
> L_0 = 4
> L_{n+1} = L_n ^2 -2
> 
> looks suspiciously like an iteration used in Pollard-Rho:
> 
> x_0 = a
> x_{n+1} = x_n ^2 +b
> 
> Can this be used to do a LL test and Pollard-rho factoring attempt at
> once?
> I remember faintly having read something that b=-2 is not a good choice
> for Pollard-rho, but I don't remember any explanation why.. would it
> work here?


 If L_0 = alpha + 1/alpha (where alpha = 2 + sqrt(3) is found
by solving a quadratic) then

L_n = alpha^(2^n) + alpha^(-2^n)

L_m - L_n = alpha^(2^m) + alpha^(-2^m) - alpha^(2^n) - alpha^(-2^n)
  = (alpha^(2^m) - alpha^(2^n)) + (1 - alpha^(-2^m  - 2^n))

If p is a factor of the number we are trying to factor, then 
the order of alpha will divide one of p+1 and p-1
(depending upon whether alpha is in the base field GF(p) 
or the extension field GF(p^2)).
The size of this order will be O(p).  
m and n will usually need to have size O(p) before we achieve

alpha^(2^m) == alpha^(+- 2^n)   (mod p) 

(which is the same as   2^m == +- 2^n   (mod (order of alpha)) ).

When b <> 0, -2, we know no such formula for the iterates of Pollard
Rho.   x -> x^2 + b (mod p) seems to behave like a pseudorandom function 
modulo p, and Pollard Rho takes O(sqrt(p)) iterations rather than O(p).

For Lucas-Lehmer, when p = 2^q - 1 is a Mersenne prime, p+1 is
divisible by a very high :-) power of 2.  The L_n values collapse to 
0, -2, 2, 2, ... as alpha^(2^n) becomes sqrt(-1), -1, 1 in GF(p^2).




_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Smallest unfactored composite (was types of work to request...)

1999-10-16 Thread Peter-Lawrence . Montgomery

On Fri, Oct 15, 1999 at 01:03:57AM -0800, Gordon Bower wrote:
> Here's another argument - suppose the largest unfactored composite was C. How
> long did it take to determine the factorization (or primality) of C-2? (C-1
> would be even.) Then to factor C would only take a marginally longer amount of
> time than it took for C-2. There is no reason you could not complete the
> factorization of C.

Alas, if Gordon's argument is valid, then every positive integer would be
factored.  This is the principle of mathematical induction.

A computer may be in a loop.  It has factored C-2 and C-1, and is now
working on C.  The task of factoring C may indeed take only marginally longer 
than it took for C-2, but the extra time is nonetheless positive.
The next number may be factored as you read this paragraph,
so a journal article saying "Every number below this C has been factored, 
but C itself has not been" would become outdated almost 
immediately, even if true when submitted. 

We can use the same argument on how much water our bodies will hold, 
or how much pollution to allow.  At any given time, adding a single
molecule may seem safe.  But the capacities are finite.

_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: targeting multiply/add (Was: K7 vs. x86)

1999-08-24 Thread Peter-Lawrence . Montgomery


Ernst Mayer writes:

- Hi, Jason:
- 
- >Or, if you just want to cheapen a complex multiply, turn the standard
- >form 
- >
- >   xr,xi * yr,ri ->   xr*yr-xi*yi, xr*yi+xi*yr
- >
- >into
- >
- >   xr,xi * yr,ri ->   xr*(yr - xi/xr * yi), xr*(yi + xi/xr * yr)
- >
- >and precompute xr and xi/xr. Presto! 2 multiplies and 2 multiply-adds.
- 
- An excellent suggestion - why rely on the compiler perhaps being smarter
- than it really is, when you drop a broad hint with very little modification
- of the normal code?
- 
- I tried the above in just the forward radix-16 FFT loop of my code, replacing
- sines with tangents in the precomputation of the FFT sincos data. No timing
- change on the MIPS R10K, indicating that the MIPSPro f90 compiler was likely
- already doing such a replacement for me. But on the Alpha 21064 (which has
- no MADD instruction) my times for large FFT lengths dropped about 5%!
- (I expect another 5% when I modify the inverse FFT similarly).
- Weird, but welcome. Any ideas how the above replacement might improve
- pipelineability of a twiddle-multiply/add/subtract sequence, assuming just
- FMUL and FADD are available?


Compared to the straightforward coding, Jason's formula
uses the same instruction count but in a different order.

Assume yr, yi, and xr, xi (or xr, xi/xr) are in registers, 
and we want the result to go into registers (zr, zi).
On a machine with multiply and multiply-(add/subtract/reverse subtract)
available, the translations might be

Standard  Jason
temp1 = xi*yi temp1 = yr - (xi/xr)*yi
temp2 = xi*yr temp2 = yi + (xi/xr)*yr
zr = xr*xr - temp1zr = xr*temp1
zi = xr*yi + temp2zi = xr*temp2


Jason's method needs fewer temporaries on a machine with separate
add and multiply instructions.  The same register 
might be used for (xi/xr)*yi, temp1, and zr.  
The standard method needs separate temporaries for xr*yr and xi*yi.

Jason's method lets the add instructions start earlier, during the
twiddle computation -- the standard method saturates the multipliers.
Jason's method may let the multiplies needed for zr and zi be combined 
with additions later in the code, on architectures with multiply-add.

When (xr, xi) denotes   cos(theta) + i*sin(theta), 
Jason's method needs tables with cos(theta) and tan(theta).
The standard method lets one use sin(theta) = cos(pi/2 - theta)
to reduce table space for trigonometric functions.

Jason's method needs special casing if xr might be zero, in which
case (xr, xi) is pure imaginary.

I am not sure which method has more potential round-off error.

Peter Montgomery
[EMAIL PROTECTED]


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Simple question about P-1 factoring method

1999-08-17 Thread Peter-Lawrence . Montgomery

Will Edgingtom writes:

> If I understand P-1 factoring correctly, then using it to a stage one
> bound of k to try to factor M(p) will find all possible factors less
> than or equal to 2*k*p + 1.  I'm assuming that p is less than k (or p
> is always used in the powering) and the convention several of us
> agreed on a while back that all prime powers less than the stage one
> bound are used in the powering, not just the primes themselves.  That
> is, trying to factor M(97), say, to a stage one bound of 10 would use
> 8, 9, 5, 7, and 97, not just 2, 3, 5, and 7.

Yes, p itself should be used in the powering.  
So should all prime powers below (or up to and including) k.

> Am I correct?  Or could a factor smaller than 2*k*p + 1 be missed in
> some cases?

In the last example a factor 16*97 + 1 could be missed.
Otherwise all factors below 2*k*p + 1 should be found.  
One extra squaring will achieve the 2*k*p + 1 bound.

   The power of the P-1 algorithm is that it can potentially find 
many larger factors, such as 252*p +1 using a stage 1 bound of 10.  
Observe 252 = 4 * 9 * 7 is a product of prime powers each <= 10.
If you want to check only for prime factors below 2*k*p + 1,
P-1 is not the way.  [P-1 coupled with ECM might be the way if k is
large, although some uncertainty remains even after repeated ECM runs.]

> Does it matter whether p is prime or not?  I don't think so, but ...

Not if you always include an exponentiation by p, and repeat it
if necessary as you do primes <= k.

Peter Montgomery

_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Mersenne: UltraSPARC and Mfactor (was: new accounts)

1999-08-09 Thread Peter-Lawrence . Montgomery


From: Lucas Wiman  <[EMAIL PROTECTED]> writes (in response to someone else)

> > The last time I did timings like this - admittedly, probably over a
> > year ago, but the mers package hasn't changed much since, especially
> > in terms of performance - this is wrong.  SPARC LL testing -
> > especially with MacLucasUNIX - is much closer to matching Prime95 LL
> > testing than SPARC trial factoring - with mersfacgmp, say - is to
> > matching Prime95's trial factoring, by, as I recall, a factor of about
> > 12.
> 
> Though I don't have specific timings, I imagine this would be the case.
> I was referring to the Mfactor program by Peter Montgomery.  I have heard
> this performs considerably better than a GMP based program (written by
> Alex Kruppa) on RISC architectures.  I suppose that I could/should check
> up on this with my SPARC-owning friend.
> 
> >   The UltraSPARC would probably significantly outperform a similar
> >   megahertz PC, if we had similarly optimized code running on each.
> > Perhaps.
> 
> Again, I have no timings for this, but I would think that if you tried
> MacLucasUNIX on both a SPARC, and a PC, the SPARC would be the overall
> winner because of the massive amount of I/O that runs on LL tests.  In
> factoring, I would imagine that the difference would be smaller, (using
> the same program).
 
UltraSPARC is a 64-bit architecture.  The ARITHMETIC_INT64 option of
Mfactor.c is intended for 64-bit architectures, but requires both halves
of a 64 x 64 -> 128-bit integer product, preferably unsigned.
Alpha and MIPS (and soon-to-be Merced) provide both halves, but
UltraSPARC provides only the lower half, I believe.

You can use the ARITHMETIC_INT32 option on SPARCs, but this restricts
your search to primes below 2^60 rather than 2^63.

Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

RE: Mersenne: Multiple residues - enhancing double-checking

1999-08-07 Thread Peter-Lawrence . Montgomery

Paul Leyland (whom I'll meet for the second time later this month) writes:

> > From: Brian J. Beesley [mailto:[EMAIL PROTECTED]]
> 
> > A couple of points here:
> > (1) Can anyone honestly commit to a project for a whole decade?

> Yes, they can and do.   In the field of computational number theory, several
> participants in the Cunningham project have been active much longer than my
> decade.  Sam Wagstaff has been co-ordinating it longer than that.  To be
> honest, I don't know when he started.  Perhaps Peter Mon tgomery (another
> long-time participant) can inform us, as I know he's also on the Mersenne
> list.

> In other fields, it's not unusual to commit to a decade or more.  Consider
> the deep space missions such as Voyager, or the astronomers who monitor a
> particular class of objects for most of their working lives.  I know some
> amateurs who have been observing particular variable stars since the 1960's;
> George Alcock has been searching for comets and novae since the late
> fifties.

Ten years for one computation seems a long time.  Yes, I have been 
contributing to the Cunningham project for a long time, using machines
at multiple institutions, but we can hardly commit for ten years.  
Jobs change today, at least in the USA.
A home machine may be stolen or damaged by fire or floods.  
My family wouldn't know what to do if I died.

For the very long LL computations, it is reasonable for two sites
A and B to start the computation, and do a checkpoint perhaps 
every 1000 iterations.  When A reaches its 7000-th iteration 
(or other multiple of 1000), it sends the checkpoint file to a central site, 
and can continue for another 1000 iterations if B has submitted a
consistent checkpoint for the 6000-th iteration, with different 
programs used for iterations 5001-6000.
Otherwise A is blocked (i.e., told to do something else until B catches up).
If A and B differ at the 6000-th iteration checkpoint, 
or if one times out (say six months with no activity),
a third site can restart from the (agreed) 5000-th iteration checkpoint 

It is important that the different programs (e.g., Prime95,
MacLucas, LucasMayer) agree on a common checkpoint format.  Not only must the
central server be able to compare them, but a contributor might migrate from
a Pentium in 1996 to an Alpha today to a Merced in 2002.  

Of course this data at the central site will be voluminous,
if each exponent around 100 M averages 2.5 13 Mb checkpoint files
(for iterations 5000 and 6000 and possibly 7000 in the example).  
If there are 10 LL tests in progress, then 
the central site will need 3250 Gb.  A tetrabyte is huge today but
may be common in 15 years.  In the 1970's much data was on
seven-track tapes.  I recall each held 1.9 M 60-bit words (14 Mb)
Today the NFS data for M619 is being processed on a filesystem with 16 Gb,
a 1100-fold increase over 25 years.

Optimizations can reduce the size stored.  For example, store the full
5000-th iteration checkpoint, but only hash(6000-th iteration) and
hash(7000-th iteration) where the hash function might return only 1024
bits.  When the second site sends a 6000-th iteration checkpoint, 
verify that it gives the same hash value.  If yes, then we assume
B's full file agrees with A's, and store the 6000-th iteration checkpoint
while deleting the 5000-th.  If no, the full 5000-th remains available 
for a third site to use.

  Peter

_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: Worth it?

1999-07-21 Thread Peter-Lawrence . Montgomery


FAQ author Lucas Wiman  <[EMAIL PROTECTED]> writes

> Well, this might not be a problem.  If P-1 tries more than one base, 
> all that is required (per Mn) is *one* gcd.  We just multiply the
> remainders from the bases mod Mn, and then take the gcd on the final 
> remainders.  We could have the program prompt the user for the time when 
> the computer is on but not being used when they ask for a P-1 assignment.
> Or, we could have it run so that the program stops functioning when the 
> computer is being used.  It could just save the intermediate files to
> disk, to free up memory for the user who (graciously) gives us their
> CPU cycles.  
>
You might rerun P-1 with larger limits, but don't bother rerunning
using a new base.  If q is a prime factor of Mp, and r is the largest
prime factor of q-1, then a fraction 1 - 1/r of all possible bases 
will require us to include exponent r in step 1 or step 2 of P-1.
If r > 10, this is 99.999% of all potential bases.

The base change can be useful if P-1 finds a composite GCD.
For example 2717 = 11*13*19.  If we start with base b = 7, 
using exponents 4 and 3, then

   b^4 = 2401 == -316 (mod 2717)

  (b^4)^3 == -31554496 == 742 (mod 2717)

Here GCD(b^12 - 1, 2717) = 247 is composite (13*19).
This is because 13-1 divides the exponent 4*3 and because
our choice b = 7 is a cube (also a square) mod 19,
with (19-1)/3 dividing 4*3.  Choosing a b which is not 
a cube modulo 19 lets P-1 factor 247.

Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: A slight divergence...

1999-07-19 Thread Peter-Lawrence . Montgomery


Chris Jefferson <[EMAIL PROTECTED]> writes

> It is well known that n is prime if for all prime factors of n-1, a^(n-1)
> = 1 mod n and a^((n-1)/q) is not 1 mod n.

 What are a and q?  You want to say
`if for each prime factor q of n-1, there exists a base a such that ...' .

> For example, take a=2, and 2^p+1 (p is prime, yes, that IS a plus)
> 
> Then we need to check if 2^(2^p)=1 mod (2^p -1) and 2^((2^p)/2) is not 1.

 What is n here?  Your modulus is 2^p - 1, suggesting you intend
n = 2^p - 1.  But the exponent is 2^p, suggesting you intend n = 2^p + 1.

 If we try p = 3, then 2^p - 1 = 7 is prime.
But 2^(2^p) = 2^8 = 256 is not one modulo 7.

Peter Montgomery


_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: The $100,000 award for 10,000,000 digit prime

1999-07-18 Thread Peter-Lawrence . Montgomery


> >7)  Anyone that makes a mathematical or algorithmic breakthrough that
> >speeds up the search process.  I'm talking about a doubling in search speed
> >not a 1% speedup in assembly code.
>
  > I think that this would be great -- but I seriously doubt that any
> improvement will be found.  We can't get any better with FP FFTs, since we
> don't have any zero-padding any more, and you've specifically disallowed
> implementational improvements.  A switch to integer FFTs might be better
> for huge FFT lengths, but integer arithmetic is currently very slow
> compared to FP arithmetic, so I doubt that will end up helping.
  > Really, the only hope I can see for a significant improvement is a really
> fast implementation of Schonhage-Strasen multiplication, but
> Schonhage-Strassn is almost infamously slow.

  I can dream of some possibilities.  For example, somebody
might find a way to easily detect whether Mp has an odd or an even
number of prime factors, without revealing the factors themselves.
We could reject all Mp with an even number of prime factors.
If the test runs quickly, this could almost double the search time.

  Other potential improvements are borderline.  For example,

   *)  Somebody finds how to parallelize the FFT using little
   communication.  The wall-clock time might be reduced 10-fold,
   but the CPU time increased 16-fold.  This could be
   great for verifying a new Mersenne prime, but
   does it qualify for the money?

   *)  Somebody finds a novel way to choose elliptic
   curves modulo Mp, using the information that
   its prime factors are == 1 (mod p) and that 2, -p
   are quadratic residues modulo any factor of Mp.
   This lets the trial division phase search 10 bits higher, 
   such as searching to 2^74 rather than 2^64.  
   Does its finder get any money?

   *)  Somebody finds a way to verify the output of the LL test
   without a complete rerun (cf. verification of digital signatures).
   If this eliminates the need for double-checks, does it qualify?

 Peter Montgomery



_
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

RE: Mersenne: Hyperbola

1999-07-14 Thread Peter-Lawrence . Montgomery


Brian Beesley comments:

> Just out of interest, did anyone ever try Fermat's Method on some of 
> the "tougher" numbers in the Cunningham tables ... Fermat's Method 
> can be speeded up by a large factor since we _know_ the form of the 
> factors - of course, it's still going to fail to find factors in a 
> reasonable time unless a pair of factors of very similar size do, in 
> fact, exist. Also, any factors found would not neccessarily be prime, 
> though cracking the two "halves" ought to be easier than tackling the 
> whole.

D.H. Lehmer did some cofactors this way.  The largest seems to be
the 33-digit number (2^109 - 2^55 + 1)/5, otherwise known as 2,218L.
See the history pages in the book `Factorizations of b^n +- 1,
b = 2, 3, 5, 6, 7, 10, 11, 12 up to high powers'
by John Brilhart et al.

 The Continued Fraction method, introduced in the late 1960's,
generalized Fermat's method and made the difference of squares method obsolete.
The P-1 method accelerated this gap.  Today it takes under a second
to factor 2,218L by P-1.

 Montgomery factorization program.  Compiled Tue Jun  3 21:25:54 MET DST 1997.
 Allows inputs up to about 6300 decimal digits.
 C(2,218L)
 Composite cofactor has33 digits:
 129807421463370683507503004437709
 RAND_PRINT - Current random number seed is  198181203 271851921 233382925
 C(2,218L)p-1 method found divisor near p=  7603
 74323515777853
 CHEK - Nontrivial GCD p-1
 74323515777853
 The first number below is the product of the second and the third, as found
 by p-1 after  18285 multiplies and GCDs
 in   0.05 CP seconds at Wed Jul 14 16:52:44 1999
 129807421463370683507503004437709
 74323515777853
 1746518852140345553
 Probable prime cofactor has 19 digits -- terminating.

 One instance of Fermat's method remains important  -- 
factoring p^2 where p is prime.

 Peter Montgomery



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: re: Mersenne prime exponent binary representations and 1's frequency (incl M38)

1999-07-11 Thread Peter-Lawrence . Montgomery


Ken Kriesel <[EMAIL PROTECTED]> writes:

> George asks, what about the bit second from the left?
 
> In total, 26 0 bits vs only 12 1 bits. (0 26 of 38 times or ~68.4%;
> 26/12=2.1...)
> Excluding p=2 and p= 3, to look only at interior bits, 25 0 bits vs 
> only 11 1 bits; 0 25 of 36 times, ~69.4%; 25/11=2.2727...)
 
> Now why would that be, that a 0 bit is more than twice as likely as a 1 bit? 
 
> (Note that Mp#38, M6972593, has the less likely 1 bit in the second most
> significant position; searching biased for these probabilities would have
> delayed finding it.  But the 4 previous Mersenne primes all had a 0 bit
> in that position.)

  The law of leading digits predicts that, for decimal numbers,
log10(2) ~= 30.1% will have leading digit 1.  More precisely,
the fractional parts of the base-10 logarithms are assumed
to be uniformly distributed.  This distribution is invariant
under many transformations, such as converting a physical constant
from miles/hour to meters/second.

  For binary numbers, this predicts the leading bit is 1 with probability 
log2(2) = 1, a not surprising result.  The two leading bits are
10 with probability log2(3) - log2(2) = ln(3/2)/ln(2) ~= 58.5%.
This prediction is smaller than the observed 68.4%, but above 50%.

Peter Montgomery



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: Why can't we...

1999-07-09 Thread Peter-Lawrence . Montgomery


Peter Foster asks:

> When doing an LL test we are always calculating the 
> same series of numbers. The modulus is different, so 
> we can't use the result of one test to help with 
> another. I'm wondering why we don't do the following:
> 
> Take 2 Mersenne numbers, m1 and m2 (m1 < m2).
> Do the usual LL series, but use as the modulus m1*m2.
> At the appropriate step, check if the remainder is 
> divisible by m1. If so, then m1 is prime.
> At the end, check if the remainder is divisible by 
> m2. If so, then m2 is prime.
> 
> This allows us to do two (or more) tests for the price 
> of one. What is the obvious reason we don't do this?
 
 In theory this is fine.  But it is likely to cost more
than making two separate tests.


 Suppose m1 = 2^p1 - 1 and m2 = 2^p2 - 1.
Residues modulo m1 have p1 bits, and those modulo m2 have p2 bits.
The time to square modulo m1 is estimated as O(p1*log(p1))
The time to square modulo m2 is estimated as O(p2*log(p2)).
The time to square modulo m1*m2 is estimated as O((p1+p2)*log(p1+p2)).

We'll assume p1 and p2 are large but have the same magnitude.
For example, each might be between 6332000 and 6333000.
Then log(p2) - log(p1) = log(p2/p1) is tiny compared to
log(p2) or log(p1), so we approximate log(p2) = log(p1) and
log(p1+p2) = log(2*p1) = log(2) + log(p1).  

The combined time to separately square modulo m1 and modulo m2
becomes O((p1+p2)*log(p1)).  A single computation modulo m1*m2 takes
time O((p1+p2)*(log(2) + log(p1))).  For p1, p2 near 6 million ~= 2^22,
the latter time is about 4.5% longer.

This estimate is slightly optimistic.  The techniques used
for reduction modulo 2^p1 - 1 and 2^p2 - 1 don't immediately generalize
to reduction modulo (2^p1 - 1)*(2^p2 - 1).

Despite this, it may be beneficial writing a program which
does several nearby exponents at once.  Consider a SIMD
(single instruction, multiple data) machine with n processors.
Each of the n processors can be assigned a different exponent.
All use the same FFT length, operating on local data unique
to the processor's exponent.  The larger exponents will need a few
more LL iterations than the smaller exponents, but this gap 
in amount of work per processor will typically be under 1%.
This scheme would require considerable local memory on each processor.

 Peter Montgomery




Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: question

1999-07-06 Thread Peter-Lawrence . Montgomery

Jud McCranie <[EMAIL PROTECTED]> writes:

> At 10:47 AM 7/6/99 +0200, Benny.VanHoudt wrote:
> >Now lets only focus on the set 2^p - 1 with p prime, i.e., the set
> >of numbers that we are checking out at GIMPS. Has anyone proven that
> >an infinite number these are NOT prime (this is VERY likely to be 
> >true)? If so, how can one prove this easily (it is probably not  
> >possible to indentify a subset that is NOT prime as above
^
> >because then we would not consider all of them for GIMPS)?
> >
> If 2p+1 is prime then it divides 2^p-1.  If it has been proven that there are
> in infinite number of prime pairs p and 2p+1 then this proves that there are an
> infinite number of 2^p-1 that is not prime when p is prime.  These are called
> Sophie Germain primes, and it has been proven that there are an infinite number
> of them, therefore there are an infinite number of composites of the form 2^p-1
> when p is prime.

 Please leave adequate white space in your right margin.
Benny's twice-indented (indentified?) text twice still reads well, but
three of Jud's indented lines wrap around on my 80-character screen.

  It is _conjectured_ that p and 2p+1 are simultaneously 
prime infinitely often, but I have seen no proof.
This is related to the twin prime conjecture, in which p and p+2
are simultaneously prime.  More generally, if f1(x) and f2(x) are 
irreducible polynomials with integer coefficients such that

   i) f1(x) and f2(x) approach +infinity as x -> +infinity
  [This excludes constant polynomials, and 3 - 7*x.];

  ii) For each prime q there exists an integer n such that 
  q does not divide the product f1(n)*f2(n)
  [This excludes f1(x) = x and f2(x) = x + 1.];

then the conjecture predicts infinitely many integers n
for which f1(n) and f2(n) are simultaneously prime.
A variation of this conjecture extends the result to more 
than two polynomials.  Even the one-polynomial result
is unproven when its degree exceeds 1: are there 
infinitely many primes of the form x^2 + 1?

 Peter Montgomery

Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Mersenne: Lehmer question

1999-07-04 Thread Peter-Lawrence . Montgomery


Problem A3 in Richard Guy's `Unsolved Problems in Number Theory'
includes this question, by D.H. Lehmer:

Let Mp = 2^p - 1 be a Mersenne prime, where p > 2.
Denote S[1] = 4 and  S[k+1] = S[k]^2 - 2 for k >= 1.
Then S[p-2] == +- 2^((p+1)/2) mod Mp.
Predict which congruence occurs.

For example, when p = 3, S[1] = 4 == 2^2 (mod 7).
When p = 5, S[3] = 194 == 2^3 (mod 31).
When p = 7, S[5] = 1416317954 == -2^4 (mod 127).

The sign is + for p = 3 and p = 5 but - for p = 7.
Do we have the pattern through M38?

Peter Montgomery



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: Hmmm.

1999-07-01 Thread Peter-Lawrence . Montgomery


[EMAIL PROTECTED] writes

> Seeing as how anyone with even the most rudimentary of Internet searching 
> skills (i.e. me) can find a publicly available Internet page with a certain 
> highly important number on it, I ask why it is there. I thought that "those 
> in the know" were *absolutely not* supposed to reveal it to anyone until it 
> had been offically disclosed. Publishing an Internet page seems a little odd, 
> you see, because entropia.com/ips and www.mersenne.org still say nothing 
> specific. What do Mr. Woltman and Mr. Kurowski have to say about this?

> Even though I know M38 and the location of said page, I will not discuss it 
> with anyone until the _official_ announcement has been made. If the general 
> public gets wind of this, it probably will be Not Good (TM), so I'd ask the 
> other members of this list not to say anything until we hear from Woltman or 
> Kurowski.

 Have you double-checked the exponent you found?
Perhaps a hacker broke into the web site and is publishing 
an incorrect exponent.



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Mersenne: SF Examiner story

1999-06-30 Thread Peter-Lawrence . Montgomery


Page A-12 of the June 30, 1999 San Francisco Examiner
has a story `Math expert confirms a prime discovery'
repeating the Oregonian story abut M38.  
The report does not give the precise exponent.
The report mentions only George Woltman by name.

Peter



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: LL and factoring & quitting

1999-06-14 Thread Peter-Lawrence . Montgomery


lrwiman <[EMAIL PROTECTED]> wrote:

> 
> It has been mentioned several times recently that factoring is all integer
> work, and LL testing is nearly all floating point.
> 
> It is my understanding that on intel CPU's, these are done on separate parts of
> the CPU.  Would it increase net performance to do factoring and LL assignments 
> at the same time?
> 
If we can execute the two codes together, the functional units
will be better utilized.  But we will need the combined memory bandwidth
for both codes, and enough cache for both working sets.  
On Pentiums, a bigger concern is the number of registers -- 
most floating point intensive algorithms need several integer registers
for loop counters and address computations, so the integer code will
lack these.  The two algorithms must be designed so the
control (i.e., branch) logic is identical -- we don't
want the integer code to be calling a subroutine 
which executes the Euclidean GCD algorithm and repeatedly tests whether
the latest remainder is zero, while the floating point code 
loops over the FFT output to square each element, for example.

 When the control logic is identical for two algorithms, and
the subscripting (i.e., memory access) patterns are similar, 
it may be feasible to merge integer and floating point algorithms
for improved functional unit utilization.  For example, Ernst Mayer
is writing code which will do an integer FFT and a floating point 
FFT of the same length concurrently.




Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: Back to the maths of primes for a sec....

1999-05-18 Thread Peter-Lawrence . Montgomery


"Nicolau C. Saldanha" <[EMAIL PROTECTED]> writes

> Do you have more precise information, or do you know where we can find it?
> (the help function inside maple gives only a very superficial explanation)
> Strong pseudoprime wrt which base? Probably not random, or the search
> for a counterexample becomes meaningless. And which Lucas sequence?
> 
 Assuming you have maple installed, you can see
its procedures (sans comments):

interface(verboseproc=3);
eval(isprime);
;quit;



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: Repeating LL remainders

1999-05-18 Thread Peter-Lawrence . Montgomery


Chris Nash <[EMAIL PROTECTED]> wrote:

> Define L(n)=(2+sqrt(3))^n+(2-sqrt(3))^n. It turns out that what we usually
> call *the* Lucas sequence is just
>
> S(n)=L(2^n).

  Yes.  We can check S(0) = L(1) = 4 and S(n+1) = S(n)^2 - 2
from the definitions.  So S defines the familiar Lucas sequence.

> The whole point of the proof is to show that the set of elements a+b.sqrt(3)
> (a, b mod N) that form a closed group under multiplication has the maximal
> order, N^2-1, and thus N is prime. The Lucas sequence does this with a
> little jiggery-pokery, because it is sufficient to show that L(N+1) is zero,
> while L((N+1)/q) isn't for any prime factor q of N+1. For Mersenne numbers
> the only factor to worry about is 2, so the test collapses into its briefer
> form.

> So the question becomes one of solving (2+sqrt(3))^n has zero surd part, and
> in fact prove that the group does not have order N^2-1. The sequence L(n)
> "will" recur in that case. However, it is not difficult to prove that the
> "first" recurrence, ie the point where L(x)=L(1), will generally occur quite
> late in the sequence unless N has all small factors - in which case, we
> would have eliminated this N by trial factoring.

> Remember too, this is late in the "L" sequence, not in the S sequence!
> Suppose for instance it occurred between L(2^(n-3)) and L(2^(n-2)) - which,
> even assuming the "first" recurrence is equally likely anywhere, would occur
> with probability 50%. The S-sequence would not even see it.

We can illustrate working modulo M23 = 8388607 = 47 * 178481.
3 is a quadratic reside modulo 47 (e.g, 12^2 == 3 mod 47),
so 2 + sqrt(3) is in the base field of order 47.
The multiplicative order of 2 + sqrt(3) divides 47-1, and turns out to be 23.

3 is a quadratic nonresidue modulo q = 178481.  The order of 2 + sqrt(3)
divides q + 1 = 178482 = 2 * 3 * 151 * 197 and turns out to be 178482.

The least common multiple of these orders is
2 * 3 * 23 * 151 * 197.  So L(2 * 3 * 23 * 151 * 197) == L(0) mod M23.

For the L sequence, we need two powers of 2 whose difference is
divisible by 2 * 3 * 23 * 151 * 197.
The orders of 2 modulo 3, 23, 151, 197 are 2, 11, 15, 196, respectively.
The order of 2 modulo 3 * 23 * 151 * 197 is the least common multiple
of these orders, namely 2^2 * 3 * 5 * 7^2 * 11 = 32340.
To include a factor of 2, we need L(2^32341) == L(2^1) mod M23.
That is, S(32341) == S(1) mod M23.
This is far more than 23 LL steps before the sequence repeats.

EXERCISE: Repeat this analysis modulo M11 = 23 * 89.
  Find the period modulo M29 = 233 * 1103 * 2089, after getting
  the individual periods for each prime factor.



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: P587 factored?

1999-05-09 Thread Peter-Lawrence . Montgomery


Observant "Foghorn Leghorn" <[EMAIL PROTECTED]> asks:

> I noticed that P587 is no longer on Geroge's ECM factoring page, but there 
> is no factor listed. Did someone else find a factor?

 Robert Silverman sieved it by SNFS.  CWI did the linear algebra.

C(2,587+)
  2769467  13119770765051463547
* C151 = prp71 * prp80.   SNFS Silverman/CWI
  58304599029582814346174509784805811323612591021863318902938306546239131
  79700481573792089991933884278502015980406487362166583968208410611454698961420697


Conrad Curry is recruiting volunteers for M617 by SNFS.
CWI is doing 2,1186L and 2,1198L.
[2,1198L denotes 2^599 - 2^300 + 1, a divisor of 2^1198 + 1.]
These will complete 2,n+- for n <= 600 and 2,nLM for n <= 1200.
Under ten composite b^n +- 1 with b <= 12 and b^n <= 10^180
remain in the Cunningham table.



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: ECM question

1999-05-06 Thread Peter-Lawrence . Montgomery


"Foghorn Leghorn" <[EMAIL PROTECTED]> observes:

> At Paul Zimmerman's ECM page,
   > http://www.loria.fr/~zimmerma/records/ecmnet.html
> the optimal B1 value listed for finding 50-digit factors is 4300, but 
> George's ECM factoring page uses 4400 for the same purpose. Is one of 
> them wrong, or is there a reason for the difference?

If I count the zeros right, these are 43 million and 44 million.
The function being minimized, namely

probability of finding a 50-digit factor on one curve, given that such exists
-
 time per curve

is flat near its minimum.  Implementation and platform differences 
can obviously affect the denominator (time per curve).
The stage-2 strategy affects the numerator.  The two optimal B1's
are close enough to be considered the same.



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: Mersenne Twin Primes

1999-04-05 Thread Peter-Lawrence . Montgomery


Foghorn Leghorn ask:

> As you can see, there are only two numbers (2^4253-3 and 2^11213-3) in 
> the table that I have not been able to factor.  Both fail the probable 
> prime test to base 2. Both of these numbers appear to be worthy 
> challenges for any factoring effort. If someone knows of any factors of 
> these numbers or an easy way to find them, then I'd be interested in 
> hearing about it. Are there any rules for these numbers analagous to the 
> rules for prime factors of Mersenne numbers?

   If a prime q divides 2^(odd exponent) - 3, then 6 is a quadratic
residue modulo q.  q must be congruent to 1, 5, 19, or 23 (mod 24).




Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: LL-testing

1999-03-22 Thread Peter-Lawrence . Montgomery


"Benny.VanHoudt" <[EMAIL PROTECTED]> asks

> I was looking at the LL-test and thought of the following:
> To succeed a LL-test the final value (mod 2^p - 1) has to be zero.
> To be able to get this there has to be a number x (between 0 and 2^p - 2)
> such that x^2 = 2. This because in the final step of the
> LL test the value is squared and substracted by two.
> 
> In some cases for example 15 = 2^4 - 1 (this cannot be prime because
> four isn't but that's not the point) such a value does not exist:
> 
> 0^2 = 0, 1^2 = 1, 2^2 = 4, 3^2 = 9, 4^2 = 1, 5^2 = 10, 6^2 = 6, 7^2 = 4,
> 8^2 = 4, 9^2 = 6, 10^2 = 10, 11^2 = 1, 12^2 = 9, 13^2 = 4 and 14^2 = 1.
> 
> Of course in this case p = 4 is not prime so we would never do a LL-test
> anyway. My question is whether anyone knows if such an x always exists if p
> is indeed a prime ? Perhaps this is only true for a subset of all primes,
> and this would allow us to focus on this subset only (if it can be
> determined more easily) ?
>  
If p = 2k + 1 is odd, then x = 2^(k+1) satisfies x^2 = 2 + 2*(2^p - 1).

As far as I know, we don't know how to predict whether the next-to-last
term will be 2^((p+1)/2) or -2^((p+1)/2) in the LL-sequence.  For example,
modulo 31, the backwards sequence might go

  0
 /  \
/ \
8   23
  / ||\
/   ||  \
  1417   5   26
  /|| \  |\\
 / ||  \ | \|\
4  27   9   22  10  21 11 20

The left upward sequence (4 -> 14 -> 8 -> 0) is of course the
proper sequence.  But with everything having two ancestors,
how do we predict that?  Modulo 127 the sequence
4 -> 14 -> 67 -> 42 -> 111 -> 0 has 111  == -16 preceding the zero.





Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: A few more iterations? (was Re: Mersenne: Moo?)

1999-03-07 Thread Peter-Lawrence . Montgomery


[EMAIL PROTECTED] suggests

> Suppose we take a number of exponents that all need checking (or double-checking) 
> - say 3, which I'll call p, q and r. (The number doesn't matter to the theory). 
> Now use the standard LL test algorithm to calculate the residues for each 
> exponent, working modulo M(p)*M(q)*M(r), taking note of the values of the residues 
> (the WHOLE value, not just the low n bits) S(p) at iteration p-2, S(q) at 
> iteration q-2 and S(r) at iteration r-2. In other words, do the "few extra 
> iterations" suggested!
> 
> If we know S(p) mod (k*M(p)) for any integer k, then obtaining S(p) mod M(p) is 
> essentially trivial - one operation, and M(p) being just a string of 1 bits, it's 
> very easy to do with a few (extended precision) shift & add operations.
> 
> The problem is that, superficially, this procedure doesn't seem to actually save 
> any time - but it shouldn't take significantly longer than running the three 
> independent tests. Practical implementation using DWT / FFT / NTT would obviously 
> require a different transform size than would be used (optimally) to do the three 
> tests independently. Also the data worked on would be different enough to ward off 
> risks of data-dependent processor bugs.
> 
> In practice, there may be some small time saving possible. It turns out that the 
> fast multiplication algorithms are most efficiently implemented when the transform 
> size is a power of 2. So, e.g., numbers around 3 million are relatively expensive 
> to check (using Prime95) because they're too large for a 128K transform size but 
> too small to need a 256K transform size. However, checking 3 at once would fit a 
> 512K transform size reasonably well. There is, of course, a converse trend, in so 
> far as larger transform sizes tend to cause more cache misses, therefore cache 
> tuning for large transform size may become a significant issue.
> 
> My suggested "trick" applies especially to other programs e.g. MacLucasUNIX which 
> have no intermediate transform sizes between powers-of-two. A couple of days ago 
> my Alpha system finished double-checking M(2361701) using a 128K transform; the 
> next exponent I had lined up for it was M(2361721), and it's decided that a 256K 
> transform is needed, thus doubling the run time 8-( (Don't forget this is 
> DIFFERENT CODE to Prime95 - in fact, the Alpha FPU works to only 53 bits precision 
> (IEEE G-floating), rather than the 64 bits precision which the Intel x86 FPU is 
> capable of)
> 
> Can anyone fault my logic? If not, then my suggested approach may possibly be more 
> effective than writing extra code into MacLucasUNIX (and other LL testers) to cope 
> with transform sizes intermediate between powers of 2. Also saves debugging a 
> whole load of extra FFT code 8-)
> 
> Regards
> Brian Beesley

Try to keep your lines below 75 characters so they still fit
on an 80-character screen after we copy and indent them.

It is valid to compute a residue modulo M(p) * M(q) * M(r)
and later reduce it modulo M(p), M(q), M(r).  But how will the DWT
(Discrete Weighted Transform) modulo this product be done?
Presently, if we want a product modulo 2^p - 1 using
a convolution of length n, we scale the input using powers
of 2^(1/n), and the precise powers depend upon p.
The individual digits of the multiple-precision residue
do not all have the same base -- the pattern depends upon p.
How to you resolve these?

  Peter



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: mersennes for our civilization

1999-03-07 Thread Peter-Lawrence . Montgomery


Spike Jones <[EMAIL PROTECTED]> asks:

> Thanks for the site Dr. Bob!  Thats a cool one.  Yesterday a coworker
> saw my six page printout of the Great Number on the wall of my office,
> asked, and I explained GIMPS and the LL algorithm, how the program
> checks to see if M(n) divides S(n).  He is a clever chap and asked why
> prime95 starts from scratch calculating S(n) instead of getting it from
> a previous check: if you calculated S(601) and your needed
> S(631) for your next check, for instance, why would you not take
> the previously calced S(601) then square and subtract 2 thirty times.
 
> I pointed at the Great Number and said: Because S(3021377) has this
> many bits.  Knock off a dozen digits, and it would take that many years
> just to send the data from one computer to another.
 
> I felt pretty smart.  Then he asked: if S(n) has so many bits, how does a
> desktop computer handle it?

If the printout has 80 characters per line and 60 lines per page, 
you'll need 190 pages for 91 decimal digits.  You must have 
big pages.  Do you use four walls, floor, and ceiling?

When checking whether M(n) divides S(n), the program successively
computes
   remainder of S(1) modulo M(n)
   remainder of S(2) modulo M(n)
 
   remainder of S(n-1) modulo M(n)
   remainder of S(n) modulo M(n)

It checks whether the last remainder is zero.

Suppose S(j) = quot(j) * M(n) + rem(j) and 
S(j+1) = quot(j+1) * M(n) + rem(j+1), where 0 <= rem(j), rem(j+1) < M(n).
We claim that it is easy to get rem(j+1) from rem(j) without knowing quot(j).
More precisely, if

   rem(j)^2 - 2 = q*M(n) + r

with 0 <= r < M(n), then we claim rem(j+1) = r.  Indeed

rem(j+1) - r = S(j+1) - quot(j+1)*M(n) - rem(j)^2 + 2 + q*M(n)
 = S(j)^2 - rem(j)^2 + (q - quot(j+1))*M(n).
 = (quot(j) * M(n) + rem(j)^2 - rem(j))^2
 + (q - quot(j+1))*M(n)

The right side is divisible by M(n) since the rem(j)^2 terms cancel
and all other terms are divisible by M(n).  This proves 
rem(j+1) - r is a multiple of M(n).  The assumptions
0 <= r, rem(j+1) < M(n) imply rem(j+1) - r = 0.

   In other words, we can compute

   remainder of S(j+1) modulo M(n)

directly from

   remainder of S(j) modulo M(n)

We do not need to know S(j) itself for this computation.

Your coworker's proposal calls for computing


  remainder of S(631) modulo M(631)
from
  remainder of S(601) modulo M(601)


If we simply do 30 more iterations, we will have
S(631) modulo M(601).  To change the divisor from
M(601) to M(631) we know no better method than starting afresh.




Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: Completely Factored

1999-02-18 Thread Peter-Lawrence . Montgomery


"Sander Hoogendoorn" <[EMAIL PROTECTED]> asks

> Can sombody please tell me when a number is completely factored?

  The prime factors of 2^29 - 1 are 233, 1103, and 2089.
The equation

2^29 - 1 = 233 * 1103 * 2089

gives the _factorization_ or _complete factorization_ of 2^29 - 1.
Whereas

2^29 - 1 = 233 * 2304167  (where 2304167 = 1103 * 2089)
2^29 - 1 = 1103 * 486737  (where 486737 = 233 * 2089)
2^29 - 1 = 2089 * 256999  (where 256999 = 233 * 1103)

are _partial factorizations_ of 2^29 - 1.  A complete factorization
factors a number into prime factors.  A partial factorization
allows composite cofactors (which are smaller than the original number).

  If someone finds the factor 233 of 2^29 - 1, he/she can compute
the cofactor 2304167, and subject the cofactor to a probable prime test.  
If the cofactor is definitely composite, we have
only a partial factorization, and more work is needed.
If instead the cofactor passes the probable prime test, 
we try to prove it prime so we can be sure the factorization is complete.



Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm

Re: Mersenne: Factoring

1999-02-15 Thread Peter-Lawrence . Montgomery


"Sander Hoogendoorn" <[EMAIL PROTECTED]> asks

> Last weekend when i was testing Prime 95 i noticed that factoring low 
> numbers took much longer as high numbers.
> Factoring M17XXX from 2^52 to 2^54 took minutes per pass while factoring 
> M9XX from 2^52 to 2^54 took about one minute to complete the whole 
> factoring. How is this possible?
> 
   Only primes in [2^52, 2^54] which are congruent to
1 modulo 17XXX (or modulo 9XX) need be checked.  [There are other
restrictions, such as the prime being == +- 1 modulo 8.]
For the larger exponent, the density of candidate divisors
is 17XXX / 9XX ~= 1/50 times as large.  
For the larger exponent, the cost of testing each divisor 
(using the binary method of exponentiation) is 
log_2(9XX) / log_2(17XXX) ~= 23/14 times as long.
Overall, when the interval [2^52, 2^54] is fixed,
the checking for the larger exponent proceeds
50 * (14/23) ~= 30 times as fast.  

   On the other hand LL testing for the larger
exponent takes 50 times as many iterations, and each 
iteration takes about 50 * (23/14) ~= 80 times as long,
so this cost grows by a factor of 4000.  
Keeping the (LL costs)/(factoring costs) ratio constant,
the factoring could search an interval 
30 * 4000 ~= 2^17 times as long.  Alas, [2^69, 2^71]
needs more than 64 bits of precision to store candidate
divisors, so you probably stop factoring at 2^64 or switch to 
other algorithms.

Re: Mersenne: Questions on Crandall algorithm

1999-01-07 Thread Peter-Lawrence . Montgomery


Olivier Langlois comments (in response to others):

> First,
> let me thank you all for your helpful explanations.
> 
> > > Is there someone that could be able to explain me the reasons that are
> > > behind the 2 conditions (q/N < 32 and maximum convolution error is < 0.5
> > > for every Lucas-Lehmer iteration ) (BTW, sorry for my ignorance but what
> > > does maximum convolution error mean ?)
> >
> > The fastest way to multiply numbers is to use FFTs, and the fastest way to
> > do FFTs is with floating point math.
> 
> Is this affirmation 100% true ?
> Does someone have tried to perform FFT with MMX and fixed point variables?
> The only reason I can see that wouldn't make MMX a viable solution is that
> Crandall algorithm necessitate 'double' (or 'float', I don't remember which one
> is currently used) precision.
> 
> Any comments ?

Floating point arithmetic is faster than integer arithmetic
on many machines today, if one does comparable numbers of adds/subtracts
and multiplies, as in matrix multiplication and FFT algorithms.

The FFT of order n uses a primitive n-th root of unity.
The Crandall trick needs an n-th root of 2.  Both of these
are available if we use complex numbers with floating point math.

Computer manufacturers are paying more attention to
fast integer multiplication now that RSA and other cryptographic 
algorithms are gaining prominence.  On 64-bit machines with fast 
integer arithmetic, modular arithmetic is a good alternative choice, 
if the modulus is chosen so the requisite roots of unity and of 2 exist.  

Ernst Mayer and I are (slowly) writing a code 
which will use _both_ floating point and modular arithmetic.
If the floating point arithmetic provides 40 bits of precision
in an output coefficient (using 53-bit mantissa) and 
a 63-bit prime yields 62 bits of precision in these coefficients,
then we can accurately recover output coefficients up to 102 bits,
and won't need as long a convolution.  We expect the code to 
be very fast on the new DEC Alpha 21264, a superscalar machine
which can start one integer multiply, one floating add/subtract, 
and one floating multiply every clock cycle.

Re: Mersenne: Re: Mayer's Counter-Challenge

1998-12-04 Thread Peter-Lawrence . Montgomery


Dienst writes (in response to Ernst Mayer)

> > If one instead does a fast polynomial-evaluation stage 2, even one
> > with modest circular convolution length (to keep memory requirements
> > reasonable), the stage 2 depth for a given amount of work should rise
> > dramatically, perhaps 2-3 orders of magnitude. One needs to do an
> > extended GCD at the start of stage 2 to generate the modular inverse
> > of the stage 1 residue, but this costs no more than an additional
> > regular GCD - in fact, one can do the regular and EGCD at the end
> > of stage 1 together.
> 
> Errm, 2-3 orders of magnitude seems far too high an
> improvement. You are not going to want to do too much work in stage 2
> and with short convolutions (at 2 MB per number the convolutions will
> be mighty short) the gains will not be near that great.

 Using a two-dimensional FFT, the time to multiply two bivariate 
polynomials (degree nx-1 in X, degree ny-1 in Y) modulo X^nx - 1 and Y^ny - 1 
is O(nx*ny*log(nx*ny)) = O(nx*ny*(log(nx) + log(ny))).

   Suppose the Mersenne multiplications are length 262144.
We want to multiply two polynomials of degree 31 with
Mersenne coefficients, reducing the product modulo X^32 - 1
(32 is the `short' convolution length).
The time is about 262144*32*(18 + 5).  
The time for a single Mersenne multiplication is about 262144*18.
The time for a convolution of length 32 is about 
32*(18 + 5)/18 ~= 41 times that for a single multiplication.
The classical method for polynomial multiplication would
have a ratio of 1024, and Karatsuba a ratio of 243.

   Yes, the convolution size is limited by our memory space.
But memories are growing, and the code may not be ready for general
use until 2000.

Re: Mersenne: How does this transform work

1998-11-05 Thread Peter-Lawrence . Montgomery


> Can someone tell me the details of how the discreet weighted transform
> works, or tell me where to find information on this subject.

It is described in an article by Crandall et al in Mathematics
of Computation earlier this decade.

 We can use the FFT to multiply two polynomials of
degree at most n-1, getting their product modulo X^n - 1.
If the inputs are the digits of a multiple-precision integer
in radix R, then we get a product modulo R^n - 1
after carry propagation.
But we want a product modulo 2^p - 1.
The usual workaround is to force R^n - 1 > (2^p - 2)^2,
so the product will not exceed R^n - 2.

Instead the discrete weighted transform (DWT)
uses an irrational radix R = 2^(p/n).  The coefficients
of the polynomials being multiplied are no longer integers.
Rather the coefficient of X^i is an integer multiple of
2^(k/n) where 0 <= k < n and k == -i*p (mod n).
This trick lets us make n half as large as we would without the DWT.

  For example, if n = 4 and p = 13, then R = 2^(13/4),
R^2 = 2^(13/2),   R^3 = 2^(39/4).  The integer
1998 = 1024 + 7*128 + 4*16 + 14 could be represented as
2^(3/4)*R^3 + 7*2^(1/2)*R^2 + 4*2^(1/4)*R + 14 
with [14,  4*2^(1/4),   7*2^(1/2),   2^(3/4) ]
being the input coefficients to the FFT.

Re: Mersenne: A missunderstandment

1998-11-02 Thread Peter-Lawrence . Montgomery

Bojan Antonovic writes:

> >  (Somebody else wrote)
> >
> > Bah!  128 bit floats are quite useful.   Think about percision.
> > There are segments within the 64 bit range where you get terrible decmil
> > precision.   (such as with any large number)   This becomes a problem with
> > scaling large numbers to small, working math, and then rescaling them to
> > large numbers.   I have seen this occur with many programs.   128 bit
> > floats would be very nice for large numbers with decmils.
>
> True and not true. To avoid to have very high precision there are reules how to
> compute results of functions. For example if you want to add a serie of numers
> first begin with the smallest and then raise up to the highest.

 Suppose we regard the original inputs to a computation
as being exact, and a long (floating point) computation is
run with d significant bits in each mantissa, with no exponent
underflow or overflow.

As d varies, the number of significant output bits is
typically d - c, where the constant c depends upon the depth of
the computation (and how carefully the code avoids round-off errors).
For example, a computation x^16 = (((x^2)^2)^2)^2 has four squarings.
The first squaring loses about 0.5 bit of significance.
The second squaring doubles the old error (to 1.0 bit)
and adds some round-off error of its own,
for a net error between 0.5 and 1.5 bits (average 1 bit).
Two more squarings raise the average error to 4 bits.

  How does this relate to Mersenne?
Let's assume that the FFT computation loses 12 bits
of significance for FFT lengths in the range of interest.
Now consider an LL test on Mp where p ~= 5 million:

   Mantissa Output   Radix FFT
 bitsbits length

  53  41  2^11 524288 (Alpha)
  64  52  2^16 327680 (Pentium)
 110  98  2^40 131072 (Proposed)

The last two columns five a possible FFT radix and length.
The radix is 2^r where

2r + log_2(FFT length) <= (output bits)
 r * (FFT length) >= p ~= 5 million

Going from 53 mantissa bits (in a 64-bit word)
to 110 mantissa bits (in a 128-bit word), we have reduced
the required FFT length by a factor of 4.
That means approximately 25% as floating point operations
are needed for the FFT.  If our hardware can do 128-bit
floating point operations in twice the time needed for similar 64-bit
operations, then our overall time has improved by a factor of 2.

68 matches

Mail list logo