Mersenne: Re: fast GCD

1999-11-06 Thread EWMAYER

George Woltman writes:

Since F24 is 5 million digits your computer would take 5 minutes * 5^2,
or 2 hours.  Given your machine is faster, my code is 32-bit, my code
does an extended GCD, the Alpha is a better architecture, and your code
is probably better - you come up with the factor of 12 difference.

Was your code doing a true Fermat-mod DWT, or were you attempting to
factor instead the Mersenne number M(2^25-1) = (2^(2^24)-1)*(2^(2^24)+1)?
In the latter case your GCD would involve vectors twice as long, which
would mean a factor-of-4 runlength increase for an O(n^2) algorithm.

Now the good news.  Crandall's giants code has an implementation of
the O (n * log n * log n) GCD in it.  It was painfully slow.  After two
days of optimizing, I will be able to do that 5 million digit
GCD in an estimated 30 minutes.

Impressive - can you give us a brief summary of the most time-impacting
changes you made, and were these to the C source or in ASM?

BTW, I know of no other publicly available implemetation of the fast GCD.
It was written by J. P. Buhler.

I know little about it, except that Crandall mentioned it took them
the better part of 6 months to get the thing written and debugged.
Better them than you, right? :)

Jason Papadopoulos writes:

I've heard lots of complaints that IA64 has no fully-integer multiply
(i.e. you have to load your numbers into an FPU register and pull the
result back). Actually, there have been a lot of gripes about IA64;
Intel's developers' guide is great reading but quite difficult at times.

Respectfully, I believe this is a point on which the purists may be plain
wrong. One thing that impresses me about the IA64 (at least judging from
the hardware guide) is the flexibility regarding loading integers into
either a 64-bit integer (a.k.a. general-purpose) register or into a
floating register mantissa - sure, the IA64 has no separate integer
multiply unit, but if it can quickly load two 64-bit ints into floating
registers and then use the floating multiply unit (which is fully pipe-
lined and very fast) to get either the low or high 64 bits of the 128-bit
(exact integer) product, then that's a real gain.

One thing about the above operation (specifically the upper 64-bit
calculation) that intrigues me is how the IA64 effects error correction
of the floating product. On an IEEE-compliant machine, one can use an
FMUL to get the upper 52 bits of an integer product of up to 52+64=116
bits in length, a standard 64-bit integer MUL to get the low 64 bits,
and one uses the remaining bit (the lowest-order bit of the 53-bit
floating mantissa) to compare to the corresponding bit in the 64-bit
integer product to effect error correction. Perhaps the IA-64 uses a 
65-th mantissa bit in the floating multiplier to do this?

Cheers,
-Ernst

ftp://209.133.33.182/pub/mayer/home.html

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: Re: Mlucas on SPARC

1999-11-06 Thread EWMAYER

Bill Rea writes:

At 256K FFT I was seeing 0.11 secs/iter with
Mlucas against 0.15 secs/iter for MLU, at 512K FFT the figures were
0.25 secs/iter and 0.29 secs/iter respectively.

...and don't forget the added benefits due to Mlucas being able to
test significantly larger exponents at the same FFT length, as well
as having intermediate runlengths between powers of two. 0.11 seconds
at 256K is impressive - that's faster than on my 400MHz Alpha 21164
(0.16 sec) and Prime95 on a 400MHz PII (0.13sec), though, to be fair,
both of the latter systems have a smaller (512KB) L2 cache.

Mlucas runs significantly faster if you can compile and run it
on a 64-bit Solaris 7 system.

This is the first I've heard of this - roughly how much of a speedup
do you see?

I would like to upgrade the E450 to Solaris 7 and see what Mayer's
code can do in 64-bit mode, but the owners won't let me :-(

Luddites. :)

Brian Beesley writes:

Bear in mind that Mlucas is still a bit "experimental". One problem 
has come to light - if you're using the PrimeNet Manual Testing page 
to submit results, you _must_ remove the space preceding the M at the 
beginning of the result line, else PrimeNet won't accept the result. 
Actually it says it has got the result but doesn't remove the 
assignment from the current assignments report or add it to the 
completed assignments report unless the leading space is removed.
(This seems to be a program bug, I see the extra space in the source 
code so I'd assume the problem afflicts Mlucas 2.7y on all platforms. 
Certainly the binary executable I built for Alpha systems running 
linux is also afflicted.)

Actually, I had good reason to insert the leading space - many Fortran
compilers (especially older ones) interpret the leading character of a
literal output string as a linefeed character (a holdover from the days
of line printers) and will print gibberish if it's nonblank. I quote
from Metcalf  Reid, "Fortran 90/95 Explained," p.212:

"Fortran's formatted output statements were originally designed for
line-printers, with their concept of lines and pages of output. On such
a device, the first character of each output record must be of default
kind. It is not printed but interpreted as a _carriage control character_.
If it is a blank, no action is taken, and it is good practice to insert
a blank as the first character of each record..."

It seems this is something that is better dealt with on the server end -
seems a bit silly for PrimeNet to get discombobulated by an extra blank
space here and there. Scott, any chance you could apply a patch to solve
this problem in the near future? (In between diaper-changing, of course :)

For those of you who own/administrate Alphas, SGIs and SPARCs (or know
someone who does), please get the word out - it would be nice to start
increasing the percentage of non-x86 GIMPS clients, especially now that
there's Mersenne testing code roughly as fast as Prime95 for all these
systems.

The one thing we still need to make it easy to administer multiple
machines (e.g. a lab full of them) is a PrimeNet interface. Scott
suggested to me that the easiest way to do this is to start with the
Mprime (Prime95) for Linux interface routines. Does any want to look
at those, and see if they can get something working for at least one
of the above Unix/Linux clients? If I don't hear from anyone I suppose
I'll have to do it myself, but I honestly prefer to work on the under-
lying "engine" - every day I spend on other things is a day I don't
have to work, e.g. on the Mlucas P-1 factoring code.

Also, I've had no responses to my query about the Absoft f90 compiler
for the Macintosh - is the Mac portion of GIMPS dead, or is the name
"MacLucas" leading all the Mac users to believe that MLU is the only
code that will run on their hardware?

Happy hunting,
Ernst

ftp://209.133.33.182/pub/mayer/home.html

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Re: fast GCD

1999-11-06 Thread George Woltman

Hi,

At 04:07 PM 11/6/99 EST, [EMAIL PROTECTED] wrote:
Was your code doing a true Fermat-mod DWT, or were you attempting to
factor instead the Mersenne number M(2^25-1) = (2^(2^24)-1)*(2^(2^24)+1)?

Prime95 uses a true Fermat-mod DWT.

Impressive - can you give us a brief summary of the most time-impacting
changes you made, and were these to the C source or in ASM?

There were a few things in giants that were not done efficiently.
For example, dividing one huge number by another was done using 
reciprocals and then multiplying.  Well, 99.9% of the time in a GCD
the quotient will be less than 2^32 and can be derived by one floating
point divide on just the high order bits.  It was also a bit of a
memory hog.

I also upgraded giants from 16-bit to 32-bit, made use of some
ASM helper routines, cleaned up memory management, etc.

The changes I've made have been forwarded to Dr. Crandall for
possible incorporation into a future release of giants. 

BTW, I know of no other publicly available implemetation of the fast GCD.
It was written by J. P. Buhler.

I know little about it, except that Crandall mentioned it took them
the better part of 6 months to get the thing written and debugged.
Better them than you, right? :)

Absolutely!  And it is quite elegant.  You should be able to incorporate
it into your code with little trouble.  I had little trouble upgrading
it to do modular inverses too.

Regards,
George

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Version 19 for FreeBSD?

1999-11-06 Thread Bryan Fullerton

On Sat, Nov 06, 1999 at 10:47:07PM +0100, Francois E Jaccard [EMAIL PROTECTED] 
wrote:
 Will a version 19 for FreeBSD be available? I would like to transfer a 33M 
 exponent to a FreeBSD 4.0-current machine.

I'm also waiting for the FreeBSD v19 client.

Bryan

-- 
Bryan Fullertonhttp://www.samurai.com/
Core Competency
Samurai Consulting
Can you feel the Ohmu call?
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: Re: Mersenne Digest V1 #656

1999-11-06 Thread STL137

Who's sending mail to the list that makes the digests appear with a red 
background in my AOL mailreader? Arrgh... afterimage... AFTERIMAGE!

S.T.L.
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne Digest V1 #656

1999-11-06 Thread Mersenne Digest


Mersenne Digest   Saturday, November 6 1999   Volume 01 : Number 656




--

Date: Wed, 03 Nov 1999 18:35:22 -0800
From: Eric Hahn [EMAIL PROTECTED]
Subject: Re: Mersenne: Trial-factorers

Brian Beesley wrote:
On 27 Oct 99, at 17:23, Eric Hahn wrote:
 
 I'm looking for program(s) capable of trial-factoring 
 prime exponent Mersenne numbers (using 2kp+1) meeting
 the following requirements:
 
 [...requirements...]

Well, I'm prepared to have a go. Could we tighten up the spec a bit?

Wow!!  I was going to post a message with regard to the fact
that it looked like I was going to have write some code to 
produce my own program.  While the programs that were suggested
are well written, perform their designed functions, etc., they
were either not capable of the tasks required or too slow to
be useful.  

Now, look.  I even have a volunteer to write some code!!  :)

(a) There's also been some interest in something else that Prime95 
doesn't do - trial factoring 2^p+1.

(b) I assume we're only interested in 2kp+1 factors. This means that 
we will miss any factors which are not of this form. (Applies to 
Mersenne numbers with composite exponents, and all 2^p+1 numbers - 
though I believe that the "missed" exponents are easy to derive 
analytically.)

I'm not opposed or take exception to any possible additional
capabilities...  It just might require a little extra effort
for the coding.

(c) I presume we're looking for a program optimized for IA32 
architecture. The mersfac* programs are available but are unlikely to 
be optimally efficient on any particular hardware platform.

Optimized for IA32 would be beneficial (which processor
architechure runs 80% of PCs?).  If possible, the ability
to modify so as to optimize for other architectures would
be a plus, however.  One big concern is speed though!

Given that, I suggest limits on exponent  2^62 and on factor  2^95
(these are convenient for the architecture).

After waking up several nights ago with some pretty *scary*
thoughts, I realized a couple of things.

As such, exponents through 2^62 should do fine.  Anything
that might be necessary above 4.6 x 10^18 could probably
be extrapolated (not that I can think of any reason, currently,
that would cause it to be necessary).

Factors, however, would be slightly different.  I suppose 2^95
would be acceptable for a base level (or default), especially
if testing an entire bit depth (or a large range of factors).
However, if testing a small and specific range of K, say
K=2^143 to K=2^143+500, it might need to go considerable
higher.  I'm willing to make a few sacrifices for this to
be possible...  

Admittedly, I'm not "in the loop" regarding the division
of the massive numbers for which I'm talking.  I'm sure
somebody is, however (and maybe could explain it).

It's probably sensible to go for an application which runs in a "DOS 
box" rather than a proper windowed application. This makes it a bit 
easier (for me) to write  also makes deriving a linux variant almost 
trivial. (Does anyone know for sure whether or not there's a DOS box 
in "Millenium"? I heard a nasty rumour...)

I heard the same nasty rumor...  Actually, I've heard several,
including ones about the floppy and Win16 support, among others.


Eric


_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers

--

Date: Wed, 3 Nov 1999 22:46:39 -0500 (EST)
From: Lucas Wiman  [EMAIL PROTECTED]
Subject: Re: Mersenne: Meganet Corp.

   I may have missed this, but was there any final judgment on the Meganet
 Corp. claim that they had a "deterministic and polynomial-time" prime
 test? There were some discussions on the list early this year when they
 first made their claim. But I don't recall reading about any resolution. 
 Did this just fade away? Are they still holding to their claim? And if
 they are, does it stand up to examination (if they are revealing enough
 for examination)?
 
 These guys are snake-oil vendors. I don't know what type of prime test
 they are claiming to have or not have but from my exposure to their crypto
 claims I wouldn't trust anything from them  without proof.

I'd say this is true in this case as well.
On their site they say things like:
Those results are unheard of. The 1,000 bits test on a Sparc
   II workstation takes 5 Minutes and it is still only
   PROBABALISTIC. The gap in time is much greater for larger
   bit sizes.

Eh?  The probable prime test on 10^999+7 took ~20 seconds on my PII/233.  That's
a thousand digits, ~3000 bits.
(w/primeform)

The major breakthrough is solving a 400 year old
   mathematical problem – how to positively identify a prime
   number without spending exponential time in dividing