Mersenne Digest       Saturday, August 21 1999       Volume 01 : Number 617




----------------------------------------------------------------------

Date: Thu, 19 Aug 1999 12:58:21 -0600
From: "Aaron Blosser" <[EMAIL PROTECTED]>
Subject: RE: Merced (was Re: Mersenne: Re: Alpha DS20 timings.)

> > I imagine you could significantly speed up the code by keeping
> much of the
> > data in register.  REG-REG operations take a lot less time than
> a REG-MEM
> > operation.  Should be delicious.
>
> From what I understand of Merced, compiler technology is going to be the
> problem.  It's probably not unreasonable to expect large performance
> increases as the intelligence of compilers (especially the "free"
> compilers like gcc and egcs) catches up to the theoretical performance
> of the CPU.

Well, I hope George can do some assembly for the Merced.  It'll be a pain
though, I'm sure, having to do all those new optimizations by hand.

As for the compilers, remember that RISC type architecture is nothing really
new...and EPIC type stuff has been around a while.  There are already
compilers for other systems that contain much of the brains to do the
optimizations already...they just need to get those smarts moved over to
IA64's particular needs.

And what with Intel making IA64 "simulators" available way before samples of
Merced ever ship, the software vendors have had ample time to work with
it...meaning that when Merced does ship in volume, the software *should*
hopefully be ready.

Aaron

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Thu, 19 Aug 1999 17:39:59 -0700
From: "John R Pierce" <[EMAIL PROTECTED]>
Subject: Re: Merced (was Re: Mersenne: Re: Alpha DS20 timings.)

> As for the compilers, remember that RISC type architecture is nothing
really
> new...and EPIC type stuff has been around a while.  There are already
> compilers for other systems that contain much of the brains to do the
> optimizations already...they just need to get those smarts moved over to
> IA64's particular needs.

IA64 is really VLIW (very long instruction word), which is quite different
than traditional sequential RISC.  It requires the compiler to do a LOT of
massively parallel pipeline scheduling to achieve optimal results.  HP has a
leg up on this compiler technology as IA64 is based on their existing
PA-RISC, and is sharing there compiler backend optimzation technology with
Intel and Microsoft.

Having once programmed a VLIW machine in 'assembler', I would not wish that
task on ANYONE.  The machine I worked on had 8 parallel asymetrical
execution units, and a 288 bit wide opcode which launched 8 parallel
different instructions in every cycle.   The assembler (micro?) coder had to
keep track of which parts of what execution unit would take how long to do
each instruction, and not rely on results before they were ready.  To keep
the machine actually humming along at even close to half its theoretical
performance levels bordered on nightmarish.

- -jrp


_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 20 Aug 1999 10:03:49 -0500
From: "Willmore, David" <[EMAIL PROTECTED]>
Subject: RE: Merced (was Re: Mersenne: Re: Alpha DS20 timings.)

> IA64 is really VLIW (very long instruction word), which is quite different
> than traditional sequential RISC.  It requires the compiler to do a LOT of
> massively parallel pipeline scheduling to achieve optimal results.  HP has
> a
> leg up on this compiler technology as IA64 is based on their existing
> PA-RISC, and is sharing there compiler backend optimzation technology with
> Intel and Microsoft.
> 
Intel has some experience with some degree of parallelism dating back to the
i860--which had visible pipelines. 

> Having once programmed a VLIW machine in 'assembler', I would not wish
> that
> task on ANYONE.  The machine I worked on had 8 parallel asymetrical
> execution units, and a 288 bit wide opcode which launched 8 parallel
> different instructions in every cycle.   The assembler (micro?) coder had
> to
> keep track of which parts of what execution unit would take how long to do
> each instruction, and not rely on results before they were ready.  To keep
> the machine actually humming along at even close to half its theoretical
> performance levels bordered on nightmarish.
> 
Yes, it's not fun at all.  I've programmed on the new TI VLIW DSPs and
they're certainly a trip.  The things that make this more practical are that
IA64 doesn't have exposed pipelines--*you* don't have to code in pipeline
delays to ensure correct behavior, just decent performance.

You are quite right, though, getting anywhere near even half of theoretical
performance on general purpose VLIW machines for general purpose code is a
monstrous task.  DSP is a bit easier as the class of alg. have more self
similarity.  *But*  this is where we're on the good side of this debate.  An
FFT is one of those types of alg.--it has been extensively researched WRT
implementations like this.  Intel, I believe, will be providing some
'machine speed' FFT code--if the recent press release is to be believed.
Maybe LL testing code for IA64 will be easier? :)

Cheers,
David
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 20 Aug 1999 19:16:40 +0200
From: "Steinar H. Gunderson" <[EMAIL PROTECTED]>
Subject: Re: Re: Merced (was Re: Mersenne: Re: Alpha DS20 timings.)

On Thu, Aug 19, 1999 at 05:39:59PM -0700, John R Pierce wrote:
>The assembler (micro?) coder had to
>keep track of which parts of what execution unit would take how long to do
>each instruction, and not rely on results before they were ready.  To keep
>the machine actually humming along at even close to half its theoretical
>performance levels bordered on nightmarish.

Seems like assemblers will have to get smarter in the future? Just the
`simple' feature of being able to edit those 8 streams separate would
appearently help a lot, and some visual cue on when an operation is
finished would also be an idea. (Wonder why the last one hasn't been
implemented in any fancy assembler GUI?)

/* Steinar */
- -- 
Homepage: http://members.xoom.com/sneeze/
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 20 Aug 1999 21:00:52 +0100
From: "Brian J. Beesley" <[EMAIL PROTECTED]>
Subject: Mersenne: MacLucasUNIX for Sun Solaris

Hi,

For anyone interested, I've placed some more versions of MacLucasUNIX 
on my ftp server (ftp://lettuce.edsc.ulst.ac.uk/gimps/software)

There are now additional versions compiled using gcc for Ultra CPUs 
(the original file also worked on SuperSparcs) and three executables 
compiled by Bill Rea using Sun's own compiler, optimized for 
different configurations using Ultra CPUs.

The speed increase is quite marked: on my Ultra 10 (300 MHz Ultra IIi 
CPU, 128MB RAM, Solaris 2.6) the generic Sparc code takes 33 mins, 
against 25 mins using the Ultra code generated by gcc and only 19 
mins against the best of Bill Rea's executables - the time being that 
required to execute 5000 iterations using a 256K FFT.

Regards
Brian Beesley
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 20 Aug 1999 16:45:36 -0400
From: Marc Getty <[EMAIL PROTECTED]>
Subject: Mersenne: Merced Assemblers

> Seems like assemblers will have to get smarter in the future? Just the
> `simple' feature of being able to edit those 8 streams separate would
> appearently help a lot, and some visual cue on when an operation is
> finished would also be an idea. (Wonder why the last one hasn't been
> implemented in any fancy assembler GUI?)

I can see it now! Microsoft Visual Assember++ it will require 2048 MB of RAM, 16
GB of hard disk space, and an 8 GHz Merced processor. You know it's coming, you
know you don't want it, and you know you will still buy it.

- -Marc

Marc Getty                           [EMAIL PROTECTED]
Department of Dental Informatics, Temple University
http://www.temple.edu/dentistry/di/    215-707-8192
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 01:14:55 +0200
From: "Steinar H. Gunderson" <[EMAIL PROTECTED]>
Subject: Mersenne: Re: Merced Assemblers

On Fri, Aug 20, 1999 at 04:45:36PM -0400, Marc Getty wrote:
>I can see it now! Microsoft Visual Assember++ it will require 2048 MB of RAM, 16
>GB of hard disk space, and an 8 GHz Merced processor. You know it's coming, you
>know you don't want it, and you know you will still buy it.

And gas, taking the bloatedness to new heights, will require 8 bytes of
memory, 32 kB of hard disk space (making it impossible to compile Merced
code on a C64 -- the gas team has already issued a public apology for this)
and a toaster. 

(No, I won't buy an assember. An assembLer, on the other hand ;-) )

Still, this might be useful. It would have been useful already today,
though... (Hardcore asm programmers would probably not need it, though
and I guess most asm programmers who've survived are hardcore enough...)

/* Steinar */
- -- 
Homepage: http://members.xoom.com/sneeze/
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 20 Aug 1999 17:48:54 -0700
From: "Scott Kurowski" <[EMAIL PROTECTED]>
Subject: Mersenne: v19 network connection tests

Hi all,

I need your help.  It takes just a few moments on your Windows computer.

I need test data from a broad variety of network connection and ISP types to see
if a new Internet connection detection method will work or supplement the RAS
method used in v16-v18 now.

There's a tiny Win32 console app executable to run that outputs 2 or 3 lines of
text, at http://entropia.com/primenet/online.ex_ (28k).  Download and rename it
from 'online.ex_' to 'online.exe'.

Copy and paste the results into a email directly to me, indicating:
  - type of Internet connection (dialup modem, LAN, DSL, ISDN, etc.)
  - output of program test run(s)
  - if the connection was really open or not
  - ISP connection method & if applicable, version
      (direct, Windows dialup networking [DUN], AOL 4.0, Compuserve 3.0, etc.)
  - web browsers & versions installed (Netscape 3, IE4.0, IE5, etc.)
  - Operating system (Win95, Win98, NT 4.0 workstation or server)

For dial-up users, run it twice, once with the connection closed, and again with
your Internet connection established.

If you have a LAN and/or proxy connection, please also run it.

Thanks, everyone.  Hopefully we'll cover enough network connection types to
determine its viability.

Regards,
scott


_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 20 Aug 1999 21:45:19 EDT
From: [EMAIL PROTECTED]
Subject: Mersenne: Re: DS-20 timings

Dear All: I'm catching up on lots of postings, so forgive me if this
is long-winded.

Simon Burge writes:

>Compaq have a DS20 Alpha with 2 500MHz 21264 CPUs on the internet
>for people to try out.

This is a dual-CPU version of the same kind of machine (a.k.a. ev6
or 21264 - the naming profusion keeps increasing with each new
generation, apparently) David Willmore used to verify M#38 using a
beta of my Mlucas v2.6 code (formerly lucas_mayer - Michael Taylor,
who used to maintain the GIMPS source code page, picked that name,
whereas I prefer just Mlucas).
We got timings competitive with Prime95 on the fastest available
Pentia, and that was using HLL compiled code with no architecture-
specific tunings. The 21264 is a sweet piece of hardware, indeed.

>Ernst - since nigel is no more, where can I get the latest f90
>code? I've got 2.5b, and it's giving me some errors:

I'm not sure why it's error-exiting, but since you compiled locally
I suspect overly aggressive compile options. I n particular if you
use -fast, you must also use -assume accuracy_sensitive, to keep
the compiler from eliminating the (x+rnd)-rnd operations used to
effect a fast NINT in the carry phase. Also, you might try both
- -O4 and -O5: the latter sometimes gives slower executables, in a
platform-dependent way, and should be used with caution.

In any event, you can get the new improved version 2.6b via

ftp://209.133.33.182/pub/mayer/Mlucas_2.6b.f90.gz

(David Willmore is the only person who got the short-lived v2.6a;
David, 2.6b is about 10% faster, so you may want to grab it now.)

If you don't have an F90 compiler or don't want to compile locally,
binary executables for Alpha Unix (formerly OSF/1) and SGI are also
available in the /bin directory of the above site. Please see the
README file for more info.

I am waiting for a Linux executable, hopefully by middle of next
week. Alex Kruppa has several different F90 compilers on his SPARC
which he said he'd try out on the code, but as his father passed
away a few days ago, he surely has more pressing concerns.

One can only hope there's now/soon a better F90 compiler for SPARC
than their dismal V1 effort - that won't even do 64-bit loads and
stores, even when one specifies such in the options!

Aaron Blosser writes:

>ps - Is anyone else besides me happy that Compaq is ditching
>that boring old "computer beige" in favor of the "opal" (basically
>white) color for their servers?  I think they look snazzy.

My DEC 21164 server is "Top Gun blue" - is that faster?

Brian Beesley writes (about MacLucasUnix):

>I find, running MLU on a Alpha 21164-533, 128K FFT works up to about 
>exponent 2.35 million, & pro rata. MLU on a Sparc seems to be able to 
>run a bit higher, somewhere around 2.45 million seems to be OK for a 
>128K FFT.

Hmm, those upper limits seem a bit low. Does MacLucasUnix tell you when
the exponent is too large? Some related postings in the last digest
seem to show people using exponents much too large for a given FFT size
but getting no error messages - that would be bad. For comparison,
Mlucas, on machines that support real*16 sincos inits (Alpha and SGI),
can go up to the following p's (I omit 160, 192 and 224K for brevity):

size: 128K   256K   320K   384K   448K   512K   640K   768K   896K  1024K
pmax: 2.62M  5.20M  6.46M  7.71M  8.96M  10.2M  12.6M  15.1M  17.5M  20M

On strictly real*8-type hardware, the upper limits are about 1% less.
The only cost of real*16 inits is a greater initialization time.

OK, now down to the nuts and bolts of per-iteration timings: I'll list
only my own timings of Mlucas 2.6b here - since I'll be maintaining the
GIMPS timings page, I hope people will keep sending/posting timings for
various platforms. In particular I'd like MacLucasUnix timings for SGIs
as similar as possible to the ones below.

SUMMARY: in general it appears that MacLucasUnix is somewhat faster than
Mlucas at a given power-of-2 length. Most of the difference appears to be
due to the fact that MacLucasUnix uses an in-place transform strategy
like Prime95, whereas Mlucas uses an out-of-place transform, thus needing
about double the memory and spilling over into RAM for FFT lengths where
MacLucasUnix still just fits into the L2 cache.

On the other hand, I found that using an out-of-place transform scheme
made the code much easier to debug, and also eased the adding of non-
power-of-2 FFT lengths, which MacLucasUnix lacks. Thus, in the lower half
of each power-of-2 interval, Mlucas will be faster, the two codes should be
roughly equal for the third quarter of the interval, and MacLucasUnix
is faster for the uppermost quarter.

                       Platform/per-iteration time (sec)
            200Mhz 21064   400MHz 21164   195MHz R10000   250MHz R10000
            cache sizes    8kB D-cache    32kB D-cache    32kB D-cache
            unknown        96kB mixed I/D                 
                           512kB L2       4MB L2          1MB L2
FFT length: ------------   ------------   -------------   -------------
 128K       0.32           0.12           0.096           0.095
 160K       0.37           0.17           0.14            0.14
 192K       0.48           0.22           0.17            0.17
 224K       0.58           0.26           0.21            0.20
 256K       0.63           0.29           0.25            0.23
 320K       0.87           0.39           0.33            0.29
 384K       1.06           0.49           0.40            0.35
 448K       1.29           0.58           0.49            0.42
 512K       1.39           0.65           0.56            0.47
 640K       1.88           0.84           0.70            0.60
 768K       2.35           1.15           0.96            0.80
 896K       2.73           1.22           1.04            0.86
1024K       2.96           1.36           1.17            0.96

David Willmore's timings of the beta of Mlucas 2.6 indicate that the
code runs about 3 times faster on a 500MHz 21264 than on a 400MHz 21164,
so dividing the numbers in the 21164 column by 3 should yield a decent
estimate of 21264 timings until I can get some actual timing data.

Now, if we could squeeze gains out by assembly coding anywhere near those
Jason Papadopoulos obtained on the SPARC via the ASM route, that would
be impressive, indeed.

Cheers,
Ernst
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 00:15:57 -0400
From: Pierre Abbat <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Re: Merced Assemblers

>And gas, taking the bloatedness to new heights, will require 8 bytes of
>memory, 32 kB of hard disk space (making it impossible to compile Merced
>code on a C64 -- the gas team has already issued a public apology for this)
>and a toaster. 

Gasp!

>(No, I won't buy an assember. An assembLer, on the other hand ;-) )

What month comes after Assember?

phma
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 05:32:10 GMT
From: [EMAIL PROTECTED] (Foghorn Leghorn)
Subject: Re: Mersenne: Re: Merced Assemblers

On Sat, 21 Aug 1999 00:15:57 -0400, you wrote:
>>(No, I won't buy an assember. An assembLer, on the other hand ;-) )
>What month comes after Assember?

I think it's Dectembruary. (Or is that just on the Julian calendar?)

Foghorn Leghorn
[EMAIL PROTECTED]
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 21:00:22 +1000
From: Simon Burge <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Re: DS-20 timings 

[EMAIL PROTECTED] wrote:

> Dear All: I'm catching up on lots of postings, so forgive me if this
> is long-winded.

I'll give you a long-winded reply then :-)

> Simon Burge writes:
> 
> >Ernst - since nigel is no more, where can I get the latest f90
> >code? I've got 2.5b, and it's giving me some errors:
> 
> I'm not sure why it's error-exiting, but since you compiled locally
> I suspect overly aggressive compile options. I n particular if you
> use -fast, you must also use -assume accuracy_sensitive, to keep
> the compiler from eliminating the (x+rnd)-rnd operations used to
> effect a fast NINT in the carry phase. Also, you might try both
> -O4 and -O5: the latter sometimes gives slower executables, in a
> platform-dependent way, and should be used with caution.

It looks like the DEC C compiler has the same problem - adding "-assume
accuracy_sensitive" to the compiler command line fixed the problem where
MacLucasUnix thought it could do 33219281 with a 1M FFT - it does detect
an error and bumps up to a 2M FFT pretty quickly (within the first ten
iterations).

> In any event, you can get the new improved version 2.6b via
> 
> ftp://209.133.33.182/pub/mayer/Mlucas_2.6b.f90.gz
> 
> (David Willmore is the only person who got the short-lived v2.6a;
> David, 2.6b is about 10% faster, so you may want to grab it now.)

Have you got 2.6x and 2.6a mixed up?  From what I understand, David used
2.6x for the double check, and 2.6a is currently on your ftp site.

> Brian Beesley writes (about MacLucasUnix):
> 
> >I find, running MLU on a Alpha 21164-533, 128K FFT works up to about 
> >exponent 2.35 million, & pro rata. MLU on a Sparc seems to be able to 
> >run a bit higher, somewhere around 2.45 million seems to be OK for a 
> >128K FFT.
> 
> Hmm, those upper limits seem a bit low. Does MacLucasUnix tell you when
> the exponent is too large? Some related postings in the last digest
> seem to show people using exponents much too large for a given FFT size
> but getting no error messages - that would be bad.

It does, when (as I mentioned above) compiled with the right options...

> For comparison,
> Mlucas, on machines that support real*16 sincos inits (Alpha and SGI),
> can go up to the following p's (I omit 160, 192 and 224K for brevity):
> 
> size: 128K   256K   320K   384K   448K   512K   640K   768K   896K  1024K
> pmax: 2.62M  5.20M  6.46M  7.71M  8.96M  10.2M  12.6M  15.1M  17.5M  20M

Here's some _very_ rough figures for MacLucasUnix on the DS20:

size: 128K   256K   320K   384K   448K   512K   640K   768K   896K  1024K
pmax: 2.38M  4.98M                       9.3M                       18.8M 

> David Willmore's timings of the beta of Mlucas 2.6 indicate that the
> code runs about 3 times faster on a 500MHz 21264 than on a 400MHz 21164,
> so dividing the numbers in the 21164 column by 3 should yield a decent
> estimate of 21264 timings until I can get some actual timing data.

Slightly more that three times - here's your table with a column for the
21264 added:

                                Platform/per-iteration time (sec)
            200Mhz 21064   400MHz 21164   195MHz R10000   250MHz R10000   500MHz 21264
            cache sizes    8kB D-cache    32kB D-cache    32kB D-cache    64kB I-cache
            unknown        96kB mixed I/D                                 64kB D-cache
                           512kB L2       4MB L2          1MB L2          4MB L2
FFT length: ------------   ------------   -------------   -------------   -------------
 128K       0.32           0.12           0.096           0.095           0.043
 160K       0.37           0.17           0.14            0.14            0.051
 192K       0.48           0.22           0.17            0.17            0.062
 224K       0.58           0.26           0.21            0.20            0.081
 256K       0.63           0.29           0.25            0.23            0.10
 320K       0.87           0.39           0.33            0.29            0.12
 384K       1.06           0.49           0.40            0.35            0.16
 448K       1.29           0.58           0.49            0.42            0.19
 512K       1.39           0.65           0.56            0.47            0.21
 640K       1.88           0.84           0.70            0.60            0.28
 768K       2.35           1.15           0.96            0.80            0.38
 896K       2.73           1.22           1.04            0.86            0.40
1024K       2.96           1.36           1.17            0.96            0.46

These results are for a copy of Mlucas_2.6a.f90 I compiled with:

        f90 -o lm -tune ev6 -O5 lucas_mayer_V2.5b.f90

The Mlucas_2.6a.exe from your FTP site gives the same results but is
slightly slower (around 5%) on the DS20:

        % cat foo
        15000017,0
        y
        750

        % time ./Mlucas_2.6a.exe < foo
          no restart file found...looking for range file...
          no range file found...switching to interactive mode.
         Enter p,n (set n=0 for default FFT length) >
         Enter 'y' to run a self-test, <return> for a full LL test >
          Enter number of iterations for timing test>
          p is prime...proceeding with Lucas-Lehmer test...
         M( 15000017 ): using an FFT length of  786432
          this gives an average    19.0735079447428      bits per digit
             750 iterations of M15000017 with FFT length  786432
         Res64: 545ACAF7C5DB12F5. Program: E2.6a
         Clocks = 00:04:59.369
        298.72u 0.09s 4:59 99% 0+368k 0+7io 0pf+0w

        % time ./lm26a < foo
          no restart file found...looking for range file...
          no range file found...switching to interactive mode.
         Enter p,n (set n=0 for default FFT length) >
         Enter 'y' to run a self-test, <return> for a full LL test >
          Enter number of iterations for timing test>
          p is prime...proceeding with Lucas-Lehmer test...
         M( 15000017 ): using an FFT length of  786432
          this gives an average    19.0735079447428      bits per digit
             750 iterations of M15000017 with FFT length  786432
         Res64: 545ACAF7C5DB12F5. Program: E2.6a
         Clocks = 00:04:42.171
        281.59u 0.11s 4:42 99% 0+368k 0+6io 0pf+0w

I can give you the raw data for my additions to the table (including
exponents, iterations and residues) if you want.

Simon.
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 13:18:43 +0200
From: Johan Winge <[EMAIL PROTECTED]>
Subject: Mersenne: Basic divisibility(?) question.

Hello!
A fairly simple question maybe, but bear with me please...
 
Am I correct in my assumption that if e.g. p=2*k*p(1)*p(2) divides M(p(1)),
then p only divides other mersenne numbers with an exponent which is a
multiple of p(1), (and thus cannot divide M(p(2)) )? Is there a proof for
this or am I wrong?

Regards,
Johan Winge
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 05:12:29 -0700
From: "Joth Tupper" <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Basic divisibility(?) question.

Can you restate the question so that it is clear what divides what?

As is, p is even, so p cannot divide a Mersenne number.

Thanks,

Joth

- ----- Original Message -----
From: Johan Winge <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Saturday, August 21, 1999 4:18 AM
Subject: Mersenne: Basic divisibility(?) question.


> Hello!
> A fairly simple question maybe, but bear with me please...
>
> Am I correct in my assumption that if e.g. p=2*k*p(1)*p(2) divides
M(p(1)),
> then p only divides other mersenne numbers with an exponent which is a
> multiple of p(1), (and thus cannot divide M(p(2)) )? Is there a proof for
> this or am I wrong?
>
> Regards,
> Johan Winge
> _________________________________________________________________
> Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
> Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
>

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 22:11:09 +1000
From: Simon Burge <[EMAIL PROTECTED]>
Subject: Mersenne: Updated Alpha MacLucasUNIX binaries

Folks,

I've updated the Alpha MacLucasUNIX binaries at:

        ftp://melanoma.cs.rmit.edu.au/pub/simonb/MLU-ALPHA.tar.gz

to include the "-assume accuracy_sensitive" option as discussed in
another message I sent to the mersenne list a few minutes ago.  THE OLD
BINARIES WILL NOT PRODUCE CORRECT RESULTS!

I've tested this version against M380267 (the smallest result I had
lying around), and that residue was correct.  It took 2590 seconds to
check that result, compared with 531978 on the DECsystem 5000/150 I
first got that result from!

The actual command line used to build these binaries was:

        cc -DMERS_PACKAGE -DUSE_RUSAGE -fast -arch host -tune host -O4 
            -assume accuracy_sensitive -ifo -o MacLucasUNIX
            MacLucasUNIX.c balance.c rw.c setup.c zero.c -lm 

Using "-ifo" and listing all the .c files on the same line will do
inter-file optimisations.  Whether or not anything was actually gained I
don't know.  The new version is approx 1% slower than the old one, but
should produce correct results :-)

Simon.
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 09:00:23 -0500
From: Herb Savage <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Basic divisibility(?) question.

Johan Winge wrote:

> Hello!
> A fairly simple question maybe, but bear with me please...
>
> Am I correct in my assumption that if e.g. p=2*k*p(1)*p(2) divides M(p(1)),
> then p only divides other mersenne numbers with an exponent which is a
> multiple of p(1), (and thus cannot divide M(p(2)) )? Is there a proof for
> this or am I wrong?

You are correct.

This question has come up several times in the past.
Its implied by the fact that GCD(a^m-1,a^n-1) = a^GCD(m,n)-1. This is
proven in an exercise in Knuth in the section on GCD's.

Regards,

Herb Savage

> Regards,
> Johan Winge
> _________________________________________________________________
> Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
> Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers



_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 21 Aug 1999 23:29:38 +0100
From: "Brian J. Beesley" <[EMAIL PROTECTED]>
Subject: Mersenne: LL Test Program verification data

Hi,

I've posted on my ftp server a file which may be of interest to 
anyone developing or verifying the operation of LL testing programs.

The URL of the file is 
ftp://lettuce.edsc.ulst.ac.uk/gimps/PrimeQA/QADATA.TXT

The format is
exponent,# iterations,residual,program id,max error

The residual listed is the hexadecimal representation of the low 
orfer 64 bits of the complete residual. The program id is always 
"lucdwt"; the max error can be ignored (basically it's there as a 
self-check to make sure that a sensible FFT size was used to generate 
the data).

The range of exponents is selected exponents between 8,000 and 
80,000,000. The low 64 bits of the residual after 400 iterations is 
listed for every exponent in the file. (After one iteration the 
residual should be 000000000000000E for every exponent). For 
exponents less than 4,000,000, the residual after 1,000 iterations is 
also listed. For exponents greater than 20,000,000, the residual 
after 100 iterations is also listed.

The results were generated on a 533 MHz Alpha 21164LX system; the 
program used is a version of lucdwt (included in Richard Crandall's 
giantint package), modified to produce output in a suitable format.

I hope that this data assists anyone developing or modifying LL 
testing software to ensure that their code is operating correctly. 
PLEASE INFORM ME IMMEDIATELY if you should detect what appears to be 
a genuine mistake in the data.

In the near future I intend to enhance this file to include the 
residual after 1000 and after (n-2) iterations for all odd n between 
500 and 8000. If anyone has any suggestions as to any further data 
which may usefully be included, please contact me. I'm afraid I am 
unable to proceed beyond exponent 80 million due to system 
limitations, my Alpha system has "only" 320 MB RAM.

Regards
Brian Beesley
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

End of Mersenne Digest V1 #617
******************************

Reply via email to