> > 
> > Eduard Kostolansky writes:
> > 
> >    Oh, GOD!  I think, this will be the real death of us - UNIX
> >    double-checkers.
> > 
> > Possible, I suppose, but for people that only have UNIX boxes and want
> > to contribute ...  Of course, there's always ECM factoring, NFSNet,
> > Ernst Mayer's F90 code, and others.
> > 
> There are still a lot of work in double-checking, since only exponents
> between 1,500,000 and 1,700,000 are available on Primenet (and there are a
> lot of exponents below 1,500,000).
> And don't forget that Prime95 can have two different residues for some
> exponents, so we'll need a third LL-test.

And we still have a need for REAL independent checking. I believe 
George's v17 code is useful but I find it hard to accept that 
matching residues with different offsets using Prime95 v17.x using 
Intel CPUs is totally satisfactory. We still need to continue to 
double (triple) check at least a proportion of exponents to be sure 
that something nasty isn't slipping through. Like some weird & 
wonderful flaw in the CPU chip, or a non-offset-dependent bug in 
George's program.

Having said that, Prime95 v17 should help us to find at least most of 
the errors that are in the current data - the estimate of 1 in 200 
results being "bad" sounds moderately frightening to me - if nothing 
else, Prime95 v17 should enable us to refine that estimate.

> >    Note for Will Edgington: Will, I don't wanna die!! It's time to
> >    upgrade MacLucasUnix! Or is it impossible??
> > 
> > I'm sure it's possible, but I certainly don't know enough to improve
> > the speed.  And have not even had time to keep up with bug reports,
> > recently, though that should change over the next few weeks.
> > 
> Can anybody take the role of George Woltman onto MacLucasUnix ?

The main problem with MacLucasUNIX is that, whilst the code itself is 
pretty portable, the required optimization is very dependent on the 
hardware on the system on which it's going to be run. I believe that 
it was written originally with a view to the way the PowerPC CPU used 
in Macintoshes is organized. For that reason, MacLucasUNIX runs very 
well on PPC-based Unix systems, but less efficiently on Alpha-based 
systems. It looks to me like either the Sun processors are very poor 
performers based on their register width & CPU speed, or there is 
something in the pipelining and/or cache organization in Sun CPUs 
which is upset by the code in MacLucasUNIX. (Mind you, fft and 
mersenne1/2 don't run any better on the Sun platform).

Even within a particular family of processors, changes in cache 
structure & pipelining can make a big difference to the effectiveness 
of hand-optimized code.

The work involved in hand optimizing for a particular CPU
architecture would be considerable & would benefit relatively few 
users. I'm not saying it shouldn't be done, but there may be more 
effective ways to use programming effort.

Like, for instance, implementing non-power-of-2 FFTs into 
MacLucasUNIX - throughput when testing exponents around 1.4M would be 
much better if we could use a FFT size of 80K, like Prime95, instead 
of being forced up to 128K because 64K isn't enough.

This would be a distinctly non-trivial project, but at least it would 
help all non-Intel implementations, not just those users with a 
particular CPU architecture.

Brian Beesley

Reply via email to