> >
> > Eduard Kostolansky writes:
> >
> > Oh, GOD! I think, this will be the real death of us - UNIX
> > double-checkers.
> >
> > Possible, I suppose, but for people that only have UNIX boxes and want
> > to contribute ... Of course, there's always ECM factoring, NFSNet,
> > Ernst Mayer's F90 code, and others.
> >
> There are still a lot of work in double-checking, since only exponents
> between 1,500,000 and 1,700,000 are available on Primenet (and there are a
> lot of exponents below 1,500,000).
> And don't forget that Prime95 can have two different residues for some
> exponents, so we'll need a third LL-test.
And we still have a need for REAL independent checking. I believe
George's v17 code is useful but I find it hard to accept that
matching residues with different offsets using Prime95 v17.x using
Intel CPUs is totally satisfactory. We still need to continue to
double (triple) check at least a proportion of exponents to be sure
that something nasty isn't slipping through. Like some weird &
wonderful flaw in the CPU chip, or a non-offset-dependent bug in
George's program.
Having said that, Prime95 v17 should help us to find at least most of
the errors that are in the current data - the estimate of 1 in 200
results being "bad" sounds moderately frightening to me - if nothing
else, Prime95 v17 should enable us to refine that estimate.
> > Note for Will Edgington: Will, I don't wanna die!! It's time to
> > upgrade MacLucasUnix! Or is it impossible??
> >
> > I'm sure it's possible, but I certainly don't know enough to improve
> > the speed. And have not even had time to keep up with bug reports,
> > recently, though that should change over the next few weeks.
> >
> Can anybody take the role of George Woltman onto MacLucasUnix ?
The main problem with MacLucasUNIX is that, whilst the code itself is
pretty portable, the required optimization is very dependent on the
hardware on the system on which it's going to be run. I believe that
it was written originally with a view to the way the PowerPC CPU used
in Macintoshes is organized. For that reason, MacLucasUNIX runs very
well on PPC-based Unix systems, but less efficiently on Alpha-based
systems. It looks to me like either the Sun processors are very poor
performers based on their register width & CPU speed, or there is
something in the pipelining and/or cache organization in Sun CPUs
which is upset by the code in MacLucasUNIX. (Mind you, fft and
mersenne1/2 don't run any better on the Sun platform).
Even within a particular family of processors, changes in cache
structure & pipelining can make a big difference to the effectiveness
of hand-optimized code.
The work involved in hand optimizing for a particular CPU
architecture would be considerable & would benefit relatively few
users. I'm not saying it shouldn't be done, but there may be more
effective ways to use programming effort.
Like, for instance, implementing non-power-of-2 FFTs into
MacLucasUNIX - throughput when testing exponents around 1.4M would be
much better if we could use a FFT size of 80K, like Prime95, instead
of being forced up to 128K because 64K isn't enough.
This would be a distinctly non-trivial project, but at least it would
help all non-Intel implementations, not just those users with a
particular CPU architecture.
Brian Beesley