Dual CPU usage (was: Re: Mersenne: Best chance to make a "real" contribution?)

Brian J. Beesley Sun, 23 Jan 2000 03:25:04 -0800
On 22 Jan 00, at 15:44, St. Dee wrote:

> This brings up something I've been wondering about.  I have a dual Celeron
> setup running 2 instances of mprime under Linux.  With both processors
> crunching on LL tests, I get iteration times for each processor of around
> .263 for exponents around 8990000 (where they are presently cranking).
> However, if one of them factors while the other does LL testing, the
> processor doing the LL testing takes about .220 seconds per iteration,
> while the one factoring also shows a factoring speed more consistent with
> the speed I would expect for the processor speed.

The "problem" here is that the two processors are sharing access to 
the memory bus. Trial factoring actually uses very little memory 
access (actually it runs more or less from the L1 cache!) whereas LL 
testing really does thrash the memory subsystem (unless you have a 
Xeon with 2MB L2 cache, and even then only if you're testing an 
exponent less than 10,320,000).

I presume you have BX chipset and PC100 memory. Even so, asking for 
twice the 66 MHz bandwidth of each Celeron CPU is going to congest 
the memory subsystem, which explains the slowdown.

However, the problem is not unique to dual Celeron systems. Unless 
you have a multi-processor system with a seperate memory bus for each 
processor - and some means of intercommunication e.g. multiporting - 
the same problem is going to strike. To the best of my knowledge, 
only very expensive server motherboards use this technology. (Compaq 
did have a dual CPU workstation MB based on this technology but they 
seem to have dropped it when 100 MHz systems came in.)

For systems using large clock multipliers, increasing the throughput 
of the memory bus will help mprime/Prime95 performance substantially
- even for uniprocessor systems. In my experience, systems running 
the VIA chipset using PC133 memory are very disappointing - my PIII-
533 is turning in times very similar to the PII-400 benchmarks. I get 
the impression that there is a 100 MHz bottleneck inside the chipset, 
even when CPU and memory buses are both set to 133 MHz. Or it's 
actually driving the CPU at 100 MHz irrespective of the board jumpers 
and setup.

Intel's Rambus technology looks interesting from this point of view - 
and boards which use it are starting to filter on to the market. A 
practical problem is that Rambus memory is extortionately expensive - 
at least 4 times the price of SDRAM, and you can't even get modules 
smaller than 64 MB. Intel recently released a version of some boards 
using the 820 chipset (designed for Rambus) which can take 133 MHz 
SDRAM, but there is a "gearbox" between the SDRAM and the chipset 
which reduces the memory bus throughput compared with the native 
Rambus version - though whether it's any worse than you'd expect 
projecting from the benchmarks based on a 100 MHz BX chipset is 
unknown.
> 
> My question is:  Do I get more "work" done by having both doing LL
> testing, or would this box contribute more to the effort by having one CPU
> performing factoring while the other does LL testing?

Personally I think you get better value by running 1 LL test & 1 
"something else" which doesn't make big claims on the memory bus. ECM 
factoring is quite light on the memory bus (though "larger" exponents 
i.e. above about 2,000 might have difficulty fitting well into the 
Celeron L2 cache, which is only 128K) and might make an interesting 
alternative to trial factoring.

There are a considerable number of exponents which have been double-
checked but have not been trial-factored to the full depth suggested 
by v19, running some of these would find a number of "first factors", 
though I don't think PrimeNet would credit you with the CPU time.

Factoring (of any kind) will not find primes; trial factoring larger 
exponents which will be LL tested later will eliminate some 
candidates, and be credited by PrimeNet (if you use the "automatic" 
method of reporting assignment results); other types of factoring 
would be done out of pure interest, to find new or first factors 
only.

Also, the PrimeNet rankings for LL testing & factoring are seperate 
rather than cumulative. If you split your work, your LL testing 
effort will be reduced compared with what it would be if you ran LL 
tests on both CPUs. As a compensation, you would zoom up the 
factoring rankings!

Regards
Brian Beesley
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
Dual CPU usage (was: Re: Mersenne: Best chance to make a "real" contribution?)

Reply via email to