>   As we can see, multiprocessor machines seems to be faster
than ones 
> with one CPU (at least if they use Glucas program, because
AFAIK mprime 
> is single threaded). I don't know how Glucas exactly works
(is there 
> huge amount of data transferred between processors in short
time or 
> not), but there might be a way to split it somehow with
reasonable 
> improvement.

I don't know too how GLucas does that. I just contributed in
fixing some problems about the multi-threading.
In the case of a multi-threaded application like GLucas, all
threads share the same memory. So sharing data is easy. On a
CC-NUMA machine like the Bull NovaScale, each block of 4 CPUs
has its own local block of memory. For a thread running on a
block, accessing local memory is (quite) fast. But accessing
to the memory of another block costs (2 times slower). So,
being able to allocate the memory needed by each thread on the
block where the thread runs would improve the performances of
GLucas. But that requires to know in details which part of
memory is used by each thread and to modify GLucas.
Then it could be possible to have GLucas running ONE exponent
on several machines by using MPI. It just requires to have the
previous work done. Running GLucas on several machines would
add another Perf factor slowing down GLucas. But I guess
checking M42 on 8 machines with 16 ia64 processors each would
take only ONE day ...

Tony

Acc�dez au courrier �lectronique de La Poste : www.laposte.net ; 
3615 LAPOSTENET (0,34�/mn) ; t�l : 08 92 68 13 50 (0,34�/mn)



_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to