Matt So that means the time on 10 processes in merely 1.8 times the time on 1 process?? this is quite difficult to digest! Okay so if memory bandwidth a controlling factor here how will forming a cluster with same machines solve this problem? my cpu has max memory bandwidth of 59 GB/s .
Apologies if the question are too silly! *Siddhesh M Godbole* 5th year Dual Degree, Civil Eng & Applied Mech. IIT Madras On Thu, Apr 23, 2015 at 4:22 PM, Matthew Knepley <[email protected]> wrote: > On Thu, Apr 23, 2015 at 5:47 AM, siddhesh godbole < > [email protected]> wrote: > >> Hello, >> >> I want to know about the test which is conducted just after the PETSC is >> configured on the system to assess the possible speedup by MPI processes. I >> have saved the result file which says: >> *Number of MPI processes 10* >> *Process 0 iitm* >> *Process 1 iitm* >> *Process 2 iitm* >> *Process 3 iitm* >> *Process 4 iitm* >> *Process 5 iitm* >> *Process 6 iitm* >> *Process 7 iitm* >> *Process 8 iitm* >> *Process 9 iitm* >> *Function Rate (MB/s) * >> *Copy: 24186.8271* >> *Scale: 23914.0401* >> *Add: 27271.7149* >> *Triad: 27787.1630* >> *------------------------------------------------* >> *np speedup* >> *1 1.0* >> *2 1.75* >> *3 1.86* >> *4 1.84* >> *5 1.85* >> *6 1.83* >> *7 1.76* >> *8 1.79* >> *9 1.8* >> *10 1.8* >> *Estimation of possible speedup of MPI programs based on Streams >> benchmark.* >> >> 1) What parameters the speedup depends on? >> > > I am not sure what you are asking here. Speedup is defined as the time T > on 1 process divided > by the time T_p on p processes: > > S = T/T_p > > >> 2) what are the hardware requirements for higher speedup? ( i was >> expecting atleast 5 times speedup after generating 10 processes. >> > > STREAMS measures the speedup of vectors operations, which are very similar > to sparse matrix operations. Both > are limited by memory bandwidth. > > >> 3) what could possibly be done to improve this ? >> > > 1) You could buy more nodes, since each node has a path to memory > > 2) You could change algorithms, but this has proven very difficult > > Thanks, > > Matt > > >> i have intel® Core™ i7-4930K CPU @ 3.40GHz × 12 with 32 GB of RAM and 1 >> TB disk space. >> >> >> Thanks >> *Siddhesh M Godbole* >> >> 5th year Dual Degree, >> Civil Eng & Applied Mech. >> IIT Madras >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >
