Oops wrong list. Nevermind this 2010/9/21 Leo van Kampenhout <lvankampenhout at gmail.com>
> > Dear all, > > in order to calculate speedup (Sp = T1/Tp) I need an accurate measurement > of T1, the time to solve on 1 processor. I will be using the parallel > algorithm for that, but there seems to be a hick-up. > > At the cluster I am currently working on, each node is made up by 12 PEs > and have shared memory. When I would just reserve 1 PE for my job, the other > 11 processors are given to other users, therefore giving dynamic load on the > memory system resulting into inaccurate timings. The solve-times I get are > ranging between 1 and 5 minutes. For me, this is not very scientific either. > > > The second idea was to reserve all 12 PEs on the node and just let 1 PE run > the job. However, in this way the single CPU gets all the memory bandwidth > and has no waiting time, therefore giving very fast results. When I would > calculate speedup from here, the algorithm does not scale very well. > > Another idea would be to spawn 12 identical jobs on 12 PEs and take the > average runtime. Unfortunately, there is only one PETSC_COMM_WORLD, so I > think this is impossible to do from within one program (MPI_COMM_WORLD). > > Do you fellow PETSc-users have any ideas on the subject? It would be much > appreciated. > > regards, > > Leo van Kampenhout > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100921/3c8e6d3b/attachment.html>
