On Feb 24, 2012, at 2:43 PM, Francis Poulin wrote:
> Hello,
>
> I wanted to thank everyone for the help and say that I managed to get it
> running on the cluster and I have done some efficiency calculations. To run
> the code I used,
>
> mpirun -np # ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine 6
> -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson -mg_levels_pc_type
> sor -log_summary
>
> as suggested and found the following
>
> p(#cpu) Tp (parallel) T1 (serial) Efficiency [ =
> Tp/(p*T1) ]
> --------------------------------------------------------------------------------------
> 1 904 904 1.00
> 2 553 904 0.82
> 4 274 904 0.83
> 8 138 904 0.82
> 16 70 904 0.81
> 32 36 904 0.78
>
> It seems to scale beautifully starting at 2 but there is a big drop from 1 to
> 2. I suspect there's a very good reason for this, I just don't know what.
You need to understand the issues of memory bandwidth shared between cores
and between processors, memory affinity and thread affinity ("binding") see
http://www.mcs.anl.gov/petsc/documentation/faq.html#computers
Barry
>
> Thanks again for all of your help,
> Francis
>
> On 2012-02-24, at 12:59 PM, Jed Brown wrote:
>
>> On Fri, Feb 24, 2012 at 11:49, Francis Poulin <fpoulin at uwaterloo.ca>
>> wrote:
>> I don't seem to have the hydra in my bin folder but I do have petscmpiexec
>> that I've been using.
>>
>> That's just the name of my mpiexec. Use whichever one is used with your
>> build of PETSc.
>>
>>
>>>
>>> Use these, it will run the same method and sizes as the options I gave for
>>> ex22 before.
>>>
>>> mpiexec.hydra -n 2 ./ex45 -da_grid_x 5 -da_grid_y 5 -da_grid_z 5 -da_refine
>>> 5 -ksp_monitor -pc_type mg -mg_levels_ksp_type richardson
>>> -mg_levels_pc_type sor -log_summary
>>>
>>
>> Also, I want to install PetSc on an SGI machine that I have access to. I
>> have been told that using MPT would give better performance compared to
>> mpich2. When I configure petsc on this server I don't suppose
>>
>> -with-mpi-dir=/opt/sgi/mpt
>>
>> the above would work because of the different name. Do you have a
>> suggestion as to what I could try?
>>
>> It's just the way you launch parallel jobs.
>