Hi Liam, On 07/12/2016 07:46, Liam Doult wrote: > I am executing Pyfr 1.5.0 on duel xeon 2630v4 10c per node with 2 nodes. > As I scale the mpirun -np up to 40 the performance is drastically reduced. > > It also appears to be capped at 50% of the maximum cpu performance. > > Any insight is appreciated
The performance and scalability of PyFR is heavily determined by the type of problem you are running. If the mesh has relatively few elements it is not surprising that the performance begins to regress as as the number of ranks is increased. Additionally, for some cases PyFR can be bound by memory bandwidth and so once you have enough ranks inside of a node to saturate the memory bus no improvement should be expected from adding additional ranks. Further, you do not state what compiler/BLAS library you are using and what the interconnect is between the nodes. Again, if this is gigabit ethernet then poor scalability is not unexpected. Moreover, PyFR is a hybrid MPI/OpenMP code and so you are almost always better off with one MPI rank per NUMA zone. Regards, Freddie. -- You received this message because you are subscribed to the Google Groups "PyFR Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send an email to [email protected]. Visit this group at https://groups.google.com/group/pyfrmailinglist. For more options, visit https://groups.google.com/d/optout.
signature.asc
Description: OpenPGP digital signature
