Dear all,
I want to analyze the strong scaling of our in-house FEM code.
The test problem has about 20M DoFs. I ran the problem using
various settings. The speedups for the assembly and solving
procedures are as follows:
Assembly Solving
NProcessors NNodes CoresPerNode
1 1 1 1.0 1.0
2 1 2 1.995246 1.898756
2 1 2.121401 2.436149
4 1 4 4.658187 6.004539
2 2 4.666667 5.942085
4 1 4.65272 6.101214
8 2 4 9.380985 16.581135
4 2 9.308575 17.258891
8 1 9.314449 17.380612
16 2 8 18.575953 34.483058
4 4 18.745129 34.854409
8 2 18.828393 36.45509
32 4 8 37.140626 70.175879
8 4 37.166421 71.533865
I don't quite understand this result. Why we can achieve a speedup of
about 70+ using 32 processors? Could you please help me explain this?
Thank you in advance.
Best,
Ce