On Mon, 18 Mar 2019, Boris Boutkov wrote: > Out of some curiosity I recently rebased my GMG implementation on to the > upcoming NBX changes in PR #1965 to do some weak scaling analysis. > > I ran an np 256 Poisson problem using GMG with ~10k dofs/proc, and in short > it, seems like the NBX changes provide some solid improvement bringing my > total runtime from something like ~19s to ~16s, so a nice step on the road > to weak GMG scaling. Pre-NBX I had a good chunk (30% total w/sub) of my > time being spent in the alltoall() which post-NBX is down to 2%! This came > with a fairly large amount of calls to possible_receive() and now 15% of my > total time being spent in there, but the overall timing seems to be a win so > thanks much for this work!
Thanks for the update! Greedy question: could you try the same timings at, say, np 16? I was pretty confident np 256 would be a big win, since the asymptotic scaling is improved, but it'd be nice to have data points at lower processor counts too. > Despite these improvements, the weak scaling for the GMG implementation is > still a bit lacking unfortunately as np1=~1s. I ran these tests through gperf > in order to gain some more insight and it looks to me that major components > slowing down the setup time are still refining/coarsening/distributing_dofs > which in turn do a lot of nodal parallel consistency adjusting and setting > nonlocal_dof_objects and am wondering if there are maybe some low hanging > fruit to improve on around those calls. There almost certainly is. Could I get comparable results from your new fem_system_ex1 settings (with more coarse refinements, I mean) to test with? --- Roy _______________________________________________ Libmesh-devel mailing list Libmesh-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libmesh-devel