On Mon, 18 Mar 2019, Boris Boutkov wrote:

> Out of some curiosity I recently rebased my GMG implementation on to the 
> upcoming NBX changes in PR #1965 to do some weak scaling analysis.
>
> I ran an np 256 Poisson problem using GMG with ~10k dofs/proc, and in short 
> it, seems like the NBX changes provide some solid improvement bringing my 
> total runtime from something like ~19s to ~16s, so a nice step on the road 
> to  weak GMG scaling.  Pre-NBX I had a good chunk (30% total w/sub) of my 
> time being spent in the alltoall() which post-NBX is down to 2%! This came 
> with a fairly large amount of calls to possible_receive() and now 15% of my 
> total time being spent in there, but the overall timing seems to be a win so 
> thanks much for this work!

Thanks for the update!

Greedy question: could you try the same timings at, say, np 16?  I was
pretty confident np 256 would be a big win, since the asymptotic
scaling is improved, but it'd be nice to have data points at lower
processor counts too.

> Despite these improvements, the weak scaling for the GMG implementation is 
> still a bit lacking unfortunately as np1=~1s. I ran these tests through gperf 
> in order to gain some more insight and it looks to me that major components 
> slowing down the setup time are still refining/coarsening/distributing_dofs 
> which in turn do a lot of nodal parallel consistency adjusting and setting 
> nonlocal_dof_objects and am wondering if there are maybe some low hanging 
> fruit to improve on around those calls.

There almost certainly is.  Could I get comparable results from your
new fem_system_ex1 settings (with more coarse refinements, I mean) to
test with?
---
Roy
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to