On Thu, 4 Apr 2013, Manav Bhatia wrote:

Attached are outputs (with profiling data at the end) from a run
with 288 processors. The case with Parallel mesh is taking about 10
times as long to run.

Thanks for this!

Two things jump out at me:

Ben, why is MeshCommunication::find_global_indices() being run
O(N_proc) times in the Parmetis setup?  On big runs that's sure to
leave most processors twiddling their thumbs most of the time.

Manav, your FEMSystem assembly of residuals is 10-15% slower in the
ParallelMesh case (which shouldn't happen, but might just be
measurement noise), but your FEMSystem assembly of jacobians is 6500%
slower, which *definitely* shouldn't happen and isn't measurement
noise.  The only thing in FEMSystem::assembly that should be slower in
the ParallelMesh case is the traversal of the mapvector container...
but that exact same traversal is undertaken in both residual and
Jacobian cases; it shouldn't add a fraction of a second in one case
and two minutes in another.  So I'm stumped.  Would you litter
fem_system.C with some more START_LOG/STOP_LOG calls and see if you
can pinpoint where the Jacobians' slowdown is happening?
---
Roy
------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to