Sure, I will try to probe a bit more. Will keep you posted. Manav
On Apr 4, 2013, at 6:03 PM, Roy Stogner <[email protected]> wrote: > > On Thu, 4 Apr 2013, Manav Bhatia wrote: > >> Attached are outputs (with profiling data at the end) from a run >> with 288 processors. The case with Parallel mesh is taking about 10 >> times as long to run. > > Thanks for this! > > Two things jump out at me: > > Ben, why is MeshCommunication::find_global_indices() being run > O(N_proc) times in the Parmetis setup? On big runs that's sure to > leave most processors twiddling their thumbs most of the time. > > Manav, your FEMSystem assembly of residuals is 10-15% slower in the > ParallelMesh case (which shouldn't happen, but might just be > measurement noise), but your FEMSystem assembly of jacobians is 6500% > slower, which *definitely* shouldn't happen and isn't measurement > noise. The only thing in FEMSystem::assembly that should be slower in > the ParallelMesh case is the traversal of the mapvector container... > but that exact same traversal is undertaken in both residual and > Jacobian cases; it shouldn't add a fraction of a second in one case > and two minutes in another. So I'm stumped. Would you litter > fem_system.C with some more START_LOG/STOP_LOG calls and see if you > can pinpoint where the Jacobians' slowdown is happening? > --- > Roy ------------------------------------------------------------------------------ Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html _______________________________________________ Libmesh-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libmesh-users
