On Jan 23, 2013, at 9:41 PM, "Cody Permann" 
<[email protected]<mailto:[email protected]>> wrote:




On Wed, Jan 23, 2013 at 11:05 AM, Kirk, Benjamin (JSC-EG311) 
<[email protected]<mailto:[email protected]>> wrote:
Are these ghosted vectors?

Can't imagine how it could happen, but if the ghost indices are not symmetric 
where they should be you could have processor m waiting on a message from 
processor n that is not coming...

Yes, ghosted vectors.  Well I guess that's somewhere to start looking.  I found 
the location of the branch down inside of PETSc where the paths diverge 
(wait_all vs wait_any) but I admit I have no idea what's happening at that 
level. I haven't been able to get the code to hang on my local workstation with 
8-10 processor jobs, and sadly it runs for a long time before hanging on the 
cluster sized jobs.

Also, I haven't tried a full debug build yet because of the size of the 
problem, but I'll put that on the "to do" list too.  If we're lucky, perhaps 
we'll hit an assert if we ever get there.  I'll keep you posted.

Any chance you can dump restart files and meshes along the way, and then 
restart prior to the problem in at least devel mode?

This does to me seem like a send_list problem, and as Roy mentioned we've seen 
similar behavior elsewhere.

A test for that would be to replace the localize with a full serialization of 
the vector, which is obviously a memory scalability problem but could be 
instructive if it fixes this problem.

If that is the case then it means we've somehow missed a proper predictive 
capability when building the send_list, so that'll take some thinking...

-Ben

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to