Guys,

I just committed a change to System::update() that I wanted to explain a bit.

First, what I did: If you have ghosted vectors enabled it will now just use 
assignment to set current_local_solution... ie *current_local_solution = 
*solution.

Second, why: We were getting segfaults on some of our larger runs recently 
(millions of dofs thousands of procs).  It took us a while to track it down... 
but ultimately core dumps led us to a stack consisting of 
System::update()->localize()->VecAssemblyBegin()!  What's weird is that this 
would just happen randomly during our simulation (it would happen more quickly 
with more Dofs though... sometimes within the first timestep... sometimes after 
30 or 40!).

I'm still not entirely sure why... and hope to do more investigation in the 
future.... but I went with a hunch that we didn't need to do the vecscatter 
manually anymore with ghosted vectors.... and it works well.  We were able to 
run up to ~240 million dofs on 12,000 procs with no problems.  Debugging at 
this size is a chore... which is why I haven't investigated further... but for 
now I just wanted to get this fix back into libMesh in case anyone else is 
running huge runs.

Just a hunch: I think that Petsc is giving us a vector that hasn't been 
properly closed as the solution vector.  Note that I had to do 
solution->close() before using the copy operator... that's because Petsc 
claimed that the solution vector wasn't in the right state!  So when we were 
going to do the vecscatter something was just going wrong internally because we 
were trying to scatter a non-closed vector.  I don't quite have enough evidence 
for this yet though.

I'm thinking about doing some timing to figure out which one of these methods 
is most efficient... but that will have to wait a bit as well.

I'd like to hear thoughts on this from everyone else.....

Derek
------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Libmesh-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to