On Wed, 11 Mar 2009, Tim Kroeger wrote:

> It happens in System::project_vector() for the case of a ghosted vector. 
> This calls DofMap::enforce_constraints_exactly(), which in line 680 of 
> dof_map_constraints.C calls NumericVector::operator().  It hits the 
> libmesh_assert() near the end of PetscVector::map_global_to_local_index(), 
> i.e. the index supplied to operator() is neither a local nor a ghost index.

Hmm... The index is pulled from a constraint-equation-expanded
local_dof_indices.  The in-element local indices should certainly be
local or ghost indices, but is it possible that we're not inserting
constraint equation dependencies into DofMap::_send_list?  That would
explain why the problem is so hard to replicate - it would be common,
even when doing a lot of adaptation, for all nonlocal constraint
equation dependencies to end up getting in the send_list anyway when
the DofMap finds them in immediately neighboring elements.

> There are 7 systems in the application (all on the same grid).  Two 
> ExplicitSystem's and two LinearImplicitSystem's are projected successfully, 
> but the first TransientLinearImplicitSystem triggers the crash, when 
> projecting its _transient_old_local_solution.  That system contains two 
> variables, both of the same FE-type.

Could you list the variables and FE types in each of the systems?  One
problem with my theory above (other than that I'll have to pore
through all of dof_map*.C to check it...) is that I'd have expected
the crash to occur earlier.

> To reproduce it more easily, I wrote the grid to an .xdr file directly before 
> the call to MeshRefinement::refine_and_coarsen_elements(), plus a file 
> containing the refinement flags of all active elements (stored in the native 
> order, i.e. the order that the iterator steps through the elements).  Then, I 
> wrote a test program that reads in the grid, initializes a 
> TransientLinearImplicitSystem with two variables on that mesh, reads the 
> refinement flags, and calls MeshRefinement::refine_and_coarsen_elements() and 
> EquationSystems::reinit().  I ran this program on the same number of 
> processors as the main application (that is 8) -- but that program does *not* 
> crash.  I think some very odd things must be going on here. Do you have any 
> idea how to track this down further?

Well, there's the hard way:  Compile the crashing test case in devel
mode and run with -start_in_debugger (this will be easier if you can
replicate the problem on fewer than 8 CPUs).  Then pore through the
data structures at the crash by hand to find out what DofObject the
bad index is coming from.

But let's try the either-easy-or-futile way first: I'll write a
(possibly redundant) patch to make sure we're properly getting
constraint dependency dofs into the send list, and you can try running
with that.  We can at least verify or rule out my first guess before
we need more information to come up with a second guess.
---
Roy

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to