this->make_node_proc_ids_parallel_consistent(mesh) did not work. I get the 
exact same error. Is this strange?


Miguel

________________________________
From: Roy Stogner <royst...@ices.utexas.edu>
Sent: Monday, March 19, 2018 12:28:26 PM
To: Salazar De Troya, Miguel
Cc: libmesh-users@lists.sourceforge.net
Subject: Re: [Libmesh-users] Assertion `min_id == node->processor_id()' failed.;


On Mon, 19 Mar 2018, Salazar De Troya, Miguel wrote:

> I found a slight difference between the trace files:
>
> The traceout_8_142118.txt contains
>
> libMesh::MeshTools::libmesh_assert_parallel_consistent_procids<libMesh::Node> 
> (mesh=...) at src/mesh/mesh_tools.C:1608
>
> whereas traceout_57_85461.txt  and traceout_11_104555.txt :
>
> libMesh::MeshTools::libmesh_assert_parallel_consistent_procids<libMesh::Node> 
> (mesh=...) at src/mesh/mesh_tools.C:1609
>
> Not sure if this helps.

No; I'm afraid that's expected from that stack trace: processors who
think the node should be on processor 57 are screaming that 57 doesn't
match the minimum proc_id of 11, but processors who think it should be
on processor 11 are screaming that 11 doesn't match the maximum
proc_id of 57.

>    #7  0x00002aaaaebe174e in 
> libMesh::MeshTools::libmesh_assert_parallel_consistent_procids<libMesh::Node> 
> (mesh=...) at src/mesh/mesh_tools.C:1608
>    #8  0x00002aaaaeba931e in libMesh::MeshTools::correct_node_proc_ids 
> (mesh=...) at src/mesh/mesh_tools.C:1844
>    #9  0x00002aaaae69a0ce in 
> libMesh::MeshCommunication::make_new_nodes_parallel_consistent (this=0x2320a, 
> mesh=...) at src/mesh/mesh_communication.C:1776
>    #10 0x00002aaaaea95919 in libMesh::MeshRefinement::_refine_elements 
> (this=0x2320a) at src/mesh/mesh_refinement.C:1601
>    #11 0x00002aaaaea6a4d1 in 
> libMesh::MeshRefinement::refine_and_coarsen_elements (this=0x2320a) at 
> src/mesh/mesh_refinement.C:578
>    #12 0x00002aaab9d69dcd in OptiProblem::solve (this=0x7fffffffabd8) at 
> /g/g92/miguel/code/topsm/src/opti_problem.C:370
>    #13 0x00000000004371b8 in main (argc=4, argv=0x7fffffffb798) at 
> /g/g92/miguel/code/topsm/test/3D_stress_constraint/linear_stress_opti.C:196
>
>    Are there other things I can do to debug this?

One possible fix you could try first: in mesh_communication.C:1767,
where it says

   this->make_new_node_proc_ids_parallel_consistent(mesh);

Try changing it to

   this->make_node_proc_ids_parallel_consistent(mesh);

It could be that you're in some corner case I didn't imagine, which
causes a processor to fail to identify and correct a new
potentially-inconsistent processor_id, and if so then maybe telling
the code to sync up *all* node processor_id() values will fix that.

Let me know whether or not that works?

This is a frighteningly tricky part of the code; you can gawk at the
current state of my failed attempts to improve load balancing of
processor ids in https://github.com/libMesh/libmesh/pull/1621 in fact.

The good news about that PR is it has me digging into corner cases
here myself, so hopefully when I'm finished it will fix your code too
if my suggested fix above doesn't.  The bad news is that there's also
a chance of me immediately re-*breaking* your code even if my
suggested fix above works - if you wouldn't mind, I'll let you know
when the PR is ready so you can run your own tests, just in case they
catch something that our own CI misses.
---
Roy
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Libmesh-users mailing list
Libmesh-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to