Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-13 Thread Roy Stogner
On Tue, 12 Sep 2017, Michael Povolotskyi wrote: Hi Roy, here is the progress. It turns out that openmpi 1.10.0 does not work as it should. Our admins tried to rebuilt it with different options, but no success. So, we switched to openmpi 2.1.0 and it works. Good to hear; thanks for the updat

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-12 Thread Michael Povolotskyi
Hi Roy, here is the progress. It turns out that openmpi 1.10.0 does not work as it should. Our admins tried to rebuilt it with different options, but no success. So, we switched to openmpi 2.1.0 and it works. Michael. On 09/11/2017 10:42 AM, Roy Stogner wrote: On Mon, 11 Sep 2017, Michael Pov

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-11 Thread Roy Stogner
On Mon, 11 Sep 2017, Michael Povolotskyi wrote: This is exactly what had happened: the test runs till the end with MPICH and hangs here with openMPI. Great! (I realize now, after the fact, that I'd forgotten about an MPI_Get_count in the middle of all that; although I don't see how that woul

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-11 Thread Michael Povolotskyi
Thank you, Roy. This is exactly what had happened: the test runs till the end with MPICH and hangs here with openMPI. I'm going to inform our admins about this issue. Michael. On 09/11/2017 09:49 AM, Roy Stogner wrote: On Sun, 10 Sep 2017, Michael Povolotskyi wrote: I checked that there is

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-11 Thread Roy Stogner
On Sun, 10 Sep 2017, Michael Povolotskyi wrote: I checked that there is no infinite loop. Both ranks pass this->send (dest_processor_id, sendvec, type1, req, send_tag); and hang on this->receive (source_processor_id, recv, type2, recv_tag); Both processes are sending zero elements, is th

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-10 Thread Michael Povolotskyi
Hello Roy, I checked that there is no infinite loop. Both ranks pass this->send (dest_processor_id, sendvec, type1, req, send_tag); and hang on this->receive (source_processor_id, recv, type2, recv_tag); Both processes are sending zero elements, is this correct? Could you, please, suggest a

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-10 Thread Roy Stogner
On Sat, 9 Sep 2017, Michael Povolotskyi wrote: (gdb) p recv_tag $2 = (const libMesh::Parallel::MessageTag &) @0x2ac9d6d3d980: { static invalid_tag = -2147483648, _tagvalue = -1, _comm = 0x0} (gdb) p recv_tag $2 = (const libMesh::Parallel::MessageTag &) @0x2b9331482980: { static invalid_tag

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-09 Thread Michael Povolotskyi
Hello Roy, following your request, I have checked recv_tag on both MPI processes. Here is what I have got: process #1 (I do not know the MPI rank, so I just call it as "1") #9 0x2ac9d4852a19 in libMesh::Parallel::Communicator::send_receiveint> (this=0x7ffc0e5e18b8, dest_processor_id=1, se

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-08 Thread Roy Stogner
On Fri, 8 Sep 2017, Michael Povolotskyi wrote: still let me share with you the stack trace. It works for me now with mpich, but if I can help to improve libmesh portability I would be glad to do it. I originally thought this looked a little like a local linkage mixup, in which case there'd b

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-08 Thread Michael Povolotskyi
Dear Roy, still let me share with you the stack trace. It works for me now with mpich, but if I can help to improve libmesh portability I would be glad to do it. Here is the trace from the 1st MPI process. On the 2nd it looks the same. (gdb) c Continuing. ^C Program received signal SIGINT, Inter

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-06 Thread Roy Stogner
On Wed, 6 Sep 2017, Michael Povolotskyi wrote: I found that if I rebuilt everything with MPICH, instead of using installed openmpi, then everything works perfectly. Is libmesh supposed to work with openmpi? If yes, and I can recompile it again and produce the stack trace. libMesh does work w

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-06 Thread Michael Povolotskyi
On 09/06/2017 10:57 AM, Roy Stogner wrote: On Mon, 4 Sep 2017, Michael Povolotskyi wrote: using namespace libMesh; int main (int argc, char ** argv) { { LibMeshInit init (argc, argv); Mesh mesh(init.comm()); MeshTools::Generation::build_cube (mesh, 5, 5, 5); mesh.print_info

Re: [Libmesh-users] problems with running libmesh in parallel

2017-09-06 Thread Roy Stogner
On Mon, 4 Sep 2017, Michael Povolotskyi wrote: using namespace libMesh; int main (int argc, char ** argv) { { LibMeshInit init (argc, argv); Mesh mesh(init.comm()); MeshTools::Generation::build_cube (mesh, 5, 5, 5); mesh.print_info(); TecplotIO mesh_output(mesh); me