Johan, I haven't tested extensively with and without these changes. I was running for a while with the popen version and getting random hangups and some crashes which I think are gone now. However the warning is still there.
I also still experience segfaults and other problems with certain preconditioners, so I've been using a lot of ilu recently... Martin On 19 June 2013 09:51, Johan Hake <[email protected]> wrote: > Does it work without the hard coded SWIG path if you keep the switch to > commands? > > I wonder why commands works and not subprocess... > > On 06/19/2013 09:29 AM, Martin Sandve Alnæs wrote: > > Johannes did some hacks to instant to get it running on the Abel cluster > > at UiO, patch attached. The swig paths and version is hardcoded so no > > check is actually performed. Also the getstatusoutput implementation is > > using commands instead of Popen. > > > > Try this patch, then edit the changes manually to make the paths and > > versions match with your system. > > > > Martin > > > > > > On 17 June 2013 23:44, Jan Blechta <[email protected] > > <mailto:[email protected]>> wrote: > > > > On Mon, 17 Jun 2013 21:48:09 +0200 > > Jan Blechta <[email protected] > > <mailto:[email protected]>> wrote: > > > On Mon, 17 Jun 2013 20:06:06 +0100 > > > "Garth N. Wells" <[email protected] <mailto:[email protected]>> wrote: > > > > On 17 June 2013 19:16, Jan Blechta <[email protected] > > <mailto:[email protected]>> > > > > wrote: > > > > > On Mon, 17 Jun 2013 15:03:18 +0200 > > > > > Johan Hake <[email protected] <mailto:[email protected]>> > wrote: > > > > >> On 06/17/2013 03:01 PM, Jan Blechta wrote: > > > > >> > On Mon, 17 Jun 2013 09:39:16 +0200 > > > > >> > Johan Hake <[email protected] <mailto:[email protected]>> > > wrote: > > > > >> >> I have now made changes to both instant and dolfin, so the > > > > >> >> swig path and version is checked at import time instead of > > > > >> >> each time a module is JIT compiled. > > > > >> >> > > > > >> >> dolfin: > > > > >> >> johanhake/swig-version-check > > > > >> >> > > > > >> >> instant: > > > > >> >> johanhake/add_version_cache > > > > >> >> > > > > >> >> Could Martin and/or Jan check if these fixes get rid of the > > > > >> >> fork warning? > > > > >> > > > > > >> > I'll try how does it behave. But note that currently I'm not > > > > >> > getting not only warning but seg fault. > > > > >> > > > > >> Ok. > > > > > > > > > > The whole thing is pretty twisted and I'm not sure if these > > > > > commits made some progress. There remains fork warning. > > > > > > > > > > As nobody other (from HPC community) reports related problems > one > > > > > could deduce that python/DOLFIN popen calls are probably safe. > > > > > > > > > >> > > > > >> > But it may also happen because of my > > > > >> > subprocess.call(mesh_generator). > > > > >> > > > > >> It would be nice to figure out what process triggers the > > > > >> segfault. > > > > > > > > > > I tried with C++ Poisson demo and problems on 12-core OpenIB > node > > > > > remain. It can segfault, throw various PETSc or MPI errors, > hang > > > > > or pass. This resembles some memory corruption but it has > probably > > > > > nothing to do with OpenIB/fork issue as warnings are not > present. > > > > > Problem seems to happen more frequently with higher number > > > > > processes used. I've a suspicion to old, buggy OpenMPI, but I > can > > > > > do nothing but beg for new version at cluster admin. > > > > > > > > > > > > > Make sure that you're using the MPI wrappers - by default CMake > uses > > > > the C++ compiler plus the MPI lib flags. On my local HPC system, > > > > failing to use the wrappers leads to hangs when computing across > > > > nodes. > > > > > > Running cmake -DCMAKE_CXX_COMPILER=mpicxx when configuring > > > demo_poisson does not help. Does is apply also to the compilation > of > > > DOLFIN? > > > > Recompiling UFC, DOLFIN, demo_poisson with mpicxx does not help. I > > think there will be some problem on the machine - possibly outdated > > OpenMPI. > > > > Jan > > > > > > > > Jan > > > > > > > > > > > Garth > > > > > > > > > Jan > > > > > > > > > >> > > > > >> Johan > > > > >> > > > > >> > > > > > >> > Jan > > > > >> > > > > > >> >> > > > > >> >> I would be surprised if it does, as we eventually call > popen > > > > >> >> to compile the JIT generated module. That call would be > > > > >> >> difficult to get rid of. > > > > >> >> > > > > >> >> Johan > > > > >> >> > > > > >> >> On 06/17/2013 08:47 AM, Martin Sandve Alnæs wrote: > > > > >> >>> Registers are touched on basically every operation the CPU > > > > >> >>> does :) But it didn't say "registers", but "registered > > > > >> >>> memory". > > > > >> >>> > > > http://blogs.cisco.com/performance/registered-memory-rma-rdma-and-mpi-implementations/ > > > > >> >>> > > > > >> >>> Martin > > > > >> >>> > > > > >> >>> > > > > >> >>> On 17 June 2013 08:36, Johan Hake <[email protected] > > <mailto:[email protected]> > > > > >> >>> <mailto:[email protected] <mailto:[email protected]>>> > > wrote: > > > > >> >>> > > > > >> >>> On 06/16/2013 11:40 PM, Jan Blechta wrote: > > > > >> >>> > On Sun, 16 Jun 2013 22:40:43 +0200 > > > > >> >>> > Johan Hake <[email protected] > > <mailto:[email protected]> > > > > >> >>> > <mailto:[email protected] > > <mailto:[email protected]>>> wrote: > > > > >> >>> >> There are still fork calls when swig version is > > > > >> >>> >> checked. Would it be ok to check it only when > dolfin > > > > >> >>> >> is imported? That would be an easy fix. > > > > >> >>> > > > > > >> >>> > I've no idea. There are two aspects of the issue: > > > > >> >>> > > > > > >> >>> > 1. forks may not be supported. > > > > >> >>> > > > > >> >>> Following [1] below it looks like they should be > > > > >> >>> supported by the more recent openmpi and it also says > that: > > > > >> >>> > > > > >> >>> In general, if your application calls system() or > > > > >> >>> popen(), it will likely be safe. > > > > >> >>> > > > > >> >>> > 2. even if forks are supported by given > installation, > > > > >> >>> > it may not be secure. Citing from [1]: > > > > >> >>> > > > > > >> >>> > "If you use fork() in your application, you must > not > > > > >> >>> > touch any registered memory before calling some > form of > > > > >> >>> > exec() to launch another process. Doing so will > cause > > > > >> >>> > an immediate seg fault / program crash." > > > > >> >>> > > > > > >> >>> > Is this condition met with present state and > would > > > > >> >>> > be met after suggested change? > > > > >> >>> > > > > >> >>> I have no clue if we do touch any register before we > call > > > > >> >>> the fork and I have no clue whether the suggested fix > would > > > > >> >>> do that. Aren't registers touched on a low level basis > quite > > > > >> >>> often? > > > > >> >>> > > > > >> >>> Do you experience occasional segfaults? > > > > >> >>> > > > > >> >>> Also [2] suggest that the warning might be the > problem. > > > > >> >>> Have you tried running with: > > > > >> >>> > > > > >> >>> mpirun --mca mpi_warn_on_fork 0 ... > > > > >> >>> > > > > >> >>> Johan > > > > >> >>> > > > > >> >>> > > > > > >> >>> > Jan > > > > >> >>> > > > > > >> >>> >> > > > > >> >>> >> Johan > > > > >> >>> >> On Jun 16, 2013 12:47 AM, "Jan Blechta" > > > > >> >>> <[email protected] > > <mailto:[email protected]> > > > > >> >>> <mailto:[email protected] > > <mailto:[email protected]>>> > > > > >> >>> >> wrote: > > > > >> >>> >> > > > > >> >>> >>> What is the current status of a presence of fork() > > > > >> >>> >>> calls in FEniCS codebase? These calls are not > > > > >> >>> >>> friendly with openib infiniband clusters [1, 2]. > > > > >> >>> >>> > > > > >> >>> >>> Issue with popen() calls for searching swig > library > > > > >> >>> >>> was disscused in the end of [3]. I'm still > > > > >> >>> >>> experiencing these sort of troubles when running > on > > > > >> >>> >>> infiniband nodes (even when using only one node) > so > > > > >> >>> >>> was cleaning of popen() finished or are there any > > > > >> >>> >>> other harmful fork() calls in FEniCS codebase? > > > > >> >>> >>> > > > > >> >>> >>> [1] > > > > >> >>> >>> > > http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork > > > > >> >>> >>> [2] > > > > >> >>> >>> > > http://www.open-mpi.org/faq/?category=tuning#fork-warning > > > > >> >>> >>> [3] > > > > >> >>> >>> > https://answers.launchpad.net/dolfin/+question/219270 > > > > >> >>> >>> > > > > >> >>> >>> Jan > > > > >> >>> >>> _______________________________________________ > > > > >> >>> >>> fenics mailing list > > > > >> >>> >>> [email protected] > > <mailto:[email protected]> > > > > >> >>> >>> <mailto:[email protected] > > <mailto:[email protected]>> > > > > >> >>> >>> http://fenicsproject.org/mailman/listinfo/fenics > > > > >> >>> >>> > > > > >> >>> >> > > > > >> >>> > > > > >> >>> _______________________________________________ > > > > >> >>> fenics mailing list > > > > >> >>> [email protected] > > <mailto:[email protected]> > > > > >> >>> <mailto:[email protected] > > <mailto:[email protected]>> > > > > >> >>> http://fenicsproject.org/mailman/listinfo/fenics > > > > >> >>> > > > > >> >>> > > > > >> > > > > > >> > > > > >> _______________________________________________ > > > > >> fenics mailing list > > > > >> [email protected] <mailto:[email protected]> > > > > >> http://fenicsproject.org/mailman/listinfo/fenics > > > > > > > > > > _______________________________________________ > > > > > fenics mailing list > > > > > [email protected] <mailto:[email protected]> > > > > > http://fenicsproject.org/mailman/listinfo/fenics > > > > _______________________________________________ > > > > fenics mailing list > > > > [email protected] <mailto:[email protected]> > > > > http://fenicsproject.org/mailman/listinfo/fenics > > > > > > _______________________________________________ > > > fenics mailing list > > > [email protected] <mailto:[email protected]> > > > http://fenicsproject.org/mailman/listinfo/fenics > > > > _______________________________________________ > > fenics mailing list > > [email protected] <mailto:[email protected]> > > http://fenicsproject.org/mailman/listinfo/fenics > > > > > > > > > > _______________________________________________ > > fenics mailing list > > [email protected] > > http://fenicsproject.org/mailman/listinfo/fenics > > > >
_______________________________________________ fenics mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics
