Johan, I haven't tested extensively with and without these changes. I was
running for a while with the popen version and getting random hangups and
some crashes which I think are gone now.  However the warning is still
there.

I also still experience segfaults and other problems with certain
preconditioners, so I've been using a lot of ilu recently...

Martin


On 19 June 2013 09:51, Johan Hake <[email protected]> wrote:

> Does it work without the hard coded SWIG path if you keep the switch to
> commands?
>
> I wonder why commands works and not subprocess...
>
> On 06/19/2013 09:29 AM, Martin Sandve Alnæs wrote:
> > Johannes did some hacks to instant to get it running on the Abel cluster
> > at UiO, patch attached. The swig paths and version is hardcoded so no
> > check is actually performed. Also the getstatusoutput implementation is
> > using commands instead of Popen.
> >
> > Try this patch, then edit the changes manually to make the paths and
> > versions match  with your system.
> >
> > Martin
> >
> >
> > On 17 June 2013 23:44, Jan Blechta <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     On Mon, 17 Jun 2013 21:48:09 +0200
> >     Jan Blechta <[email protected]
> >     <mailto:[email protected]>> wrote:
> >     > On Mon, 17 Jun 2013 20:06:06 +0100
> >     > "Garth N. Wells" <[email protected] <mailto:[email protected]>> wrote:
> >     > > On 17 June 2013 19:16, Jan Blechta <[email protected]
> >     <mailto:[email protected]>>
> >     > > wrote:
> >     > > > On Mon, 17 Jun 2013 15:03:18 +0200
> >     > > > Johan Hake <[email protected] <mailto:[email protected]>>
> wrote:
> >     > > >> On 06/17/2013 03:01 PM, Jan Blechta wrote:
> >     > > >> > On Mon, 17 Jun 2013 09:39:16 +0200
> >     > > >> > Johan Hake <[email protected] <mailto:[email protected]>>
> >     wrote:
> >     > > >> >> I have now made changes to both instant and dolfin, so the
> >     > > >> >> swig path and version is checked at import time instead of
> >     > > >> >> each time a module is JIT compiled.
> >     > > >> >>
> >     > > >> >>   dolfin:
> >     > > >> >>   johanhake/swig-version-check
> >     > > >> >>
> >     > > >> >>   instant:
> >     > > >> >>   johanhake/add_version_cache
> >     > > >> >>
> >     > > >> >> Could Martin and/or Jan check if these fixes get rid of the
> >     > > >> >> fork warning?
> >     > > >> >
> >     > > >> > I'll try how does it behave. But note that currently I'm not
> >     > > >> > getting not only warning but seg fault.
> >     > > >>
> >     > > >> Ok.
> >     > > >
> >     > > > The whole thing is pretty twisted and I'm not sure if these
> >     > > > commits made some progress. There remains fork warning.
> >     > > >
> >     > > > As nobody other (from HPC community) reports related problems
> one
> >     > > > could deduce that python/DOLFIN popen calls are probably safe.
> >     > > >
> >     > > >>
> >     > > >> > But it may also happen because of my
> >     > > >> > subprocess.call(mesh_generator).
> >     > > >>
> >     > > >> It would be nice to figure out what process triggers the
> >     > > >> segfault.
> >     > > >
> >     > > > I tried with C++ Poisson demo and problems on 12-core OpenIB
> node
> >     > > > remain. It can segfault, throw various PETSc or MPI errors,
> hang
> >     > > > or pass. This resembles some memory corruption but it has
> probably
> >     > > > nothing to do with OpenIB/fork issue as warnings are not
> present.
> >     > > > Problem seems to happen more frequently with higher number
> >     > > > processes used. I've a suspicion to old, buggy OpenMPI, but I
> can
> >     > > > do nothing but beg for new version at cluster admin.
> >     > > >
> >     > >
> >     > > Make sure that you're using the MPI wrappers - by default CMake
> uses
> >     > > the C++ compiler plus the MPI lib flags. On my local HPC system,
> >     > > failing to use the wrappers leads to hangs when computing across
> >     > > nodes.
> >     >
> >     > Running cmake -DCMAKE_CXX_COMPILER=mpicxx when configuring
> >     > demo_poisson does not help. Does is apply also to the compilation
> of
> >     > DOLFIN?
> >
> >     Recompiling UFC, DOLFIN, demo_poisson with mpicxx does not help. I
> >     think there will be some problem on the machine - possibly outdated
> >     OpenMPI.
> >
> >     Jan
> >
> >     >
> >     > Jan
> >     >
> >     > >
> >     > > Garth
> >     > >
> >     > > > Jan
> >     > > >
> >     > > >>
> >     > > >> Johan
> >     > > >>
> >     > > >> >
> >     > > >> > Jan
> >     > > >> >
> >     > > >> >>
> >     > > >> >> I would be surprised if it does, as we eventually call
> popen
> >     > > >> >> to compile the JIT generated module. That call would be
> >     > > >> >> difficult to get rid of.
> >     > > >> >>
> >     > > >> >> Johan
> >     > > >> >>
> >     > > >> >> On 06/17/2013 08:47 AM, Martin Sandve Alnæs wrote:
> >     > > >> >>> Registers are touched on basically every operation the CPU
> >     > > >> >>> does :) But it didn't say "registers", but "registered
> >     > > >> >>> memory".
> >     > > >> >>>
> >
> http://blogs.cisco.com/performance/registered-memory-rma-rdma-and-mpi-implementations/
> >     > > >> >>>
> >     > > >> >>> Martin
> >     > > >> >>>
> >     > > >> >>>
> >     > > >> >>> On 17 June 2013 08:36, Johan Hake <[email protected]
> >     <mailto:[email protected]>
> >     > > >> >>> <mailto:[email protected] <mailto:[email protected]>>>
> >     wrote:
> >     > > >> >>>
> >     > > >> >>>     On 06/16/2013 11:40 PM, Jan Blechta wrote:
> >     > > >> >>>     > On Sun, 16 Jun 2013 22:40:43 +0200
> >     > > >> >>>     > Johan Hake <[email protected]
> >     <mailto:[email protected]>
> >     > > >> >>>     > <mailto:[email protected]
> >     <mailto:[email protected]>>> wrote:
> >     > > >> >>>     >> There are still fork calls when swig version is
> >     > > >> >>>     >> checked. Would it be ok to check it only when
> dolfin
> >     > > >> >>>     >> is imported? That would be an easy fix.
> >     > > >> >>>     >
> >     > > >> >>>     > I've no idea. There are two aspects of the issue:
> >     > > >> >>>     >
> >     > > >> >>>     > 1. forks may not be supported.
> >     > > >> >>>
> >     > > >> >>>     Following [1] below it looks like they should be
> >     > > >> >>> supported by the more recent openmpi and it also says
> that:
> >     > > >> >>>
> >     > > >> >>>       In general, if your application calls system() or
> >     > > >> >>> popen(), it will likely be safe.
> >     > > >> >>>
> >     > > >> >>>     > 2. even if forks are supported by given
> installation,
> >     > > >> >>>     > it may not be secure. Citing from [1]:
> >     > > >> >>>     >
> >     > > >> >>>     >    "If you use fork() in your application, you must
> not
> >     > > >> >>>     > touch any registered memory before calling some
> form of
> >     > > >> >>>     > exec() to launch another process. Doing so will
> cause
> >     > > >> >>>     > an immediate seg fault / program crash."
> >     > > >> >>>     >
> >     > > >> >>>     >    Is this condition met with present state and
> would
> >     > > >> >>>     > be met after suggested change?
> >     > > >> >>>
> >     > > >> >>>     I have no clue if we do touch any register before we
> call
> >     > > >> >>> the fork and I have no clue whether the suggested fix
> would
> >     > > >> >>> do that. Aren't registers touched on a low level basis
> quite
> >     > > >> >>> often?
> >     > > >> >>>
> >     > > >> >>>     Do you experience occasional segfaults?
> >     > > >> >>>
> >     > > >> >>>     Also [2] suggest that the warning might be the
> problem.
> >     > > >> >>> Have you tried running with:
> >     > > >> >>>
> >     > > >> >>>       mpirun --mca mpi_warn_on_fork 0 ...
> >     > > >> >>>
> >     > > >> >>>     Johan
> >     > > >> >>>
> >     > > >> >>>     >
> >     > > >> >>>     > Jan
> >     > > >> >>>     >
> >     > > >> >>>     >>
> >     > > >> >>>     >> Johan
> >     > > >> >>>     >> On Jun 16, 2013 12:47 AM, "Jan Blechta"
> >     > > >> >>>     <[email protected]
> >     <mailto:[email protected]>
> >     > > >> >>> <mailto:[email protected]
> >     <mailto:[email protected]>>>
> >     > > >> >>>     >> wrote:
> >     > > >> >>>     >>
> >     > > >> >>>     >>> What is the current status of a presence of fork()
> >     > > >> >>>     >>> calls in FEniCS codebase? These calls are not
> >     > > >> >>>     >>> friendly with openib infiniband clusters [1, 2].
> >     > > >> >>>     >>>
> >     > > >> >>>     >>> Issue with popen() calls for searching swig
> library
> >     > > >> >>>     >>> was disscused in the end of [3]. I'm still
> >     > > >> >>>     >>> experiencing these sort of troubles when running
> on
> >     > > >> >>>     >>> infiniband nodes (even when using only one node)
> so
> >     > > >> >>>     >>> was cleaning of popen() finished or are there any
> >     > > >> >>>     >>> other harmful fork() calls in FEniCS codebase?
> >     > > >> >>>     >>>
> >     > > >> >>>     >>> [1]
> >     > > >> >>>     >>>
> >     http://www.open-mpi.org/faq/?category=openfabrics#ofa-fork
> >     > > >> >>>     >>> [2]
> >     > > >> >>>     >>>
> >     http://www.open-mpi.org/faq/?category=tuning#fork-warning
> >     > > >> >>>     >>> [3]
> >     > > >> >>>     >>>
> https://answers.launchpad.net/dolfin/+question/219270
> >     > > >> >>>     >>>
> >     > > >> >>>     >>> Jan
> >     > > >> >>>     >>> _______________________________________________
> >     > > >> >>>     >>> fenics mailing list
> >     > > >> >>>     >>> [email protected]
> >     <mailto:[email protected]>
> >     > > >> >>>     >>> <mailto:[email protected]
> >     <mailto:[email protected]>>
> >     > > >> >>>     >>> http://fenicsproject.org/mailman/listinfo/fenics
> >     > > >> >>>     >>>
> >     > > >> >>>     >>
> >     > > >> >>>
> >     > > >> >>>     _______________________________________________
> >     > > >> >>>     fenics mailing list
> >     > > >> >>>     [email protected]
> >     <mailto:[email protected]>
> >     > > >> >>> <mailto:[email protected]
> >     <mailto:[email protected]>>
> >     > > >> >>> http://fenicsproject.org/mailman/listinfo/fenics
> >     > > >> >>>
> >     > > >> >>>
> >     > > >> >
> >     > > >>
> >     > > >> _______________________________________________
> >     > > >> fenics mailing list
> >     > > >> [email protected] <mailto:[email protected]>
> >     > > >> http://fenicsproject.org/mailman/listinfo/fenics
> >     > > >
> >     > > > _______________________________________________
> >     > > > fenics mailing list
> >     > > > [email protected] <mailto:[email protected]>
> >     > > > http://fenicsproject.org/mailman/listinfo/fenics
> >     > > _______________________________________________
> >     > > fenics mailing list
> >     > > [email protected] <mailto:[email protected]>
> >     > > http://fenicsproject.org/mailman/listinfo/fenics
> >     >
> >     > _______________________________________________
> >     > fenics mailing list
> >     > [email protected] <mailto:[email protected]>
> >     > http://fenicsproject.org/mailman/listinfo/fenics
> >
> >     _______________________________________________
> >     fenics mailing list
> >     [email protected] <mailto:[email protected]>
> >     http://fenicsproject.org/mailman/listinfo/fenics
> >
> >
> >
> >
> > _______________________________________________
> > fenics mailing list
> > [email protected]
> > http://fenicsproject.org/mailman/listinfo/fenics
> >
>
>
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to