After some digging, it seems that segfaults observed on OFED/InfiniBand
clusters (see [1], [2], [3]) are caused by implementation of

  subprocess.Popen

Check your local subprocess.py:_execute_child. Parent does not seem to
keep hands of memory between fork() and exec() (as required [4], [5]),
especially as it is fiddling with garbage collector.

I tried switching to os.system (posix implementation in [6]) instead of
subprocess.Popen and it seems promising. Check out [7] or enclosed
patch.

Jan

[1] https://answers.launchpad.net/dolfin/+question/219270
[2] https://answers.launchpad.net/dolfin/+question/225946
[3] http://fenicsproject.org/pipermail/fenics/2013-June/000398.html
[4] https://www.open-mpi.org/faq/?category=openfabrics#ofa-fork
[5] 
http://www.openfabrics.org/downloads/OFED/release_notes/OFED_3.12_rc1_release_notes#3.03
[6] http://svn.python.org/projects/python/trunk/Modules/posixmodule.c
[7] https://bitbucket.org/blechta/instant/branch/blechta/ofed-fork
commit 4cca8225fd786fd984f1321ee6ec918deba37822
Author: Jan Blechta <[email protected]>
Date:   Tue Mar 25 14:15:31 2014 +0100

    Switch from subprocess.Popen to os.system for OFED fork safety.

diff --git a/instant/output.py b/instant/output.py
index f44622c..cdd643f 100644
--- a/instant/output.py
+++ b/instant/output.py
@@ -74,21 +74,53 @@ def write_file(filename, text):
         instant_error("Can't open '%s': %s" % (filename, e))
 
 from subprocess import Popen, PIPE, STDOUT
-def get_status_output(cmd, input=None, cwd=None, env=None):
+def _get_status_output(cmd, input=None, cwd=None, env=None):
     "Replacement for commands.getstatusoutput which does not work on Windows."
     if isinstance(cmd, str):
         cmd = cmd.strip().split()
     instant_debug("Running: " + str(cmd))
+ 
+    # NOTE: Is not OFED-fork-safe! Check subprocess.py,
+    #       http://bugs.python.org/issue1336#msg146685
+    #       OFED-fork-safety means that parent should not
+    #       touch anything between fork() and exec(),
+    #       which is not met in subprocess module. See
+    #       https://www.open-mpi.org/faq/?category=openfabrics#ofa-fork
+    #       http://www.openfabrics.org/downloads/OFED/release_notes/OFED_3.12_rc1_release_notes#3.03
     pipe = Popen(cmd, shell=False, cwd=cwd, env=env, stdout=PIPE, stderr=STDOUT)
 
     (output, errout) = pipe.communicate(input=input)
     assert not errout
 
     status = pipe.returncode
+    return (status, output)
+
+import os, tempfile
+def get_status_output(cmd, input=None, cwd=None, env=None):
+    # TODO: We don't need function with such a generality.
+    #       We only need output and return code.
+    if not isinstance(cmd, str) or input is not None or \
+        cwd is not None or env is not None:
+        raise NotImplementedError
+
+    # TODO: Writing to tempfile and reading back is unnecessary and
+    #       prone to not being supported under different platforms.
+    #       In fact, output is usually written back to logfile
+    #       in instant, so it can be done directly.
+    f = tempfile.NamedTemporaryFile(delete=True)
+
+    # TODO: Is this redirection platform independnt?
+    cmd += ' > ' + f.name + ' 2> ' + os.devnull
+    
+    # NOTE: Possibly OFED-fork-safe, tests needed!!!
+    status = os.system(cmd)
 
+    output = f.read()
+    f.close()
     return (status, output)
 
-def get_output(cmd):
+def _get_output(cmd):
+    # TODO: can be removed, not used in instant
     "Replacement for commands.getoutput which does not work on Windows."
     if isinstance(cmd, str):
         cmd = cmd.strip().split()
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to