Hi, We've a code that is failing when it runs this function:
V = VectorFunctionSpace(mesh, "CG", 2) On more than one compute node. It fails at the ffc stage: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. Rank 0 [Fri Oct 9 16:42:35 2015] [c3-0c0s2n1] application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 In instant.recompile: The module did not compile with command 'cmake -DDEBUG=TRUE .', see '/work/z01/z01/adrianj/.instant/ffc_form_e1514c0a3fffa4b6c0eda574739266f451a578aa/compile.log' Traceback (most recent call last): File "multiple_solitons.py", line 184, in V = VectorFunctionSpace(mesh, "CG", 2) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 628, in init constrained_domain=constrained_domain) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/dolfin/functions/functionspace.py", line 153, in init ufc_element, ufc_dofmap = jit(self._ufl_element, mpi_comm=mesh.mpi_comm()) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 68, in mpi_jit output = local_jit(*args, **kwargs) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/dolfin/compilemodules/jit.py", line 128, in jit return form_compiler.jit(form, parameters=p) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/ffc/jitcompiler.py", line 72, in jit return jit_element(ufl_object, parameters) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/ffc/jitcompiler.py", line 177, in jit_element compiled_form, module, prefix = jit_form(form, parameters) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/ffc/jitcompiler.py", line 148, in jit_form cache_dir = cache_dir) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/ffc/backends/ufc/build.py", line 73, in build_ufc_module **kwargs) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/instant/build.py", line 563, in build_module recompile(modulename, module_path, new_compilation_checksum, build_system) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/instant/build.py", line 152, in recompile instant_error(msg % (cmd, compile_log_filename_dest)) File "/work/y07/y07/cse/fenics/1.5.0/lib/python2.7/site-packages/instant/output.py", line 85, in instant_error raise RuntimeError(text) RuntimeError: In instant.recompile: The module did not compile with command 'cmake -DDEBUG=TRUE .', see '/work/z01/z01/adrianj/.instant/ffc_form etc.... However, it works if I run this within a single compute node. It only fails if I try to use more than one compute node. Our compute nodes have 24 cores, so the above will work up to 24 cores inside one node, but will fail if, for instance, I use 12 cores on 2 nodes (giving 24 altogether). Any ideas what would be causing this. Is it a petsc/petsc4py thing? I've cleaned the instant cache out before the runs. thanks adrianj -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ fenics-support mailing list [email protected] http://fenicsproject.org/mailman/listinfo/fenics-support
