Hi Jan,

My computations aren't parallel, and I don't believe I have OpenMPI installed - 
I can't find the file SubsystemManager.cpp mentioned in the bitbucket issue.

I also get these 'MPI' crashes when I run the gmsh executable from the linux 
shell command line. I also get very similar when trying to use my own .py 
(which calls dolfin) from a different directory, which doesn't call the 
'dolfin_parameters.xml' shown below.

I have done the following to uninstall/reinstall Fenics:

sudo apt-get remove fenics*
sudo apt-get -purge autoremove fenics
sudo apt-get update
sudo add-apt-repository ppa:fenics-packages/fenics
sudo apt-get update
sudo apt-get install fenics
sudo apt-get dist-upgrade

(I also did this process with ipython, mayavi2 and gmsh.)

Thank you,

David


What is your OpenMPI version? It could be
https://bitbucket.org/fenics-project/dolfin/issue/384

Jan


On Wed, 3 Dec 2014 16:27:25 +0000
David Holloway <[email protected]<mailto:[email protected]>> wrote:

> Hi Jan,
>
> Thank you - here's a screen dump from trying to run a Fenics demo .py
>
> In [1]: run d1_p2D.py
> Reading DOLFIN parameters from file "dolfin_parameters.xml".
> [ubuntu:03039] *** Process received signal *** [ubuntu:03039] Signal:
> Floating point exception (8) [ubuntu:03039] Signal code: Integer
> divide-by-zero (1) [ubuntu:03039] Failing at address: 0xb74e7da0
> [ubuntu:03039] [ 0] [0xb77bb40c] [ubuntu:03039] [ 1]
> /usr/lib/i386-linux-gnu/libhwloc.so.5(+0x2cda0)
> [0xb74e7da0] [ubuntu:03039]
> [ 2] /usr/lib/i386-linux-gnu/libhwloc.so.5(+0x2e71c) [0xb74e971c]
> [ubuntu:03039] [ 3] /usr/lib/i386-linux-gnu/libhwloc.so.5(+0x2ea8b)
> [0xb74e9a8b] [ubuntu:03039]
> [ 4] /usr/lib/i386-linux-gnu/libhwloc.so.5(+0x98f6) [0xb74c48f6]
> [ubuntu:03039] [ 5]
> /usr/lib/i386-linux-gnu/libhwloc.so.5(hwloc_topology_load+0x1c6)
> [0xb74c58ec] [ubuntu:03039]
> [ 6] /usr/lib/libopen-rte.so.4(orte_odls_base_open+0x7b1)
> [0xb770c881] [ubuntu:03039]
> [ 7] /usr/lib/openmpi/lib/openmpi/mca_ess_hnp.so(+0x2445)
> [0xb7797445] [ubuntu:03039]
> [ 8] /usr/lib/libopen-rte.so.4(orte_init+0x1cf) [0xb76e1b3f]
> [ubuntu:03039] [ 9] /usr/lib/libopen-rte.so.4(orte_daemon+0x256)
> [0xb76fe1c6] [ubuntu:03039] [10] orted() [0x80485b3] [ubuntu:03039]
> [11] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)
> [0xb7515a83] [ubuntu:03039] [12] orted() [0x80485f8] [ubuntu:03039]
> *** End of error message *** [ubuntu:03035] [[INVALID],INVALID]
> ORTE_ERROR_LOG: Unable to start a daemon on the local node in file
> ess_singleton_module.c at line 343 [ubuntu:03035] [[INVALID],INVALID]
> ORTE_ERROR_LOG: Unable to start a daemon on the local node in file
> ess_singleton_module.c at line 140 [ubuntu:03035] [[INVALID],INVALID]
> ORTE_ERROR_LOG: Unable to start a daemon on the local node in file
> runtime/orte_init.c at line 128
> ----------------------------------------------------------------------
> ---- It looks like orte_init failed for some reason; your parallel
> process is likely to abort.  There are many reasons that a parallel
> process can fail during orte_init; some of which are due to
> configuration or environment problems.  This failure appears to be an
> internal failure; here's some additional information (which may only
> be relevant to an Open MPI developer):
>
>   orte_ess_set_name failed
>   --> Returned value Unable to start a daemon on the local node
> (-128) instead of ORTE_SUCCESS
> ----------------------------------------------------------------------
> ----
> ----------------------------------------------------------------------
> ---- It looks like MPI_INIT failed for some reason; your parallel
> process is likely to abort.  There are many reasons that a parallel
> process can fail during MPI_INIT; some of which are due to
> configuration or environment problems.  This failure appears to be an
> internal failure; here's some additional information (which may only
> be relevant to an Open MPI developer):
>
>   ompi_mpi_init: orte_init failed
>   --> Returned "Unable to start a daemon on the local node" (-128)
> instead of "Success" (0)
> ----------------------------------------------------------------------
> ---- [ubuntu:3035] *** An error occurred in MPI_Init_thread
> [ubuntu:3035]
> *** on a NULL communicator [ubuntu:3035] *** Unknown error
> [ubuntu:3035] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> ----------------------------------------------------------------------
> ---- An MPI process is aborting at a time when it cannot guarantee
> that all of its peer processes in the job will be killed properly.
> You should double check that everything has shut down cleanly.
>
>   Reason:     Before MPI_INIT completed
>   Local host: ubuntu
>   PID:        3035
> ----------------------------------------------------------------------

_______________________________________________
fenics-support mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics-support

Reply via email to