Hi, I am having a problem with the last version of openmpi.
In some executions (1 each 100 more or less) a message is printed:
[tegasaste:01617] [NO-NAME] ORTE_ERROR_LOG: File read failure in file
util/universe_setup_file_io.c at line 123
It seems like if it try to read the universe file and it
This has been around for a very long time (at least a year, if memory serves
correctly). The problem is that the system "hangs" while trying to flush the
io buffers through the RML because it loses connection to the head node
process (for 1.x, that's basically mpirun) - but the "flush" procedure
I have been noticing this for a while (at least 2 months) as well
along with stale session directories. I filed a bug yesterday #177
https://svn.open-mpi.org/trac/ompi/ticket/177
I'll add this stack trace to it. I want to take a closer look
tomorrow to see what's really going on here.