Hi,
I have to spawn a set of processes on multiple hosts, with my own mapping
pattern, including processor ID, for example :
* process 1 on cpu0 of host 1
* process 2 on cpu1 of host 1
* process 3 on cpu1 of host 1
* process 4 on cpu0 of host 2
* process 5 on cpu1 of host 2
I see that only the
Hi,
I have to spawn multiple slaves processes on a cluster, from a unique master
process.
The open mpi distribution I use is 1.1.2.
I'm using a HP cluster, with 2 ethernet NICs on each machine.
My problem was a freeze of master when calling mpi_call_spawn_multiple, and of
slaves when calling
I believe this is "too many open files".
ulimit -n some_number
Regards,
Mostyn
On Wed, 22 Nov 2006, Lydia Heck wrote:
I have - again - successfully built and installed
mx and openmpi and I can run 64 and 128 cpus jobs on a 256 CPU cluster
version of openmpi is 1.2b1
compiler used: studio11
The same run on 32 CPUs almost completes, starting to write 32 re-start
files and fails with the same problem:
Signal:11 info.si_errno:0(Error 0) si_code:1(SEGV_MAPERR)
Failing at addr:33
/opt/ompi/lib/libopal.so.0.0.0:opal_backtrace_print+0x10
/opt/ompi/lib/libopal.so.0.0.0:0x99df5
/lib/amd64/li
Gadget2 - I cannot attach it because it is not publicly available,
runs perfectly fine on any number of processes on systems such
as Solaris 10 - Sun CT6 gigabit, SUN CT5 and myrinet gm, IBM regatta ..
Sorry to be so expansive ...
When I run the code on 32 CPUs on openmpi, mx using the studio11