Am 08.02.2012 um 22:52 schrieb Tom Bryan: > <snip> > Yes, this should work across multiple machines. And it's using `qrsh >>>> -inherit >>>> ...` so it's failing somewhere in Open MPI - is it working with 1.4.4? >>> >>> I'm not sure. We no longer have our 1.4 test environment, so I'm in the >>> process of building that now. I'll let you know once I have a chance to run >>> that experiment. > > You said that both of these cases worked for you in 1.4. Were you running a > modified version that did not use THREAD_MULTIPLE? I ask because I'm > getting worse errors in 1.4. I'm using the same code that was working (in > some cases) with 1.5.4. > > I built 1.4.4 with (among other option) > --with-threads=posix --enable-mpi-threads
./configure --prefix=$HOME/local/openmpi-1.4.4-default-thread --with-sge --with-threads=posix --enable-mpi-threads No problems even with THREAD_MULTIPLE. Only as stated in singleton mode one or more additional line (looks like one per slave host, but not always - race condition?): [pc15370:31390] [[24201,0],1] routed:binomial: Connection to lifeline [[24201,0],0] lost > <snip> > ompi_mpi_init: orte_init failed > --> Returned "Data unpack would read past end of buffer" (-26) instead of > "Success" (0) > -------------------------------------------------------------------------- > *** The MPI_Init_thread() function was called before MPI_INIT was invoked. > *** This is disallowed by the MPI standard. > *** Your MPI job will now abort. Interesting error message, as it's not true to be disallowed. -- Reuti