> Are you referring to this SEGV error here? I am assuming this is OMPI > 1.1.1 so you are using rsh PLS to launch your executables (using loose > integration).
oops, I wanted to compile ompi 1.2.3 against OFED 1.1 and these are the errors. This problem has nothing to do with the SGE anymore (Jeff suggested me to migrate to a "slightly" newer version, so I tried and failed with these errors) Should I start a whole new thread on this, since the SGE question is solved? > >-sh-3.00$ ompi/bin/mpirun -d -np 2 -H node03,node06 hostname > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] connect_uni: connection not allowed > > [headnode:23178] [0,0,0] setting up session dir with > > [headnode:23178] universe default-universe-23178 > > [headnode:23178] user me > > [headnode:23178] host headnode > > [headnode:23178] jobid 0 > > [headnode:23178] procid 0 > > [headnode:23178] procdir: > > /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/0/0 > > [headnode:23178] jobdir: > > /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/0 > > [headnode:23178] unidir: > > /tmp/openmpi-sessions-me@headnode_0/default-universe-23178 > > [headnode:23178] top: openmpi-sessions-me@headnode_0 > > [headnode:23178] tmp: /tmp > > [headnode:23178] [0,0,0] contact_file > > /tmp/openmpi-sessions-me@headnode_0/default-universe-23178/universe- > > setup.txt > > [headnode:23178] [0,0,0] wrote setup file > > [headnode:23178] *** Process received signal *** > > [headnode:23178] Signal: Segmentation fault (11) > > [headnode:23178] Signal code: Address not mapped (1) > > [headnode:23178] Failing at address: 0x1 > > [headnode:23178] [ 0] /lib64/tls/libpthread.so.0 [0x39ed80c430] > > [headnode:23178] [ 1] /lib64/tls/libc.so.6(strcmp+0) [0x39ecf6ff00] > > [headnode:23178] [ 2] > > /home/me/ompi/lib/openmpi/mca_pls_rsh.so(orte_pls_rsh_launch+0x24f) > > [0x2a9723cc7f] > > [headnode:23178] [ 3] /home/me/ompi/lib/openmpi/mca_rmgr_urm.so > > [0x2a9764fa90] > > [headnode:23178] [ 4] /home/me/ompi/bin/mpirun(orterun+0x35b) > > [0x402ca3] > > [headnode:23178] [ 5] /home/me/ompi/bin/mpirun(main+0x1b) [0x402943] > > [headnode:23178] [ 6] /lib64/tls/libc.so.6(__libc_start_main+0xdb) > > [0x39ecf1c3fb] > > [headnode:23178] [ 7] /home/me/ompi/bin/mpirun [0x40289a] > > [headnode:23178] *** End of error message *** > > Segmentation fault > > So is it true that SEGV only occurred under the SGE environment and not > a normal environment? If it is then I am baffled because starting rsh > pls under the SGE environment in 1.1.1 should be no different than > starting rsh pls without SGE. nope the config.log and "ompi_info --all" output are attached some posts before. Sorry for this topic confusion. thank you.