I'm getting an error message early on: [csclprd3-0-11:17355] [[36373,0],17] plm:rsh: using "/opt/sge/bin/lx-amd64/qrsh -inherit -nostdin -V -verbose" for launching unable to write to file /tmp/285019.1.verylong.q/qrsh_error: No space left on device[csclprd3-6-10:18352] [[36373,0],21] plm:rsh: using "/opt/sge/bin/lx-amd64/qrsh -inherit -nostdin -V -verbose" for launching
According to the OpenMPI FAQ: 'You may want to alter other parameters, but the important one is "control_slaves", specifying that the environment has "tight integration". Note also the lack of a start or stop procedure. The tight integration means that mpirun automatically picks up the slot count to use as a default in place of the '-np' argument, picks up a host file, spawns remote processes via 'qrsh' so that SGE can control and monitor them, and creates and destroys a per-job temporary directory ($TMPDIR), in which Open MPI's directory will be created (by default).' When I look at my OpenMPI environment there is no $TMPDIR environment variable. How does OpenMPI determine where it's going to put the "per-job temporary directory ($TMPDIR)"? Does it use an SoGE defined environment variable? Is the host file used by OpenMPI spawned in this $TMPDIR temporary directory? Bill L. IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation.