Hi Reuti As far as I am concerned, you SGE users “own” the SGE support - so feel free to submit a patch!
Ralph > On Sep 13, 2017, at 9:10 AM, Reuti <re...@staff.uni-marburg.de> wrote: > > Hi, > > I wonder whether it came ever to the discussion, that SGE can have a similar > behavior like Torque/PBS regarding the mangling of hostnames. It's similiar > to https://github.com/open-mpi/ompi/issues/2328, in the behavior that a node > can have multiple network interfaces and each has an unique name. SGE's > operation can be routed to a specific network interface by the use of a file: > > $SGE_ROOT/$SGE_CELL/common/host_aliases > > which has the format: > > <sge-name of the node> <one or more blanks> <real long or short hostname> > > Hence in the generated $PE_HOSTFILE the name known to SGE is listed, although > the `hostname` command provides the real name. Open MPI would in this case > start a `qrsh -inherit …` call instead of forking, as it thinks that these > are different machines (assuming an allocation_rule of $PE_SLOTS so that the > `mpiexec` is supposed to be on the same machine as the started tasks). > > I tried to go the "old" way to provide a start_proc_args to the PE to create > a symbolic link to `hostname` in $TMPDIR, so that inside the job script an > adjusted `hostname` call is available, but obviously Open MPI calls > gethostname() directly and not by an external binary. > > So I mangled the hostname in the created machinefile in the jobscript to feed > an "adjusted" $PE_HOSTFILE to Open MPI and then it's working as intended: > Open MPI creates forks. > > Does anyone else need such a patch in Open MPI and is it suitable to be > included? > > -- Reuti > > PS: Only the headnodes have more than one network interface in our case and > hence it's didn't come to my attention up to now, as now there was a need to > use also some cores on the headnodes. They are known internally to SGE as > "login" and "master", but the external names may be "foo" and "baz" which > gethostname() returns. > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users