Dear Pak Lui I can delete the (sge) job with qdel -f such that it disappears from the job list but the application processes keep running, including the shepherds. I have to kill them with -15
For some reason the kill -15 does not reach mpirun. (We use such a parameter to mpirun on our myrinet mx nodes with mpich, that's why I asked). Just to confirm, there is no configure directive specific to gridengine when building openmpi? Thanks henk > -----Original Message----- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Pak Lui > Sent: 23 July 2007 15:16 > To: Open MPI Users > Subject: Re: [OMPI users] sge qdel fails > > Hi Henk, > > The sge script should not require any extra parameter. The > qdel command should send the kill signal to mpirun and also > remove the SGE allocated tmp directory (in something like > /tmp/174.1.all.q/) which contains the OMPI session dir for > the running job, and in turns would cause orted and the user > processes to exit. > > Maybe you could try qdel -f <jid> to force delete from the > sge_qmaster, in case when sge_execd does not respond to the > delete request by the sge_qmaster? > > SLIM H.A. wrote: > > I am using OpenMPI 1.2.3 with SGE 6.0u7 over InfiniBand (OFED 1.2), > > following the recommendation in the OpenMPI FAQ > > > > http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge > > > > The job runs but when the user wants to delete the job with > the qdel > > command, this fails. Does the mpirun command > > > > mpirun -np $NSLOTS ./exe > > > > in the sge script require extra parameters? > > > > Thanks for any advice > > > > Henk > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > > - Pak Lui > pak....@sun.com > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >