Hi, Am 12.09.2013 um 17:50 schrieb Edward Ned Harvey:
> I'm having a heck of a time figuring out why. > > On rhel6, /etc/init.d/sgeexecd.myclustername script is run at startup, or via > sudo after startup. > sudo /etc/init.d/sgeexecd.myclustername start > > It just says "OK" and no other output, yet the daemon isn't running. > > I added the "-x" option to '#!/bin/sh -x" so I can debug it … > I see it gets up to the "exec 1> /dev/null 2>&1" which effectively eliminates > any further debug output… > So I comment out that line and run again. > Now I can see it launches sge_execd, and the exit status is 0, so the "touch" > on the following line does indeed create the lock file. > The "qping" loop immediately after that in the script … exits with 0 status, > on the first try. > > And still, there is no process running at the end of that script. > > I modify the startup script to perform the qping 5 times unconditionally. I > see the first time, it has exit value 0, and all subsequent times, it has > exit value 1. This means it is indeed running for a very short period of > time, but then it dies in less than a second. > > Any ideas what the problem is? Please have a look at your /tmp. The starting execd will write the cause of not being able to start in a file therein. -- Reuti > This is a machine that we recently reinstalled the OS, and we're reinstalling > sgeexecd by the same process it was previously installed. > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
