Am 15.06.2012 um 18:01 schrieb Michael Coffman: > I am trying to update my sge_execd and sge_shepherd binaries. Based on > recent emails, I figured I could drop the GE2011.11 bits into place and they > would work fine. I am however having issues: > > My grid environment is: > > Current Version - SGE - 6.2u5 > SGE_CELL=ftcrnd > SGE_ROOT=/opt/grid-6.2u5 > SGE_CLUSTER_NAME=ftcrnd > > Binary path is /opt/grid/bin/lx24-amd64. > > I had to make l link in /opt/grid/bin for linux-x64 to get things to work.
Yep, this "lx24-amd64" is compiled into the binaries. > I used the following commands and it did indeed update live and the running > processes seemed happy and all seemed to be working fine: > > gbits=/opt/sa/tmp/gbits > service sgeexecd softstop > cd /opt/grid/bin > ln -s lx24-amd64 linux-x64 > cd lx24-amd64 > mv sge_shepherd sge_shepherd.old > mv sge_execd sge_execd.old > cp $gbits/sge_shepherd . > cp $gbits/sge_execd . > service sgeexecd start > > Since yesterday though I have had a couple of jobs fail and put their queue > into an error state. > > Mail from the failing job: > Shepherd error: > > 06/14/2012 21:29:37 [20339:8436]: can't open file job_pid: Permission denied > > From the qmaster messages file: > 06/14/2012 21:29:39|worker|gemaster|W|job 3885.1 failed on host > cs428.ftc.avagotech.net general before job because: 06/14/2012 21:29:37 > [20339:8436]: can't open file job_pid: Permission denied > > I checked a job_pid file on a currently running job on the system that had > the above errors, permission down the entire tree seems fine and here is the > job_id file: > > -rw-r--r-- 1 grid grid 6 Jun 14 17:40 job_pid > This usually goes to the spool directory of the jobs, where the sgeadmin must have write access. Under which account is the actual sgeexecd running? I get: $ ps -e -o user,ruser,command | grep sge sgeadmin root /usr/sge/bin/lx24-amd64/sge_execd -- Reuti > Any clues? Is the path perhaps hard coded into sge_shepherd for this file? > > Thanks. > -- > -MichaelC > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
