On Fri, Jun 15, 2012 at 10:11 AM, Rayson Ho <[email protected]> wrote:
> On Fri, Jun 15, 2012 at 12:01 PM, Michael Coffman > <[email protected]> wrote: > > From the qmaster messages file: > > 06/14/2012 21:29:39|worker|gemaster|W|job 3885.1 failed on host > > cs428.ftc.avagotech.net general before job because: 06/14/2012 21:29:37 > > [20339:8436]: can't open file job_pid: Permission denied > > > > I checked a job_pid file on a currently running job on the system that > had > > the above errors, permission down the entire tree seems fine and here is > the > > job_id file: > > > > -rw-r--r-- 1 grid grid 6 Jun 14 17:40 job_pid > > Is your execd spool dir on NFS or local?? > > Local. > Also, does it happen to all nodes or just a node or queue? > > Happened on 2 different nodes. Not all jobs caused this. > Rayson > > > > > > > Any clues? Is the path perhaps hard coded into sge_shepherd for this > > file? > > > > Thanks. > > -- > > -MichaelC > > > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > > > -- -MichaelC
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
