On Fri, Jun 15, 2012 at 10:11 AM, Rayson Ho <[email protected]> wrote:

> On Fri, Jun 15, 2012 at 12:01 PM, Michael Coffman
> <[email protected]> wrote:
> > From the qmaster messages file:
> > 06/14/2012 21:29:39|worker|gemaster|W|job 3885.1 failed on host
> > cs428.ftc.avagotech.net general before job because: 06/14/2012 21:29:37
> > [20339:8436]: can't open file job_pid: Permission denied
> >
> > I checked a job_pid file on a currently running job on the system that
> had
> > the above errors, permission down the entire tree seems fine and here is
> the
> > job_id file:
> >
> > -rw-r--r-- 1 grid  grid       6 Jun 14 17:40 job_pid
>
> Is your execd spool dir on NFS or local??
>
> Local.


> Also, does it happen to all nodes or just a node or queue?
>
>
Happened on 2 different nodes.   Not all jobs caused this.


> Rayson
>
>
>
> >
> > Any clues?    Is the path perhaps hard coded into sge_shepherd for this
> > file?
> >
> > Thanks.
> > --
> > -MichaelC
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
> >
>



-- 
-MichaelC
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to