Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-28 Thread Dave Love
berg...@merctech.com writes: In a continuing effort to resolve the problem where the queue gets put into an error state with the message (in qstat): = = can't open file /opt/sge/6.2u5/default/spool/r820-1/active_jobs/93629.1/pe_hostfile: Permission denied = I've enabled

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-28 Thread Dave Love
Jewell, Chris c.p.jew...@massey.ac.nz writes: The message is from a failure of setuid(2) or similar. I don't know if it's a libc bug that errno seems no to be set (Success) as it should be. The two possible cases are: EAGAIN The uid does not match the current uid and uid brings

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-27 Thread bergman
In the message dated: Fri, 23 Aug 2013 12:59:12 -0400, The pithy ruminations from berg...@merctech.com on Re: [gridengine users] Random queue errors, and suspect pe_hostfiles were: = = In a continuing effort to resolve the problem where the queue gets put into an error state with the message

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-26 Thread Reuti
Am 26.08.2013 um 01:12 schrieb Jewell, Chris: The message is from a failure of setuid(2) or similar. I don't know if it's a libc bug that errno seems no to be set (Success) as it should be. The two possible cases are: EAGAIN The uid does not match the current uid and uid brings

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-25 Thread Jewell, Chris
The message is from a failure of setuid(2) or similar. I don't know if it's a libc bug that errno seems no to be set (Success) as it should be. The two possible cases are: EAGAIN The uid does not match the current uid and uid brings process over its RLIMIT_NPROC resource

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-23 Thread William Hay
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/08/13 04:09, Jewell, Chris wrote: Hello all, A while since I posted here, so good to be back! My installation of GE 8.1.3 from the Scientific Linux 6.3 RPM repo has started misbehaving of late, since I introduced a share tree policy the

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-23 Thread Dave Love
Sorry, I apparently missed this before. William Hay w@ucl.ac.uk writes: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/08/13 04:09, Jewell, Chris wrote: Hello all, A while since I posted here, so good to be back! My installation of GE 8.1.3 from the Scientific Linux 6.3 RPM

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-23 Thread bergman
In the message dated: Fri, 23 Aug 2013 01:28:34 -, The pithy ruminations from Jewell, Chris on Re: [gridengine users] Random queue errors, and suspect pe_hostfiles were: = = I started with a search of the SGE mailing list archive, and found your = post. :) = = Have you found

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-22 Thread bergman
In the message dated: Thu, 01 Aug 2013 03:09:56 -, The pithy ruminations from Jewell, Chris on [gridengine users] Random queue errors, and suspect pe_hostfiles were: = Hello all, = = A while since I posted here, so good to be back! = = My installation of GE 8.1.3 from the Scientific Linux

Re: [gridengine users] Random queue errors, and suspect pe_hostfiles

2013-08-22 Thread Jewell, Chris
I started with a search of the SGE mailing list archive, and found your post. :) Have you found a solution? Hello all, Sorry for the long leave of absence. I've been thoroughly testing my system for this issue. I checked my RAID1 for consistency, and performed an xfs_repair to make