Hi,

Am 04.10.2011 um 12:05 schrieb Schmidt, Burkhard:

thanks for your answer.

Am 29.09.2011 um 20:50 schrieb Reuti:

Hi,

Am 28.09.2011 um 15:41 schrieb Schmidt, Burkhard:

I'm running SGE 6.2u5 on an Xserve cluster running Mac OS X Server v10.6 Snow Leopard with Open Directory network accounts. All users belong to the same default group staff.

the complete cluster is OS X, or only the master node or only the slaves?

The complete cluster is running Mac OS X Server v10.6 Snow Leopard.

There were issues in the past as a result for an account having too many additinal groups, but I'm not sure whether it applies here, as the error message was different.

http://gridengine.org/pipermail/users/2011-March/000447.html

Nevertheless: can you check the group count of the users in question?

I did so, and it is less than 14 on the execution hosts for all of my users. But it is more than 14 on the head node, due to the presence of 31 com.apple.sharepoint.* derived group memberships with IDs in the range 101 -- 178.

However, as all my users (not only those added after the upgrade to 10.6) have the same (large number of) group memberships on the head node, this doesn't seem to be the origin of my problem.

What's confusing me is the shepherd error message [1319:18564]: can't open file job_pid: Permission denied. These files have permissions 644, owner is the local user running Grid Engine, group is the local admin group. So they should be readable for everybody.

I was some days on vacation, hence the delay. Yes, the files, but what about the enclosing directory. Usually it's something like /var/spool/ sge/node01/active_jobs/12345.1 or alike in the spool directory of the node.

Do you have the spool directory local on each machine or in a shared space?

-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to