Hi Barry

Under SGE 6.2u5 grid engine will treat failure to add a supplementary gid to
a job's process as an error.  Unfortunately this is based off NGROUPS_MAX which
on OSX is set to 16.  On Linux its 65K.

So, if user has >14 group memberships the the add of the supplemental gid will
fail and be treated as an error.

For SGE 6.2u5 the work around is to ensure that you have fewer than 14 group 
membership
for grid engine users.

Thanks
Stephen
############################################################################
# Stephen Dennis : Senior Sales Engineer
# Univa Corporation:  http://univa.com
# [email protected] : 310 310 0738 : skype stephendennis.com
############################################################################
________________________________________
From: [email protected] [[email protected]] On Behalf Of 
Barry McInnes [[email protected]]
Sent: Friday, March 18, 2011 11:31 AM
To: [email protected]
Subject: [gridengine users] ge62u5 mac 10.6 too many group ids

Hi,
When running gmaster on 10.5 we get user submit errors when they are in
too many groups, so the job fails. SOme users in less groups (6-8) can
run jobs eg the first user cannot submit the second user can
[mac27:~/SGE] bmcinnes% id bmcinnes
uid=2101(bmcinnes) gid=200(climate)
groups=200(climate),1953027852(PSD\sysadmins),829578209(PSD\domain
admins),801476512(PSD\log1),204(_developer),100(_lpoperator),98(_lpadmin),81(_appserveradm),80(admin),79(_appserverusr),62(netaccounts),12(everyone),1207(rain),1100(systems),998(lmadmin),900(sawrtrs),400(cuac),2109053379(PSD\domain
users),1858905114(PSD\denied rodc password replication
group),1358185131(PSD\it_wikis),404(com.apple.sharepoint.group.3),928177777(PSD\coopcall),401(com.apple.access_screensharing),403(com.apple.sharepoint.group.2),402(com.apple.sharepoint.group.1)
[mac27:~/SGE] bmcinnes%
[mac27:~/SGE] bmcinnes%
[mac27:~/SGE] bmcinnes% id ppegion
uid=3009(ppegion) gid=200(climate)
groups=200(climate),62(netaccounts),12(everyone),594189391(PSD\climate),247203070(PSD\psd1group),2109053379(PSD\domain
users),404(com.apple.sharepoint.group.3),928177777(PSD\coopcall),403(com.apple.sharepoint.group.2),402(com.apple.sharepoint.group.1)
[mac27:~/SGE] bmcinnes%

The Mac OS is adding groups membership to users, as well as our group
settings.

When we go to Mac 10.6 Intel, the qmaster server fails to put any nodes
in service, due to the same error, so users have no chance to even
submit jobs

03/16/2011 13:41:49|worker|g5s2|W|rescheduling job 15015.1
03/16/2011 13:41:49|worker|g5s2|E|queue quad marked QERROR as result of
ob 15015's failure at host mac40.psd.esrl.noaa.gov
03/16/2011 14:02:49|worker|g5s2|W|job 15015.1 failed on host
mac65.psd.esrl.noaa.gov general before job because: 03/16/2011 14:02:49
[0:22624]: can't set additional group id (uid=0, euid=0): the user
already has too many group ids
03/16/2011 14:02:49|worker|g5s2|W|rescheduling job 15015.1
03/16/2011 14:02:49|worker|g5s2|E|queue quad marked QERROR as result of
job 15015's failure at host mac65.psd.esrl.noaa.gov
03/16/2011 14:08:19|worker|g5s2|W|job 15015.1 failed on host
mac18.psd.esrl.noaa.gov general before job because: 03/16/2011 14:08:19
[0:42391]: can't set additional group id (uid=0, euid=0): the user
already has too many group ids

We are using Active Directory authentication, and the Mac clients are
all 10.6.6.
We tried OGE 62u7 with the same group id error.

We are currently back at 10.5 PPC qmaster server to get jobs submitted
and run.

Any help appreciated.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


---------------------------------------------------------------------


Notice from Univa Postmaster:


This email message is for the sole use of the intended recipient(s) and may 
contain confidential and privileged information. Any unauthorized review, use, 
disclosure or distribution is prohibited. If you are not the intended 
recipient, please contact the sender by reply email and destroy all copies of 
the original message. This message has been content scanned by the Univa Mail 
system.



---------------------------------------------------------------------


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to