Dear all,
We have noticed a very strange problem every time we add an existing
user to a secondary group.
We manage our users in LDAP. When we add a user to a new group and then
type the "id" and "groups" commands we see that the user was indeed
added to the new group. The same happens when running the command
"getent groups".
For example, for a user "thekla" whose primary group was "cstrc" and
now was also added to the group "build" we get:
[thekla@node01 ~]$ id
uid=2017(thekla) gid=5000(cstrc) groups=5000(cstrc),10257(build)
[thekla@node01 ~]$ groups
cstrc build
[thekla@node01 ~]$ getent group | grep build
build:*:10257:thekla
The above output is the correct one and it is given to us when we ssh to
one of the compute nodes.
But, when we submit a job on the nodes (so getting access through SLURM
and not with direct ssh), we cannot see the new group the user was added
to:
[thekla@prometheus ~]$ salloc -N1
salloc: Granted job allocation 8136
[thekla@node01 ~]$ id
uid=2017(thekla) gid=5000(cstrc) groups=5000(cstrc)
[thekla@node01 ~]$ groups
cstrc
While, the following output shows the correct result:
[thekla@node01 ~]$ getent group | grep build
build:*:10257:thekla
This problem appears only when we get access through SLURM i.e. when we
run a job.
Has anyone faced this problem before? The only way we found for solving
this is to restart the SLURM service on the compute nodes every time we
add a user to a new group.
Thanks,
Thekla