I posted earlier (Dec 28, 2015) about this issue and was told to check that
the slurmdbd and slurmctl daemons were running as the same user- they
weren't at that time. I thought making that change would resolve the
problem but it did not.

These daemons are now both running as root
root      6463     1  0 17:01 ?        00:00:00
/share/apps/slurm-15.08.8/sbin/slurmdbd
root      6743     1  0 17:05 ?        00:00:00
/share/apps/slurm-15.08.8//sbin/slurmctld

on the compute node:
root      7874     1  0 17:03 ?        00:00:00
/share/apps/slurm-15.08.8//sbin/slurmd

Upon further testing, I only need restart the slurmctld daemon to get the
new user added such that he can run a job. So not as big a deal to me now
but it is different than in older versions of slurm.

I'm adding a new user to an existing account and before I restart slurmctld
I see this in the slurmctld log when I try to "srun date" as that user:

[2016-03-30T17:04:50.107] error: User 9101 not found
[2016-03-30T17:04:50.107] _job_create: invalid account or partition for
user 9101, account '(null)', and partition 'debug'
[2016-03-30T17:04:50.142] _slurm_rpc_allocate_resources: Invalid account or
account/partition combination specified
[2016-03-30T17:05:11.381] Terminate signal (SIGINT or SIGTERM) received

Oddly the account is "null"

Here is the command to add the user,
sacctmgr add user johndoe defaultaccount=boris partition=low,med,high,debug
cluster=jane

slurm-15.08.8 on Ubuntu 14.04.4

Like I said, I can live with it since its only 1 restart.

Thanks,
Terri

Reply via email to