Sorry, you just said they were, somehow misread this. Try increasing logging level, perhaps the easiest way is running slurmctld and slurmdbd interactively with the -Dvvv arguments. Then add a user and see if any errors occur, particularly on the slurmctld side after the sacctmgr update is done.
slurmdbd will send the accounting update to slurmctld slightly after sacctmgr returns. ---- Doug Jacobsen, Ph.D. NERSC Computer Systems Engineer National Energy Research Scientific Computing Center <http://www.nersc.gov> [email protected] ------------- __o ---------- _ '\<,_ ----------(_)/ (_)__________________________ On Wed, Mar 30, 2016 at 5:38 PM, Douglas Jacobsen <[email protected]> wrote: > Are both slurmdbd and slurmctld running as the same UID? (if not they > need to be, I believe you can see the errors on slurmdbd debug2 or debug3) > > > > ---- > Doug Jacobsen, Ph.D. > NERSC Computer Systems Engineer > National Energy Research Scientific Computing Center > <http://www.nersc.gov> > [email protected] > > ------------- __o > ---------- _ '\<,_ > ----------(_)/ (_)__________________________ > > > On Wed, Mar 30, 2016 at 5:32 PM, Terri Knight <[email protected]> > wrote: > >> >> I posted earlier (Dec 28, 2015) about this issue and was told to check >> that the slurmdbd and slurmctl daemons were running as the same user- they >> weren't at that time. I thought making that change would resolve the >> problem but it did not. >> >> These daemons are now both running as root >> root 6463 1 0 17:01 ? 00:00:00 >> /share/apps/slurm-15.08.8/sbin/slurmdbd >> root 6743 1 0 17:05 ? 00:00:00 >> /share/apps/slurm-15.08.8//sbin/slurmctld >> >> on the compute node: >> root 7874 1 0 17:03 ? 00:00:00 >> /share/apps/slurm-15.08.8//sbin/slurmd >> >> Upon further testing, I only need restart the slurmctld daemon to get the >> new user added such that he can run a job. So not as big a deal to me now >> but it is different than in older versions of slurm. >> >> I'm adding a new user to an existing account and before I restart >> slurmctld I see this in the slurmctld log when I try to "srun date" as that >> user: >> >> [2016-03-30T17:04:50.107] error: User 9101 not found >> [2016-03-30T17:04:50.107] _job_create: invalid account or partition for >> user 9101, account '(null)', and partition 'debug' >> [2016-03-30T17:04:50.142] _slurm_rpc_allocate_resources: Invalid account >> or account/partition combination specified >> [2016-03-30T17:05:11.381] Terminate signal (SIGINT or SIGTERM) received >> >> Oddly the account is "null" >> >> Here is the command to add the user, >> sacctmgr add user johndoe defaultaccount=boris >> partition=low,med,high,debug cluster=jane >> >> slurm-15.08.8 on Ubuntu 14.04.4 >> >> Like I said, I can live with it since its only 1 restart. >> >> Thanks, >> Terri >> > >
