[slurm-dev] need to restart slurm daemons for accounting changes

2015-12-28 Thread Terri Knight
Since upgrading to slurm 15.08.1 on Ubuntu 14.04.3  it is required to
restart mysql, slurmdbd, and slurmctl daemons before a new user receives
access to submit a job (accounting enabled).

$ sacctmgr add user ptrimmer defaultaccount=adamgrp partition=serial
cluster=farm

$ sacctmgr dump farm
...
User - 'ptrimmer':Partition='serial':DefaultAccount='adamgrp':Fairshare=1
...

As user ptrimmer:
$ srun -p serial date
srun: error: Unable to allocate resources: Invalid account or
account/partition combination specified

On the slurm server as root:
# service mysql stop
mysql stop/waiting
#service slurm-llnl-slurmdbd stop
 * Stopping slurm-llnl database server interface
   [ OK ]
# service slurm-llnl stop
 * Stopping slurm central management daemon slurmctld
  [ OK ]
slurmctld is stopped
#  service mysql start
mysql start/running, process 6270
# service slurm-llnl-slurmdbd start
 * Starting slurm-llnl database server interface
   [ OK ]
#  service slurm-llnl start
 * Starting slurm central management daemon slurmctld

Back to user ptrimmer:
ptrimmer@farm:~$  srun -p serial date
srun: job 5898165 queued and waiting for resources
srun: job 5898165 has been allocated resources
Mon Dec 28 12:57:20 PST 2015

I also tried running
$ scontrol reconfig
on the slurm server before restarting the slurm daemons but that did not
help.

Is this proper? In slurm 2.6 I did not have to do this.

thanks,
Terri Knight


[slurm-dev] What is the Cluster name?

2015-12-28 Thread Simpson Lachlan

Hi all,

In the accounting management guide, it mentions cluster name during the setting 
up of the database

"For example, to add a cluster named "snowflake" to the database execute this 
line..."

But I can't see any Name= or Cluster= in the conf.

Is the cluster name the "PartitionName"? If so, can I recommend updating the 
documentation for consistency?

Cheers
L.


This email (including any attachments or links) may contain
confidential and/or legally privileged information and is
intended only to be read or used by the addressee.  If you
are not the intended addressee, any use, distribution,
disclosure or copying of this email is strictly
prohibited.
Confidentiality and legal privilege attached to this email
(including any attachments) are not waived or lost by
reason of its mistaken delivery to you.
If you have received this email in error, please delete it
and notify us immediately by telephone or email.  Peter
MacCallum Cancer Centre provides no guarantee that this
transmission is free of virus or that it has not been
intercepted or altered and will not be liable for any delay
in its receipt.


[slurm-dev] Re: Slurm accounting recommendations?

2015-12-28 Thread Simpson Lachlan
Out of interest – and because I don’t really see any docs about this – is XDMod 
just a front end to slurmdbd?

If I go with XDMod do I even need slurmdbd?

Cheers
L.


From: Trey Dockendorf [mailto:treyd...@tamu.edu]
Sent: Saturday, 12 December 2015 5:03 AM
To: slurm-dev
Subject: [slurm-dev] Re: Slurm accounting recommendations?

We use Open XDMoD and it works great with SLURM.  It does not currently do 
memory utilization but a future version (5.5 I think) will introduce ability to 
provide job-level details and memory utilization may be one of the metrics 
collected.  The current version (5.0) bases all information on data collected 
from sacct.

I'd highly recommend Open XDMoD.  It's actively developed and well documented 
for setting up.

- Trey

=

Trey Dockendorf
Systems Analyst I
Texas A University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treyd...@tamu.edu
Jabber: treyd...@tamu.edu

On Thu, Dec 10, 2015 at 6:44 PM, Simpson Lachlan 
> wrote:

Hi,

New to SLURM (about 2 hours of reading).

Am looking to set up a test installation for in house, and would like to have 
access to the accounting - core and memory utilisation, avg queue length, 
quietest queue times, etc.

I've found the page describing sacct and sstat, but the Related Software page 
points to a number of other softwares, including Open XDMoD, slurmmon, and 
slurm-web.

My two (admittedly ignorant) questions are:

1. What do people use and recommend?

2. Would your recommendation change knowing that I am linux experienced, but a 
SLURM beginner?

Cheers
L.
This email (including any attachments or links) may contain
confidential and/or legally privileged information and is
intended only to be read or used by the addressee.  If you
are not the intended addressee, any use, distribution,
disclosure or copying of this email is strictly
prohibited.
Confidentiality and legal privilege attached to this email
(including any attachments) are not waived or lost by
reason of its mistaken delivery to you.
If you have received this email in error, please delete it
and notify us immediately by telephone or email.  Peter
MacCallum Cancer Centre provides no guarantee that this
transmission is free of virus or that it has not been
intercepted or altered and will not be liable for any delay
in its receipt.

This email (including any attachments or links) may contain 
confidential and/or legally privileged information and is 
intended only to be read or used by the addressee.  If you 
are not the intended addressee, any use, distribution, 
disclosure or copying of this email is strictly 
prohibited.  
Confidentiality and legal privilege attached to this email 
(including any attachments) are not waived or lost by 
reason of its mistaken delivery to you.
If you have received this email in error, please delete it 
and notify us immediately by telephone or email.  Peter 
MacCallum Cancer Centre provides no guarantee that this 
transmission is free of virus or that it has not been 
intercepted or altered and will not be liable for any delay 
in its receipt.



[slurm-dev] RE: What is the Cluster name?

2015-12-28 Thread Simpson Lachlan

Gah. Apologies. 

I found the "ClusterName" buried in the accounting info. My bad.

Sorry

L.

> -Original Message-
> From: Simpson Lachlan
> Sent: Tuesday, 29 December 2015 10:03 AM
> To: slurm-dev
> Subject: What is the Cluster name?
> 
> Hi all,
> 
> In the accounting management guide, it mentions cluster name during the 
> setting
> up of the database
> 
> "For example, to add a cluster named "snowflake" to the database execute this
> line..."
> 
> But I can't see any Name= or Cluster= in the conf.
> 
> Is the cluster name the "PartitionName"? If so, can I recommend updating the
> documentation for consistency?
> 
> Cheers
> L.


This email (including any attachments or links) may contain
confidential and/or legally privileged information and is
intended only to be read or used by the addressee.  If you
are not the intended addressee, any use, distribution,
disclosure or copying of this email is strictly
prohibited.
Confidentiality and legal privilege attached to this email
(including any attachments) are not waived or lost by
reason of its mistaken delivery to you.
If you have received this email in error, please delete it
and notify us immediately by telephone or email.  Peter
MacCallum Cancer Centre provides no guarantee that this
transmission is free of virus or that it has not been
intercepted or altered and will not be liable for any delay
in its receipt.


[slurm-dev] Re: need to restart slurm daemons for accounting changes

2015-12-28 Thread Douglas Jacobsen
I'm betting that slurmctld is running as a different uid than slurmdbd.
Once both are running as the same uid, slurmctld will start taking updates
from slurmdbd (via sacctmgr).

-Doug


On Mon, Dec 28, 2015 at 2:51 PM, Terri Knight  wrote:

> Since upgrading to slurm 15.08.1 on Ubuntu 14.04.3  it is required to
> restart mysql, slurmdbd, and slurmctl daemons before a new user receives
> access to submit a job (accounting enabled).
>
> $ sacctmgr add user ptrimmer defaultaccount=adamgrp partition=serial
> cluster=farm
>
> $ sacctmgr dump farm
> ...
> User - 'ptrimmer':Partition='serial':DefaultAccount='adamgrp':Fairshare=1
> ...
>
> As user ptrimmer:
> $ srun -p serial date
> srun: error: Unable to allocate resources: Invalid account or
> account/partition combination specified
>
> On the slurm server as root:
> # service mysql stop
> mysql stop/waiting
> #service slurm-llnl-slurmdbd stop
>  * Stopping slurm-llnl database server interface
>[ OK ]
> # service slurm-llnl stop
>  * Stopping slurm central management daemon slurmctld
> [ OK ]
> slurmctld is stopped
> #  service mysql start
> mysql start/running, process 6270
> # service slurm-llnl-slurmdbd start
>  * Starting slurm-llnl database server interface
>[ OK ]
> #  service slurm-llnl start
>  * Starting slurm central management daemon slurmctld
>
> Back to user ptrimmer:
> ptrimmer@farm:~$  srun -p serial date
> srun: job 5898165 queued and waiting for resources
> srun: job 5898165 has been allocated resources
> Mon Dec 28 12:57:20 PST 2015
>
> I also tried running
> $ scontrol reconfig
> on the slurm server before restarting the slurm daemons but that did not
> help.
>
> Is this proper? In slurm 2.6 I did not have to do this.
>
> thanks,
> Terri Knight
>


[slurm-dev] Re: need to restart slurm daemons for accounting changes

2015-12-28 Thread Terri Knight
Indeed, surmdbd and slurmctl are running as user slurm on the slurm server;
slurmd is running as root on the compute nodes.

Thanks!


On Mon, Dec 28, 2015 at 3:35 PM, Douglas Jacobsen 
wrote:

> I'm betting that slurmctld is running as a different uid than slurmdbd.
> Once both are running as the same uid, slurmctld will start taking updates
> from slurmdbd (via sacctmgr).
>
> -Doug
>
>
> On Mon, Dec 28, 2015 at 2:51 PM, Terri Knight 
> wrote:
>
>> Since upgrading to slurm 15.08.1 on Ubuntu 14.04.3  it is required to
>> restart mysql, slurmdbd, and slurmctl daemons before a new user receives
>> access to submit a job (accounting enabled).
>>
>> $ sacctmgr add user ptrimmer defaultaccount=adamgrp partition=serial
>> cluster=farm
>>
>> $ sacctmgr dump farm
>> ...
>> User - 'ptrimmer':Partition='serial':DefaultAccount='adamgrp':Fairshare=1
>> ...
>>
>> As user ptrimmer:
>> $ srun -p serial date
>> srun: error: Unable to allocate resources: Invalid account or
>> account/partition combination specified
>>
>> On the slurm server as root:
>> # service mysql stop
>> mysql stop/waiting
>> #service slurm-llnl-slurmdbd stop
>>  * Stopping slurm-llnl database server interface
>>  [ OK ]
>> # service slurm-llnl stop
>>  * Stopping slurm central management daemon slurmctld
>> [ OK ]
>> slurmctld is stopped
>> #  service mysql start
>> mysql start/running, process 6270
>> # service slurm-llnl-slurmdbd start
>>  * Starting slurm-llnl database server interface
>>  [ OK ]
>> #  service slurm-llnl start
>>  * Starting slurm central management daemon slurmctld
>>
>> Back to user ptrimmer:
>> ptrimmer@farm:~$  srun -p serial date
>> srun: job 5898165 queued and waiting for resources
>> srun: job 5898165 has been allocated resources
>> Mon Dec 28 12:57:20 PST 2015
>>
>> I also tried running
>> $ scontrol reconfig
>> on the slurm server before restarting the slurm daemons but that did not
>> help.
>>
>> Is this proper? In slurm 2.6 I did not have to do this.
>>
>> thanks,
>> Terri Knight
>>
>
>


[slurm-dev] Re: Slurm accounting recommendations?

2015-12-28 Thread Chris Samuel

On Mon, 28 Dec 2015 08:25:04 PM Simpson Lachlan wrote:

> Out of interest – and because I don’t really see any docs about this – is
> XDMod just a front end to slurmdbd? 
>
> If I go with XDMod do I even need slurmdbd?

One of our staff is playing with XDMod at the moment and what you do is import 
data from slurmdbd via sacct into it.   So you must have slurmdbd for this.

http://xdmod.sourceforge.net/resource-manager-slurm.html

Hope this helps!

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci