Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Ole Holm Nielsen
On 11/2/20 2:25 PM, navin srivastava wrote: Currently we are running slurm version 17.11.x and wanted to move to 20.x. We are building the New server with Slurm 20.2 version and planning to upgrade the client nodes from 17.x to 20.x. wanted to check if we can upgrade the Client from 17.x to

Re: [slurm-users] Reserving a GPU (Christopher Benjamin Coffey)

2020-11-02 Thread Christopher Benjamin Coffey
Hi All, Anyone know if its possible yet to reserve a gpu? Maybe in 20.02? Thanks! Best, Chris -- Christopher Coffey High-Performance Computing Northern Arizona University 928-523-1167 On 5/19/20, 3:04 PM, "slurm-users on behalf of Christopher Benjamin Coffey" wrote: Hi Lisa,

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Paul Edmon
We have hit this when we naively ran using the service and it timed out and borked the database.  Fortunately we had a backup to go back to.  Since then we have run it straight from the command line.  Like yours our production DB is now 23 GB for 6 months worth of data so major schema updates

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Chris Samuel
On 11/2/20 7:31 am, Paul Edmon wrote: e. Run slurmdbd -Dv to do the database upgrade. Depending on the upgrade this can take a while because of database schema changes. I'd like to emphasis the importance of doing the DB upgrade in this way, do not use systemctl for this as if systemd

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Paul Edmon
We haven't really had MPI ugliness with the latest versions. Plus we've been rolling our own PMIx and building against that which seems to have solved most of the cross compatibility issues. -Paul Edmon- On 11/2/2020 10:38 AM, Fulcomer, Samuel wrote: Our strategy is a bit simpler. We're

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Fulcomer, Samuel
Our strategy is a bit simpler. We're migrating compute nodes to a new cluster running 20.x. This isn't an upgrade. We'll keep the old slurmdbd running for at least enough time to suck the remaining accounting data into XDMoD. The old cluster will keep running jobs until there are no more to run.

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Paul Edmon
We don't follow the recommended procedure here but rather build RPMs and upgrade using those.  We haven't and any issues.  Here is our procedure: 1. Build rpms from source using a version of the slurm.spec file that we maintain. It's the version SchedMD provides but modified with some

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Paul Edmon
We don't follow the recommended procedure here but rather build RPMs and upgrade using those.  We haven't and any issues.  Here is our procedure: 1. Build rpms from source using a version of the slurm.spec file that we maintain. It's the version SchedMD provides but modified with some

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Paul Edmon
In general  I would follow this: https://slurm.schedmd.com/quickstart_admin.html#upgrade Namely: Almost every new major release of Slurm (e.g. 19.05.x to 20.02.x) involves changes to the state files with new data structures, new options, etc. Slurm permits upgrades to a new major release

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Fulcomer, Samuel
We're doing something similar. We're continuing to run production on 17.x and have set up a new server/cluster running 20.x for testing and MPI app rebuilds. Our plan had been to add recently purchased nodes to the new cluster, and at some point turn off submission on the old cluster and switch

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Christopher J Cawley
Depending on how large the database is, the database backend upgrades can take while. Chris Christopher J. Cawley Systems Engineer/Linux Engineer, Information Technology Services 223 Aquia Building, Ffx, MSN: 1B5 George Mason University Phone: (703) 993-6397 Email: ccawl...@gmu.edu ​

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Christopher J Cawley
I do not think so. In any case, make sure that you stop services and make a backup of the database. Chris Christopher J. Cawley Systems Engineer/Linux Engineer, Information Technology Services 223 Aquia Building, Ffx, MSN: 1B5 George Mason University Phone: (703) 993-6397 Email:

[slurm-users] Slurm Upgrade

2020-11-02 Thread navin srivastava
Dear All, Currently we are running slurm version 17.11.x and wanted to move to 20.x. We are building the New server with Slurm 20.2 version and planning to upgrade the client nodes from 17.x to 20.x. wanted to check if we can upgrade the Client from 17.x to 20.x directly or we need to go

[slurm-users] Reset all accounting data

2020-11-02 Thread Diego Zuccato
Hello all. I'm going to change (a lot!) our cluster config, including accounting weights. Since keeping historic data would cause a mess, how can I reset usage for all accounts/users ? I already tried the obvious: # sacctmgr modify account set rawusage=0 You didn't set any conditions with