[slurm-dev] Slurm 17.02.7 and PMIx
Hi folks, Just wondering if anyone here has had any success getting Slurm to compile with PMIx support? I'm trying 17.02.7 and I find that with PMIx I get either: PMIX v1.2.2: Slurm complains and tells me it wants v2. PMIX v2.0.1: Slurm can't find it because the header files are not where it is looking for them, and when I do a symlink hack to make PMIX detection work it then fails to compile, with: /bin/sh ../../../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../slurm -I../../../.. -I../../../../src/common -I/usr/include -I/usr/local/pmix/latest/include -DHAVE_PMIX_VER=2 -g -O0 -pthread -Wall -g -O0 -fno-strict-aliasing -MT mpi_pmix_v2_la-pmixp_client.lo -MD -MP -MF .deps/mpi_pmix_v2_la-pmixp_client.Tpo -c -o mpi_pmix_v2_la-pmixp_client.lo `test -f 'pmixp_client.c' || echo './'`pmixp_client.c libtool: compile: gcc -DHAVE_CONFIG_H -I. -I../../../.. -I../../../../slurm -I../../../.. -I../../../../src/common -I/usr/include -I/usr/local/pmix/latest/include -DHAVE_PMIX_VER=2 -g -O0 -pthread -Wall -g -O0 -fno-strict-aliasing -MT mpi_pmix_v2_la-pmixp_client.lo -MD -MP -MF .deps/mpi_pmix_v2_la-pmixp_client.Tpo -c pmixp_client.c -fPIC -DPIC -o .libs/mpi_pmix_v2_la-pmixp_client.o pmixp_client.c: In function ‘_set_procdatas’: pmixp_client.c:468:24: error: request for member ‘size’ in something not a structure or union kvp->value.data.array.size = count; ^ pmixp_client.c:482:24: error: request for member ‘array’ in something not a structure or union kvp->value.data.array.array = (pmix_info_t *)info; ^ make[4]: *** [mpi_pmix_v2_la-pmixp_client.lo] Error 1 So I'm guessing that I'm missing something but the documentation for PMIX in Slurm seems pretty much non-existent. :-( Anyone had any luck with this? cheers, Chris -- Christopher SamuelSenior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
[slurm-dev] Re: Upgrading Slurm
On 04/10/17 20:51, Gennaro Oliva wrote: > If you are talking about Slurm I would backup the configuration files > also. Not directly Slurm related but don't forget to install and configure etckeeper first. It puts your /etc/ directory under git version control and will do commits of changes before and after any package upgrade/install/removal so you have a good history of changes made. I'm assuming that the slurm config files in the Debian package are under /etc so that will be helpful to you for this. > Anyway there have been a lot of major changes in SLURM and in Debian since > 2013 (Wheezy release date), so be prepared that it will be no picnic. The Debian package name also changed from slurm-llnl to slurm-wlm at some point too, so missing the intermediate release may result in that not transitioning properly. To be honest I would never use a distros packages for Slurm, I'd always install it centrally (NFS exported to compute nodes) to keep things simple. That way you decouple your Slurm version from the OS and can keep it up to date (or keep it on a known working version). All the best! Chris -- Christopher SamuelSenior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
[slurm-dev] Re: Setting up Environment Modules package
On 05/10/17 03:11, Mike Cammilleri wrote: > 2. Install Environment Modules packages in a location visible to the > entire cluster (NFS or similar), including the compute nodes, and the > user then includes their 'module load' commands in their actual slurm > submit scripts since the command would be available on the compute > nodes - loading software (either local or from network locations > depending on what they're loading) visible to the nodes This is what we do, the management node for the cluster exports its /usr/local read-only to the rest of the cluster. We also have in our taskprolog.sh: echo export BASH_ENV=/etc/profile.d/module.sh to try and ensure that bash shells have modules set up, just in case. :-) -- Christopher SamuelSenior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
[slurm-dev] Re: Tasks distribution
I didn't realize prior to this that the "--distribution" flag to "sbatch" only affects how an "srun" within the batch script will make CPU allocations. Prior to that happening, SLURM must allocate CPUs to the batch job, and _that_ distribution is dictated by how you have the "select/cons_res" plugin configured: > SelectType=select/cons_res > SelectTypeParameters=CR_Core The default behavior is to spread the allocation across the available nodes -- thus, 4/4/3/3/3. If you'd rather "pack" allocations onto the nodes, enable the CR_PACK_NODES option: > SelectType=select/cons_res > SelectTypeParameters=CR_Core,CR_Pack_Nodes This will produce the 4/4/4/4/1 allocation pattern. AFAIK there's no way to alter which CPU allocation pattern gets used on a per-job basis. Once the job has been assigned nodes and CPUs on those nodes, the "--distribution" option you provide informs "srun" how to distribute the tasks it starts. Not using "srun" to start the MPI program, Open MPI itself knows nothing beyond seeing SLURM_NODELIST=n[009-013] SLURM_TASKS_PER_NODE=4(x2),3(x3) in the environment which produces the host list n009:4 n010:4 n011:3 n012:3 n013:3 for which the --map-by and --rank-by options to "mpirun" will affect the distribution. > On Oct 3, 2017, at 8:26 PM, Christopher Samuelwrote: > > > On 02/10/17 20:51, Sysadmin CAOS wrote: > >> I'm execution my MPI program with "mpirun"... Maybe could be this the >> problem? Do I need to execute with "srun"? > > I suspect so, try it and see.. > > -- > Christopher SamuelSenior Systems Administrator > Melbourne Bioinformatics - The University of Melbourne > Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 :: Jeffrey T. Frey, Ph.D. Systems Programmer V / HPC Management Network & Systems Services / College of Engineering University of Delaware, Newark DE 19716 Office: (302) 831-6034 Mobile: (302) 419-4976 ::
[slurm-dev] Re: Setting up Environment Modules package
We've had good luck putting the modules on an nfs-mounted file system. Along with that, suggest creating /etc/profile.d/zmodule.sh that contains module use /modules then symlink /etc/profile.d/zmodule.csh to it, and set this up on all login and compute nodes. Andy On 10/04/2017 12:10 PM, Mike Cammilleri wrote: Hi Everyone, I'm in search of a best practice for setting up Environment Modules for our Slurm 16.05.6 installation (we have not had the time to upgrade to 17.02 yet). We're a small group and had no explicit need for this in the beginning, but as we are growing larger with more users we clearly need something like this. I see there are a couple ways to implement Environment Modules and I'm wondering which would be the cleanest, most sensible way. I'll list my ideas below: 1. Install Environment Modules package and relevant modulefiles on the slurm head/submit/login node, perhaps in the default /usr/local/ location. The modulefiles modules would define paths to various software packages that exist in a location visible/readable to the compute nodes (NFS or similar). The user then loads the modules manually at the command line on the submit/login node and not in the slurm submit script - but specify #SBATCH --export=ALL and import the environment before submitting the sbatch job. 2. Install Environment Modules packages in a location visible to the entire cluster (NFS or similar), including the compute nodes, and the user then includes their 'module load' commands in their actual slurm submit scripts since the command would be available on the compute nodes - loading software (either local or from network locations depending on what they're loading) visible to the nodes 3. Another variation would be to use a configuration manager like bcfg2 to make sure Environment Modules and necessary modulefiles and all configurations are present on all compute/submit nodes. Seems like that's potential for a mess though. Is there a preferred approach? I see in the archives some folks have strange behavior when a user uses --export=ALL, so it would seem to me that the cleaner approach is to have the 'module load' command available on all compute nodes and have users do this in their submit scripts. If this is the case, I'll need to configure Environment Modules and relevant modulefiles to live in special places when I build Environment Modules (./configure --prefix=/mounted-fs --modulefilesdir=/mounted-fs, etc.). We've been testing with modules-tcl-1.923 Thanks for any advice, mike
[slurm-dev] Setting up Environment Modules package
Hi Everyone, I'm in search of a best practice for setting up Environment Modules for our Slurm 16.05.6 installation (we have not had the time to upgrade to 17.02 yet). We're a small group and had no explicit need for this in the beginning, but as we are growing larger with more users we clearly need something like this. I see there are a couple ways to implement Environment Modules and I'm wondering which would be the cleanest, most sensible way. I'll list my ideas below: 1. Install Environment Modules package and relevant modulefiles on the slurm head/submit/login node, perhaps in the default /usr/local/ location. The modulefiles modules would define paths to various software packages that exist in a location visible/readable to the compute nodes (NFS or similar). The user then loads the modules manually at the command line on the submit/login node and not in the slurm submit script - but specify #SBATCH --export=ALL and import the environment before submitting the sbatch job. 2. Install Environment Modules packages in a location visible to the entire cluster (NFS or similar), including the compute nodes, and the user then includes their 'module load' commands in their actual slurm submit scripts since the command would be available on the compute nodes - loading software (either local or from network locations depending on what they're loading) visible to the nodes 3. Another variation would be to use a configuration manager like bcfg2 to make sure Environment Modules and necessary modulefiles and all configurations are present on all compute/submit nodes. Seems like that's potential for a mess though. Is there a preferred approach? I see in the archives some folks have strange behavior when a user uses --export=ALL, so it would seem to me that the cleaner approach is to have the 'module load' command available on all compute nodes and have users do this in their submit scripts. If this is the case, I'll need to configure Environment Modules and relevant modulefiles to live in special places when I build Environment Modules (./configure --prefix=/mounted-fs --modulefilesdir=/mounted-fs, etc.). We've been testing with modules-tcl-1.923 Thanks for any advice, mike
[slurm-dev] Re: Is PriorityUsageResetPeriod really required for hard limits?
Thanks for the replies and clarifications. It is actually our desired usage policy that they never be able to run jobs once their allocation is exhausted. They must submit a proposal at which point we increase their allocation, but we never want to reset their usage. It's good to know that the reset period is not actually required, as the Slurm documentation suggests, because we have a very real use case. I'm assuming the usage is stored as a 64-bit integer, so hopefully we don't end up overflowing in the future. __ *Jacob D. Chappell* *Research Computing Associate* Research Computing | Research Computing Infrastructure Information Technology Services | University of Kentucky 301 Rose Street | 102 James F. Hardymon Building Lexington, KY 40506-0495 jacob.chapp...@uky.edu Visit us: www.uky.edu/ITS How are we doing? Send Feedback to itsabout...@uky.edu ITS . . . it’s about technology. ITS . . . it’s about innovation. ITS . . . it’s about you! On Wed, Oct 4, 2017 at 10:19 AM, Thomas M. Payerlewrote: > > On Tue, 3 Oct 2017, Christopher Samuel wrote: > >> >> On 29/09/17 06:34, Jacob Chappell wrote: >> >> Hi all. The slurm.conf documentation says that if decayed usage is >>> disabled, then PriorityUsageResetPeriod must be set to some value. Is >>> this really true? What is the technical reason for this requirement if >>> so? Can we set this period to sometime far into the future to have >>> effectively an infinite period (no reset)? >>> >> >> Basically this is because once a user exceeds something like their >> maximum CPU run time limit then they will never be able to run jobs >> again unless you either decay or reset usage. >> >> -- >> Christopher SamuelSenior Systems Administrator >> Melbourne Bioinformatics - The University of Melbourne >> Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 >> >> > To answer your question, it is not required. Although if you do not > have it set, you will, as Christopher pointed out, have to do something > to reset usage if you do not want people to lose ability to run jobs > forever. > > We have a couple of different "types" of allocations with different > reset periods, so a global PriorityUsageResetPeriod does not work for > us. Instead we have cron jobs that run at the appropriate times > and do something like sacctmgr update account name=XXX rawusage=0 > to do our resets. But PriorityUsageResetPeriod is set to none. > > > Tom Payerle > DIT-ACIGS/Mid-Atlantic Crossroads paye...@umd.edu > 5825 University Research Court (301) 405-6135 > University of Maryland > College Park, MD 20740-3831 >
[slurm-dev] Re: Is PriorityUsageResetPeriod really required for hard limits?
On Tue, 3 Oct 2017, Christopher Samuel wrote: On 29/09/17 06:34, Jacob Chappell wrote: Hi all. The slurm.conf documentation says that if decayed usage is disabled, then PriorityUsageResetPeriod must be set to some value. Is this really true? What is the technical reason for this requirement if so? Can we set this period to sometime far into the future to have effectively an infinite period (no reset)? Basically this is because once a user exceeds something like their maximum CPU run time limit then they will never be able to run jobs again unless you either decay or reset usage. -- Christopher SamuelSenior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 To answer your question, it is not required. Although if you do not have it set, you will, as Christopher pointed out, have to do something to reset usage if you do not want people to lose ability to run jobs forever. We have a couple of different "types" of allocations with different reset periods, so a global PriorityUsageResetPeriod does not work for us. Instead we have cron jobs that run at the appropriate times and do something like sacctmgr update account name=XXX rawusage=0 to do our resets. But PriorityUsageResetPeriod is set to none. Tom Payerle DIT-ACIGS/Mid-Atlantic Crossroads paye...@umd.edu 5825 University Research Court (301) 405-6135 University of Maryland College Park, MD 20740-3831
[slurm-dev] Re: Upgrading Slurm
Ciao Elisabetta, On Wed, Oct 04, 2017 at 01:38:21AM -0600, Elisabetta Falivene wrote: > Just some other questions. How would you do the upgrade in the safer way? > Letting aptitude do his job? I prefer apt, but it's a matter of taste. > Would you to debian 9? In my opinion is preferable to make 2 steps: from wheezy to jessie and from jessie to stretch. Upgrade from releases older than Jessie is not supported. If you don't have many software compiled from source under /usr/local, /opt or in the users' home directories, it is better to leave your /home partition unchanged and do a fresh install of your system than making the long jump. This will also have the benefit of putting the system under your total control. > And the nodes must be > upgraded in the same way one by one? Yes I would use the same method for the front-end and the computing nodes unless you have automatic centralized installation and configuration systems in place. In that case I would reinstall nodes from scratch after front-end has been upgraded. > Let's think about the worst case: upgrading nuke slurm. I don't really know > well this machine's configuration. You would backup something else beside > The database before upgrading? If you are talking about Slurm I would backup the configuration files also. Regarding Debian system updates, you can find some information here, for the first upgrade: https://www.debian.org/releases/jessie/amd64/release-notes/ch-upgrading.en.html and here for the second upgrade: https://www.debian.org/releases/stretch/amd64/release-notes/ch-upgrading.html Anyway there have been a lot of major changes in SLURM and in Debian since 2013 (Wheezy release date), so be prepared that it will be no picnic. Best regards -- Gennaro Oliva
[slurm-dev] Re: Upgrading Slurm
Hi Elisabetta, Elisabetta Falivenewrites: > Upgrading Slurm > > Thank you all for useful advices! > > So The 'jump' could not be a problem if there are no running jobs > (which is my case as you guessed). Surely I'll report how it went > doing it. I would like to do some test on a virtual machine, but > really can't imagine how to replicate the exact situation of a 7Tb > cluster locally... > > Just some other questions. How would you do the upgrade in the safer > way? Letting aptitude do his job? Would you to debian 9? And the nodes > must be upgraded in the same way one by one? If no jobs are running, I would just let aptitude get on with it. It there are no other reasons not to, I would upgrade to Debian 9. In this case, your version of Slurm will be 16.05 and thus not too old. > Let's think about the worst case: upgrading nuke slurm. I don't really > know well this machine's configuration. You would backup something > else beside The database before upgrading? The only other thing I backup is the statesave directory, but this only interesting if you are upgrading while jobs are running. In your case, only the database is worth backing up, and even then, that's only really interesting if you need the old data for statistical purposes, or you need to maintain, say, fairshare information across the upgrade. In bocca al lupo! Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de
[slurm-dev] Re: Upgrading Slurm
Awesome! Thank you Ole! 2017-10-04 9:59 GMT+02:00 Ole Holm Nielsen: > > On 10/04/2017 09:38 AM, Elisabetta Falivene wrote: > >> Ps: if you know some good source of information about how to set up a >> cluster and slurm beside official doc, I would be grateful if you could >> share. It is difficult to find good material >> > > I agree about the lack of availability of HowTo guides. That's why I > wrote a Slurm HowTo Wiki while installing Slurm, but it is focused on > CentOS 7: https://wiki.fysik.dtu.dk/niflheim/SLURM > > However, the Slurm setup in itself should not depend on the Linux > distribution, so perhaps you can learn something useful from the Wiki > anyway. > > /Ole >
[slurm-dev] Re: Upgrading Slurm
On 04/10/17 17:12, Loris Bennett wrote: > Ole's pages on Slurm are indeed very useful (Thanks, Ole!). I just > thought I point out that the limitation on only upgrading by 2 major > versions is for the case that you are upgrading a production system and > don't want to lose any running jobs. The on disk format might for spooled jobs may also change between releases too, so you probably want to keep that in mind as well.. -- Christopher SamuelSenior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
[slurm-dev] Re: Upgrading Slurm
Hi Elisabetta, Ole Holm Nielsenwrites: > On 10/03/2017 03:29 PM, Elisabetta Falivene wrote: >> I've been asked to upgrade our slurm installation. I have a slurm 2.3.4 on a >> Debian 7.0 wheezy cluster (1 master + 8 nodes). I've not installed it so I'm >> a >> bit confused about how to do this and how to proceed without destroying >> anything. >> >> I was thinking to upgrade at least to Jessie (Debian 8) but what about Slurm? >> I've read carefully the upgrading section >> (https://slurm.schedmd.com/quickstart_admin.html) of the doc, reading that >> the >> upgrade must be done incrementally and not jumping from 2.3.4 to 17, for >> example. > > Yes, you may jump max 2 versions per upgrade. > Quoting https://slurm.schedmd.com/quickstart_admin.html#upgrade > >> Slurm daemons will support RPCs and state files from the two previous minor >> releases (e.g. a version 16.05.x SlurmDBD will support slurmctld daemons and >> commands with a version of 16.05.x, 15.08.x or 14.11.x). > > >> Stil is not clear to me precisely how to do this. How would you proceed if >> asked to upgrade a cluster you just don't know nothing about? What would you >> check? What version of o.s. and slurm would you choose? What would you >> backup? >> And how would you proceed? >> >> Any info is gold! Thank you > > My 2 cents of information: > > My Slurm Wiki explains how to upgrade Slurm on CentOS 7: > https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm > > Probably the general method is the same for Debian. Ole's pages on Slurm are indeed very useful (Thanks, Ole!). I just thought I point out that the limitation on only upgrading by 2 major versions is for the case that you are upgrading a production system and don't want to lose any running jobs. If you are upgrading the whole operating system, you are probably planning a downtime anyway and so there won't be any such jobs. In this case, there shouldn't in theory be a problem - although I must admit that I wouldn't be that surprised if converting the database from 2.3.4 to, say, 17.02.7 didn't go 100% smoothly. However, Debian users who just rely on Debian packages are always going to face this problem of large version jumps between Debian releases, and so it would be useful for the community to know how well this works. Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de