[slurm-dev] Re: Slurm versions 17.02.5 and 17.11.0-pre1 are now available

2017-07-15 Thread Andrew Elwell
> Slurm version 17.11.0-pre1 is the first pre-release of version 17.11, to > be > released in November 2017. This version contains the support for > scheduling of > a workload across a set (federation) of clusters which is described in > some > detail here: >

[slurm-dev] Re: rpmbuild from tarball

2017-06-19 Thread Andrew Elwell
iel...@fysik.dtu.dk> wrote: > On 06/19/2017 05:36 AM, Andrew Elwell wrote> I've just tried and failed to > get the github release >> >> (https://github.com/SchedMD/slurm/releases) of 16.05.10-2 to build >> using the 'rpmbuild -ta tarball' trick - it's failing o

[slurm-dev] rpmbuild from tarball

2017-06-18 Thread Andrew Elwell
I've just tried and failed to get the github release (https://github.com/SchedMD/slurm/releases) of 16.05.10-2 to build using the 'rpmbuild -ta tarball' trick - it's failing on line 88 of the spec ie > Name:see META file > Version: see META file > Release: see META file however the tarball

[slurm-dev] Re: Fwd: how to perform a DB upgrade?

2017-01-04 Thread Andrew Elwell
> 4. Start the new slurmdbd Do this part by hand (ie, slurmdbd -Dvvv) as it takes longer than an init script / systemctl allows for it to start due to the migration, and it'll be flagged as 'failed' When I did this for our test cluster, the initial startip of slurmdbd took ~30 mins or so to

[slurm-dev] slurmdbd user across multiple clusters

2016-11-01 Thread Andrew Elwell
In the docs for slurmdbd 16.05 it states[1]: SlurmUser The name of the user that the slurmctld daemon executes as. This user must exist on the machine executing the Slurm Database Daemon and have the same user ID as the hosts on which slurmctld execute. For security purposes, a user other than

[slurm-dev] Re: sreport "duplicate" lines

2016-10-20 Thread Andrew Elwell
> Looks like you've somehow created partition specific associations for > some people - not something we do at all. ISTR this was because 2.6 didn't let us have an overall restriction for the cluster and a sub-restriction on the number of jobs to run in a (debug) partition I could understand

[slurm-dev] Re: sreport "duplicate" lines

2016-10-20 Thread Andrew Elwell
0 magnus pawsey0001 achew Ashley Chew175 0 magnus pawsey0001 achew Ashley Chew136 0 magnus pawsey0001 aelwell Andrew Elwell 2 0 magnus pawsey0001 aelwell Andrew Elwell 2236 0

[slurm-dev] sreport "duplicate" lines

2016-10-20 Thread Andrew Elwell
CPU Hours Cluster Account Login Proper Name Used Energy - --- - --- -- -- magnus pawsey0001 aelwell Andrew Elwell 2 0

[slurm-dev] Re: Packaging for fedora (and EPEL)

2016-10-17 Thread Andrew Elwell
> I've had consistent success with the documented system - "rpmbulid > slurm-.tgz" then yum installing the resulting files, using 15.x, > 16.05 and 17.02. Yup, it seems to build well enough but then fails a few picky rpmlint rules - Nothing too major and *could* be worked around with patches but

[slurm-dev] Packaging for fedora (and EPEL)

2016-10-17 Thread Andrew Elwell
Hi folks, I see from https://bugzilla.redhat.com/show_bug.cgi?id=1149566 that there have been a few unsuccessful attempts to get slurm into fedora (and potentially EPEL) Is anyone on this list actively working on it at the moment? I'll update the bugzilla ticket to prod the last portential

[slurm-dev] Re: rpm dependencies in 16.05.5

2016-10-13 Thread Andrew Elwell
> I have a Wiki page describing how to install Munge and Slurm on CentOS 7: Thanks Ole, there's some good notes in there I'll use. My original question was more a packaging issue - In this case I don't mind installing the rest of the slurm binaries, but ideally I'd like our slurmdbd host to be

[slurm-dev] rpm dependencies in 16.05.5

2016-10-13 Thread Andrew Elwell
Hi folks, I've just built 16.05.5 into rpms (using the rpmbuild -ta slurm*.tar.bz2 method) to update a CentOS 7 slurmdbd host. According to http://slurm.schedmd.com/accounting.html "Note that SlurmDBD relies upon existing Slurm plugins for authentication and Slurm sql for database use, but the

[slurm-dev] Re: Remote Visualization and Slurm

2016-08-18 Thread Andrew Elwell
> If anyone has a working remote visualization cluster that integrates well > with slurm, I would love to hear from you. We're using 'strudel' https://www.massive.org.au/userguide/cluster-instructions/strudel and our local instructions are

[slurm-dev] Cray Resource Utilization Reporting (RUR) via plugin

2015-03-20 Thread Andrew Elwell
Hi All, We're investigating the possibility of enabling RUR on our XC30's, with the end goal of integrating this into the slurmdbd for jobs. Is anyone else working on this? if not, is anyone else interested? I know that there's already ./acct_gather_energy/cray/acct_gather_energy_cray.c but I

[slurm-dev] FlexLM integration - roughly how much work?

2015-01-15 Thread Andrew Elwell
Hi Folks, At the Lugano meeting last year, SchedMD said that the Flexlm integration had come off the short term roadmap due to other features. We’re interested in the possibility of holding jobs until certain licences are available (hello ansys) rather than them running and failing. Can anyone

[slurm-dev] Re: sbatch --array question and a tale of job and task confusion

2014-11-19 Thread Andrew Elwell
I'll add that this is (most likely) being seen on slurm 2.6.6 on a Cray using ALPS. /me waves to Balt

[slurm-dev] including config files

2014-10-22 Thread Andrew Elwell
Hi Folks, According to the docs (http://slurm.schedmd.com/slurm.conf.html) it should be possible to have include otherconfig.conf in my slurm.conf, however I'd like to make this ${ClusterName}.conf - is this possible to do this? I see that in src/common/parse_config.c there seems to be some

[slurm-dev] Re: Error: Unable to contact slurm controller

2014-08-24 Thread Andrew Elwell
Hi Gerry, [2014-08-21T09:30:09.673] fatal: system has no usable batch compute nodes We see this on our systems (running Slurm + Alps/basil rather than native) when the slurmctld starts before the sdb has a list of batch nodes. It's bitten us when we've set the nodes to interactive rather than

[slurm-dev] --parsable(2) option for squeue / sinfo

2014-06-12 Thread Andrew Elwell
Hi folks, Wishlist item -- would it be easy to port in the parsable flags into squeue? from a very quick glance over the code, it seems that sreport and sacctmgr use common/print_fields.c but that's not used from squeue. Many thanks Andrew

[slurm-dev] Pbs to slurmdbd

2014-01-28 Thread Andrew Elwell
Hi folks, We're migrating from pbs pro to slurm mid cpu accounting cycle. Since slurmdbd/sreport looks nicer than grepping through pbs logs for usage (no gold on this cluster), is there a way to populate slurmdbd records from pbs till we migrate? (I.e. has anyone done this already rather than me

[slurm-dev] Re: Pbs to slurmdbd

2014-01-28 Thread Andrew Elwell
You might take a look at the moab_2_slurmdb.pl script in contribs/slurmdb-direct. Thanks - I figured that was a good start - my concern was the use lib qw(/home/da/slurm/1.3/ line in the code - I wasn't sure how much bitrot had set in to make it work with 2.6.x :-)

[slurm-dev] Installation onto an XT30

2013-09-26 Thread Andrew Elwell
Hi Folks, I'm trying to install slurm (2.6.2) onto our Cray XT30 -- I've been following the guide at http://slurm.schedmd.com/cray.html and Gerrit's paper from CUG11, but I've got a few questions about daemon placement and configuration. 1) we use eslogin nodes (and other external services) so