Re: [slurm-users] [External] Re: Why does the make install path get hard coded into the slurmd binary?

2020-02-19 Thread Prentice Bisbal

The binaries are probably compiled with the RPATH switch enabled.

https://en.wikipedia.org/wiki/Rpath

Prentice

On 2/18/20 6:18 PM, Dean Schulze wrote:
The ./configure --prefix=... value gets into the Makefiles, which is 
no surprise.  But it is also getting into the slurmd binary and .o 
files.  Here's what I found in the slurmd/ source directory:


$ find ./ -type f -exec grep 
/home/dean/src/slurm.versions/slurm-19.05.4.build {} /dev/null \;

./Makefile:prefix = /home/dean/src/slurm.versions/slurm-19.05.4.build
Binary file ./slurmd.o matches
Binary file ./get_mach_stat.o matches
Binary file ./req.o matches
Binary file ./slurmd matches

I can't tell what is going on in the Makefile that puts that string 
into the .o file and the slurmd binary, however.



On Tue, Feb 18, 2020 at 3:44 PM Dean Schulze > wrote:


I built slurm on one machine (controller) and copied the new
slurmd binary to a node.  When I started it systemctl it failed
with the message:

fatal: Unable to find slurmstepd file at
/home/dean/src/slurm.versions/slurm-19.05.4.build/

The path it refers to is what I gave to ./configure --prefix==...
on the controller where I built the binaries.  The --prefix= value
is used by the make install step to output the slurm* binaries it
creates to.  That path also gets written into the generated
.service files. for ExecStart=..  I change the ExecStart= in the
.service files to /usr/local/sbin where I place the slurm* binaries.

Here's my slurmd.service file on my node:

Unit]
Description=Slurm node daemon
After=munge.service network.target remote-fs.target
ConditionPathExists=/etc/slurm/slurm.conf

[Service]
Type=forking
EnvironmentFile=-/etc/sysconfig/slurmd
ExecStart=/usr/local/sbin/slurmd $SLURMD_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID
PIDFile=/var/run/slurmd.pid
KillMode=process
LimitNOFILE=131072
LimitMEMLOCK=infinity
LimitSTACK=infinity
Delegate=yes
TasksMax=infinity

[Install]
WantedBy=multi-user.target

Why is the slurmd binary looking for the build path? That path is
not in any .service or .conf file  on the node.



Re: [slurm-users] Slurm Upgrade from 17.02

2020-02-19 Thread Marcus Wagner

Hi Ricardo,

If I remember right, you can only upgrade two versions further. So you 
WILL have to upgrade to 18.08, even if you want to use 19.05 or the 
coming 20.02


17.02 -> 17.11 -> 18.08 -> 19.05 -> 20.02
^  ^
|  |
|- you are here    |- "farthest jump" to a newer version in one step.


As SchedMD introduced constres in 19.05, consres will become depercated 
in future versions. The way you order GPUs is more consistent in the new 
version. So, I would upgrade to 19.05. Still you will have in a first 
step to upgrade to 18.08 though.



Best
Marcus


On 2/19/20 3:10 PM, Ricardo Gregorio wrote:


hi all,

I am putting together an upgrade plan for slurm on our HPC. We are 
currently running old version 17.02.11. Would you guys advise us 
upgrading to 18.08 or 19.05?


I understand we will have to also upgrade the version of mariadb from 
5.5 to 10.X and pay attention to 'long db upgrade from 17.02 to 18.X 
or 19.X' and 'bug 6796' amongst other things.


We would appreciate your comments/recommendations

Regards,

*Ricardo Gregorio*

Research and Systems Administrator

Operations ITS


Rothamsted Research is a company limited by guarantee, registered in 
England at Harpenden, Hertfordshire, AL5 2JQ under the registration 
number 2393175 and a not for profit charity number 802038. 


--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de



Re: [slurm-users] Slurm Upgrade from 17.02

2020-02-19 Thread Chris Samuel

On 19/2/20 6:10 am, Ricardo Gregorio wrote:

I am putting together an upgrade plan for slurm on our HPC. We are 
currently running old version 17.02.11. Would you guys advise us 
upgrading to 18.08 or 19.05?


Slurm versions only support upgrading from 2 major versions back, so you 
could only upgrade from 17.02 to 17.11 or 18.08.  I'd suggest going 
straight to 18.08.


Remember you have to upgrade slurmdbd first, then upgrade slurmctld and 
then finally the slurmd's.


Also, as Ole points out, 20.02 is due out soon at which point 18.08 gets 
retired from support, so you'd probably want to jump to 19.05 from 18.08.


Don't forget to take backups first!  We do a mysqldump of the whole 
accounting DB and rsync backups of our state directories before an upgrade.


Best of luck!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



Re: [slurm-users] Slurm Upgrade from 17.02

2020-02-19 Thread Ole Holm Nielsen

On 2/19/20 3:10 PM, Ricardo Gregorio wrote:
I am putting together an upgrade plan for slurm on our HPC. We are 
currently running old version 17.02.11. Would you guys advise us upgrading 
to 18.08 or 19.05?


You should be able to upgrade 2 Slurm major versions in one step.  The 
18.08 version is just about to become unsupported since 20.02 will be 
released shortly.  We use 19.05.5.


I have collected a number of upgrading details in my Slurm Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm

You really, really want to perform a dry-run Slurm database upgrade on a 
test machine before doing the real upgrade!  See

https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#make-a-dry-run-database-upgrade

I understand we will have to also upgrade the version of mariadb from 5.5 
to 10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' 
and 'bug 6796' amongst other things.


We use the default MariaDB 5.5 in CentOS 7.7.  Upgrading to MariaDB 10 
seems to have quite a number of unresolved installation issues, so I would 
skip that for now.  Se s



We would appreciate your comments/recommendations


Slurm 19.05 works great for us.  We're happy with our SchedMD support 
contract.


/Ole



[slurm-users] Slurm Upgrade from 17.02

2020-02-19 Thread Ricardo Gregorio
hi all,

I am putting together an upgrade plan for slurm on our HPC. We are currently 
running old version 17.02.11. Would you guys advise us upgrading to 18.08 or 
19.05?

I understand we will have to also upgrade the version of mariadb from 5.5 to 
10.X and pay attention to 'long db upgrade from 17.02 to 18.X or 19.X' and 'bug 
6796' amongst other things.

We would appreciate your comments/recommendations

Regards,
Ricardo Gregorio
Research and Systems Administrator
Operations ITS



Rothamsted Research is a company limited by guarantee, registered in England at 
Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a 
not for profit charity number 802038.