Re: [slurm-users] parastation (mpi)

2023-11-24 Thread Christopher Samuel

On 11/24/23 06:16, Heckes, Frank wrote:

My colleagues are using this toolchains on Jülich cluster (especially 
Juwels). My question is whether these eb files can be shared ? I would 
be interested especially in the ones using NVHPC as core module.


If Jülich developed that toolchain then I think you'd need to ask them 
whether they are agreeable to sharing them.


Does anyone knows whether the parastation MPI is an active project 
still, because the github doesn’t show so many recent changes?


There's a number of different repos under that umbrella, and whilst 
psmpi does look active it seems the psmgmt one has had more commits 
recently. So it does look active to me.


https://github.com/ParaStation/

All the best.
Chris
--
Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA




Re: [slurm-users] Slurm version 23.11 is now available

2023-11-24 Thread Ole Holm Nielsen




On 11/24/23 12:15, Ole Holm Nielsen wrote:

On 11/24/23 09:31, Gestió Servidors wrote:
Some days ago, I started to configure a new server with SLURM 23.02.5. 
Yesterday, I read in this mailing list that version 23.11.0 was 
released, so today I have compiled this latest version. However, after 
starting slurmdbd (with a database upgrade), I have got problems with 
slurmctld, because of “select/cons_res” has dissapeard and, now, this 
parameter must be changed to “select_cons/tres”. If I change this, what 
is the impact in the system/cluster? What are the differences between 
“select_cons/res” and “select_cons/tres” if in newest versions it is 
mandatory to configure in this way?


In https://bugs.schedmd.com/show_bug.cgi?id=15470#c3 you can read:

When making a change from cons_res to cons_tres there isn't much you 
need to do.  These two select type plugins are very similar.  The 
difference being that the cons_tres plugin adds much more functionality 
related to GPUs.  If you're moving from cons_res to cons_tres there 
shouldn't be any effect on the running jobs.  If you were changing from 
cons_tres to cons_res and you had jobs on the system that used the new 
syntax that is available for GPUs, then you would run into problems.  
For your reference, this is described in the documentation here:

https://slurm.schedmd.com/slurm.conf.html#OPT_SelectType_1

This page shows the new options that are available when using the 
cons_tres plugin:

https://slurm.schedmd.com/cons_res.html#using_cons_tres


Note added:  A little more details are in this Wiki page: 
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#upgrade-cons-res-to-cons-tres


/Ole



[slurm-users] parastation (mpi)

2023-11-24 Thread Heckes, Frank
Hello all,
Some of scientists found the toolchains based on ParaStation to be faster than 
those utilizing openMPI or impi. I couldn’t find any eb files for toolchains 
based on this MPI implementation.
My colleagues are using this toolchains on Jülich cluster (especially Juwels). 
My question is whether these eb files can be shared ? I would be interested 
especially in the ones using NVHPC as core module.
Does anyone knows whether the parastation MPI is an active project still, 
because the github doesn’t show so many recent changes?
Many thanks in advance and sorry if I overlooked something in eb files of 
version 4.8.1.
Cheers,
-Frank Heckes



Re: [slurm-users] Slurm version 23.11 is now available

2023-11-24 Thread Ole Holm Nielsen

On 11/24/23 09:31, Gestió Servidors wrote:
Some days ago, I started to configure a new server with SLURM 23.02.5. 
Yesterday, I read in this mailing list that version 23.11.0 was released, 
so today I have compiled this latest version. However, after starting 
slurmdbd (with a database upgrade), I have got problems with slurmctld, 
because of “select/cons_res” has dissapeard and, now, this parameter must 
be changed to “select_cons/tres”. If I change this, what is the impact in 
the system/cluster? What are the differences between “select_cons/res” and 
“select_cons/tres” if in newest versions it is mandatory to configure in 
this way?


In https://bugs.schedmd.com/show_bug.cgi?id=15470#c3 you can read:


When making a change from cons_res to cons_tres there isn't much you need to 
do.  These two select type plugins are very similar.  The difference being that 
the cons_tres plugin adds much more functionality related to GPUs.  If you're 
moving from cons_res to cons_tres there shouldn't be any effect on the running 
jobs.  If you were changing from cons_tres to cons_res and you had jobs on the 
system that used the new syntax that is available for GPUs, then you would run 
into problems.  For your reference, this is described in the documentation here:
https://slurm.schedmd.com/slurm.conf.html#OPT_SelectType_1

This page shows the new options that are available when using the cons_tres 
plugin:
https://slurm.schedmd.com/cons_res.html#using_cons_tres


IHTH,
Ole



Re: [slurm-users] Slurm version 23.11 is now available

2023-11-24 Thread Gestió Servidors
Hello,

Some days ago, I started to configure a new server with SLURM 23.02.5. 
Yesterday, I read in this mailing list that version 23.11.0 was released, so 
today I have compiled this latest version. However, after starting slurmdbd 
(with a database upgrade), I have got problems with slurmctld, because of 
"select/cons_res" has dissapeard and, now, this parameter must be changed to 
"select_cons/tres". If I change this, what is the impact in the system/cluster? 
What are the differences between "select_cons/res" and "select_cons/tres" if in 
newest versions it is mandatory to configure in this way?

Thanks.


Re: [slurm-users] slurm comunication between versions

2023-11-24 Thread Ryan Novosielski
What do you mean by management node, slurmctld? Or just a node with the client 
software on it?

--
#BlackLivesMatter

|| \\UTGERS, |---*O*---
||_// the State  | Ryan Novosielski - novos...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
 `'

On Nov 23, 2023, at 12:14, Felix  wrote:

Hello

I have a curiosity and question in the same time,

Will slurm-20.02 which is installed on a management node comunicate with 
slurm-22.05 installed on a work nodes?

They have the same configuration file slurm.conf

Or do the version have to be the same. Slurm 20.02 was installed manually and 
slurm 22.05 was installed through dnf.

Thank you

Felix

--
Dr. Eng. Farcas Felix
National Institute of Research and Development of Isotopic and Molecular 
Technology,
IT - Department - Cluj-Napoca, Romania
Mobile: +40742195323