On 5/1/23 12:08, Angel de Vicente wrote:
Hello Ole,

Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> writes:

As Brian wrote:

On a technical note: slurm keeps the detailed accounting data for each cluster
in separate TABLES within a single database.

In the Federation page
https://urldefense.com/v3/__https://slurm.schedmd.com/federation.html__;!!D9dNQwwGXtA!UXs13P7Zdf-J6x0HmI1pkRQ7dxPXonmaR08N9UtrXNcoixhdJMhbWu2-wEKkxP8qjCcbDTbNpaJyJP224dxuZU6gbW1FV7rFvg$
it is implicitly assumed that the sacctmgr command talks only to a single
slurmdbd instance.  It is not, however, explicitly stated as an answer to your
question.

And hence my question.. because as I was saying in a previous mail,
reading the documentation I understand that this is the standard way to
do it, but right now I got it working the other way: in each cluster I
have one slurmdbd daemon that connects with a single mysqld daemon in a
third machine (option 2 from my question).

I have a single database with detailed accounting data for each cluster
in separate tables, and from each cluster I can query the whole database
so as far as I can see all is working fine but it is implemented
different to the standard approach.

I did it this way not because I wanted something special or outside of
the standard, simply because it was not very clear to me from the
documentation which way to go and this came natural when implementing it
(maybe simply because in the database machine I don't have Slurm
installed). And I have no problem with changing the installation to a
single slurmdbd daemon if I need to.

But this being my first time I just hope to learn if this is really a
bad idea that is going to bite me in the near future when these machines
go to production and I should change to the standard way, or in general
whether someone has a clear idea of the pros/cons of both ways.

If implementing Slurm for the first time, the slurm-users mailing list is probably the most helpful way to ask questions. The official Slurm documentation is of course the place to start learning. Some people have found my Slurm Wiki page helpful:
https://wiki.fysik.dtu.dk/Niflheim_system/SLURM/
However, I do not describe federated clusters because we don't use this aspect.

I also recommend SchedMD's paid support contracts, since they are the experts and give a fantastic service: https://www.schedmd.com/support.php

/Ole

Reply via email to