[slurm-dev] Re: RFD: simple getting rid of config syncronization.

Andrej N. Gritsenko Fri, 13 May 2011 03:11:30 -0700

    Hello!

You'we wrote to list:
>I would also add that you can include other files within a slurm.conf
>file, which could make management of SLURM's configuration file easier
>for you.


    Sure, that is impossible for remote config loading but I don't think
it's a big disadvantage and when we choose to use that then we accept
this limitation.

>________________________________________
>From: [email protected] [[email protected]] On 
>Behalf Of Danny Auble [[email protected]]
>Sent: Thursday, May 12, 2011 4:00 PM
>To: [email protected]
>Subject: Re: [slurm-dev] RFD: simple getting rid of config syncronization.

>On Thursday, May 12, 2011 03:10:14 PM Andrej N. Gritsenko wrote:
>>     Hello there!
>>
>>     There is a little problem with SLURM: each node should have own copy
>> of slurm.conf and if you change it on controller then you should update
>> it on all other nodes as well. There is a simple solution for it - you
>> should just have it on one node - slurmdbd for example - and load it for
>> each slurmctld or slurmd by means of accounting_storage plugin. See a
>> simple API for that in attachment, it expands config file name (option -f
>> of slurmd or slurmctld or environment variable SLURM_CONF) so it can now
>> contain some non-local name in form "plugin:host:port", i.e. for example
>> "slurmdbd:sqlnode:7031". If use it with accounting_storage/slurmdbd the
>> slurmdbd.conf should contain slurmdbd variables and slurm.conf variables
>> too, or else RPC DBD_GET_CONFIG (and acct_storage_g_get_config() function
>> and appropriate acct_storage_p_get_config() as well) should be expanded
>> in future version of SLURM to request exact config type.

>I am not sure how scalable this would be.

>What if you had 10k nodes all asking for it at the same time?

    Sure, it's possible but it's possible only in case whole cluster got
restarted. But in that case:
a) nodes will never boot at the same millisecond and even second, that's
generally almost random time to boot so that process will be dispersed
somehow;
b) sending config isn't big resourse consuming operation, it's basicly
copying data from memory into TCP socket;
c) nodes do communicate with slurmctld ATM (registration request) already
so I think it's near the same in time consuming.

>What if your slurmdbd isn't on the network your compute nodes have access to?

    Then this method isn't appliable. I never tried to propose a panacea
you know, just another way to reach a config.

>What happens if your DBD is down or unreachable?  It seems like it could 
>potentially bring down your entire enterprise (as most people have 1 DBD to 
>service multiple clusters) with a potential single point of failure.

    What happens if controller is down or unreachable? Isn't that near
the same? If that happens then we should increase reliability of the
service.

>Another concern is all the other user commands like srun, squeue and such 
>would have to go and get this information as well when starting.  It seems 
>like it could overload the DBD quite fast if it was continuously being 
>bombarded with requests for config information.  It seems like you would have 
>to have a wrapper around all the commands as well to tell the commands where 
>the DBD was.  How do you handle this in your cluster running this?

    About wrapper - as I said before, I set SLURM_CONF variable in the
environment and it solves the question. About srun, squeue and other
commands - yes, those will ask config too but how big impact they will do
on storage? How many such commands typical cluster will get in a minute?
Tens? Hundreds? Is it a problem to service few connections per second for
storage on decent hardware?

>I am not against the idea of having the database house this information, but 
>currently you don't have to have the database to run.  I am guessing very few 
>people use the -f option (as you probably noted when you found the bug earlier 
>;)).

>On most installations the slurm.conf doesn't change very often and there are 
>tools out there that will dist to all your nodes when it does.

    I understand your concerns. I just public some solution and it's your
decision if you want it and if it would be useful for someone else. ;)

>>     This solution already works very well in our cluster but we use own
>> accounting_storage plugin so to get accounting_storage/slurmdbd work that
>> way you have to do something with described problem as either slurmdbd
>> will complain about unknown variables in config or scontrol will do. So
>> I've marked the proposal as RFD - if you want it then resolve the problem
>> (in 2.3.0 probably?) and include this patch. :)

>I am not sure I understand your statement about unknown variables?  I am 
>guessing that is on your install?

    I said above about current implementation of RPC DBD_GET_CONFIG - it
have no parameters and returns back DBD_GOT_CONFIG with contents of the
slurmdbd.conf so you have to have all slurm.conf variables in that file
if you want to use it as proposed. At least the 'scontrol reconfigure'
command will complain about any non-slurm.conf variable (and you know
slurmdbd.conf have own variables). Not sure if slurmdbd wouldn't complain
about non-slurmdbd.conf variables as we don't run slurmdbd anymore. And
again, currently slurmdbd would never send non-slurmdbd.conf variables in
that RPC (DBD_GOT_CONFIG) so it need to be adjusted somehow to serve that
purpose. The solution for that is to add config type (and cluster name
probably too) into next version of the DBD_GET_CONFIG request so slurmdbd
will send either own config or another as requested (i.e. slurm.conf,
topology.conf, etc.).

>Thanks for your ideas,

    You're always welcome!

    Andriy.

[slurm-dev] Re: RFD: simple getting rid of config syncronization.

Reply via email to