There are no firewalls and I have always been able to do 'sacctmgr show clusters' as well as things like  'squeue -M ALL' from both the db server and the cluster head.

For now, I will have to restart slurmctld on all the clusters when there are changes to associations.  But that is definitely not ideal.

Brian Andrus


On 11/8/2018 1:31 PM, Chris Samuel wrote:
On Friday, 9 November 2018 5:38:22 AM AEDT Brian Andrus wrote:

Where, slurmctld is not picking up new accounts unless it is restarted.
This is usually because slurmdbd cannot connect back to the slurmctld on the
management node to do the RPC to tell it that a new account/user/etc has
appeared.   When you restart slurmctld it connects to slurmdbd and grabs all
that information.  That can be because either slurmctld has registered an IP
address for itself that slurmdbd cannot connect to or because of intervening
firewalls/ACLs.

Check that the connection can be made, you can see the IP address & port
number that slurmctld has registered with "sacctmgr show clusters".

Best of luck!
Chris


Reply via email to