At long last, I have the answer to this: SLURM maintains this address by itself. If you have an incorrect ControlHost, chances are your ControlHost is contacting your AccountingStorageHost from that IP. In our case, it was an unexpected NAT making traffic appear to be coming from the VM host instead of the VM guest.
The remedy was to fix the traffic problem (for us, changing the AccountingStorageHost to be an address accessible to the VM guest directly) and restart slurmctld. Apparently slurmctld contacts slurmdbd and the ControlHost is changed automatically (which explains why it kept getting set back to the wrong thing when I did finally try changing the database). > On Aug 9, 2016, at 6:55 PM, Ryan Novosielski <novos...@oarc.rutgers.edu> > wrote: > > Is it really possible that no one has an answer to this? I guess I can start > looking through the source code, but I'd hope that there might be someone who > at least understands how this part works well enough to know how to undo this > mistake. Thanks either way! > > ________________________________________ > From: Ryan Novosielski <novos...@rutgers.edu> > Sent: Friday, August 5, 2016 2:44 AM > To: slurm-dev > Subject: [slurm-dev] Re: sacctmgr modify cluster controlhost? > > Hi all, > > I'd written about this some time ago -- I need to change the ControlHost for > my cluster (it somehow got set to a machine that does not run slurmctld). > There was one answer about a patch related to listening on all interfaces on > a host, but my problem is that the ControlHost is flat out the wrong host > (the IP in there is a different host altogether). > > Does anyone know if there's an appropriate syntax to change this parameter, > or if I'll have to change it in the database manually (and if so, the right > thing to change)? > > Thanks! > > ________________________________________ > From: Ryan Novosielski <novos...@rutgers.edu> > Sent: Tuesday, May 10, 2016 11:36 AM > To: slurm-dev > Subject: [slurm-dev] sacctmgr modify cluster controlhost? > > Signed PGP part > Hi there, > > Using SLURM 15.08. Apparently our cluster ControlHost as shown by > sacctmgr show cluster is incorrect. I'm not sure how it got that way, > and I didn't notice it until I went to try to use the -M flag on a > cluster to contact it. > > It doesn't appear that the sacctmgr modify cluster where > cluster=<clustername> set controlhost=<correctaddr> works, though it > looks like it might from the manual. > > Any ideas for the best way to set this without dropping the data from > the database? > > Thanks! > > -- > ____ > || \\UTGERS, > |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novos...@rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, > Newark > `' -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `'
signature.asc
Description: Message signed with OpenPGP using GPGMail