Hi again! I think we've found the root source, but are unable to mitigate that:
2016-02-16 16:13:22,217 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (AgentManager-Handler-8:null) Seq 6--1: MgmtId 57177340185273: Req: Routing to peer 2016-02-16 16:13:22,217 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (AgentManager-Handler-9:null) Seq 6--1: MgmtId 57177340185273: Req: Cancel request received 2016-02-16 16:13:22,899 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (AgentManager-Handler-10:null) Seq 1-4458000681143369786: MgmtId 57177340185273: Req: Resource [Host:1] is unreachable: Host 1: Link is closed 2016-02-16 16:13:22,899 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (AgentManager-Handler-10:null) Seq 1--1: MgmtId 57177340185273: Req: Routing to peer 2016-02-16 16:13:22,900 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (AgentManager-Handler-11:null) Seq 1--1: MgmtId 57177340185273: Req: Cancel request received 2016-02-16 16:13:22,905 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (AgentManager-Handler-12:null) Seq 3-2144839322535198778: MgmtId 57177340185273: Req: Resource [Host:3] is unreachable: Host 3: Link is closed Here's a longer excerpt from the logfile during startup: http://pastebin.com/SftVJCs4 Maybe someone knows how to resolve this? To me it looks like our single management-host has some kind of identity crisis? Am Dienstag, den 16.02.2016, 15:12 +0100 schrieb Stephan Seitz: > Hi acs gurus! > > We're currently facing a really strange problem after two somewhat > simple steps. > 1. Reboot Management-Node (well there is also a 2nd. NFS-Storage > located) > 2. Upgrade 4.7.0 to 4.7.1 > > Both steps seemed successful and running, but after a few days I've > noticed the SSVM in "running, not connected" state, so I decided to > restart the SSVM. That's where all the trouble begun... > > I've pasted a somewhat repetive log excerpt here > http://pastebin.com/8MM6XUBk > > If I try to (force) reconnect a host, we're getting huge repetive log > entries like pasted here http://pastebin.com/cNR3TtkG > > Cloudmonkey quits with following Response: > > (local) 🐵 > reconnect host id=df4182f8-24a0-40ca-9ccc-6489f374cd4c > Error Connection refused by server: ('Connection aborted.', > BadStatusLine("''",)) > > > I've tcpdump'ed relevant traffic between management and xenservers and > found simply nothing except some (i assume) unrelated NFS-Packets. > > Could please someone shed some light, how to fix that? > > Thanks in advance! > > - Stephan