Paul,

thank you for your hint! That was the root cause of our problems:

https://bugs.launchpad.net/ubuntu/+source/ifenslave/+bug/1288196

We simply just didn't know that the msid is derived from the MAC.

Our services tend to be manageable again ;)

Thanks again guys!

cheers,

- Stephan


Am Mittwoch, den 17.02.2016, 19:16 +0000 schrieb Paul Angus: 
> The msid is generated from the MAC address of the host when the service 
> starts, the two IDs are subtly different do you have some bonding in place 
> that is maybe miss-configured, which is generating the 2nd MAC?
> 
> 
> 
> Paul Angus
> VP Technology   ,       ShapeBlue



> 
> 
> t:      @cloudyangus<tel:@cloudyangus>
> 
> e:      paul.an...@shapeblue.com<mailto:paul.an...@shapeblue.com>        |    
>   w:      www.shapeblue.com<http://www.shapeblue.com>
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Simon Weller [mailto:swel...@ena.com]
> Sent: Wednesday, February 17, 2016 6:11 PM
> To: dev@cloudstack.apache.org
> Cc: Glenn Wagner <glenn.wag...@shapeblue.com>
> Subject: Re: [update] ACS management unable to connect to xenserver hosts 
> after reboot
> 
> Stephan,
> 
> When you restart the management process, do you see any logs indicating it's 
> trying to peer with another management server?
> 
> - Si
> 
> ________________________________________
> From: Stephan Seitz <s.se...@secretresearchfacility.com>
> Sent: Wednesday, February 17, 2016 9:28 AM
> To: dev@cloudstack.apache.org
> Cc: Glenn Wagner
> Subject: Re: [update] ACS management unable to connect to xenserver hosts 
> after reboot
> 
> Glenn,
> 
> thanks for your reply. Unfortunately the SSVM has been destroyed.
> 
> We don't have any firewall in between. ACS and XenServers are located in the 
> same /22. I've double checked every connection and there's no iptables or 
> similar in the way.
> Instead of the SSVM, I've just successfully checked if the consoleproxy VM is 
> able to connect to Port 8250.
> 
> To me it looks, like there's some strange "identity" problem.
> 
> mysql> select * from mshost;
> +----+----------------+---------------+------------------+-------+---------+------------+--------------+---------------------+---------+-------------+
> | id | msid | runid | name | state |
> version | service_ip | service_port | last_update | removed |
> alert_count |
> +----+----------------+---------------+------------------+-------+---------+------------+--------------+---------------------+---------+-------------+
> | 1 | 57177340185274 | 1455209855143 | acs-management-1 | Up | 4.7.1
> | 10.97.13.1 | 9090 | 2016-02-12 16:55:56 | NULL |
> 0 |
> | 3 | 57177340185273 | 1455639355379 | acs-management-1 | Up | 4.7.1
> | 10.97.13.1 | 9090 | 2016-02-17 11:31:50 | NULL |
> 0 |
> +----+----------------+---------------+------------------+-------+---------+------------+--------------+---------------------+---------+-------------+
> 2 rows in set (0.00 sec)
> 
> Indeed, there is (and always has been) only one management host in this 
> infrastructure.
> 
> With sqldumps at hand, we removed the second row and purged all the related 
> jobs to that id, but after restarting cloudstack-management, this entry wasi 
> created again.
> 
> Maybe, I'm completely wrong, but is it possible that our management host 
> "thinks" there's another management host responsible for our cluster?
> 
> Since we're fiddling at least two days without any success here, I'm willing 
> to get a few consulting hours thrown on that.
> 
> cheers,
> 
> - Stephan
> 
> btw. sorry, if this is a double post, but I think the list ate my last mail...
> 
> 
> Am Dienstag, den 16.02.2016, 20:39 +0000 schrieb Glenn Wagner:
> > Hi Stephan,
> >
> > Check that you can telnet port 8250 on the management server from SSVM
> > , check that iptables has been setup correctly Looks like it’s a
> > firewall issue on the ACS Management server
> >
> > Thanks
> > Glenn
> >
> >
> >
> >
> >
> > ShapeBlue
> > Glenn Wagner
> > Senior
> > Consultant
> > ,
> > ShapeBlue
> > d:
> > | s: +27 21 527 0091
> > |
> > m:
> > +27 73 917 4111
> > e:
> > glenn.wag...@shapeblue.com | t:
> > |
> > w:
> > www.shapeblue.com
> > a:
> > 2nd Floor, Oudehuis Centre, 122 Main Rd, Somerset West Cape Town 7130
> > South Africa
> >
> > Shape Blue Ltd is a company incorporated in England & Wales. ShapeBlue
> > Services India LLP is a company incorporated in India and is operated
> > under license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda
> > is a company incorporated in Brasil and is operated under license from
> > Shape Blue Ltd. ShapeBlue SA Pty Ltd is a company registered by The
> > Republic of South Africa and is traded under license from Shape Blue
> > Ltd. ShapeBlue is a registered trademark.
> > This email and any attachments to it may be confidential and are
> > intended solely for the use of the individual to whom it is addressed.
> > Any views or opinions expressed are solely those of the author and do
> > not necessarily represent those of Shape Blue Ltd or related
> > companies. If you are not the intended recipient of this email, you
> > must neither take any action based upon its contents, nor copy or show
> > it to anyone. Please contact the sender if you believe you have
> > received this email in error.
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Stephan Seitz [mailto:s.se...@secretresearchfacility.com]
> > Sent: Tuesday, 16 February 2016 5:19 PM
> > To: us...@cloudstack.apache.org
> > Cc: dev@cloudstack.apache.org
> > Subject: [update] ACS management unable to connect to xenserver hosts
> > after reboot
> >
> > Hi again!
> >
> > I think we've found the root source, but are unable to mitigate that:
> >
> > 2016-02-16 16:13:22,217 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentManager-Handler-8:null) Seq 6--1: MgmtId 57177340185273: Req:
> > Routing to peer
> > 2016-02-16 16:13:22,217 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentManager-Handler-9:null) Seq 6--1: MgmtId 57177340185273: Req:
> > Cancel request received
> > 2016-02-16 16:13:22,899 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentManager-Handler-10:null) Seq 1-4458000681143369786: MgmtId
> > 57177340185273: Req: Resource [Host:1] is unreachable: Host 1: Link is
> > closed
> > 2016-02-16 16:13:22,899 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentManager-Handler-10:null) Seq 1--1: MgmtId 57177340185273: Req:
> > Routing to peer
> > 2016-02-16 16:13:22,900 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentManager-Handler-11:null) Seq 1--1: MgmtId 57177340185273: Req:
> > Cancel request received
> > 2016-02-16 16:13:22,905 DEBUG [c.c.a.m.ClusteredAgentManagerImpl]
> > (AgentManager-Handler-12:null) Seq 3-2144839322535198778: MgmtId
> > 57177340185273: Req: Resource [Host:3] is unreachable: Host 3: Link is
> > closed
> >
> > Here's a longer excerpt from the logfile during startup:
> >
> > http://pastebin.com/SftVJCs4
> >
> > Maybe someone knows how to resolve this? To me it looks like our
> > single management-host has some kind of identity crisis?
> >
> >
> > Am Dienstag, den 16.02.2016, 15:12 +0100 schrieb Stephan Seitz:
> > > Hi acs gurus!
> > >
> > > We're currently facing a really strange problem after two somewhat
> > > simple steps.
> > > 1. Reboot Management-Node (well there is also a 2nd. NFS-Storage
> > > located)
> > > 2. Upgrade 4.7.0 to 4.7.1
> > >
> > > Both steps seemed successful and running, but after a few days I've
> > > noticed the SSVM in "running, not connected" state, so I decided to
> > > restart the SSVM. That's where all the trouble begun...
> > >
> > > I've pasted a somewhat repetive log excerpt here
> > > http://pastebin.com/8MM6XUBk
> > >
> > > If I try to (force) reconnect a host, we're getting huge repetive
> > log
> > > entries like pasted here http://pastebin.com/cNR3TtkG
> > >
> > > Cloudmonkey quits with following Response:
> > >
> > > (local) 🐡 > reconnect host id=df4182f8-24a0-40ca-9ccc-6489f374cd4c
> > > Error Connection refused by server: ('Connection aborted.',
> > > BadStatusLine("''",))
> > >
> > >
> > > I've tcpdump'ed relevant traffic between management and xenservers
> > and
> > > found simply nothing except some (i assume) unrelated NFS-Packets.
> > >
> > > Could please someone shed some light, how to fix that?
> > >
> > > Thanks in advance!
> > >
> > > - Stephan
> >
> >
> >
> > Find out more about ShapeBlue and our range of CloudStack related
> > services:
> > IaaS Cloud Design & Build | CSForge – rapid IaaS deployment framework
> > CloudStack Consulting | CloudStack Software Engineering CloudStack
> > Infrastructure Support | CloudStack Bootcamp Training Courses
> >
> 
> 
> Find out more about ShapeBlue and our range of CloudStack related services:
> IaaS Cloud Design & Build<http://shapeblue.com/iaas-cloud-design-and-build//> 
> | CSForge – rapid IaaS deployment framework<http://shapeblue.com/csforge/>
> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> | 
> CloudStack Software 
> Engineering<http://shapeblue.com/cloudstack-software-engineering/>
> CloudStack Infrastructure 
> Support<http://shapeblue.com/cloudstack-infrastructure-support/> | CloudStack 
> Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>


Reply via email to