Re: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs

2017-12-20 Thread Rohit Yadav
Hi Marc,


I like the idea, I guess a locking-service was needed in CloudStack to no only 
solve the issue of locking and getting rid of DB-based lock (which I suppose if 
we can get rid of, may help people migrate to mysql-clusters with active-active 
setup which cannot be used due to LOCK usage), but also fix the issue of claim 
and ownership (i.e which management server owns which resource such as hosts, 
vms, volumes etc).


To retain CloudStack as a turnkey/standalone solution embedded-ZooKeeper may be 
used for this purpose and the new CA framework if applicable could be used to 
secure a cluster of mgmt server running the ZK plugin/services. This will also 
require refactoring of the job manager/service layer to be locking-service 
aware. I guess a general and pluggable locking service manager could be 
implemented for this purpose that supports plugins with a default plugin that 
is (embedded) ZK based.


With the agent-management server model, CloudStack agents such as KVM, SSVM and 
CPVM agents currently only have a single mgmt server 'host' IP that it connects 
to. With the introduction of the CA framework I had tried to change this to a 
list of host/IPs that it tries to connect to on disconnection (say a mgmt 
server shutdown) and as mentioned there is PR 2309 that further 
improves/introduces a way of balancing. To solve the issue of balancing 
(claim+ownership) of the agents using a locking-service such as ZK, across the 
cluster of management servers we may need a locking service/manager that can 
help. It can trigger events such as rebalancing of tasks. We may also explore 
use Gossip and other ways of discovery propagation and rebalancing of agents 
with the new locking-service/manager. I'm excited to see your attempt at 
solving the problem.


- Rohit


From: Marc-Aurèle Brothier 
Sent: Monday, December 18, 2017 7:26:21 PM
To: dev@cloudstack.apache.org
Subject: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs

Hi everyone,

Another point, another thread. Currently when shutting down a management
server, despite all the "stop()" method not being called as far as I know,
the server could be in the middle of processing an async job task. It will
lead to a failed job since the response won't be delivered to the correct
management server even though the job might have succeed on the agent. To
overcome this limitation due to our weekly production upgrades, we added a
pre-shutdown mechanism which works along side HA-proxy. The management
server keeps a eye onto a file "lb-agent" in which some keywords can be
written following the HA proxy guide (
https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#5.2-agent-check).
When it finds "maint", "stopped" or "drain", it stops those threads:
 - AsyncJobManager._heartbeatScheduler: responsible to fetch and start
execution of AsyncJobs
 - AlertManagerImpl._timer: responsible to send capacity check commands
 - StatsCollector._executor: responsible to schedule stats command

Then the management server stops most of its scheduled tasks. The correct
thing to do before shutting down the server would be to send
"rebalance/reconnect" commands to all agents connected on that management
server to ensure that commands won't go through this server at all.

Here, HA-proxy is responsible to stop sending API requests to the
corresponding server with the help of this local agent check.

In case you want to cancel the maintenance shutdown, you could write
"up/ready" in the file and the different schedulers will be restarted.

This is really more a change for operation around CS for people doing live
upgrade on a regular basis, so I'm unsure if the community would want such
a change in the code base. It goes a bit in the opposite direction of the
change for removing the need of HA-proxy
https://github.com/apache/cloudstack/pull/2309

If there is enough positive feedback for such a change, I will port them to
match with the upstream branch in a PR.

Kind regards,
Marc-Aurèle

rohit.ya...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 



Re: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs

2017-12-19 Thread Rafael Weingärtner
This is the best design, Marc. I also find quite …..  seeing all of those
thread being managed manually in ACS. I would only look at other
technologies such as Spring-integration or AKKA to create the structure for
an async messaging system across nodes/JVMs (I find Kafka too damn
complicated to install and maintain). There is a big problem though. In a
system that we design and code from scratch is quite easy to create a
project, plan and implement a new and resilient design like this. However,
in legacy code such as ACS, how can we proceed to this type of solution?

We would need a plan to extract/move bit by bit to this new design. For
instance, starting to recreate system VMs or host’s agents (these are the
most isolated piece of code I could think of). It’s time to think about ACS
5.0, this might be our chance…

On Tue, Dec 19, 2017 at 5:29 AM, Marc-Aurèle Brothier 
wrote:

> It's definitively a great direction to take and much more robust. ZK would
> be great fit to monitor the state of management servers and agent with the
> help of the ephemeral nodes. On the other side, it's not encouraged to use
> it as a messaging queue, and kafka would be a much better fit for that
> purpose, having partitions/topics. Doing a quick overview of the
> architecture I would see ZK used a inter-JVM lock, holding mgmt & agent
> status nodes, along with their capacities using a direct connection from
> each of them to ZK. Kafka would be use to exchange the current command
> messages between management servers, management server & agents. With those
> 2 kind of brokers in the middle it will make the system super resilient.
> For example, if a management sends a command to stop a VM on a host, but
> that host's agent is stopping to perform an upgrade, when it connects back
> to the kafka topic, its "stop" message would still be there if it didn't
> expired and the command could be process. Of course it would have taken
> more time, but still, it would not return an error message. This would
> remove the need to create and manage threads in the management server to
> handle all the async tasks & checks and move it to an event driven
> approach. At the same time it adds 2 dependencies that require setup &
> configuration, moving out of the goal to have an easy, almost all-included,
> installable solution... Trade-offs to be discussed.
>
> On Mon, Dec 18, 2017 at 8:06 PM, ilya musayev <
> ilya.mailing.li...@gmail.com>
> wrote:
>
> > I very much agree with Paul, we should consider moving into resilient
> model
> > with least dependence I.e ha-proxy..
> >
> > Send a notification to partner MS to take over the job management would
> be
> > ideal.
> >
> > On Mon, Dec 18, 2017 at 9:28 AM Paul Angus 
> > wrote:
> >
> > > Hi Marc-Aurèle,
> > >
> > > Personally, my utopia would be to be able to pass async jobs between
> > mgmt.
> > > servers.
> > > So rather than waiting in indeterminate time for a snapshot to
> complete,
> > > monitoring the job is passed to another management server.
> > >
> > > I would LOVE that something like Zookeeper monitored the state of the
> > > mgmt. servers, so that 'other' management servers could take over the
> > async
> > > jobs in the (unlikely) event that a management server becomes
> > unavailable.
> > >
> > >
> > >
> > > Kind regards,
> > >
> > > Paul Angus
> > >
> > > paul.an...@shapeblue.com
> > > www.shapeblue.com
> > > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > @shapeblue
> > >
> > >
> > >
> > >
> > > -Original Message-
> > > From: Marc-Aurèle Brothier [mailto:ma...@exoscale.ch]
> > > Sent: 18 December 2017 13:56
> > > To: dev@cloudstack.apache.org
> > > Subject: [DISCUSS] Management server (pre-)shutdown to avoid killing
> jobs
> > >
> > > Hi everyone,
> > >
> > > Another point, another thread. Currently when shutting down a
> management
> > > server, despite all the "stop()" method not being called as far as I
> > know,
> > > the server could be in the middle of processing an async job task. It
> > will
> > > lead to a failed job since the response won't be delivered to the
> correct
> > > management server even though the job might have succeed on the agent.
> To
> > > overcome this limitation due to our weekly production upgrades, we
> added
> > a
> > > pre-shutdown mechanism which works along side HA-proxy. The management
> > > server keeps a eye onto a file "lb-agent" in which some keywords can be
> > > written following the HA proxy guide (
> > > https://cbonte.github.io/haproxy-dconv/1.9/
> configuration.html#5.2-agent-
> > check
> > > ).
> > > When it finds "maint", "stopped" or "drain", it stops those threads:
> > >  - AsyncJobManager._heartbeatScheduler: responsible to fetch and start
> > > execution of AsyncJobs
> > >  - AlertManagerImpl._timer: responsible to send capacity check commands
> > >  - StatsCollector._executor: responsible to schedule stats command
> > >
> > > Then the management server stops most 

Re: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs

2017-12-18 Thread Marc-Aurèle Brothier
It's definitively a great direction to take and much more robust. ZK would
be great fit to monitor the state of management servers and agent with the
help of the ephemeral nodes. On the other side, it's not encouraged to use
it as a messaging queue, and kafka would be a much better fit for that
purpose, having partitions/topics. Doing a quick overview of the
architecture I would see ZK used a inter-JVM lock, holding mgmt & agent
status nodes, along with their capacities using a direct connection from
each of them to ZK. Kafka would be use to exchange the current command
messages between management servers, management server & agents. With those
2 kind of brokers in the middle it will make the system super resilient.
For example, if a management sends a command to stop a VM on a host, but
that host's agent is stopping to perform an upgrade, when it connects back
to the kafka topic, its "stop" message would still be there if it didn't
expired and the command could be process. Of course it would have taken
more time, but still, it would not return an error message. This would
remove the need to create and manage threads in the management server to
handle all the async tasks & checks and move it to an event driven
approach. At the same time it adds 2 dependencies that require setup &
configuration, moving out of the goal to have an easy, almost all-included,
installable solution... Trade-offs to be discussed.

On Mon, Dec 18, 2017 at 8:06 PM, ilya musayev 
wrote:

> I very much agree with Paul, we should consider moving into resilient model
> with least dependence I.e ha-proxy..
>
> Send a notification to partner MS to take over the job management would be
> ideal.
>
> On Mon, Dec 18, 2017 at 9:28 AM Paul Angus 
> wrote:
>
> > Hi Marc-Aurèle,
> >
> > Personally, my utopia would be to be able to pass async jobs between
> mgmt.
> > servers.
> > So rather than waiting in indeterminate time for a snapshot to complete,
> > monitoring the job is passed to another management server.
> >
> > I would LOVE that something like Zookeeper monitored the state of the
> > mgmt. servers, so that 'other' management servers could take over the
> async
> > jobs in the (unlikely) event that a management server becomes
> unavailable.
> >
> >
> >
> > Kind regards,
> >
> > Paul Angus
> >
> > paul.an...@shapeblue.com
> > www.shapeblue.com
> > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > @shapeblue
> >
> >
> >
> >
> > -Original Message-
> > From: Marc-Aurèle Brothier [mailto:ma...@exoscale.ch]
> > Sent: 18 December 2017 13:56
> > To: dev@cloudstack.apache.org
> > Subject: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs
> >
> > Hi everyone,
> >
> > Another point, another thread. Currently when shutting down a management
> > server, despite all the "stop()" method not being called as far as I
> know,
> > the server could be in the middle of processing an async job task. It
> will
> > lead to a failed job since the response won't be delivered to the correct
> > management server even though the job might have succeed on the agent. To
> > overcome this limitation due to our weekly production upgrades, we added
> a
> > pre-shutdown mechanism which works along side HA-proxy. The management
> > server keeps a eye onto a file "lb-agent" in which some keywords can be
> > written following the HA proxy guide (
> > https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#5.2-agent-
> check
> > ).
> > When it finds "maint", "stopped" or "drain", it stops those threads:
> >  - AsyncJobManager._heartbeatScheduler: responsible to fetch and start
> > execution of AsyncJobs
> >  - AlertManagerImpl._timer: responsible to send capacity check commands
> >  - StatsCollector._executor: responsible to schedule stats command
> >
> > Then the management server stops most of its scheduled tasks. The correct
> > thing to do before shutting down the server would be to send
> > "rebalance/reconnect" commands to all agents connected on that management
> > server to ensure that commands won't go through this server at all.
> >
> > Here, HA-proxy is responsible to stop sending API requests to the
> > corresponding server with the help of this local agent check.
> >
> > In case you want to cancel the maintenance shutdown, you could write
> > "up/ready" in the file and the different schedulers will be restarted.
> >
> > This is really more a change for operation around CS for people doing
> live
> > upgrade on a regular basis, so I'm unsure if the community would want
> such
> > a change in the code base. It goes a bit in the opposite direction of the
> > change for removing the need of HA-proxy
> > https://github.com/apache/cloudstack/pull/2309
> >
> > If there is enough positive feedback for such a change, I will port them
> > to match with the upstream branch in a PR.
> >
> > Kind regards,
> > Marc-Aurèle
> >
>


Re: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs

2017-12-18 Thread ilya musayev
I very much agree with Paul, we should consider moving into resilient model
with least dependence I.e ha-proxy..

Send a notification to partner MS to take over the job management would be
ideal.

On Mon, Dec 18, 2017 at 9:28 AM Paul Angus  wrote:

> Hi Marc-Aurèle,
>
> Personally, my utopia would be to be able to pass async jobs between mgmt.
> servers.
> So rather than waiting in indeterminate time for a snapshot to complete,
> monitoring the job is passed to another management server.
>
> I would LOVE that something like Zookeeper monitored the state of the
> mgmt. servers, so that 'other' management servers could take over the async
> jobs in the (unlikely) event that a management server becomes unavailable.
>
>
>
> Kind regards,
>
> Paul Angus
>
> paul.an...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>
> -Original Message-
> From: Marc-Aurèle Brothier [mailto:ma...@exoscale.ch]
> Sent: 18 December 2017 13:56
> To: dev@cloudstack.apache.org
> Subject: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs
>
> Hi everyone,
>
> Another point, another thread. Currently when shutting down a management
> server, despite all the "stop()" method not being called as far as I know,
> the server could be in the middle of processing an async job task. It will
> lead to a failed job since the response won't be delivered to the correct
> management server even though the job might have succeed on the agent. To
> overcome this limitation due to our weekly production upgrades, we added a
> pre-shutdown mechanism which works along side HA-proxy. The management
> server keeps a eye onto a file "lb-agent" in which some keywords can be
> written following the HA proxy guide (
> https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#5.2-agent-check
> ).
> When it finds "maint", "stopped" or "drain", it stops those threads:
>  - AsyncJobManager._heartbeatScheduler: responsible to fetch and start
> execution of AsyncJobs
>  - AlertManagerImpl._timer: responsible to send capacity check commands
>  - StatsCollector._executor: responsible to schedule stats command
>
> Then the management server stops most of its scheduled tasks. The correct
> thing to do before shutting down the server would be to send
> "rebalance/reconnect" commands to all agents connected on that management
> server to ensure that commands won't go through this server at all.
>
> Here, HA-proxy is responsible to stop sending API requests to the
> corresponding server with the help of this local agent check.
>
> In case you want to cancel the maintenance shutdown, you could write
> "up/ready" in the file and the different schedulers will be restarted.
>
> This is really more a change for operation around CS for people doing live
> upgrade on a regular basis, so I'm unsure if the community would want such
> a change in the code base. It goes a bit in the opposite direction of the
> change for removing the need of HA-proxy
> https://github.com/apache/cloudstack/pull/2309
>
> If there is enough positive feedback for such a change, I will port them
> to match with the upstream branch in a PR.
>
> Kind regards,
> Marc-Aurèle
>


RE: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs

2017-12-18 Thread Paul Angus
Hi Marc-Aurèle,

Personally, my utopia would be to be able to pass async jobs between mgmt. 
servers.
So rather than waiting in indeterminate time for a snapshot to complete, 
monitoring the job is passed to another management server. 

I would LOVE that something like Zookeeper monitored the state of the mgmt. 
servers, so that 'other' management servers could take over the async jobs in 
the (unlikely) event that a management server becomes unavailable.



Kind regards,

Paul Angus

paul.an...@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


-Original Message-
From: Marc-Aurèle Brothier [mailto:ma...@exoscale.ch] 
Sent: 18 December 2017 13:56
To: dev@cloudstack.apache.org
Subject: [DISCUSS] Management server (pre-)shutdown to avoid killing jobs

Hi everyone,

Another point, another thread. Currently when shutting down a management 
server, despite all the "stop()" method not being called as far as I know, the 
server could be in the middle of processing an async job task. It will lead to 
a failed job since the response won't be delivered to the correct management 
server even though the job might have succeed on the agent. To overcome this 
limitation due to our weekly production upgrades, we added a pre-shutdown 
mechanism which works along side HA-proxy. The management server keeps a eye 
onto a file "lb-agent" in which some keywords can be written following the HA 
proxy guide ( 
https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#5.2-agent-check).
When it finds "maint", "stopped" or "drain", it stops those threads:
 - AsyncJobManager._heartbeatScheduler: responsible to fetch and start 
execution of AsyncJobs
 - AlertManagerImpl._timer: responsible to send capacity check commands
 - StatsCollector._executor: responsible to schedule stats command

Then the management server stops most of its scheduled tasks. The correct thing 
to do before shutting down the server would be to send "rebalance/reconnect" 
commands to all agents connected on that management server to ensure that 
commands won't go through this server at all.

Here, HA-proxy is responsible to stop sending API requests to the corresponding 
server with the help of this local agent check.

In case you want to cancel the maintenance shutdown, you could write "up/ready" 
in the file and the different schedulers will be restarted.

This is really more a change for operation around CS for people doing live 
upgrade on a regular basis, so I'm unsure if the community would want such a 
change in the code base. It goes a bit in the opposite direction of the change 
for removing the need of HA-proxy
https://github.com/apache/cloudstack/pull/2309

If there is enough positive feedback for such a change, I will port them to 
match with the upstream branch in a PR.

Kind regards,
Marc-Aurèle