+1 looks ok. We need to do OS level monitoring of the agent and keep it alive (to avoid process up, agent down, VM up scenario). That's an easy task.
On Tue, Oct 29, 2013 at 3:52 PM, Lahiru Sandaruwan <[email protected]> wrote: > Hi all, > > We(Imesh, Reka, and myself) had a small discussion on $subject while > working on Stratos 4.0 M1. > > This is on handling faults in VM instances. For example there can be three > basic faults. > > - Network Issue > - Application process is terminated > - VM itself is terminated > > Here is the decision table, > > > Process > > VM > > Decision flow > > Down > > Up > > - > > Cartridge agent publish event to CC > - > > CC updates instance status in topology > - > > Autoscaler decides to kill it > > Down > > Down(It can be that agent is crashed) > > - > > CEP identify that & publish event to Autoscaler > - > > Autoscaler calls CC to terminate(if available) and remove the instance > from topology > - > > Autoscaler will spawn another to cover that > > Up > > Up(but network issue) > > - > > CEP sends statistics on fault requests to Autoscaler > - > > Autoscaler keep monitoring it and takes a decision to terminate the > instance > - > > Autoscaler will spawn another to cover that > > > > > Feed your thoughts here... > > Thanks. > > > > -- > -- > Lahiru Sandaruwan > Software Engineer, > Platform Technologies, > WSO2 Inc., http://wso2.com > lean.enterprise.middleware > > email: [email protected] cell: (+94) 773 325 954 > blog: http://lahiruwrites.blogspot.com/ > twitter: http://twitter.com/lahirus > linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146 > > -- Best Regards, Nirmal Nirmal Fernando. PPMC Member & Committer of Apache Stratos, Senior Software Engineer, WSO2 Inc. Blog: http://nirmalfdo.blogspot.com/
