Hi Nirmal,
On Thu, Jan 9, 2014 at 10:25 AM, Nirmal Fernando <[email protected]>wrote: > Hi Lahiru, > > I'm sorry, but I don't get your point here and also I don't understand how > the proposed solution would solve this problem > mmm, Any specific point that you are not clear? Basically it solves the problem of recursively starting and terminating instances if there is a long running issue. > (what if there's an error and that made member activation event to not > sent ever). > > If this happens only for one instance(Not sending member activation event for ever), it will only wait the given timeout(say 5 mins). So there is no issue. If this happens to consecutive instances(say 3), we can say that there is some long running issue. So we will wait more than the initial delay after 3 consecutive failures... And this is continuing, we can keep increasing the wait up to a ceiling... Thanks. > I wonder whether we need to be concerned on this kind of issues. > > > On Thu, Jan 9, 2014 at 10:00 AM, Isuru Haththotuwa <[email protected]>wrote: > >> +1. We can make the waiting time x2 each time, but there should be a >> ceiling value as well. >> >> >> On Wed, Jan 8, 2014 at 11:28 PM, Lahiru Sandaruwan <[email protected]>wrote: >> >>> Hi all, >>> >>> Currently if an instance is not joined after a timeout, we will >>> terminate the instance and it will be removed from the pending state. >>> Then the Autoscaler will decide to spawn more instances according to the >>> rules, to cover terminated instances. >>> If there is an error which blocks sending member activate event( in the >>> cartridge, network or at any other place), system will be terminating and >>> starting instances continuously, which is an utter waste of resources. >>> >>> So I suggest following scenario, >>> >>> We keep a count of unactivated instances per cluster. If this count >>> exceeds a limit( say 3 - should be configurable), we will increase waiting >>> time on the next instance activation. May be we keep increasing. >>> We can reset the count when ever a member activation received. >>> >>> Wdyt? >>> >>> Thanks. >>> >>> Sent from my mobile. >>> >> >> >> >> -- >> Thanks and Regards, >> >> Isuru H. >> Software Engineer, WSO2 Inc. >> +94 716 358 048* <http://wso2.com/>* >> >> >> > > > -- > Best Regards, > Nirmal > > Nirmal Fernando. > PPMC Member & Committer of Apache Stratos, > Senior Software Engineer, WSO2 Inc. > > Blog: http://nirmalfdo.blogspot.com/ > -- -- Lahiru Sandaruwan Software Engineer, Platform Technologies, WSO2 Inc., http://wso2.com lean.enterprise.middleware email: [email protected] cell: (+94) 773 325 954 blog: http://lahiruwrites.blogspot.com/ twitter: http://twitter.com/lahirus linked-in: http://lk.linkedin.com/pub/lahiru-sandaruwan/16/153/146
