Hi all, Currently if an instance is not joined after a timeout, we will terminate the instance and it will be removed from the pending state. Then the Autoscaler will decide to spawn more instances according to the rules, to cover terminated instances. If there is an error which blocks sending member activate event( in the cartridge, network or at any other place), system will be terminating and starting instances continuously, which is an utter waste of resources.
So I suggest following scenario, We keep a count of unactivated instances per cluster. If this count exceeds a limit( say 3 - should be configurable), we will increase waiting time on the next instance activation. May be we keep increasing. We can reset the count when ever a member activation received. Wdyt? Thanks. Sent from my mobile.
