Hi all,

Currently if an instance is not joined after a timeout, we will terminate
the instance and it will be removed from the pending state.
Then the Autoscaler will decide to spawn more instances according to the
rules, to cover terminated instances.
If there is an error which blocks sending member activate event( in the
cartridge, network or at any other place), system will be terminating and
starting instances continuously, which is an utter waste of resources.

So I suggest following scenario,

We keep a count of unactivated instances per cluster. If this count exceeds
a limit( say 3 - should be configurable), we will increase waiting time on
the next instance activation.  May be we keep increasing.
We can reset the count when ever a member activation  received.

Wdyt?

Thanks.

Sent from my mobile.

Reply via email to