andrijapanicsb opened a new issue #3895: Algorithm not set in agent.properties, 
when adding a new KVM hosts
URL: https://github.com/apache/cloudstack/issues/3895
 
 
   CloudStack agent, software load balancing logic
   
   ### how it works / supposed to work
   When having the global setting as follows (IP values are an example, to show 
2 IP addresses for 2 mgmt hosts)... :
   - host=10.2.2.144,10.2.2.118
   - indirect.agent.lb.algorithm=roundrobin (or static)
   - indirect.agent.lb.check.interval = xxx (different then zero)
   
   ... the behaviour we are achieving here is that the list of mgmt servers is 
randomised due to using "roundrobin" (or set always in the same order as it 
apperas in the "host" value, if we set "static) - before geting sent to all 
connected agents (so some agents will have "x.x.x..144,x.x.x.118@roundrobin" 
while some will have "x.x.x.118,x.x.x.144@roundrobin" in their agent.properties 
file)
   
   The presence of  "@roundrobin" or "@static" in the agent.properties is 
needed for the background task to run - a task which says "ok, this is 
"@static" or "@roundrobin", i.e. NOT "@shuffle" - so I'm going to read the 
value of the  indirect.agent.lb.check.interval and then check each  
indirect.agent.lb.check.interval seconds whether some of the prefered host 
(which died previously) is up again and will failback (reconnect) to the first 
host from the list of hosts in the agent.properties.
   If the algorith is "shuffle" or if it's missing from the agent.properties, 
then the task will assume "shuffle" and this means it will internally set 
indirect.agent.lb.check.interval=0, and thus will NEVER try to reconnect to the 
first host from the list of hosts in the agent.properties - so if first mshost 
die, agent will always connect to the next one, but when the first host is back 
again, the agent will NOT reconnect to the first one, since the algorith is 
"shuffle" or missing inside the agent.properties.
   
   ### BUG
   When adding a new KVM hosts, while having the global setting as descibed 
above, the new host will only have list of hosts, withOUT the algorith, in its 
agent.properties. This means if the first host is down, agent will reconnect 
(fail-over) to second host from the list, but when the first server is back 
online, it will NOT reconnect (fail-back) to the first mshost, even though the 
global settings (indirect.agent.lb.check.interval = xxx) says it should do so.
   
   Not a critical bug, but creates incosistency in agent "fail-back" behavior 
between the existing and freshly added KVM hosts.
   
   Workaround: change the host list once, change it back to what it was before 
("host" setting) and everything will be properly propagated to all connected 
agents (including those freshly added KVM hosts)
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to