GabrielBrascher commented on pull request #4978:
URL: https://github.com/apache/cloudstack/pull/4978#issuecomment-933484526


   @rhtyd thanks for the review. I hope that I can address all your comments 
here:
   
   > we should explore options to implement this without introducing a new 
service (my main concern is from security and upgrade point of view, a lot of 
people don't like non-essential services running on hypervisor)
   
   I understand that we should avoid populating new services, but I see HA as 
an essential part, and having it decoupled from the CloudStack agent helps with 
avoiding specific problems with the Java process.
   
   Additionally, this PR adds a global settings (on cluster scope) 
`kvm.ha.webservice.enabled`. By default, it is set to false, one can easily 
enable/disable it which results in CloudStack HA workflow skipping or not the 
checks for the KVM HA Helper.
   
   > for example, (1) what if I the admin wants to do some maintainance etc 
which requires stopping of the agent - in that case could your changes cause 
any side-effect, (2) systemd can be configured (probably already is?) to have 
this service always start on boot and on-crash/on-error
   
   You are right, this something to be careful about.
   We've configured the service in a way that it always starts on boot and if 
the process/job is killed for any reason it gets restarted as well. The only 
way of stopping it is via systemd (e.g. `systemctl stop 
cloudstack-hahelper.service`)
   
   > agent has a stop command answer it can tell mgmt server why it is stopping 
- that can be used intelligently to not cause HA led migrations (I haven't 
checked, probably already-is?)
   
   We did not implement such a way of telling that the agent has been 
"intentionally stopped". This would rely on Admins disabling it on the 
CloudStack side.
   I will need to add some information in the documentation about how to handle 
the cluster with this agent.
   
   > if this new service is essential, can it be secured using CA-framework 
generated certificates so at least the communication is validated (the simplest 
being server certificate was signed/created against the root CA cert)
   
   I can look into a way of adding CA certificates and validate the 
communications. For now, it has no such validation; however, it binds only with 
the node IP in the management network (which in theory is an isolated/secure 
network).
   
   > and a global setting/kill-switch for users who don't want/need this 
additional feature/service (for ex. NFS users?) and have it disabled by default
   
   Perfect, this is important indeed. We've added it via 
`kvm.ha.webservice.enabled`. One can set it per cluster, thus managing 
specifically which cluster is intended to have it enabled/disabled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to