I'm going to start working on JIRA issues VCL-6 and VCL-7 updating healthcheck.pm.

healthcheck is a script that is run via cron, it's main goal is to check the status of the nodes(vm,standalone lab,blades), perform an action such as reset the state of the machine in the database or something else.

The goals of this update:
- move all db queries to the utils.pm.
- remove checks for valid hosts, since some nodes could possibly not be registered this check doesn't make sense - add checks for vm's and the vm's host server, this would include any disk size issues state of the vm, etc.
- add disk usage size checks for management node

Dependencies -
The healthcheck.pm will need to use the Provision engines/modules "node_status" routine. The node_status routine will need to accept variables instead of pulling from the data_hash. Since healthcheck module checks all the nodes under the control of the management node running the healthcheck cron, it doesn't use the data_hash because the data_hash is constructed on a pre-reservation basis.

So in summary I'll need to add logic the provisioning modules node_status routines to accept input variables.

The is a first step in updating healthcheck process. Once this is complete we can explore additional steps such as a reload attempt with a know working image, loading an stateless install in memory and run some kind of diagnostics tool or even do some power management(if enabled) to shut down machines during low usage or in an data center event that would create heat issues.

If there are not any objections or other suggestions for this update - I'll proceed.

Aaron

Reply via email to