I'm going to start working on JIRA issues VCL-6 and VCL-7 updating
healthcheck.pm.
healthcheck is a script that is run via cron, it's main goal is to check
the status of the nodes(vm,standalone lab,blades), perform an action such
as reset the state of the machine in the database or something else.
The goals of this update:
- move all db queries to the utils.pm.
- remove checks for valid hosts, since some nodes could possibly not be
registered this check doesn't make sense
- add checks for vm's and the vm's host server, this would include any disk
size issues state of the vm, etc.
- add disk usage size checks for management node
Dependencies -
The healthcheck.pm will need to use the Provision engines/modules
"node_status" routine. The node_status routine will need to accept
variables instead of pulling from the data_hash. Since healthcheck module
checks all the nodes under the control of the management node running the
healthcheck cron, it doesn't use the data_hash because the data_hash is
constructed on a pre-reservation basis.
So in summary I'll need to add logic the provisioning modules node_status
routines to accept input variables.
The is a first step in updating healthcheck process. Once this is complete
we can explore additional steps such as a reload attempt with a know
working image, loading an stateless install in memory and run some kind of
diagnostics tool or even do some power management(if enabled) to shut down
machines during low usage or in an data center event that would create heat
issues.
If there are not any objections or other suggestions for this update - I'll
proceed.
Aaron