anuragaw opened a new pull request #3575: [WIP DO NOT MERGE] Health check feature for virtual router URL: https://github.com/apache/cloudstack/pull/3575 We want to support more exhaustive health checks for VRs. This feature helps admins configuring health checks and also expands it's scope. Adds new global settings: * "router.health.checks.enabled" - If true, router health checks are performed periodically as per other configurations * "router.health.checks.interval" - Intervals (in minutes) at which router health checks are performed in minutes. * "router.health.checks.data.refresh.interval" - Intervals (in minutes) at which router health checks data - such as scheduling interval, excluded checks, etc is updated (mostly for advanced checks) * "router.health.checks.results.fetch.interval" - Intervals (in minutes) at which router health checks results are fetched in minutes. On each check management server evaluates need to restart as per configuration of "router.health.checks.failures.to.restart.vr" * "router.health.checks.type" - Router health checks type - can be basic or advanced (superset of basic checks but more heavy). Default or mismatch falls back to basic * "router.health.checks.failures.to.restart.vr" - Health checks failures that should cause router to restart. If empty the restart never happens. Put 'any' to restart on any failure * "router.health.checks.to.exclude" - Health checks that should be excluded when executing scheduled checks * "router.health.checks.free.disk.space.threshold" - Free disk space in MB threshold on VR below which the VR needs to be restarted as is considered a failure Additionally the feature looks into any executable script in /root/health_scripts/ directory and adds it's result as json output of the overall health checks config. This allows custom checks as part of health check cron job. The health checks can be manually triggered using new API added in the feature (CLI or UI both support this). ## Description <!--- Describe your changes in detail --> <!-- For new features, provide link to FS, dev ML discussion etc. --> <!-- In case of bug fix, the expected and actual behaviours, steps to reproduce. --> <!-- When "Fixes: #<id>" is specified, the issue/PR will automatically be closed when this PR gets merged --> <!-- For addressing multiple issues/PRs, use multiple "Fixes: #<id>" --> <!-- Fixes: # --> ## Types of changes <!--- What types of changes does your code introduce? Put an `x` in all the boxes that apply: --> - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [x] New feature (non-breaking change which adds functionality) - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] Enhancement (improves an existing feature and functionality) - [ ] Cleanup (Code refactoring and cleanup, that may add test cases) ## Screenshots (if appropriate): ## How Has This Been Tested? <!-- Please describe in detail how you tested your changes. --> <!-- Include details of your testing environment, and the tests you ran to --> <!-- see how your change affects other areas of the code, etc. --> Integration tests, manually, CMK, UI  <!-- Please read the [CONTRIBUTING](https://github.com/apache/cloudstack/blob/master/CONTRIBUTING.md) document -->
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
