Am Mittwoch, 22. April 2009 23:41:42 schrieb Mark Hamzy: > Hello, > > I am working on a feature to add system health metrics to HA. With this > information, HA could failover nodes away from hardware that might have > problems. > > The following is a short description of what we want this new feature to > do. > > Feature Name: Health monitoring support > Purpose: Allow pacemaker to schedule resources in a way that's sensitive > to a variety of server-related health metrics > > Description: > Add support in pacemaker for a class of attributes which would be specially > treated. Under this proposal, all attributes defined for a node whose name > matches the regular expression /^#health-.*$/ would be automatically added > into the score for each resource being considered for scheduling on that > node. > > The purpose of this is to allow multiple independent health monitors to > each set their own health status and have that taken into account when > scheduling resources. For example, IBM might define one called > #health-ibmserver. Someone using smarttools (disk health monitors) might > define one called #health-smarttools. Someone else using IPMI might define > one called #health-ipmi. This means that this feature is not specific to > any vendor, and various health monitor providers can develop health metrics > for their hardware and not have to coordinate with each other in their > development process. > > Typical usage of these variables is expected to be something like this: > > Health Attribute-value Meaning > green 1000 server is happy, capable of running any resource > yellow 0 server is marginal - it is desirable to > schedule resources somewhere else if you can > red -INFINITY server is unreliable (but still up) and should not > be used > > Note that the value given for green is likely to be configuration-specific, > and should be configurable by the various health monitoring tools as they > get developed. > > Special Note: > IBM is already in the process of developing such a health monitoring tool > for IBM X (intel-class) servers. > > So, what do you all think of this proposed functionality? Does it sound > reasonable? Comments are appreciated. > > Mark_______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems
Hi, it would be a standartized version of what is already possible with attr_updater. Basically I am interested in such an addition and I think it could be a valued contribution to the project. One should discuss in what secion of the CIB these information go and how this information is retrieved by the cluster. In any case the score number assigned to this feature should be configurable. There also should be a clean documented API to write you own health-... agents monitoring the system itself. Greetings, -- Dr. Michael Schwartzkopff MultiNET Services GmbH Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany Tel: +49 - 89 - 45 69 11 0 Fax: +49 - 89 - 45 69 11 21 mob: +49 - 174 - 343 28 75 mail: [email protected] web: www.multinet.de Sitz der Gesellschaft: 85630 Grasbrunn Registergericht: Amtsgericht München HRB 114375 Geschäftsführer: Günter Jurgeneit, Hubert Martens --- PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B Skype: misch42 _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
