Thank you for sharing, I missed that session. Somehow related to the health checks: https://review.openstack.org/#/c/97748/
This is an spec/functionality for oslo I’m working on, to provide feedback to the process manager that runs the daemons (init.d, pacemaker, systemd, pacemaker+systemd, upstart). The idea is that daemons themselves could provide feedback about their inner status, with an status code + status message. To allow, for example, degraded operation. Feedback on the spec/comments is appreciated. Best regards, Miguel Ángel Miguel Ángel ajo @ freenode.net On Thursday, 13 de November de 2014 at 12:59, Angus Salkeld wrote: > On Tue, Nov 11, 2014 at 12:13 PM, Angus Salkeld <asalk...@mirantis.com > (mailto:asalk...@mirantis.com)> wrote: > > Hi all > > > > The HA session was really well attended and I'd like to give some feedback > > from the session. > > > > Firstly there is some really good content here: > > https://etherpad.openstack.org/p/kilo-crossproject-ha-integration > > > > 1. We SHOULD provide better health checks for OCF resources > > (http://linux-ha.org/wiki/OCF_Resource_Agents). > > These should be fast and reliable. We should probably bike shed on some > > convention like "<project>-manage healthcheck" > > and then roll this out for each project. > > > > 2. We should really move > > https://github.com/madkiss/openstack-resource-agents to stackforge or > > openstack if the author is agreeable to it (it's referred to in our > > official docs). > > > > I have chatted to the author of this repo and he is happy for it to live > under stackforge or openstack. Or each OCF resource going into each of the > projects. > Does anyone have any particular preference? I suspect stackforge will be the > path of least resistance. > > -Angus > > > 3. All services SHOULD support Active/Active configurations > > (better scaling and it's always tested) > > > > 4. We should be testing HA (there are a number of ideas on the etherpad > > about this) > > > > 5. Many services do not recovery in the case of failure mid-task > > This seems like a big problem to me (some leave the DB in a mess). > > Someone linked to an interesting article ( > > crash-only-software: http://lwn.net/Articles/191059/) > > (http://lwn.net/Articles/191059/) that suggests that we if we do this > > correctly we should not need the concept of clean shutdown. > > > > (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L459-L471) > > I'd be interested in how people think this needs to be approached > > (just raise bugs for each?). > > > > Regards > > Angus > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org (mailto:OpenStack-dev@lists.openstack.org) > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev