Hi all

The HA session was really well attended and I'd like to give some feedback
from the session.

Firstly there is some really good content here:
https://etherpad.openstack.org/p/kilo-crossproject-ha-integration

1. We SHOULD provide better health checks for OCF resources (
http://linux-ha.org/wiki/OCF_Resource_Agents).
These should be fast and reliable. We should probably bike shed on some
convention like "<project>-manage healthcheck"
and then roll this out for each project.

2. We should really move
https://github.com/madkiss/openstack-resource-agents to stackforge or
openstack if the author is agreeable to it (it's referred to in our
official docs).

3. All services SHOULD support Active/Active configurations
    (better scaling and it's always tested)

4. We should be testing HA (there are a number of ideas on the etherpad
about this)

5. Many services do not recovery in the case of failure mid-task
    This seems like a big problem to me (some leave the DB in a mess).
Someone linked to an interesting article (
crash-only-software: http://lwn.net/Articles/191059/)
<http://lwn.net/Articles/191059/> that suggests that we if we do this
correctly we should not need the concept of clean shutdown.
     (
https://github.com/openstack/oslo-incubator/blob/master/openstack/common/service.py#L459-L471
)
     I'd be interested in how people think this needs to be approached
(just raise bugs for each?).

Regards
Angus
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to