we introduced support for the deployment of Ceph in the liberty release so that it could optionally be used as backend for one or more of Cinder, Glance, Nova and more recently Gnocchi.

We used to deploy Ceph MONs on the controller nodes and Ceph OSDs on dedicated ceph-storage nodes so a deployment of OpenStack with Ceph would need at least 1 more additional node to host a Ceph OSD.

In our HA scenario the storage backends are configured as follows:

Glance -> Swift
Nova (ephemeral) -> Local
Cinder (persistent) -> LVM (on controllers)
Gnocchi -> Swift

The downside of the above configuration is that Cinder volumes can not be replicated across the controller nodes and become unavailable if a controller fails, while production environments generally expect persistent storage to be highly available. Cinder volumes instead could even get lost completely in case of a permanent failure of a controller.

With the Newton release and the composable roles we can now deploy Ceph OSDs on the compute nodes, removing the requirement we had for an additional node to host a Ceph OSD.

I would like to ask for some feedback on the possibility of deploying Ceph by default in the HA scenario and use it as backend for Cinder.

Also using Swift as backend for Glance and Gnocchi is enough to cover the availability issue for the data, but it also means we're storing that data on the controller nodes which might or might not be wanted; I don't see a strong reason for defaulting them to Ceph, but it might make more sense when Ceph is available; feedback about this would be appreciated as well.
I think it would be important to take into account the recently created guiding principles [0]:

"While the software that OpenStack produces has well defined and documented APIs, the primary output of OpenStack is software, not API definitions. We expect people who say they run “OpenStack” to run the software produced by and in the community, rather than alternative implementations of the API."

In the case of Cinder, I think the situation is a bit muddy as LVM is not openstack software, and my limited understanding is that LVM is used as a reference implementation, but in the case of Swift, I think RGW would be considered an 'alternative implementation of the API'.


[0] - https://governance.openstack.org/reference/principles.html#openstack-primarily-produces-software

Finally a shared backend (Ceph) for Nova would allow live migrations but probably decrease performances for the guests in general; so I'd be against defaulting Nova to Ceph. Feedback?

