Excerpts from Adam Kijak's message of 2016-10-12 12:23:41 +0000:
> > ________________________________________
> > From: Xav Paice <xavpa...@gmail.com>
> > Sent: Monday, October 10, 2016 8:41 PM
> > To: email@example.com
> > Subject: Re: [Openstack-operators] [openstack-operators][ceph][nova] How do
> > you handle Nova on Ceph?
> > On Mon, 2016-10-10 at 13:29 +0000, Adam Kijak wrote:
> > > Hello,
> > >
> > > We use a Ceph cluster for Nova (Glance and Cinder as well) and over
> > > time,
> > > more and more data is stored there. We can't keep the cluster so big
> > > because of
> > > Ceph's limitations. Sooner or later it needs to be closed for adding
> > > new
> > > instances, images and volumes. Not to mention it's a big failure
> > > domain.
> > I'm really keen to hear more about those limitations.
> Basically it's all related to the failure domain ("blast radius") and risk
> Bigger Ceph cluster means more users.
Are these risks well documented? Since Ceph is specifically designed
_not_ to have the kind of large blast radius that one might see with
say, a centralized SAN, I'm curious to hear what events trigger
> Growing the Ceph cluster temporary slows it down, so many users will be
One might say that a Ceph cluster that can't be grown without the users
noticing is an over-subscribed Ceph cluster. My understanding is that
one is always advised to provision a certain amount of cluster capacity
for growing and replicating to replaced drives.
> There are bugs in Ceph which can cause data corruption. It's rare, but when
> it happens
> it can affect many (maybe all) users of the Ceph cluster.
OpenStack-operators mailing list