[ceph-users] Re: ceph-ansible in Pacific and beyond?

Oliver Freyermuth Wed, 17 Mar 2021 14:54:02 -0700

Am 17.03.21 um 20:09 schrieb Stefan Kooman:

On 3/17/21 7:51 PM, Martin Verges wrote:

  I am still not convinced that containerizing everything brings any

benefits except the collocation of services.

Is there even a benefit?


Decoupling from underlying host OS. On a test cluster I'm running Ubuntu Focal on the 
host (and a bunch of other stuff, hyperconverged setup) and for testing purposes I needed 
to run Ceph Mimic there. No Mimic (or Nautilus packges for that matter) available on 
Ubuntu Focal. In that case it can be convenient to "just run a container". Sure 
you can build them yourselves. And with the pxe way of deploying you have more 
flexibility, but most setups (I guess) are not like that.

So, for me that was a benefit. I can think of other potential benefits, but I 
don't want to go there, asof yet, as I still need to convince myself containers 
are a proper solution to deploy software as far as Ceph is concerned. But it 
does look promising.

I think the main issue, trying to summarize the views in this thread, is that the
"container path" seems to be becoming the only offering (apart from the manual
installation steps).
ceph-deploy and now also ceph-ansible seem to have lost support / be slowly
losing support.

I already triggered a discussion taking quite similar turns a while earlier:

https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/4N5EEQWQ246QROV3RVMBCCBMXWGSVK36/#WMHVK6ZJRUHKEZLUPHMQQKD3AI2PTNZN
and there was similar feedback from many voices, some even holding off on
upgrading Ceph to newer versions due to this move.

I personally still agree containers have their advantages in many use cases,
but they also have their downsides in other (mostly complementary) setups.[0]
There's no free lunch, but often more than one kind of lunch you can choose to
your taste, especially in the modular world of Unix.

My main message is that there seems to be a general need for a way which users
can integrate with their potentially intricate, granular setup, in which they
have and want to keep full control of all components.
I believe this is something many voices ask for in addition to the "container
offering".

There are of course the manual installation commands, but running these comes
with a big disadvantage:
They need to be adapted often between Ceph versions. After all, the project is
active, gains new features, and commands gain new flags,
and things are deprecated or upgraded (and this is good!).

That means anything building on top of the manual commands (be it ceph-deploy,
ceph-ansible or anything else) will break early and break often,
unless it receives a lot of energy from maintainers. Just looking at the commit histories
of these projects easily reveals many "brain hours" have been burnt there,
always racing to keep up with new Ceph releases.

However, my understanding of the cephadm "orchestrator layer" is that it is modular, and
essentially encapsulates many cumbersome "lots of manual steps"-commands into "high
level" commands.
From this (without checking the code, so please correct me if wrong!) I would expect it
to be "easy" to, for example, have an SSH orchestrator running bare-metal,
i.e. essentially like "ceph-deploy", since the same commands now being run in
containers by the orchestrator could be executed bare-metal after installation of the
packages.
Of course, some things like an "upgrade" of all components in one orchestration step will
not work this way — but I wonder if the users asking for "full control" actually ask for
that.

This "high level language" could then be used to integrate with ceph-ansible or
other special tools people may use, abstracting away the manual commands,
removing most of the complexity and breakage whenever the manual commands are
changed, significantly reducing the maintenance burden for these tools and
other toolings
users may be using. In short, it would define an "API" which can be integrated
into other tools, following the modular Unix spirit.

That is what I actually thought the SSH orchestrator (as it was called in the
Nautilus docs once) was going to be, and then it turned out to be limited to
the specific container use case
(at least for now).

Of course, this implementation is not going to come "for free", somebody needs
to write it. I hope my summary was inspirational for someone out there knowing the
existing orchestrator code
and maybe interested in adding this (if it is of general interest, which I
believe it is). And I hope this won't reduce the thriving user base of Ceph in
the long run, since I am still in love with this project
in general (and the community around it!) and will keep recommending it :-).

Cheers,
Oliver

[0]
For example, when you have a full stack automated with Puppet or any other
configuration management, control of the OS updates, testing machines you
upgrade / reconfigure first before applying changes to production,
monitoring of all services etc., then adding containers for storage services
is just another layer of complexity.
All you get in such a case is just another "thing" you need to update (it will
by design also contain libraries not part of Ceph),
you need to monitor for CVEs, you may need to re-pull an updated container and
restart at different times from all the other parts of your infrastructure, and
so on.
Things like RDMA become harder, for which drivers and libraries from the "machine
OS" may need to be matched up or mixed in,
shared libraries in the container use extra memory since they are of different
versions than those from the OS etc.
Containers are good for isolation, reproducibilty and mobility of compute, but you don't
gain so much when you don't "re-orchestrate" often.
So as usual, mileage may vary.

smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-ansible in Pacific and beyond?

Reply via email to