On 6/25/25 09:19, Joachim Kraftmayer wrote:
Hi,
i agree to configure both interfaces as a bond.
from my experience, i see the following advantages for a separate public
and cluster network on the bond:
the isolation of public network and cluster network traffic makes it easier
to monitor client traffic and inter osd traffic.
and if it is necessary later, you can also prioritise or limit client
traffic via the separate interface.
It's also helpful to debug and analyse issues in the ceph cluster.
I have a different experience. When one of the nodes had a cluster
network down, but a working public network, you get hard to troubleshoot
issues. Especially as a newbie in Ceph (this was a test cluster). You
will see slow operations, OSDs on the storage nodes that flag their peer
down, the OSD daemon itself will respond to those messages that it's
still running, etc.. If you look at ceph -w you will get conflicting
information. For an experienced operator this will not be too hard to
troubleshoot, but for less experienced ones it will be. This was also in
a time where OSD "heartbeat" check were not yet a thing and also before
alertmanager et al. Ideally you don't want to have "gray" failures like
this and either want your ceph node to be "UP" or "DOWN", but not
something in between. A single public interface will give you that.
Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io