Hi all,

Thanks for the interesting discussion. Actually it's a bit disappointing to see that also cephfs with multiple MDS servers is not as HA as we would like it.

I read also that filover time depends on the number of clients. We will only have three, and they will not do heavy IO. So that should perhaps help a bit.

Is there any difference between an 'uncontrolled' ceph server (accidental) reboot, and a controlled reboot, where we (for example) first failover the MDS in a controlled, gentle way?

MJ

Op 26-10-2022 om 14:40 schreef Eugen Block:
Just one comment on the standby-replay setting: it really depends on the use-case, it can make things worse during failover. Just recently we had a customer where disabling standby-replay made failovers even faster and cleaner in a heavily used cluster. With standby-replay they had to manually clean things up in the mounted directory. So I would recommend to test both options.

Zitat von William Edwards <[email protected]>:

Op 26 okt. 2022 om 10:11 heeft mj <[email protected]> het volgende geschreven:

Hi!

We have read https://docs.ceph.com/en/latest/man/8/mount.ceph, and would like to see our expectations confirmed (or denied) here. :-)

Suppose we build a three-node cluster, three monitors, three MDSs, etc, in order to export a cephfs to multiple client nodes.

On the (RHEL8) clients (web application servers) fstab, we will mount the cephfs like:

cehp1,ceph2,ceph3:/ /mnt/ha-pool/ ceph name=admin,secretfile=/etc/ceph/admin.secret,noatime 0 2

We expect that the RHEL clients will then be able to use (read/write) a shared /mnt/ha-pool directory simultaneously.

Our question: how HA can we expect this setup to be? Looking for some practical experience here.

Specific: Can we reboot any of the three involved ceph servers without the clients noticing anything? Or will there be certain timeouts involved, during which /mnt/ha-pool/ will appear unresposive, and *after* a timeout the client switches monitor node, and /mnt/ha-pool/ will respond again?

Monitor failovers don’t cause a noticeable disruption IIRC.

MDS failovers do. The MDS needs to replay. You can minimise the effect with mds_standby_replay.


Of course we hope the answer is: in such a setup, cephfs clients should not notice a reboot at all. :-)

All the best!

MJ
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]


_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]



_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to