Re: [ceph-users] Ceph disk failure causing outage/ stalled writes

2017-12-20 Thread Konstantin Shalygin
he writes to the cluster gets stalled for a good 10-15 mins and actually the machine just shut itself down. If your writes is stalled due host shutdown than your cluster have wrong configuration - like 2 osd-hosts with min_size 2 and failure domain - host. As is usually difficult to give

[ceph-users] Ceph disk failure causing outage/ stalled writes

2017-12-20 Thread Osama Hasebou
Hi Everyone, We have been having lately a pattern, which is, when a disk fails on CEPH, it gets marked as down, while the actual disk might not be faulty yet, and the systemd osd process is still showing up. When trying to kill the process, it doesn't work, and if the machine is rebooted,