Your min_size=2 is why the cluster is blocking and you can't mount cephfs.
Those 2 PGs, while the cluster is performing the backfilling, are currently
only on 1 OSD (osd.13). That is not enough OSDs to satisfy the min_size,
so any requests for data on those PGs will block and wait until a second
We are using replica 2 and min size is 2. A small amount of data is
sitting around from when we were running the default 3.
Looks like the problem started around here:
2017-06-22 14:54:29.173982 7f3c39f6f700 0 log_channel(cluster) log
[INF] : 1.2c9 deep-scrub ok
2017-06-22 14:54:29.690401
Something about it is blocking the cluster. I would first try running this
command. If that doesn't work, then I would restart the daemon.
# ceph osd down 13
Marking it down should force it to reassert itself to the cluster without
restarting the daemon and stopping any operations it's working
Thanks for the response:
[root@ceph-control ~]# ceph health detail | grep 'ops are blocked'
100 ops are blocked > 134218 sec on osd.13
[root@ceph-control ~]# ceph osd blocked-by
osd num_blocked
A problem with osd.13?
Dan
On 06/23/2017 02:03 PM, David Turner wrote:
# ceph health detail | grep
# ceph health detail | grep 'ops are blocked'
# ceph osd blocked-by
My guess is that you have an OSD that is in a funky state blocking the
requests and the peering. Let me know what the output of those commands
are.
Also what are the replica sizes of your 2 pools? It shows that only 1 OSD
was
Two of our OSD systems hit 75% disk utilization, so I added another
system to try and bring that back down. The system was usable for a day
while the data was being migrated, but now the system is not responding
when I try to mount it:
mount -t ceph ceph-0,ceph-1,ceph-2,ceph-3:6789:/ /home