After a critical node failure on my lab cluster, which won't come back up and
is still down, the RBD objects are still being watched / mounted according to
ceph. I can't shell to the node to rbd unbind them as the node is down. I am
absolutely certain that nothing is using these images and they don't have
snapshots either (and this IP is not even remotely close to the those of the
monitors in the cluster). I blocked the IP usingceph osd blocklist add but
after 30 minutes, they are still being watched. Them being watched (they are
RWO ceph-csi volumes) prevents me from re-using them in the cluster. As far as
I'm aware, ceph should remove the watchers after 30 minutes and they've been
blocklisted for hours now.root@node0:~# rbd status
kubernetes/csi-vol-e6a07ccd-93f6-4c47-a948-201501440fff
Watchers:
watcher=10.0.0.103:0/992994811 client.1634081 cookie=139772597209280
root@node0:~# rbd snap list
kubernetes/csi-vol-e6a07ccd-93f6-4c47-a948-201501440fff
root@node0:~# rbd info kubernetes/csi-vol-e6a07ccd-93f6-4c47-a948-201501440fff
rbd image 'csi-vol-e6a07ccd-93f6-4c47-a948-201501440fff':
size 10 GiB in 2560 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 4ff5353b865e1
block_name_prefix: rbd_data.4ff5353b865e1
format: 2
features: layering
op_features:
flags:
create_timestamp: Fri Mar 31 14:46:51 2023
access_timestamp: Fri Mar 31 14:46:51 2023
modify_timestamp: Fri Mar 31 14:46:51 2023
root@node0:~# rados -p kubernetes listwatchers rbd_header.4ff5353b865e1
watcher=10.0.0.103:0/992994811 client.1634081 cookie=139772597209280
root@node0:~# ceph osd blocklist ls
10.0.0.103:0/0 2023-04-16T13:58:34.854232+0200
listed 1 entries
root@node0:~# ceph daemon osd.0 config get osd_client_watch_timeout
{
"osd_client_watch_timeout": "30"
}
Is it possible to kick a watcher out manually, or is there not much I can do
here besides shutting down the entire cluster (or OSDs) and getting them back
up? If it is a bug, I'm happy to help figuring out it's root cause and see if I
can help writing a fix. Cheers, Max.
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]