Re: [ceph-users] Not timing out watcher

Serguei Bezverkhi (sbezverk) Thu, 21 Dec 2017 06:05:29 -0800

Hi Ilya,

Here you go, no k8s services running this time:


sbezverk@kube-4:~$ sudo rbd map raw-volume --pool kubernetes --id admin -m 
192.168.80.233  --key=AQCeHO1ZILPPDRAA7zw3d76bplkvTwzoosybvA==
/dev/rbd0
sbezverk@kube-4:~$ sudo rbd status raw-volume --pool kubernetes --id admin -m 
192.168.80.233  --key=AQCeHO1ZILPPDRAA7zw3d76bplkvTwzoosybvA==
Watchers:
        watcher=192.168.80.235:0/3465920438 client.65327 cookie=1
sbezverk@kube-4:~$ sudo rbd info raw-volume --pool kubernetes --id admin -m 
192.168.80.233  --key=AQCeHO1ZILPPDRAA7zw3d76bplkvTwzoosybvA==
rbd image 'raw-volume':
        size 10240 MB in 2560 objects
        order 22 (4096 kB objects)
        block_name_prefix: rb.0.fafa.625558ec
        format: 1
sbezverk@kube-4:~$ sudo reboot

sbezverk@kube-4:~$ sudo rbd status raw-volume --pool kubernetes --id admin -m 
192.168.80.233  --key=AQCeHO1ZILPPDRAA7zw3d76bplkvTwzoosybvA==
Watchers: none

It seems when the image was mapped manually, this issue is not reproducible. 

K8s does not just map the image, it also creates loopback device which is 
linked to /dev/rbd0. Maybe this somehow reminds rbd client to re-activate a 
watcher on reboot. I will try to mimic exact steps k8s follows manually to see 
what exactly forces an active watcher after reboot.

Thank you
Serguei

On 2017-12-21, 5:49 AM, "Ilya Dryomov" <[email protected]> wrote:

    On Wed, Dec 20, 2017 at 6:20 PM, Serguei Bezverkhi (sbezverk)
    <[email protected]> wrote:
    > It took 30 minutes for the Watcher to time out after ungraceful restart. 
Is there a way limit it to something a bit more reasonable? Like 1-3 minutes?
    >
    > On 2017-12-20, 12:01 PM, "Serguei Bezverkhi (sbezverk)" 
<[email protected]> wrote:
    >
    >     Ok, here is what I found out. If I gracefully kill a pod then watcher 
gets properly cleared, but if it is done ungracefully, without “rbd unmap” then 
even after a node reboot Watcher stays up for a long time,  it has been more 
than 20 minutes and it is still active (no any kubernetes services are running).
    
    Hi Serguei,
    
    Can you try taking k8s out of the equation -- set up a fresh VM with
    the same kernel, do "rbd map" in it and kill it?
    
    Thanks,
    
                    Ilya
    

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Not timing out watcher

Reply via email to