Thanks for the reply — I was pretty darn sure, since I live migrated all VMs off of that box and then killed everything but a handful of system processes (init, sshd, etc) and the watcher was STILL present. In saying that, I halted the machine (since nothing was running on it any longer) and the watcher did indeed go away and I was able to remove the images. Very, very strange. (But situation solved… except I don’t know what the cause was, really.)
Thanks, -- Kenneth Van Alstyne Systems Architect Knight Point Systems, LLC Service-Disabled Veteran-Owned Business 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 c: 228-547-8045 f: 571-266-3106 www.knightpoint.com<http://www.knightpoint.com> DHS EAGLE II Prime Contractor: FC1 SDVOSB Track GSA Schedule 70 SDVOSB: GS-35F-0646S GSA MOBIS Schedule: GS-10F-0404Y ISO 9001 / ISO 20000 / ISO 27001 / CMMI Level 3 Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, copy, use, disclosure, or distribution is STRICTLY prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. On Jan 10, 2019, at 4:03 AM, Ilya Dryomov <[email protected]<mailto:[email protected]>> wrote: On Wed, Jan 9, 2019 at 5:17 PM Kenneth Van Alstyne <[email protected]<mailto:[email protected]>> wrote: Hey folks, I’m looking into what I would think would be a simple problem, but is turning out to be more complicated than I would have anticipated. A virtual machine managed by OpenNebula was blown away, but the backing RBD images remain. Upon investigating, it appears that the images still have watchers on the KVM node that that VM previously lived on. I can confirm that there are no mapped RBD images on the machine and the qemu-system-x86_64 process is indeed no longer running. Any ideas? Additional details are below: # rbd info one-73-145-10 rbd image 'one-73-145-10': size 1024 GB in 262144 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.27174d6b8b4567 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten flags: parent: rbd/one-73@snap overlap: 102400 kB # # rbd status one-73-145-10 Watchers: watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880 # # # rados -p rbd listwatchers rbd_header.27174d6b8b4567 watcher=10.0.235.135:0/3820784110 client.33810559 cookie=140234310778880 This appears to be a RADOS (i.e. not a kernel client) watch. Are you sure that nothing of the sort is running on that node? In order for the watch to stay live, the watcher has to send periodic ping messages to the OSD. Perhaps determine the primary OSD with "ceph osd map rbd rbd_header.27174d6b8b4567", set debug_ms to 1 on that OSD and monitor the log for a few minutes? Thanks, Ilya
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
