Hi,

I'm currently investigating a case where Ceph cluster ended up with 
inconsistent clone information.

Here's a what I did to quickly reproduce:
* Created new cluster (tested in hammer 0.94.6 and jewel 10.2.3)
* Created two pools: test and rbd
* Created base image in pool test, created snapshot, protected it and created 
clone of this snapshot in pool rbd:
        # rbd -p test create --size 10 --image-format 2 base
        # rbd -p test snap create base@base
        # rbd -p test snap protect base@base
        # rbd clone test/base@base rbd/destination
* Created new user called "test" with rwx permissions to rbd pool only:
        caps: [mon] allow r
        caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
pool=rbd
* Using this newly creted user I removed the cloned image in rbd pool, had 
errors but finally removed the image:
        # rbd --id test -p rbd rm destination
        2016-12-21 11:50:03.758221 7f32b7459700 -1 librbd::image::OpenRequest: 
failed to retreive name: (1) Operation not permitted
        2016-12-21 11:50:03.758288 7f32b6c58700 -1 
librbd::image::RefreshParentRequest: failed to open parent image: (1) Operation 
not permitted
        2016-12-21 11:50:03.758312 7f32b6c58700 -1 
librbd::image::RefreshRequest: failed to refresh parent image: (1) Operation 
not permitted
        2016-12-21 11:50:03.758333 7f32b6c58700 -1 librbd::image::OpenRequest: 
failed to refresh image: (1) Operation not permitted
        2016-12-21 11:50:03.759366 7f32b6c58700 -1 librbd::ImageState: failed 
to open image: (1) Operation not permitted
        Removing image: 100% complete...done.

At this point there's no cloned image but the original snapshot still has 
reference to it:

# rbd -p test snap unprotect base@base
2016-12-21 11:53:47.359060 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: 
cannot unprotect: at least 1 child(ren) [29b0238e1f29] in pool 'rbd'
2016-12-21 11:53:47.359678 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: 
encountered error: (16) Device or resource busy
2016-12-21 11:53:47.359691 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: 
0x7fee39ae9340 should_complete_error: ret_val=-16
2016-12-21 11:53:47.360627 7fee037fe700 -1 librbd::SnapshotUnprotectRequest: 
0x7fee39ae9340 should_complete_error: ret_val=-16
rbd: unprotecting snap failed: (16) Device or resource busy

# rbd -p test children base@base
rbd: listing children failed: (2) No such file or directory2016-12-21
11:53:08.716987 7ff2b2eaad80 -1 librbd: Error looking up name for image
id 29b0238e1f29 in pool rbd


Any ideas on how this could be fixed?


Thanks,
Bartek
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to