> -----Original Message-----
> From: ceph-users [mailto:[email protected]] On Behalf Of 
> Daznis
> Sent: 09 January 2017 12:54
> To: ceph-users <[email protected]>
> Subject: [ceph-users] Ceph cache tier removal.
> 
> Hello,
> 
> 
> I'm running preliminary test on cache tier removal on a live cluster, before 
> I try to do that on a production one. I'm trying to
avoid
> downtime, but from what I noticed it's either impossible or I'm doing 
> something wrong. My cluster is running Centos 7.2 and 0.94.9
> ceph.
> 
> Example 1:
>  I'm setting the cache layer to forward.
>     1. ceph osd tier cache-mode test-cache forward .
> Then flushing the cache:
>      1. rados -p test-cache cache-flush-evict-all Then I'm getting stuck with 
> the some objects that can't be removed:
> 
> rbd_header.29c3cdb2ae8944a
> failed to evict /rbd_header.29c3cdb2ae8944a: (16) Device or resource busy
>         rbd_header.28c96316763845e
> failed to evict /rbd_header.28c96316763845e: (16) Device or resource busy 
> error from cache-flush-evict-all: (1) Operation not
> permitted
> 

These are probably the objects which have watchers attached. The current evict 
logic seems to be unable to evict these, hence the
error. I'm not sure if anything can be done to work around this other than what 
you have tried...ie stopping the VM, which will
remove the watcher.

> I found a workaround for this. You can bypass these errors by running
>       1. ceph osd tier remove-overlay test-pool
>       2. turning off the VM's that are using them.
> 
> For the second option. I can boot the VM's normally after recreating a new 
> overlay/cauchetier. At this point everything is working
fine,
> but I'm trying to avoid downtime as it takes almost 8h to start and check 
> everything to be in optimal condition.
> 
> Now for the first part. I can remove the overlay and flush cache layer. And 
> VM's are running fine with it removed. Issues start
after I
> have readed the cache layer to the cold pool and try to write/read from the 
> disk. For no apparent reason VM's just freeze. And you
> need to force stop/start all VM's to start working.

Which pool are the VM's being pointed at base or cache? I'm wondering if it's 
something to do with the pool id changing?

> 
> From what I have read about it all objects should leave cache tier and you 
> don't have to  "force" removing the tier with objects.
> 
> Now onto the questions:
> 
>    1. Is it normal for VPS to freeze while adding a cache layer/tier?
>    2. Do VMS' need to be offline to remove caching layer?
>    3. I have read somewhere that snapshots might interfere with cache
> tier clean up. Is it true?        4. Are there some other ways to
> remove the caching tier on a live system?
> 
> 
> Regards,
> 
> 
> Darius
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to