Re: [ceph-users] Removing Snapshots Killing Cluster Performance

2014-12-02 Thread Craig Lewis
On Mon, Dec 1, 2014 at 1:51 AM, Daniel Schneller < daniel.schnel...@centerdevice.com> wrote: > > I could not find any way to throttle the background deletion activity > > (the command returns almost immediately). > I'm only aware of osd snap trim sleep. I haven't tried this since my Firefly upgr

Re: [ceph-users] Removing Snapshots Killing Cluster Performance

2014-12-01 Thread Daniel Schneller
Thanks for your input. We will see what we can find out with the logs and how to proceed from there. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Removing Snapshots Killing Cluster Performance

2014-12-01 Thread Dan Van Der Ster
> On 01 Dec 2014, at 13:37, Daniel Schneller > wrote: > > On 2014-12-01 10:03:35 +, Dan Van Der Ster said: > >> Which version of Ceph are you using? This could be related: >> http://tracker.ceph.com/issues/9487 > > Firefly. I had seen this ticket earlier (when deleting a whole pool) and

Re: [ceph-users] Removing Snapshots Killing Cluster Performance

2014-12-01 Thread Daniel Schneller
On 2014-12-01 10:03:35 +, Dan Van Der Ster said: Which version of Ceph are you using? This could be related: http://tracker.ceph.com/issues/9487 Firefly. I had seen this ticket earlier (when deleting a whole pool) and hoped the backport of the fix would be available some time soon. I must

Re: [ceph-users] Removing Snapshots Killing Cluster Performance

2014-12-01 Thread Dan Van Der Ster
Hi, Which version of Ceph are you using? This could be related: http://tracker.ceph.com/issues/9487 See "ReplicatedPG: don't move on to the next snap immediately"; basically, the OSD is getting into a tight loop "trimming" the snapshot objects. The fix above breaks out of that loop more frequent

[ceph-users] Removing Snapshots Killing Cluster Performance

2014-12-01 Thread Daniel Schneller
Hi! We take regular (nightly) snapshots of our Rados Gateway Pools for backup purposes. This allows us - with some manual pokery - to restore clients' documents should they delete them accidentally. The cluster is a 4 server setup with 12x4TB spinning disks each, totaling about 175TB. We are run