We're on Jewel and your right, I'm pretty sure the snap stuff is also now 
handled in the op thread.

The dump historic ops socket command showed a 10s delay at the "Reached PG" 
stage, from Greg's response [1], it would suggest that the OSD itself isn't 
blocking but the PG it's currently sleeping whilst trimming. I think in the 
former case, it would have a high time on the "Started" part of the op? Anyway 
I will carry out some more testing with higher osd op threads and see if that 
makes any difference. Thanks for the suggestion.

Nick


[1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008652.html

> -----Original Message-----
> From: Dan van der Ster [mailto:[email protected]]
> Sent: 13 January 2017 10:28
> To: Nick Fisk <[email protected]>
> Cc: ceph-users <[email protected]>
> Subject: Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?
> 
> Hammer or jewel? I've forgotten which thread pool is handling the snap trim 
> nowadays -- is it the op thread yet? If so, perhaps all the
> op threads are stuck sleeping? Just a wild guess. (Maybe increasing # op 
> threads would help?).
> 
> -- Dan
> 
> 
> On Thu, Jan 12, 2017 at 3:11 PM, Nick Fisk <[email protected]> wrote:
> > Hi,
> >
> > I had been testing some higher values with the osd_snap_trim_sleep
> > variable to try and reduce the impact of removing RBD snapshots on our
> > cluster and I have come across what I believe to be a possible unintended 
> > consequence. The value of the sleep seems to keep the
> lock on the PG open so that no other IO can use the PG whilst the snap 
> removal operation is sleeping.
> >
> > I had set the variable to 10s to completely minimise the impact as I
> > had some multi TB snapshots to remove and noticed that suddenly all IO to 
> > the cluster had a latency of roughly 10s as well, all the
> dumped ops show waiting on PG for 10s as well.
> >
> > Is the osd_snap_trim_sleep variable only ever meant to be used up to
> > say a max of 0.1s and this is a known side effect, or should the lock on 
> > the PG be removed so that normal IO can continue during the
> sleeps?
> >
> > Nick
> >
> > _______________________________________________
> > ceph-users mailing list
> > [email protected]
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to