Thanks Jason.

I am running Jewel currently.  I do have object-map enabled, but it sounds like 
the no-op discard optimization is specific to the newer librbd?

Thanks again,
Brendan

________________________________________
From: Jason Dillaman [[email protected]]
Sent: Monday, November 27, 2017 5:49 PM
To: Brendan Moloney
Cc: [email protected]
Subject: Re: [ceph-users] I/O stalls when doing fstrim on large RBD

My only possible suggestion would be to try a Luminous librbd client
and enable the object-map feature on the image. When the object map is
enabled, librbd will optimize away all the no-op discard requests. If
that doesn't work, it could be an issue in the OS / controller
interaction.

On Mon, Nov 27, 2017 at 5:56 PM, Brendan Moloney <[email protected]> wrote:
> Hi,
>
> Anyone have input on this?  I am surprised there are not more people running 
> into this issue.  I guess most people don't have multi-TB RBD images?  I 
> think ext4 might also fair better since it does keep track of blocks that 
> have been discarded in the past and not modified so that they don't get 
> discarded again.
>
> For the benefit of anyone else on the list who is following along, I went 
> ahead and made the blktrace output available: 
> https://filebin.ca/3imcZ5IHImxW/fstrim_blktrace.tar.gz
>
> Here is the script I wrote to chunk up the fstrim: 
> https://gist.github.com/moloney/5763a02e3847a5368af56110cc583544
>
> Using this script and the "cfq" scheduler I do seem to have things running a 
> bit more smoothly. I also raised the disk timeout values from 30 seconds to 
> 180.  I am still not convinced that the issue is resolved though, I will need 
> to wait and see what happens when a VM under heavy load does an fstrim.
>
> Thanks,
> Brendan
>



--
Jason
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to