> On Feb 20, 2019, at 2:06 PM, Dennis Zhou <[email protected]> wrote:
>
> Hi everyone,
>
> Ric proposed a similar topic already [1], but I'd like to just add a
> little more to that discussion with a broader proposal for discussing
> discard.
>
> Proposal:
> Discard is a topic that seems to need a little bit of TLC from the
> kernel side. The two main approaches with discard are either to enable
> it as a mount option or to periodically run fstrim. At Facebook, we've
> seen performance wins with both enabling discard and disabling discard
> in favor of fstrim. I would like to find a middle ground where we don't
> have to have application developers performance tune for the hardware.
>
> XFS has had async discard support, but it has been problematic for our
> fleet. We were seeing bursts of large discard requests caused by async
> discard in XFS. This resulted in degraded drive performance increasing
> latency for dependent services.
>
> An alternative to discard is a filesystem level choice of reusing
> blocks. Rather than immediately issuing a discard, we can choose to
> reuse a block which will cause the underlying FTL to invalidate a block
> that we would have discarded. This saves the work done by the initial
> discard. There will have to be a balance here though with saving discard
> with hopes of reusing it and eventually issuing the discards.
>
> Another issue is each drive has its own proprietary FTL which introduces
> a wide variability of performance with discard. Some drives don't need
> discard to perform well and others need discard enabled. There is also a
> class of drives where discard is so not performant that it locks up the
> drive. I would like to discuss how we can quantify this and help tune
> discard within the kernel.
>
> In summary, I'd like to explore discard more and understand how we can
> build a more balanced discard experience where end users don't have to
> be super cognizant of the implications on their applications.
>
> [1]
> https://lore.kernel.org/linux-fsdevel/[email protected]/
>
> Thanks,
> Dennis
Maybe we should use BPF to handle complexity and variations?
Thanks,
Song