On Fri, Apr 6, 2018 at 1:58 AM, Borja Marcos <bor...@sarenet.es> wrote:
> > > On 5 Apr 2018, at 17:00, Warner Losh <i...@bsdimp.com> wrote: > > > > I'm working on trim shaping in -current right now. It's focused on NVMe, > > but since I'm doing the bulk of it in cam_iosched.c, it will eventually > be > > available for ada and da. The notion is to measure how long the TRIMs > take, > > and only send them at 80% of that rate when there's other traffic in the > > queue (so if trims are taking 100ms, send them no faster than 8/s). While > > this will allow for better read/write traffic, it does slow the TRIMs > down > > which slows down whatever they may be blocking in the upper layers. Can't > > speak to ZFS much, but for UFS that's freeing of blocks so things like > new > > block allocation may be delayed if we're almost out of disk (which we > have > > no signal for, so there's no way for the lower layers to prioritize trims > > or not). > > Have you considered "hard" shaping including discarding TRIMs when needed? > Remember that a TRIM is not a write, which is subject to a contract with > the application, > but a better-if-you-do-it operation. > Well, yes and no. TRIM is there to improve performance, in the long term, of the drives because they'd otherwise get too fragmented and/or have an unacceptably high write amplification. It's more than just a hint, but maybe, in some cases, less than a write. Better if you do it does give some leeway, how much depends on the application. If we were to implement a hard limit on the latency of TRIMs, it would have to be user configurable. There's also the strategy of returning some TRIMs right away, while letting only a percentage through to the device. If I go through with what you're calling hard shaping, I'd also look for ways to allow the upper layers to tell me to hurry up. We have it in the buffer daemon between all the users of bufs when there's a buf shortage, but no similar signal from UFS down to the device to tell it that the results are needed NOW vs needed eventually. And the urgency of the need varies somewhat over time. you could easily send down a boatload of TRIMs with no urgent need for blocks, time passes, and then you have an urgent need for blocks. So you can't add something to the bio going down that it's needed or not since you might not have another TRIM to send down. A new BIO type and/or a tweak to BIO_FLUSH might suffice and be well defined for drivers that don't do weird things. The notion of the upper layers being able to cancel a TRIM that's been queued up was also floated since TRIM + WRITE in quick succession often gives no different performance than just the bare WRITE. And I have no clue what ZFS does wrt TRIMs. So I've considered it, yes. But there's more tricky corners here to consider if it were to be implemented due to (a) the diversity of quality in the market place and (b) the diversity of workloads FreeBSD is used for. > Otherwise, as you say, you might be blocking other operations in the upper > layers. > I am assuming here that with many devices doing TRIMs is better than not > doing them. > And in case of queue congestion doing *some* TRIMs should be better than > doing > no TRIMs at all. > > Yep, not the first time I propose something of the sort, but my queue of > suggestions > to eventually discard TRIMs doesn’s implement the same method ;) > I'm looking at all options, to be honest. I'm not sure what will work the best in the long term. I've observed that, at least with UFS, it's quite easy to survive for hours without finishing the trim with millions of TRIMs in the queue. All it affects are monitoring programs that freak out when you have this many items in the queue for so long (thinking something must be wrong). Of coarse, we control the monitoring programs, so that's easy to fix. (I discovered this when I was doing 1 IOP for TRIMs when running early, buggy versions of this, btw). The backup causes UFS to be waiting on the blocks, if there is a block shortage. Since most of the time there's not, this didn't cause problems when I tweaked a parameter and drained the TRIMs 8 hours after tons of files were deleted.... Warner _______________________________________________ firstname.lastname@example.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"