On 5/1/24 14:38, mike tancsa wrote:
Kind of struggling to check if TRIM is actually working or not with my SSDs on RELENG_14 in ZFS.

On a pool that has almost no files on it (capacity at 0% out of 3TB), should not

zpool -w trim <pool> be almost instant after a couple of runs ? Instead it seems to always take about 10min to complete.

Looking at the stats,

kstat.zfs.tortank1.misc.iostats.trim_bytes_failed: 0
kstat.zfs.tortank1.misc.iostats.trim_extents_failed: 0
kstat.zfs.tortank1.misc.iostats.trim_bytes_skipped: 2743435264
kstat.zfs.tortank1.misc.iostats.trim_extents_skipped: 253898
kstat.zfs.tortank1.misc.iostats.trim_bytes_written: 14835526799360
kstat.zfs.tortank1.misc.iostats.trim_extents_written: 1169158

what and why are bytes being skipped ?

One of the drives for example

 sysctl -a kern.cam.ada.0
kern.cam.ada.0.trim_ticks: 0
kern.cam.ada.0.trim_goal: 0
kern.cam.ada.0.sort_io_queue: 0
kern.cam.ada.0.rotating: 0
kern.cam.ada.0.unmapped_io: 1
kern.cam.ada.0.flags: 0x1be3bde<CAN_48BIT,CAN_FLUSHCACHE,CAN_NCQ,CAN_DMA,WAS_OTAG,CAN_TRIM,OPEN,SCTX_INIT,CAN_POWERMGT,CAN_DMA48,CAN_LOG,CAN_WCACHE,CAN_RAHEAD,PROBED,ANNOUNCED,DIRTY,PIM_ATA_EXT,UNMAPPEDIO>
kern.cam.ada.0.max_seq_zones: 0
kern.cam.ada.0.optimal_nonseq_zones: 0
kern.cam.ada.0.optimal_seq_zones: 0
kern.cam.ada.0.zone_support: None
kern.cam.ada.0.zone_mode: Not Zoned
kern.cam.ada.0.write_cache: -1
kern.cam.ada.0.read_ahead: -1
kern.cam.ada.0.trim_lbas: 7771432624
kern.cam.ada.0.trim_ranges: 371381
kern.cam.ada.0.trim_count: 310842
kern.cam.ada.0.delete_method: DSM_TRIM

If I take one of the disks out of the pool and replace it with a spare, and do a manual trim it seems to work

I had a hard time seeing evidence of this at the disk level while fiddling with TRIM recently. It appeared that at least some counters are driver and operation specific. For example, the da driver appears to update counters in some paths but not others. I assume that ada is different. There is a bug report for da, but haven't seen any feedback ...

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277673

You could try to run gstat with the -d flag during the time period when the delete operations are expected to occur. That should give you an idea of what's happening at the disk level in real time but may not offer more info than you're already seeing.

e.g. here was one disk in the pool that was taking a long time for each zpool trim

# time trim -f /dev/ada1
trim /dev/ada1 offset 0 length 1000204886016
0.000u 0.057s 1:29.33 0.0%      5+184k 0+0io 0pf+0w
and then if I re-run it
#  time trim -f /dev/ada1
trim /dev/ada1 offset 0 length 1000204886016
0.000u 0.052s 0:04.15 1.2%      1+52k 0+0io 0pf+0w

90 seconds and then 4 seconds after that.


-Matthew

Reply via email to