On 5/1/2024 4:24 PM, Matthew Grooms wrote:
On 5/1/24 14:38, mike tancsa wrote:
Kind of struggling to check if TRIM is actually working or not with
my SSDs on RELENG_14 in ZFS.
On a pool that has almost no files on it (capacity at 0% out of 3TB),
should not
zpool -w trim <pool> be almost instant after a couple of runs ?
Instead it seems to always take about 10min to complete.
Looking at the stats,
kstat.zfs.tortank1.misc.iostats.trim_bytes_failed: 0
kstat.zfs.tortank1.misc.iostats.trim_extents_failed: 0
kstat.zfs.tortank1.misc.iostats.trim_bytes_skipped: 2743435264
kstat.zfs.tortank1.misc.iostats.trim_extents_skipped: 253898
kstat.zfs.tortank1.misc.iostats.trim_bytes_written: 14835526799360
kstat.zfs.tortank1.misc.iostats.trim_extents_written: 1169158
what and why are bytes being skipped ?
One of the drives for example I had a hard time seeing evidence of
this at the disk level while fiddling with TRIM recently. It appeared
that at least some counters are driver and operation specific. For
example, the da driver appears to update counters in some paths but
not others. I assume that ada is different. There is a bug report for
da, but haven't seen any feedback ...
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277673
You could try to run gstat with the -d flag during the time period
when the delete operations are expected to occur. That should give you
an idea of what's happening at the disk level in real time but may not
offer more info than you're already seeing.
It *seems* to be doing something. What I dont understand is why if I
run it once, do nothing (no writes / snapshots etc), and then run trim
again, it seems to be doing something with gstat even though there
should not be anything to mark as being trimmed ?
dT: 1.002s w: 1.000s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps
ms/d %busy Name
0 1254 0 0 0.0 986 5202 2.0 244
8362733 4.5 55.6 ada0
12 1242 0 0 0.0 1012 5218 1.9 206
4972041 6.0 63.3 ada2
12 1242 0 0 0.0 1012 5218 1.9 206
4972041 6.0 63.3 ada2p1
0 4313 0 0 0.0 1024 5190 0.8 3266
6463815 0.4 62.8 ada3
0 1254 0 0 0.0 986 5202 2.0 244
8362733 4.5 55.6 ada0p1
0 4238 0 0 0.0 960 4874 0.7 3254
6280362 0.4 59.8 ada5
0 4313 0 0 0.0 1024 5190 0.8 3266
6463815 0.4 62.8 ada3p1
0 4238 0 0 0.0 960 4874 0.7 3254
6280362 0.4 59.8 ada5p1
dT: 1.001s w: 1.000s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps
ms/d %busy Name
2 2381 0 0 0.0 1580 9946 0.9 767
5990286 1.8 70.0 ada0
2 2801 0 0 0.0 1540 9782 0.9 1227
11936510 1.0 65.2 ada2
2 2801 0 0 0.0 1540 9782 0.9 1227
11936510 1.0 65.2 ada2p1
0 2072 0 0 0.0 1529 9566 0.8 509
12549587 2.1 57.0 ada3
2 2381 0 0 0.0 1580 9946 0.9 767
5990286 1.8 70.0 ada0p1
0 2042 0 0 0.0 1517 9427 0.6 491
12549535 1.9 52.4 ada5
0 2072 0 0 0.0 1529 9566 0.8 509
12549587 2.1 57.0 ada3p1
0 2042 0 0 0.0 1517 9427 0.6 491
12549535 1.9 52.4 ada5p1
dT: 1.002s w: 1.000s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps
ms/d %busy Name
2 1949 0 0 0.0 1094 5926 1.2 827
11267200 1.8 78.8 ada0
0 2083 0 0 0.0 1115 6034 0.7 939
16537981 1.4 67.2 ada2
0 2083 0 0 0.0 1115 6034 0.7 939
16537981 1.4 67.2 ada2p1
2 2525 0 0 0.0 1098 5914 0.8 1399
16021615 1.1 79.3 ada3
2 1949 0 0 0.0 1094 5926 1.2 827
11267200 1.8 78.8 ada0p1
12 2471 0 0 0.0 1018 5399 1.0 1425
15395566 1.1 80.5 ada5
2 2525 0 0 0.0 1098 5914 0.8 1399
16021615 1.1 79.3 ada3p1
12 2471 0 0 0.0 1018 5399 1.0 1425
15395566 1.1 80.5 ada5p1
The ultimate problem is that after a while with a lot of writes, the
disk performance will be toast until I do a manual trim -f of the disk
:( this is most notable on consumer WD SSDs. I havent done any
extensive tests with Samsung SSDs to see if there are performance
penalties or not. It might be that they are just better at masking the
problem. I dont see the same issue with ZFS on Linux with the same
disks / hardware
I have an open PR in
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277992 that I think
might actually have 2 separate problems.
---Mike
e.g. here was one disk in the pool that was taking a long time for
each zpool trim
# time trim -f /dev/ada1
trim /dev/ada1 offset 0 length 1000204886016
0.000u 0.057s 1:29.33 0.0% 5+184k 0+0io 0pf+0w
and then if I re-run it
# time trim -f /dev/ada1
trim /dev/ada1 offset 0 length 1000204886016
0.000u 0.052s 0:04.15 1.2% 1+52k 0+0io 0pf+0w
90 seconds and then 4 seconds after that.
-Matthew