summary: WD Blue 510 SSDs when attached to the mpr controller seem to
start throwing errors on random disks in the pools (see
https://lists.freebsd.org/archives/freebsd-hardware/2024-March/000100.html
for examples) after copying and destroying a zfs 200G dataset with many
small files 3 or 4 times on a set of 4 disks in raidz1. Doing a hard
trim -f da on the disks and recreating the pool allows me to do the
tests 3 or 4 more times before hitting the errors again. The same tests
with the same disks attached to a sata controller doesnt show the
errors. I also ran into the same problem with a similar LSI controller
but using the mrsas controller/driver (<AVAGO Invader SAS Controller>).
It seems to be trim related? Using samsung SSDs on the mpr controller
does not seem to show the issue.
OK, some updates. I took the same 4 disks off the mpr controller and
put them off the motherboard and the problem seems to disappear. If it
is still related to trim, I notice that on the mpr controller the trim
method is ATA_TRIM and when attached to the motherboard SATA its
DSM_TRIM. Not sure if there is any difference there ? Or its some other
problem. PR time for the mpr driver ?
kern.cam.ada.1.trim_ticks: 0
kern.cam.ada.1.trim_goal: 0
kern.cam.ada.1.flags:
0x1be3bde<CAN_48BIT,CAN_FLUSHCACHE,CAN_NCQ,CAN_DMA,WAS_OTAG,CAN_TRIM,OPEN,SCTX_INIT,CAN_POWERMGT,CAN_DMA48,CAN_LOG,CAN_WCACHE,CAN_RAHEAD,PROBED,ANNOUNCED,DIRTY,PIM_ATA_EXT,UNMAPPEDIO>
kern.cam.ada.1.trim_lbas: 6356918872
kern.cam.ada.1.trim_ranges: 171552
kern.cam.ada.1.trim_count: 84205
kern.cam.ada.1.delete_method: DSM_TRIM
kern.cam.da.6.trim_ticks: 0
kern.cam.da.6.trim_goal: 0
kern.cam.da.6.sort_io_queue: 0
kern.cam.da.6.unmapped_io: 1
kern.cam.da.6.rotating: 0
kern.cam.da.6.flags:
0x10ef40<WAS_OTAG,OPEN,SCTX_INIT,CAN_RC16,PROBED,ANNOUCNED,CAN_ATA_DMA,CAN_ATA_LOG,UNMAPPEDIO>
kern.cam.da.6.p_type: 0
kern.cam.da.6.error_inject: 0
kern.cam.da.6.max_seq_zones: 0
kern.cam.da.6.optimal_nonseq_zones: 0
kern.cam.da.6.optimal_seq_zones: 0
kern.cam.da.6.zone_support: None
kern.cam.da.6.zone_mode: Not Zoned
kern.cam.da.6.trim_lbas: 0
kern.cam.da.6.trim_ranges: 0
kern.cam.da.6.trim_count: 0
kern.cam.da.6.minimum_cmd_size: 6
kern.cam.da.6.delete_max: 17179607040
kern.cam.da.6.delete_method: ATA_TRIM
camcontrol iden doesnt show much difference really
diff -bu wd.mpr wd.ata
--- wd.mpr 2024-03-21 08:27:02.995734000 -0400
+++ wd.ata 2024-03-21 08:21:42.310055000 -0400
@@ -1,5 +1,6 @@
+# camcontrol ide ada1
pass6: <WD Blue SA510 2.5 1000GB 52046100> ACS-4 ATA SATA 3.x device
-pass6: 600.000MB/s transfers, Command Queueing Enabled
+pass6: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
protocol ACS-4 ATA SATA 3.x
device model WD Blue SA510 2.5 1000GB
Controller is
mprutil show adapter
mpr0 Adapter:
Board Name: INSPUR 3008IT
Board Assembly: INSPUR
Chip Name: LSISAS3008
Chip Revision: ALL
BIOS Revision: 18.00.00.00
Firmware Revision: 16.00.12.00
Integrated RAID: no
SATA NCQ: ENABLED
PCIe Width/Speed: x8 (8.0 GB/sec)
IOC Speed: Full
Temperature: 51 C
PhyNum CtlrHandle DevHandle Disabled Speed Min Max Device
0 0001 0009 N 6.0 3.0 12 SAS
Initiator
1 0001 0009 N 6.0 3.0 12 SAS
Initiator
2 0001 0009 N 6.0 3.0 12 SAS
Initiator
3 0001 0009 N 6.0 3.0 12 SAS
Initiator
4 N 3.0 12 SAS
Initiator
5 N 3.0 12 SAS
Initiator
6 N 3.0 12 SAS
Initiator
7 N 3.0 12 SAS
Initiator