Hi, On Fri, Feb 19, 2021 at 4:45 PM Xavier <t...@lws-hosting.com> wrote: > > Package: zfsutils-linux > Version: 0.8.6-1~bpo10+1 > Severity: important > > Dear Maintainer, > > The recently added cron "TRIM the first Sunday of every month" makes some SSD > drives crash. > > The problem appears on reasonnably busy and otherwise stable servers: > * with about 100 containers, > * each on a separate zvol, ext4 mounted with discard option, > * on a 6 identical drives raidz2. > > The issue has been observed on these drives: > * Micron_5100_MTFDDAK960TCB > * Samsung_SSD_850_EVO_1TB > * Samsung_SSD_860_EVO_1TB >
I'm particularly interested in the protocol used by these disks, I suspect all of them are equipped with SATA 2.x/3.0? Prior to SATA 3.1, TRIM command is considered blocking (non-queued) and this might be the root cause of crashing your workload in a busy environment. In other words, actively trimming on SATA 2.x/3.0 disks could be considered harmful to the operational status of heavy workloads even though the disks are enterprise graded with superfluous IOPS capacity. > When affected (it not always the case), the systems could not complete the > cancelling of the trim with: > # zpool trim -c pool > Testing trim on one drive only, and reducing the rate to as low as 500000, > did not help. > > A reset seems the only solution, followed by a zpool trim -c after reboot. > > It would be wise to deactivate that cron by default, or at least to provide > some kind of convenient way to do so, like an option in /etc/default/zfs. > Thanks for this advice and we'll have a look on how to get something landed for bullseye and buster-bpo. Regards, Aron