On Tue, Jul 10, 2018 at 23:30:06 +0100, Andrew Gabriel wrote: > On 10/07/2018 09:26, Pawel Jakub Dawidek wrote: > > On 7/9/18 23:39, Ken Merry wrote: > >> Hi ZFS folks, > >> > >> We (Spectra Logic) have seen some odd behavior with resilvers in RAIDZ3 > >> pools. > >> > >> The codebase in question is FreeBSD stable/11 from July 2017, at > >> approximately FreeBSD SVN version 321310. > >> > >> We have customer systems with (sometimes) hundreds of SMR drives in RAIDZ3 > >> vdevs in a large pool. (A typical arrangement is a 23-drive RAIDZ3, and > >> some customers will put everything in one giant pool made up of a number > >> of 23-drive RAIDZ3 arrays.) > >> > >> The SMR drives in question have a bug that sometimes causes them to go off > >> the SAS bus for up to two minutes. (They???re usually gone a lot less > >> than that, up to 10 seconds.) Once they come back online, zfsd puts the > >> drive back in the pool and makes it online. > >> > >> If a resilver is active on a different drive, once the drive that > >> temporarily left comes back, the resilver apparently starts over from the > >> beginning. > >> > >> This leads to resilvers that take forever to complete, especially on > >> systems with high load. > > Since resilver is single threaded, adding the drive immediately doesn't > > buy you any additional redundancy. Maybe it would make sense for the > > zfsd to delay reinserting the drive until after ongoing resilver is done? > > A zpool can only have one resilver running, but it resilvers all the > disks requiring resilver together. > So yes, it starts again when an additional drive returns to the zpool, > but then does them all in parallel.
Ok. Does it go back to the beginning, or to the lowest common denominator transaction group in that case? > Doing them serially would take longer which won't help if your drive > failure rate is already bumping up against your drive resilver > completion rate. It would also mean it doesn't have data from the later > drive available to reconstruct data on the original resilvering drive. So if you online a drive, can ZFS access data from the onlined drive before the resilver gets to that point? > You need to get your drive firmware fixed - it's plain broken. The vendor released a firmware update that helped a bit, but has refused to do any more for the drive model in question. They have a newer model of the same drive (which we're shipping now) that is far less likely (10% of the rate of the older drive) to exhibit the problem. But that doesn't fix the customers with the older drive already out in the field. Yes, I agree, the drive is broken. Ken -- Kenneth Merry [email protected] ------------------------------------------ openzfs: openzfs-developer Permalink: https://openzfs.topicbox.com/groups/developer/T2a7340f4c0c48fa9-M3b1ff81935b0cb7c4cbceb1f Delivery options: https://openzfs.topicbox.com/groups
