On 10/07/2018 09:26, Pawel Jakub Dawidek wrote:
On 7/9/18 23:39, Ken Merry wrote:
Hi ZFS folks,

We (Spectra Logic) have seen some odd behavior with resilvers in RAIDZ3 pools.

The codebase in question is FreeBSD stable/11 from July 2017, at approximately 
FreeBSD SVN version 321310.

We have customer systems with (sometimes) hundreds of SMR drives in RAIDZ3 
vdevs in a large pool.  (A typical arrangement is a 23-drive RAIDZ3, and some 
customers will put everything in one giant pool made up of a number of 23-drive 
RAIDZ3 arrays.)

The SMR drives in question have a bug that sometimes causes them to go off the 
SAS bus for up to two minutes.  (They’re usually gone a lot less than that, up 
to 10 seconds.)  Once they come back online, zfsd puts the drive back in the 
pool and makes it online.

If a resilver is active on a different drive, once the drive that temporarily 
left comes back, the resilver apparently starts over from the beginning.

This leads to resilvers that take forever to complete, especially on systems 
with high load.
Since resilver is single threaded, adding the drive immediately doesn't
buy you any additional redundancy. Maybe it would make sense for the
zfsd to delay reinserting the drive until after ongoing resilver is done?

A zpool can only have one resilver running, but it resilvers all the disks requiring resilver together. So yes, it starts again when an additional drive returns to the zpool, but then does them all in parallel.

Doing them serially would take longer which won't help if your drive failure rate is already bumping up against your drive resilver completion rate. It would also mean it doesn't have data from the later drive available to reconstruct data on the original resilvering drive.

You need to get your drive firmware fixed - it's plain broken.

Andrew

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/T2a7340f4c0c48fa9-Mc07409b864bfd773c0409075
Delivery options: https://openzfs.topicbox.com/groups

Reply via email to