I don't know enough about ZFS internals to say how resilvers work but have
some personal experience relevant to this issue.  Due to a power supply
problem I was having drives go offline.  It would come right back when
onlined and start to resilver,  By the time I found out what the problem
was and fixed it, I had three drives of a raidz2 vdev all resilvering at
once.  I thought this was a little much so killed the processes using this
pool and waited until the relivers completed before restarting them.  No
data loss during resilver or a subsequent scrub.

On Wed, Jul 11, 2018 at 10:16 AM, Kenneth D. Merry <[email protected]> wrote:

> On Tue, Jul 10, 2018 at 23:30:06 +0100, Andrew Gabriel wrote:
> > On 10/07/2018 09:26, Pawel Jakub Dawidek wrote:
> > > On 7/9/18 23:39, Ken Merry wrote:
> > >> Hi ZFS folks,
> > >>
> > >> We (Spectra Logic) have seen some odd behavior with resilvers in
> RAIDZ3 pools.
> > >>
> > >> The codebase in question is FreeBSD stable/11 from July 2017, at
> approximately FreeBSD SVN version 321310.
> > >>
> > >> We have customer systems with (sometimes) hundreds of SMR drives in
> RAIDZ3 vdevs in a large pool.  (A typical arrangement is a 23-drive RAIDZ3,
> and some customers will put everything in one giant pool made up of a
> number of 23-drive RAIDZ3 arrays.)
> > >>
> > >> The SMR drives in question have a bug that sometimes causes them to
> go off the SAS bus for up to two minutes.  (They???re usually gone a lot
> less than that, up to 10 seconds.)  Once they come back online, zfsd puts
> the drive back in the pool and makes it online.
> > >>
> > >> If a resilver is active on a different drive, once the drive that
> temporarily left comes back, the resilver apparently starts over from the
> beginning.
> > >>
> > >> This leads to resilvers that take forever to complete, especially on
> systems with high load.
> > > Since resilver is single threaded, adding the drive immediately doesn't
> > > buy you any additional redundancy. Maybe it would make sense for the
> > > zfsd to delay reinserting the drive until after ongoing resilver is
> done?
> >
> > A zpool can only have one resilver running, but it resilvers all the
> > disks requiring resilver together.
> > So yes, it starts again when an additional drive returns to the zpool,
> > but then does them all in parallel.
>
> Ok.  Does it go back to the beginning, or to the lowest common denominator
> transaction group in that case?
>
> > Doing them serially would take longer which won't help if your drive
> > failure rate is already bumping up against your drive resilver
> > completion rate. It would also mean it doesn't have data from the later
> > drive available to reconstruct data on the original resilvering drive.
>
> So if you online a drive, can ZFS access data from the onlined drive before
> the resilver gets to that point?
>
> > You need to get your drive firmware fixed - it's plain broken.
> 
> The vendor released a firmware update that helped a bit, but has refused
> to do any more for the drive model in question.  They have a newer model
> of the same drive (which we're shipping now) that is far less likely (10%
> of the rate of the older drive) to exhibit the problem.  But that doesn't
> fix the customers with the older drive already out in the field.
> 
> Yes, I agree, the drive is broken.
> 
> Ken
> --
> Kenneth Merry
> [email protected]

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/T2a7340f4c0c48fa9-Mec2d71febcd73c25296b37d8
Delivery options: https://openzfs.topicbox.com/groups

Reply via email to