On Tue, Jul 10, 2018 at 23:30:06 +0100, Andrew Gabriel wrote:
> On 10/07/2018 09:26, Pawel Jakub Dawidek wrote:
> > On 7/9/18 23:39, Ken Merry wrote:
> >> Hi ZFS folks,
> >>
> >> We (Spectra Logic) have seen some odd behavior with resilvers in RAIDZ3 
> >> pools.
> >>
> >> The codebase in question is FreeBSD stable/11 from July 2017, at 
> >> approximately FreeBSD SVN version 321310.
> >>
> >> We have customer systems with (sometimes) hundreds of SMR drives in RAIDZ3 
> >> vdevs in a large pool.  (A typical arrangement is a 23-drive RAIDZ3, and 
> >> some customers will put everything in one giant pool made up of a number 
> >> of 23-drive RAIDZ3 arrays.)
> >>
> >> The SMR drives in question have a bug that sometimes causes them to go off 
> >> the SAS bus for up to two minutes.  (They???re usually gone a lot less 
> >> than that, up to 10 seconds.)  Once they come back online, zfsd puts the 
> >> drive back in the pool and makes it online.
> >>
> >> If a resilver is active on a different drive, once the drive that 
> >> temporarily left comes back, the resilver apparently starts over from the 
> >> beginning.
> >>
> >> This leads to resilvers that take forever to complete, especially on 
> >> systems with high load.
> > Since resilver is single threaded, adding the drive immediately doesn't
> > buy you any additional redundancy. Maybe it would make sense for the
> > zfsd to delay reinserting the drive until after ongoing resilver is done?
> 
> A zpool can only have one resilver running, but it resilvers all the 
> disks requiring resilver together.
> So yes, it starts again when an additional drive returns to the zpool, 
> but then does them all in parallel.

Ok.  Does it go back to the beginning, or to the lowest common denominator
transaction group in that case?

> Doing them serially would take longer which won't help if your drive 
> failure rate is already bumping up against your drive resilver 
> completion rate. It would also mean it doesn't have data from the later 
> drive available to reconstruct data on the original resilvering drive.

So if you online a drive, can ZFS access data from the onlined drive before
the resilver gets to that point?

> You need to get your drive firmware fixed - it's plain broken.

The vendor released a firmware update that helped a bit, but has refused
to do any more for the drive model in question.  They have a newer model
of the same drive (which we're shipping now) that is far less likely (10%
of the rate of the older drive) to exhibit the problem.  But that doesn't
fix the customers with the older drive already out in the field.

Yes, I agree, the drive is broken. 

Ken
-- 
Kenneth Merry
[email protected]

------------------------------------------
openzfs: openzfs-developer
Permalink: 
https://openzfs.topicbox.com/groups/developer/T2a7340f4c0c48fa9-M3b1ff81935b0cb7c4cbceb1f
Delivery options: https://openzfs.topicbox.com/groups

Reply via email to