On Fri, Aug 13, 2010 at 02:56:49PM -0400, Toby Burress wrote: > On Fri, Aug 13, 2010 at 02:42:12PM -0400, David Miller wrote: > > What does your zpool look like? Ideally if you're using RAIDz or RAIDz2 > > then you should be using multiple RAIDz sets in the pool. This way IO is > > stripped across the RAIDz sets and any degradation, and recovery, should > > only involve the smaller RAIDz set. Which should be relatively quick > > depending on the size and type of drives involved. > > The server that it taking a billion years to resilver does in fact have > 15 disks in one big raidz2 pool. The other server has a single pool of > three raidz2 arrays of 8 disks each, so hopefully that will yield better > recoveries. Although if the bottleneck is reads, then wouldn't it be > faster to read from 14 disks than 7? And if the bottleneck is just > writes, then wow, I need to buy some different disks next time. > > Since the load on the machine is 3, and it's doing nothing but > resilvering, I suspect the bottleneck is actually the CPU. I don't > know a ton about the implementation of ZFS, but I do know it checksums > every block. It would be insane for it not to verify those checksums > while resilvering, and perhaps it even recomputes them while writes them > to the new disk.
Did you lose 1 or 2 disks in the raidz2 pool? I'm not current on how zfs does reconstruction, but generally to recover data from a lost raid5/6 set you have to read all of the other disks and then compute the missing data. That's why a 15 disk wide stripe can be worse than a smaller stripe set... -j _______________________________________________ bblisa mailing list [email protected] http://www.bblisa.org/mailman/listinfo/bblisa
