[zfs-discuss] Two disks giving errors in a raidz pool, advice needed
Hello, I have looked around this mailing list and other virtual spaces and I wasn't able to find a similar situation than this weird one. I have a 6 disks raidz zfs15 pool. After a scrub, the status of the pool and all disks still show up as ONLINE but two of the disks are starting to give me errors and I do have fatal data corruption. The disks seems to be failing differently : disk 2 has 78 (not growing) read errors, 43k (growing) write errors and 3 (not growing) checksum errors. disk 5 has 0 read errors, 0 write errors but 7.4k checksum errors (growing). Data corruption is around 22k files. I plan to replace both disks. Which disk do you think should be replaced first to loose as few data as possible ? I was thinking of replacing disk 5 first as it seems to have a lot of silent data corruption so maybe it's a bad idea to use it's output to replace disk 2. Also checksum and read errors on disk 2 do not seem to be growing as I used the pool to backup data (corrupted files could not be accessed, but a lot of files were fine) but write errors are growing extremely fast. So reading uncorrupted data from disk 2 seems to be working but writing on it seems to be problematic. Do you guys also think I should change disk 5 first or am I missing something ? I'm not an expert with zfs so any insight to help me replace those disks without loosing too much data would be much appreciated :) Regards, Ryan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two disks giving errors in a raidz pool, advice needed
On Mon, 23 Apr 2012, Manuel Ryan wrote: Do you guys also think I should change disk 5 first or am I missing something ? From your description, this sounds like the best course of action, but you should look at your system log files to see what sort of issues are being logged. Also consult the output of 'iostat -xe' to see what low-level errors are being logged. I'm not an expert with zfs so any insight to help me replace those disks without loosing too much data would be much appreciated :) If this is really raidz1 then more data is definitely at risk if several disks seem to be failing at once. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two disks giving errors in a raidz pool, advice needed
On 04/23/12 01:47 PM, Manuel Ryan wrote: Hello, I have looked around this mailing list and other virtual spaces and I wasn't able to find a similar situation than this weird one. I have a 6 disks raidz zfs15 pool. After a scrub, the status of the pool and all disks still show up as ONLINE but two of the disks are starting to give me errors and I do have fatal data corruption. The disks seems to be failing differently : disk 2 has 78 (not growing) read errors, 43k (growing) write errors and 3 (not growing) checksum errors. disk 5 has 0 read errors, 0 write errors but 7.4k checksum errors (growing). Data corruption is around 22k files. I plan to replace both disks. Which disk do you think should be replaced first to loose as few data as possible ? I was thinking of replacing disk 5 first as it seems to have a lot of silent data corruption so maybe it's a bad idea to use it's output to replace disk 2. Also checksum and read errors on disk 2 do not seem to be growing as I used the pool to backup data (corrupted files could not be accessed, but a lot of files were fine) but write errors are growing extremely fast. So reading uncorrupted data from disk 2 seems to be working but writing on it seems to be problematic. Do you guys also think I should change disk 5 first or am I missing something ? If it were my data, I'd set the pool read only, backup, rebuild and restore. You do risk further data loss (maybe even pool loss) while the new drive is resilvering. I would only use raidz for unimportant data, or for a copy of data from a more robust pool. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two disks giving errors in a raidz pool, advice needed
On Mon, Apr 23, 2012 at 02:16:40PM +1200, Ian Collins wrote: If it were my data, I'd set the pool read only, backup, rebuild and restore. You do risk further data loss (maybe even pool loss) while the new drive is resilvering. You're definitely in a pickle. The first priority is to try and ensure that no further damage is done. Check and make sure you have ample power supply. Setting the pool readonly would be a good start. Powering down and checking all the connectors and cables would be another. Write errors are an interesting result. Check the smart data on that disk - either it is totally out of sectors to reallocate, or it has some kind of interface problem. If you can, image all the disks elsewhere, with something like ddrescue. Doing so sequentially rather than random IO through the filesystem can sometimes have better results for marginal disks/sectors. That gives you scratch copies to work on or fall back to, as you try other recovery methods. zfs15 is fairly old.. Consider presenting a copy of the pool to a newer solaris that may have more robust recovery, as one experiment. I wouldn't zpool replace anything at this point - the moment you do, you throw away any of the good data on that disk, which might help you recover sectors that are bad on other disks. If you have to swap disks, I would try and get as many of the readable sectors copied across to the new disk as possible (ddrescue again) with the pool offline, and then just physically swap disks, so at least the good data remains usable. Try and get some clarity on what's happening with the hardware on a individual disk level - what reads successfully (at least at the physical layer, below zfs chksum). Try and get at the root cause of the write errors first; they're impeding zfs's recovery of what looks like other I would only use raidz for unimportant data, or for a copy of data from a more robust pool. Well, yeah, but a systemic problem (like bad ram or power or controller) can manifest as a multi-disk failure no matter how many redundant disks. -- Dan. pgpQQcIoIuOpx.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two disks giving errors in a raidz pool, advice needed
Thank you for this very detailed answer ! After a reboot of the machine, I have no more write errors on disk 2 (only 4 checksum, not growing), I was able to access data which I previously couldn't and now only the checksum errors on disk 5 are growing. Fortunately, I was able to recover all important data in those conditions (yeah !), Unfortunately, I don't have the spare disks to backup everything or try your idea of copying every disk (very good strategy btw, I hadn't tought about it !). And since I can live with loosing the pool now, I'll gamble away and replace drive 5 tomorrow and if that fails i'll just destroy the pool, replace the 2 physical disks and build a new one (maybe raidz2 this time :)) I'll try to leave all 6 original disks in the machine while replacing, maybe zfs will be smart enough to use the 6 drives to build the replacement disk ? It's a miracle that zpool still shows disk 5 as ONLINE, here's a SMART dump of disk 5 (1265 Current_Pending_Sector, ouch) : === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 178 173 051Pre-fail Always - 3804 3 Spin_Up_Time0x0027 253 253 021Pre-fail Always - 1050 4 Start_Stop_Count0x0032 100 100 000Old_age Always - 86 5 Reallocated_Sector_Ct 0x0033 198 198 140Pre-fail Always - 55 7 Seek_Error_Rate 0x002e 200 200 000Old_age Always - 0 9 Power_On_Hours 0x0032 094 094 000Old_age Always - 4606 10 Spin_Retry_Count0x0032 100 253 000Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000Old_age Always - 84 192 Power-Off_Retract_Count 0x0032 200 200 000Old_age Always - 30 193 Load_Cycle_Count0x0032 179 179 000Old_age Always - 65652 194 Temperature_Celsius 0x0022 119 109 000Old_age Always - 31 196 Reallocated_Event_Count 0x0032 145 145 000Old_age Always - 55 197 Current_Pending_Sector 0x0032 195 195 000Old_age Always - 1265 198 Offline_Uncorrectable 0x0030 200 189 000Old_age Offline - 0 199 UDMA_CRC_Error_Count0x0032 200 200 000Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 001 000Old_age Offline - 1 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Two disks giving errors in a raidz pool, advice needed
On Mon, Apr 23, 2012 at 05:48:16AM +0200, Manuel Ryan wrote: After a reboot of the machine, I have no more write errors on disk 2 (only 4 checksum, not growing), I was able to access data which I previously couldn't and now only the checksum errors on disk 5 are growing. Well, that's good, but what changed? If it was just a reboot and perhaps power-cycle of the disks, I don't think you've solved much in the long term.. Fortunately, I was able to recover all important data in those conditions (yeah !), .. though that's clearly the most important thing! If you're down to just checksum errors now, then run a scrub and see if they can all be repaired, before replacing the disk. If you haven't been able to get a scrub complete, then either: * delete unimportant / rescued data, until none of the problem sectors are referenced any longer, or * replace the disk like I suggested last time, with a copy under zfs' nose and switch And since I can live with loosing the pool now, I'll gamble away and replace drive 5 tomorrow and if that fails i'll just destroy the pool, replace the 2 physical disks and build a new one (maybe raidz2 this time :)) You know what? If you're prepared to do that in the worst of circumstances, it would be a very good idea to do that under the best of circumstances. If you can, just rebuild it raidz2 and be happier next time something flaky happens with this hardware. I'll try to leave all 6 original disks in the machine while replacing, maybe zfs will be smart enough to use the 6 drives to build the replacement disk ? I don't think it will.. others who know the code, feel free to comment otherwise. If you've got the physical space for the extra disk, why not keep it there and build the pool raidz2 with the same capacity? It's a miracle that zpool still shows disk 5 as ONLINE, here's a SMART dump of disk 5 (1265 Current_Pending_Sector, ouch) That's all indicative of read errors. Note that your reallocated sector count on that disk is still low, so most of those will probably clear when overwritten and given a chance to re-map. If these all appeared suddenly, clearly the disk has developed a problem. Normally, they appear gradually as head sensitivity diminishes. How often do you normally run a scrub, before this happened? It's possible they were accumulating for a while but went undetected for lack of read attempts to the disk. Scrub more often! -- Dan. pgpFByqrFnHeY.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss