[zfs-discuss] Kickstart hot spare attachment
For my latest test I set up a stripe of two mirrors with one hot spare like so: zpool create -f -m /export/zmir zmir mirror c0t0d0 c3t2d0 mirror c3t3d0 c3t4d0 spare c3t1d0 I spun down c3t2d0 and c3t4d0 simultaneously, and while the system kept running (my tar over NFS barely hiccuped), the zpool command hung again. I rebooted the machine with -dnq, and although the system didn't come up the first time, it did after a fsck and a second reboot. However, once again the hot spare isn't getting used: # zpool status -v pool: zmir state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: resilver completed with 0 errors on Tue Dec 12 09:15:49 2006 config: NAMESTATE READ WRITE CKSUM zmirDEGRADED 0 0 0 mirrorDEGRADED 0 0 0 c0t0d0 ONLINE 0 0 0 c3t2d0 UNAVAIL 0 0 0 cannot open mirrorDEGRADED 0 0 0 c3t3d0 ONLINE 0 0 0 c3t4d0 UNAVAIL 0 0 0 cannot open spares c3t1d0AVAIL A few questions: - I know I can attach it via the zpool commands, but is there a way to kickstart the attachment process if it fails to attach automatically upon disk failure? - In this instance the spare is twice as big as the other drives -- does that make a difference? - Is there something inherent to an old SCSI bus that causes spun- down drives to hang the system in some way, even if it's just hanging the zpool/zfs system calls? Would a thumper be more resilient to this? Jim This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kickstart hot spare attachment
On Tue, Dec 12, 2006 at 07:53:32AM -0800, Jim Hranicky wrote: - I know I can attach it via the zpool commands, but is there a way to kickstart the attachment process if it fails to attach automatically upon disk failure? Yep. Just do a 'zpool replace zmir target spare'. This is what the FMA agent does in response to failed drive faults. - In this instance the spare is twice as big as the other drives -- does that make a difference? Nope. The 'size' of a replacing vdev is the minimum size of its two children, so it won't affect anything. - Is there something inherent to an old SCSI bus that causes spun- down drives to hang the system in some way, even if it's just hanging the zpool/zfs system calls? Would a thumper be more resilient to this? There are a number of drive failure modes that result in arbitrarily misbehaving drives, as opposed to drives which fail to open entirely. We are working on a more complete FMA diagnosis engine which will be able to diagnose this type of failure and proactively fault the device. I'm not sure exactly what behavior you're seeing by 'spun-down drives', so this may or may not address your issue. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kickstart hot spare attachment
Eric Schrock wrote: On Tue, Dec 12, 2006 at 07:53:32AM -0800, Jim Hranicky wrote: - I know I can attach it via the zpool commands, but is there a way to kickstart the attachment process if it fails to attach automatically upon disk failure? Yep. Just do a 'zpool replace zmir target spare'. This is what the FMA agent does in response to failed drive faults. Sure, but that's what I want to avoid. The FMA agent should do this by itself, but it's not, so I guess I'm just wondering why, or if there's a good way to get to do so. If this happens in the middle of the night I don't want to have to run the commands by hand. - Is there something inherent to an old SCSI bus that causes spun- down drives to hang the system in some way, even if it's just hanging the zpool/zfs system calls? Would a thumper be more resilient to this? There are a number of drive failure modes that result in arbitrarily misbehaving drives, as opposed to drives which fail to open entirely. We are working on a more complete FMA diagnosis engine which will be able to diagnose this type of failure and proactively fault the device. I'm not sure exactly what behavior you're seeing by 'spun-down drives', so this may or may not address your issue. For instance, the zpool command hanging or the system hanging trying to reboot normally. Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kickstart hot spare attachment
On Tue, Dec 12, 2006 at 02:08:57PM -0500, James F. Hranicky wrote: Sure, but that's what I want to avoid. The FMA agent should do this by itself, but it's not, so I guess I'm just wondering why, or if there's a good way to get to do so. If this happens in the middle of the night I don't want to have to run the commands by hand. Yes, the FMA agent should do this. Can you run 'fmdump -v' and see if the DE correctly identified the faulted devices? For instance, the zpool command hanging or the system hanging trying to reboot normally. If the SCSI commands hang forever, then there is nothing that ZFS can do, as a single write will never return. The more likely case is that the commands are continually timining out with very long response times, and ZFS will continue to talk to them forever. The future FMA integration I mentioned will solve this problem. In the meantime, you should be able to 'zpool offline' the affected devices by hand. There is also associated work going on to better handle asynchrounous reponse times across devices. Currently, a single slow device will slow the entire pool to a crawl. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kickstart hot spare attachment
Eric Schrock wrote: On Tue, Dec 12, 2006 at 02:08:57PM -0500, James F. Hranicky wrote: Sure, but that's what I want to avoid. The FMA agent should do this by itself, but it's not, so I guess I'm just wondering why, or if there's a good way to get to do so. If this happens in the middle of the night I don't want to have to run the commands by hand. Yes, the FMA agent should do this. Can you run 'fmdump -v' and see if the DE correctly identified the faulted devices? Here you go: # fmdump -v TIME UUID SUNW-MSG-ID Nov 29 16:29:12.1947 e50198f2-2eb9-c58b-d7c5-87aaae5cb935 ZFS-8000-D3 100% fault.fs.zfs.device Problem in: zfs://pool=8e63f0b8e4263e71/vdev=9272c0973ecdb27c Affects: zfs://pool=8e63f0b8e4263e71/vdev=9272c0973ecdb27c FRU: - Nov 30 10:31:48.8844 1a44a780-05c0-cb6e-d44f-f1d8999f40e5 ZFS-8000-D3 100% fault.fs.zfs.device Problem in: zfs://pool=51f1caf6cad1aa2f/vdev=769276842b0efd54 Affects: zfs://pool=51f1caf6cad1aa2f/vdev=769276842b0efd54 FRU: - Dec 11 14:04:57.8803 c46d21e0-200d-43a1-e5db-ae9c9ebf3482 ZFS-8000-D3 100% fault.fs.zfs.device Problem in: zfs://pool=2646e20c1cb0a9d0/vdev=52070de44ec80c15 Affects: zfs://pool=2646e20c1cb0a9d0/vdev=52070de44ec80c15 FRU: - Dec 11 14:42:32.1271 1319464e-7a8c-e65b-962e-db386e90f7f2 ZFS-8000-D3 100% fault.fs.zfs.device Problem in: zfs://pool=2646e20c1cb0a9d0/vdev=724c128cdbc17745 Affects: zfs://pool=2646e20c1cb0a9d0/vdev=724c128cdbc17745 FRU: - I'm not really sure what it means. For instance, the zpool command hanging or the system hanging trying to reboot normally. If the SCSI commands hang forever, then there is nothing that ZFS can do, as a single write will never return. The more likely case is that the commands are continually timining out with very long response times, and ZFS will continue to talk to them forever. The future FMA integration I mentioned will solve this problem. In the meantime, you should be able to 'zpool offline' the affected devices by hand. Well, as long as I know which device is affected :- If zpool status doesn't return it may be difficult to figure out. Do you know if the SATA controllers in a Thumper can better handle this problem? There is also associated work going on to better handle asynchrounous reponse times across devices. Currently, a single slow device will slow the entire pool to a crawl. Do you have an idea as to when this might be available? Thanks for all your input, Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kickstart hot spare attachment
On Tue, Dec 12, 2006 at 02:38:22PM -0500, James F. Hranicky wrote: Dec 11 14:42:32.1271 1319464e-7a8c-e65b-962e-db386e90f7f2 ZFS-8000-D3 100% fault.fs.zfs.device Problem in: zfs://pool=2646e20c1cb0a9d0/vdev=724c128cdbc17745 Affects: zfs://pool=2646e20c1cb0a9d0/vdev=724c128cdbc17745 FRU: - I'm not really sure what it means. Hmmm, it means that we correctly noticed that the device had failed, but for whatever reason the ZFS FMA agent didn't correctly replace the drive. I am cleaning up the hot spare behavior as we speak so I will try to reproduce this. Well, as long as I know which device is affected :- If zpool status doesn't return it may be difficult to figure out. Do you know if the SATA controllers in a Thumper can better handle this problem? I will be starting a variety of experiments in this vein in the near future. Others may be able to describe their experiences so far. How exactly did you 'spin down' the drives in question? Is there a particular failure mode you're interested in? Do you have an idea as to when this might be available? It will be a while before the complete functionality is finished. I have begun the work, but there are several distinct phases. First, I am cleaning up the existing hot spare behavior. Second, I'm adding proper hotplug support to ZFS so that it detects device removal without freaking out and correctly resilvers/replaces drives when they are plugged back in. Finally, I'll be adding a ZFS diagnosis engine to both analyze ZFS faults as well as consume SMART data to predict disk failure and proactively offline devices. I would estimate that it will be a few months before I get all of this into Nevada. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kickstart hot spare attachment
Eric Schrock wrote: Hmmm, it means that we correctly noticed that the device had failed, but for whatever reason the ZFS FMA agent didn't correctly replace the drive. I am cleaning up the hot spare behavior as we speak so I will try to reproduce this. Ok, great. Well, as long as I know which device is affected :- If zpool status doesn't return it may be difficult to figure out. Do you know if the SATA controllers in a Thumper can better handle this problem? I will be starting a variety of experiments in this vein in the near future. Others may be able to describe their experiences so far. How exactly did you 'spin down' the drives in question? Is there a particular failure mode you're interested in? The Andataco cabinet has a button for each disk slot that if you hold down will spin the drive down so you can pull it out. I'm interested in any failure mode that might happen to my server :- Basically, we're very interested in building a nice ZFS server box that will house a good chunk of our data, be it homes, research or whatever. I just have to know the server is as bulletproof as possible, that's why I'm doing the stress tests. Do you have an idea as to when this might be available? It will be a while before the complete functionality is finished. I have begun the work, but there are several distinct phases. First, I am cleaning up the existing hot spare behavior. Second, I'm adding proper hotplug support to ZFS so that it detects device removal without freaking out and correctly resilvers/replaces drives when they are plugged back in. Finally, I'll be adding a ZFS diagnosis engine to both analyze ZFS faults as well as consume SMART data to predict disk failure and proactively offline devices. I would estimate that it will be a few months before I get all of this into Nevada. Ok, thanks. Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss