Re: [zfs-discuss] ZFS, power failures, and UPSes (and ZFS recovery guide links)
Ian Collins wrote: David Magda wrote: On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote: I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Can't remember where I read the claim, but supposedly if power isn't restored within about ten minutes, then it will probably be out for a few hours. If this 'statistic' is true, it would mean that your UPS should last (say) fifteen minutes, and after that you really need a generator. Or run your systems of DC and get as much backup as you have room (and budget!) for batteries. I once visited a central exchange with 48 hours of battery capacity... The way Google handles UPSes is to have a small 12v battery integrated with each PC power supply. When the machine is on, the battery has its charged maintained. Not unlike a laptop in that it has a built in battery backup, but using an inexpensive sealed lead acid battery instead of lithium ion. Here is info along with photos of the Google server internals: http://news.cnet.com/8301-1001_3-10209580-92.html http://willysr.blogspot.com/2009/04/googles-server-design.html (IIRC there have been power supply UPSes since at least the late 1980s which had an internal battery. Either that or they were UPSes that fit inside the standard PC (AT) compatible desktop case, making the power protection system entirely internal to the computer. I think I saw these models one time while browsing late 1980s or early 1990s issues of PC Magazine that reviewed UPSes. They still exist...one company selling them is http://www.globtek.com/html/ups.html . A Google search for 'power supply built in UPS' would likely find more.) I also did additional searches in the zfs-discuss archives and found a thread from mid-February, which lead me to other threads. It looks like there are still scattered instances where ZFS has not recovered gracefully from power failures or other failures, where it became necessary to perform a manual transaction group (txg) rollback. Here is a consolidated list of links related to manual uberblock transaction group (txg) rollback and similar ZFS data recovery guides, including undeleting: Section 1: Nathan Hand's guide and related thread Nathan Hand's guide to invalidating uberblocks (Dec 2008 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=85794 or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg22153.html Section 2. Victor Latushkin's guide and related threads Thread: zpool unimportable (corrupt zpool metadata??) but no zdb -l device problems (Oct 2008 to Feb 2009 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=76960 or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg19839.html Repair report: Re: Solved - a big THANKS to Victor Latushkin @ Sun / Moscow http://www.opensolaris.org/jive/message.jspa?messageID=289537#289537 Some recovery discussion by Victor: zdb -bv alone took several hours to walk the block tree http://www.opensolaris.org/jive/message.jspa?messageID=292991#292991 or http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/022365.html or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg20095.html Victor Latushkin's guide: Thanks to COW nature of ZFS it was possible to successfully recover pool state which was only 5 seconds older than last unopenable one. http://mail.opensolaris.org/pipermail/zfs-discuss/2008-October/022331.html or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg20061.html Section 3: reliability debates, recovery tool planning, uberblock info Thread: Availability: ZFS needs to handle disk removal / driver failure better (August 2008 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=70811 or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg19057.html Thread: ZFS: unreliable for professional usage? (Feb 2009 thread) http://www.opensolaris.org/jive/thread.jspa?threadID=91426 or http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg23833.html Richard Elling's post that uberblocks are kept in an 128-entry circular queue which is 4x redundant with 2 copies each at the beginning and end of the vdev. Other metadata, by default, is 2x redundant and spatially diverse. http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg24145.html Jeff Bonwick's post about Bug ID 6667683 http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg23961.html Bug ID 6667683: need a way to rollback to an uberblock from a previous txg Description: If we are unable to open the pool based on the most recent uberblock then it might be useful to try an older txg uberblock as it might provide a better view of the world. Having a utility to reset the uberblock to a previous txg might provide a nice recovery mechanism.
Re: [zfs-discuss] ZFS, power failures, and UPSes
On Thu, 2 Jul 2009, Ian Collins wrote: 5+ is typical for telco use. Aah, but we start getting into rooms full of giant 2V wet lead acid cells and giant busbars the size of railway tracks. -- Andre van Eyssen. mail: an...@purplecow.org jabber: an...@interact.purplecow.org purplecow.org: UNIX for the masses http://www2.purplecow.org purplecow.org: PCOWpix http://pix.purplecow.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS, power failures, and UPSes
Hello, I've looked around Google and the zfs-discuss archives but have not been able to find a good answer to this question (and the related questions that follow it): How well does ZFS handle unexpected power failures? (e.g. environmental power failures, power supply dying, etc.) Does it consistently gracefully recover? Should having a UPS be considered a (strong) recommendation or a don't even think about running without it item? Are there any communications/interfacing caveats to be aware of when choosing the UPS? In this particular case, we're talking about a home file server running OpenSolaris 2009.06. Actual environment power failures are generally 1 per year. I know there are a few blog articles about this type of application, but I don't recall seeing any (or any detailed) discussion about power failures and UPSes as they relate to ZFS. I did see that the ZFS Evil Tuning Guide says cache flushes are done every 5 seconds. Here is one post that didn't get any replies about a year ago after someone had a power failure, then UPS battery failure while copying data to a ZFS pool: http://lists.macosforge.org/pipermail/zfs-discuss/2008-July/000670.html Both theoretical answers and real life experiences would be appreciated as the former tells me where ZFS is needed while the later tells me where it has been or is now. Thanks, -hk ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
I've seen enough people suffer from corrupted pools that a UPS is definitely good advice. However, I'm running a (very low usage) ZFS server at home and it's suffered through at least half a dozen power outages without any problems at all. I do plan to buy a UPS as soon as I can, but it seems to be surviving very well so far. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
A related question: If you are on a UPS, is it OK to disable ZIL? The evil tuning guide says The ZIL is an essential part of ZFS and should never be disabled. However, if you have a UPS, what can go wrong that really requires ZIL? Opinions? Monish - Original Message - From: Ross no-re...@opensolaris.org To: zfs-discuss@opensolaris.org Sent: Tuesday, June 30, 2009 3:04 PM Subject: Re: [zfs-discuss] ZFS, power failures, and UPSes I've seen enough people suffer from corrupted pools that a UPS is definitely good advice. However, I'm running a (very low usage) ZFS server at home and it's suffered through at least half a dozen power outages without any problems at all. I do plan to buy a UPS as soon as I can, but it seems to be surviving very well so far. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
Haudy Kazemi wrote: Hello, I've looked around Google and the zfs-discuss archives but have not been able to find a good answer to this question (and the related questions that follow it): How well does ZFS handle unexpected power failures? (e.g. environmental power failures, power supply dying, etc.) Does it consistently gracefully recover? Mostly. Unless you are unlucky. Backups are your friend in *any* environment though. Should having a UPS be considered a (strong) recommendation or a don't even think about running without it item? There has been quite any interesting thread on this over the last few months. I won't repeat my comments, but it is there in digital posterity on the zfs-discuss archives. Certainly in a large environment with a lot of data being written, then one should consider this a mandatory requirement if you care about your data. Particularly if there are many links in your storage chain that cause data corruption due to power failure. Are there any communications/interfacing caveats to be aware of when choosing the UPS? In this particular case, we're talking about a home file server running OpenSolaris 2009.06. As far as a home server goes, particularly if it is not write intensive then you will 'most likely' be fine. I have a home one with a v120 running S10 u6 with a D1000 and 7 x 300 GB SCSI disk in a RAIDZ2 that has seen numerous power interruptions with no faults. This machine is a Samba server for my Macs and printing business. I also have another mail / web server also on another v120 which experiences the same power faults and regularly bounces back without issues. But your mileage may vary. It all really depends on how much you care about the data really. I haven't used OpenSolaris specifically however as I prefer the generally more well supported S10 releases. (yes I know you can get support for OS, but I tend to be conservative and standardize as much as possible. I do have millions of files stored on ZFS volumes for our Uni and I sleep well ;)) Actual environment power failures are generally 1 per year. I know there are a few blog articles about this type of application, but I don't recall seeing any (or any detailed) discussion about power failures and UPSes as they relate to ZFS. I did see that the ZFS Evil Tuning Guide says cache flushes are done every 5 seconds. The flush time you mention is based on older versions of ZFS, newer ones can have a flush time as long as 30 seconds I believe now. Here is one post that didn't get any replies about a year ago after someone had a power failure, then UPS battery failure while copying data to a ZFS pool: http://lists.macosforge.org/pipermail/zfs-discuss/2008-July/000670.html Both theoretical answers and real life experiences would be appreciated as the former tells me where ZFS is needed while the later tells me where it has been or is now. Thanks, -hk ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
On Tue, 30 Jun 2009, Monish Shah wrote: The evil tuning guide says The ZIL is an essential part of ZFS and should never be disabled. However, if you have a UPS, what can go wrong that really requires ZIL? Without addressing a single ZFS-specific issue: * panics * crashes * hardware failures - dead RAM - dead CPU - dead systemboard - dead something else * natural disasters * UPS failure * UPS failure (must be said twice) * Human error (what does this button do?) * Cabling problems (say, where did my disks go?) * Malicious actions (Fired? Let me turn their power off!) That's just a warm-up; I'm sure people can add both the ZFS-specific reasons and also the fallacy that a UPS does anything more than mitigate one particular single point of failure. Don't forget to buy two UPSes and split your machine across both. And don't forget to actually maintain the UPS. And check the batteries. And schedule a load test. The single best way to learn about the joys of UPS behaviour is to sit down and have a drink with a facilities manager who has been doing the job for at least ten years. At least you'll hear some funny stories about the day a loose screw on one floor took out a house UPS and 100+ hosts and NEs with it. Andre. -- Andre van Eyssen. mail: an...@purplecow.org jabber: an...@interact.purplecow.org purplecow.org: UNIX for the masses http://www2.purplecow.org purplecow.org: PCOWpix http://pix.purplecow.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
Monish Shah wrote: A related question: If you are on a UPS, is it OK to disable ZIL? The evil tuning guide says The ZIL is an essential part of ZFS and should never be disabled. However, if you have a UPS, what can go wrong that really requires ZIL? The UPS. Opinions? Monish - Original Message - From: Ross no-re...@opensolaris.org To: zfs-discuss@opensolaris.org Sent: Tuesday, June 30, 2009 3:04 PM Subject: Re: [zfs-discuss] ZFS, power failures, and UPSes I've seen enough people suffer from corrupted pools that a UPS is definitely good advice. However, I'm running a (very low usage) ZFS server at home and it's suffered through at least half a dozen power outages without any problems at all. I do plan to buy a UPS as soon as I can, but it seems to be surviving very well so far. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Dr Doug Baker Sun Microsystems Systems Support Engineer. UK Mission Critical Solution Centre. Tel : 0870 600 3222 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
On 06/30/09 03:00 AM, Andre van Eyssen wrote: On Tue, 30 Jun 2009, Monish Shah wrote: The evil tuning guide says The ZIL is an essential part of ZFS and should never be disabled. However, if you have a UPS, what can go wrong that really requires ZIL? Without addressing a single ZFS-specific issue: * panics * crashes * hardware failures - dead RAM - dead CPU - dead systemboard - dead something else * natural disasters * UPS failure * UPS failure (must be said twice) * Human error (what does this button do?) * Cabling problems (say, where did my disks go?) * Malicious actions (Fired? Let me turn their power off!) That's just a warm-up; I'm sure people can add both the ZFS-specific reasons and also the fallacy that a UPS does anything more than mitigate one particular single point of failure. Actually, they do quite a bit more than that. They create jobs, generate revenue for battery manufacturers, and tech's that change batteries and do PM maintenance on the large units. Let's not forget that they add significant revenue to the transportation industry, given their weight for shipping. In the last 28 years of doing this stuff, I've found a few times that the UPS has actually worked and lasted as long as the outage. Many other times, the unit is failed (circuits), or the batteries are beyond the service life. But really, something approaching 40% of the time they actually work out OK. So they also create repair and recycling jobs. :-) Don't forget to buy two UPSes and split your machine across both. And don't forget to actually maintain the UPS. And check the batteries. And schedule a load test. The single best way to learn about the joys of UPS behaviour is to sit down and have a drink with a facilities manager who has been doing the job for at least ten years. At least you'll hear some funny stories about the day a loose screw on one floor took out a house UPS and 100+ hosts and NEs with it. Andre. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
On Tue, 30 Jun 2009, Neal Pollack wrote: Actually, they do quite a bit more than that. They create jobs, generate revenue for battery manufacturers, and tech's that change batteries and do PM maintenance on the large units. Let's not It sounds like this is a responsibility which should be moved to the US federal goverment since UPSs create jobs. In the last 28 years of doing this stuff, I've found a few times that the UPS has actually worked and lasted as long as the outage. I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
Bob Friesenhahn wrote: On Tue, 30 Jun 2009, Neal Pollack wrote: Actually, they do quite a bit more than that. They create jobs, generate revenue for battery manufacturers, and tech's that change batteries and do PM maintenance on the large units. Let's not It sounds like this is a responsibility which should be moved to the US federal goverment since UPSs create jobs. Actually, I think UPS already employs some 410,000+ people, making it the 3rd largest private employer in the USA. (5th overall, if you include the Federal Gov't and the US Postal Service). wink In the last 28 years of doing this stuff, I've found a few times that the UPS has actually worked and lasted as long as the outage. I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss As someone who has spend enough time doing data center work, I can attest to the fact that UPSes are really useful only as extremely-short-interval solutions. A dozen or so minutes, at best. The best design I've see was for an old BBN (hey, remember them!) site just outside of Cambridge, MA. It took in utility power, ran it through a conditioner setup, and then through this nice switch thing. The switch took three inputs: Utility, a local diesel generator, and a line of marine batteries. The switch itself was internally redundant (which isn't hard to do, it's 50's tech), so you could draw power from any (or even all 3 at once). Nothing really fancy; it was simple, with no semiconductor stuff to fail - just all 50-ish hardwired circuitry. I don't even think there was a transistor in the whole shebang. Lots of capacitors, though. :-) The jist of the whole thing was, that if utility power was out more than 5 minutes, there was not good predictor of how long it would remain out - I saw a nice little graph that showed no real good prediction of outage time based on existing outage length (i.e. if the power has been out X minutes, you can expect it to be restored in Y minutes...). I suspect it was something like 20 years of accumulated data or so... The end of this is simple: UPSes should give you enough time to start the gen-pack. If you are having problems with your gen-pack, you'll never have enough UPS time to fix it (and, it's not cost-effective to try to make it so), so FIX YOUR GEN PACK BEFORE the outage. Which means - TEST it, and TEST it, and TEST it again! For home use, I set my UPS to immediately shut down anything attached to it for /any/ service outage. Large enough batteries to handle anything more than a couple of minutes are frankly a fire-hazard for the home, not to mention a maintenance PITA. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
On Tue, Jun 30, 2009 at 1:36 PM, Erik Trimbleerik.trim...@sun.com wrote: Bob Friesenhahn wrote: On Tue, 30 Jun 2009, Neal Pollack wrote: Actually, they do quite a bit more than that. They create jobs, generate revenue for battery manufacturers, and tech's that change batteries and do PM maintenance on the large units. Let's not It sounds like this is a responsibility which should be moved to the US federal goverment since UPSs create jobs. Actually, I think UPS already employs some 410,000+ people, making it the 3rd largest private employer in the USA. (5th overall, if you include the Federal Gov't and the US Postal Service). wink In the last 28 years of doing this stuff, I've found a few times that the UPS has actually worked and lasted as long as the outage. I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss As someone who has spend enough time doing data center work, I can attest to the fact that UPSes are really useful only as extremely-short-interval solutions. A dozen or so minutes, at best. The best design I've see was for an old BBN (hey, remember them!) site just outside of Cambridge, MA. It took in utility power, ran it through a conditioner setup, and then through this nice switch thing. The switch took three inputs: Utility, a local diesel generator, and a line of marine batteries. The switch itself was internally redundant (which isn't hard to do, it's 50's tech), so you could draw power from any (or even all 3 at once). Nothing really fancy; it was simple, with no semiconductor stuff to fail - just all 50-ish hardwired circuitry. I don't even think there was a transistor in the whole shebang. Lots of capacitors, though. :-) The jist of the whole thing was, that if utility power was out more than 5 minutes, there was not good predictor of how long it would remain out - I saw a nice little graph that showed no real good prediction of outage time based on existing outage length (i.e. if the power has been out X minutes, you can expect it to be restored in Y minutes...). I suspect it was something like 20 years of accumulated data or so... The end of this is simple: UPSes should give you enough time to start the gen-pack. If you are having problems with your gen-pack, you'll never have enough UPS time to fix it (and, it's not cost-effective to try to make it so), so FIX YOUR GEN PACK BEFORE the outage. Which means - TEST it, and TEST it, and TEST it again! Slight corollary -- just because you have a generator and test it doesn't mean you can assume you can get fuel in a timely manner (so still be prepared to shutdown if needed). I have seen places whose DR plans completely rely on the assumption there will never be any problems refueling their generators. However, last year after Ike hit, one of ATT's central offices lost power because it ran out of fuel (and couldn't get refilled in time). For home use, I set my UPS to immediately shut down anything attached to it for /any/ service outage. Large enough batteries to handle anything more than a couple of minutes are frankly a fire-hazard for the home, not to mention a maintenance PITA. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
ms == Monish Shah mon...@indranetworks.com writes: sl == Scott Lawson scott.law...@manukau.ac.nz writes: np == Neal Pollack neal.poll...@sun.com writes: ms If you are on a UPS, is it OK to disable ZIL? sl I have seen numerous UPS' failures over the years, yeah at my place in NYC we've had more problems with the UPS than with the service. At the very least a UPS needs to switch off for new batteries every two years, and the raw service does not go out that often for me. It starts to make more sense to use a UPS if you have dual power supplies, dual UPS's, bypass switches. Or crappy aboveground power. anyway, typical machines panic because of bugs a lot more often than either UPS or line problems. **BUT THIS IS ALL BESIDE THE POINT**! The ZIL is for implementing fsync() for databases and also the part of NFS that allows servers to reboot without client data loss. It has *NOTHING TO DO* with losing your entire pool. Disabling the ZIL does not make catastrophic pool loss more likely, not even a little bit! Unfortunately some software developer decided to write a bunch of DIRE WARNINGS to SCARE PEOPLE INTO ASSUMPTIONS leading them to use the maximum amount of code of which said developer is justly proud, regardless of whether they're using it for the right reason or not. oddly, I don't think disabling ZIL will make catastrophic loss more likely for databases running above the ZFS, either, because unlike non-COW filesystems ZFS never recovers to a state where writes appear to have happened out-of-order prior to the crash. Yes, disabling the ZIL could break the 'D' in ACID for databases running above that ZFS, but in a way that rolls them back in time, not makes them become corrupt. Running without ZIL is as if a snapshot were taken at each TXG commit time, and on reboot after a crash you recover to the most recent TXG-snapshot that fully committed, thus databases will be ``crash-consistent'' even without the ZIL, unless I'm mistaken. Adding an SSD *does* make catastrophic pool loss more likely, because if you break the SSD and then export the pool, you can never import it again. so, adding an SSD for the ZIL as a suggestive good-little-boy alternative to disabling the ZIL makes catastrophic loss of the entire pool more likely, not less. The advantage of rolling with ZIL is, if you're using NFS you should be able to crash and reboot the server without the clients noticing. Also MTA's that accept messages, databases that confirm orders and bookings, won't lose anything they've accepted or confirmed in the crash (if everything else works). I wish ZIL could be enabled and disabled per filesystem instead of per kernel. pgpxF80aXBJS7.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote: I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Can't remember where I read the claim, but supposedly if power isn't restored within about ten minutes, then it will probably be out for a few hours. If this 'statistic' is true, it would mean that your UPS should last (say) fifteen minutes, and after that you really need a generator. At $WORK we currently have about thirty minutes worth of juice at full load, but as time drags on and we start shutting down less essential stuff we can increase that. The PBX and security system have their own UPSes in their own racks, so there are two layers of battery there. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
David Magda wrote: On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote: I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Can't remember where I read the claim, but supposedly if power isn't restored within about ten minutes, then it will probably be out for a few hours. If this 'statistic' is true, it would mean that your UPS should last (say) fifteen minutes, and after that you really need a generator. Most UPS's from any vendor are designed to run for around ~12 minutes at full load. So that would appear to back that claim up and from my experience that is pretty much on the money... At $WORK we currently have about thirty minutes worth of juice at full load, but as time drags on and we start shutting down less essential stuff we can increase that. The PBX and security system have their own UPSes in their own racks, so there are two layers of battery there. The problem comes when the power cut comes and you aren't there in the middle of the night. Then you either need an automated shutdown system instigated by traps from the UPS (shutting things down in the correct order) or a generator. About here the generator becomes a very good option. The above no generator scenario needs to be consistently tested to maintain it's validity, which is a royal pain in the neck. Gen sets are worth their weight in gold. I can't even think how many times in the last few years they have saved our bacon. (through both planned and unplanned outages) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS, power failures, and UPSes
David Magda wrote: On Jun 30, 2009, at 14:08, Bob Friesenhahn wrote: I have seen UPSs help quite a lot for short glitches lasting seconds, or a minute. Otherwise the outage is usually longer than the UPSs can stay up since the problem required human attention. A standby generator is needed for any long outages. Can't remember where I read the claim, but supposedly if power isn't restored within about ten minutes, then it will probably be out for a few hours. If this 'statistic' is true, it would mean that your UPS should last (say) fifteen minutes, and after that you really need a generator. Or run your systems of DC and get as much backup as you have room (and budget!) for batteries. I once visited a central exchange with 48 hours of battery capacity... -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss