Experience may not be the best teacher, but it is the most expensive one...
On Fri, Oct 8, 2010 at 13:34, David Lum <[email protected]> wrote: > So, the root cause: ESX 3.5 OS was installed onto SAN volume that contained > my VM’s. The install of that OS (effectively) removes pointers that VM’s > need when they boot up. Best practice is to disconnect the SAN links when > installing this version of the OS so this doesn’t happen. In fact our SE did > this but apparently didn’t disconnect one far enough. If we had left the > VM’s running we could have used a VM converter to move them to a different > storage location. > > > > ESX 4.0 doesn’t allow this activity. > > > > Our SE feels really about out the work he created for me – personally I’m > just really happy he’s a stand up guy and explained what happened. You do > this stuff long enough and something like this eventually happens – it’s > called “experience”. > > > > Dave > > > > From: Andrew S. Baker [mailto:[email protected]] > Sent: Friday, October 08, 2010 9:36 AM > To: NT System Admin Issues > Subject: Re: How'd this for a bad day? AKA bad me > > > > I've said it before, but I will say it again. > > > > In a highly virtualized, heavily consolidated world, we need more planning, > more thinking and more time for effective execution. > > Cutting corners will become more and more painful, and will bite more and > more organizations. > > > > Hopefully, enough near misses will teach enough entities to do the right > thing. That's just my optimism speaking, however. > > > > It will be incumbent on each technology professional to advocate or fight > for the right solutions, or have an excellent exit strategy planned out. :) > > ASB (My XeeSM Profile) > Exploiting Technology for Business Advantage... > > > On Fri, Oct 8, 2010 at 11:27 AM, Raper, Jonathan - Eagle > <[email protected]> wrote: > > +1 from here as well. A vCenter reboot should not require a host reboot. If > it did, that would (IMHO) be a huge problem in the design and purpose behind > VMware. Talk to VMware. If your maintenance is not current, get current. > > > > On a related note, YESTERDAY, one of our storage groups on our SAN ran out > of space (fortunately I’m not in or over the group responsible for that > anymore!), and thus took down a number of systems, all part of our core > electronic medical record system, eClinicalWorks, all virtual… We were > without that app for more than 6 hours, and are still dealing with database > replication issues today as a result…. > > > > TGIF! > > Jonathan L. Raper, A+, MCSA, MCSE > Technology Coordinator > Eagle Physicians & Associates, PA > [email protected] > www.eaglemds.com > > ________________________________ > > From: Jonathan Link [mailto:[email protected]] > Sent: Friday, October 08, 2010 9:40 AM > > To: NT System Admin Issues > Subject: Re: How'd this for a bad day? AKA bad me > > > > +1 I'm just getting caught up on emails this morning. vCenter reboot > shouldn't necessitate a reboot of a host server. > > > > On Fri, Oct 8, 2010 at 9:34 AM, Jeff Bunting <[email protected]> wrote: > > Why do you need to power down VMs to reboot vCenter? vCenter might be the > problem with the missing VMs. VMWare support might be able to help you with > those. > > Jeff > > On Fri, Oct 8, 2010 at 5:51 AM, David Lum <[email protected]> wrote: > > I have 7 production systems running on 3 different ESX boxes in an ESX > cluster, and 2 different logical SAN volumes (sorry am not SAN savvy, I just > know I have two different SAN volumes to choose from when making a VM). > > > > Today, a SAN blows up and takes out half – our SharePoint server (heavily > used), a Terminal Server , and an internal occasionally-used web server > (Namescape rDirectory). Then somehow, when I was told to power down the > other 4 VM’s so our VMWare guy could reboot a vCenter server, 3 of the 4 > remaining VM’s decided to go AWOL (a combination of “missing” and > “disconnected”). That took out my other two Terminal Servers and another > lightly used internal web server. > > > > Did I mention I don’t have the normal backups for these things because > …well…I’m an idiot and didn’t confirm our backup guy installed backup > software on these servers as I stood them up (process error on my part since > I should confirm it’s on there). None of these store data – they all talk to > a backend SQL and the Terminal Servers are used to run apps that are slow if > they run the same apps over VPN. SharePoint we got back quick because we do > have a staging equivalent of it, so it was repoint to a config and content > DB, DNS change, and done. > > > > I do have copious notes on how I built the others and can rebuild from > scratch easily enough (I just finished the three TS boxes), but dude…six > servers at once? > > > > The most frustrating part was discovering that the 4 systems that had been > powered off could have been “migrated” before power off and there would have > been no issue with them – the power down nuked ‘em. > > > > Oh, and the lone surviving server – the PGP Universal Server that manages > the encrypted machines. (Yes, the PGP machines will still boot w/out the > server up, but still, I’ve been on this server 50% of my time over the last > two weeks!). > > > > Dave > > > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin > > ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ > ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ > > --- > To manage subscriptions click here: > http://lyris.sunbelt-software.com/read/my_forums/ > or send an email to [email protected] > with the body: unsubscribe ntsysadmin ~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~ --- To manage subscriptions click here: http://lyris.sunbelt-software.com/read/my_forums/ or send an email to [email protected] with the body: unsubscribe ntsysadmin
