Experience may not be the best teacher, but it is the most expensive one...

On Fri, Oct 8, 2010 at 13:34, David Lum <[email protected]> wrote:
> So, the root cause: ESX 3.5 OS was installed onto SAN volume that contained
> my VM’s. The install of that OS (effectively) removes pointers that VM’s
> need when they boot up. Best practice is to disconnect the SAN links when
> installing this version of the OS so this doesn’t happen. In fact our SE did
> this but apparently didn’t disconnect one far enough. If we had left the
> VM’s running we could have used a VM converter to move them to a different
> storage location.
>
>
>
> ESX 4.0 doesn’t allow this activity.
>
>
>
> Our SE feels really about out the work he created for me – personally I’m
> just really happy he’s a stand up guy and explained what happened. You do
> this stuff long enough and something like this eventually happens – it’s
> called “experience”.
>
>
>
> Dave
>
>
>
> From: Andrew S. Baker [mailto:[email protected]]
> Sent: Friday, October 08, 2010 9:36 AM
> To: NT System Admin Issues
> Subject: Re: How'd this for a bad day? AKA bad me
>
>
>
> I've said it before, but I will say it again.
>
>
>
> In a highly virtualized, heavily consolidated world, we need more planning,
> more thinking and more time for effective execution.
>
> Cutting corners will become more and more painful, and will bite more and
> more organizations.
>
>
>
> Hopefully, enough near misses will teach enough entities to do the right
> thing.   That's just my optimism speaking, however.
>
>
>
> It will be incumbent on each technology professional to advocate or fight
> for the right solutions, or have an excellent exit strategy planned out. :)
>
> ASB (My XeeSM Profile)
> Exploiting Technology for Business Advantage...
>
>
> On Fri, Oct 8, 2010 at 11:27 AM, Raper, Jonathan - Eagle
> <[email protected]> wrote:
>
> +1 from here as well. A vCenter reboot should not require a host reboot. If
> it did, that would (IMHO) be a huge problem in the design and purpose behind
> VMware. Talk to VMware. If your maintenance is not current, get current.
>
>
>
> On a related note, YESTERDAY, one of our storage groups on our SAN ran out
> of space (fortunately I’m not in or over the group responsible for that
> anymore!), and thus took down a number of systems, all part of our core
> electronic medical record system, eClinicalWorks, all virtual… We were
> without that app for more than 6 hours, and are still dealing with database
> replication issues today as a result….
>
>
>
> TGIF!
>
> Jonathan L. Raper, A+, MCSA, MCSE
> Technology Coordinator
> Eagle Physicians & Associates, PA
> [email protected]
> www.eaglemds.com
>
> ________________________________
>
> From: Jonathan Link [mailto:[email protected]]
> Sent: Friday, October 08, 2010 9:40 AM
>
> To: NT System Admin Issues
> Subject: Re: How'd this for a bad day? AKA bad me
>
>
>
> +1  I'm just getting caught up on emails this morning.  vCenter reboot
> shouldn't necessitate a reboot of a host server.
>
>
>
> On Fri, Oct 8, 2010 at 9:34 AM, Jeff Bunting <[email protected]> wrote:
>
> Why do you need to power down VMs to reboot vCenter?  vCenter might be the
> problem with the missing VMs.  VMWare support might be able to help you with
> those.
>
> Jeff
>
> On Fri, Oct 8, 2010 at 5:51 AM, David Lum <[email protected]> wrote:
>
> I have 7 production systems running on 3 different ESX boxes in an ESX
> cluster, and 2 different logical SAN volumes (sorry am not SAN savvy, I just
> know I have two different SAN volumes to choose from when making a VM).
>
>
>
> Today, a SAN blows up and takes out half – our SharePoint server (heavily
> used), a Terminal Server , and an internal occasionally-used web server
> (Namescape rDirectory). Then somehow, when I was told to power down the
> other 4 VM’s so our VMWare guy could reboot a vCenter server, 3 of the 4
> remaining VM’s decided to go AWOL (a combination of “missing” and
> “disconnected”). That took out my other two Terminal Servers and another
> lightly used internal web server.
>
>
>
> Did I mention I don’t have the normal backups for these things because
> …well…I’m an idiot and didn’t confirm our backup guy installed backup
> software on these servers as I stood them up (process error on my part since
> I should confirm it’s on there). None of these store data – they all talk to
> a backend SQL and the Terminal Servers are used to run apps that are slow if
> they run the same apps over VPN. SharePoint we got back quick because we do
> have a staging equivalent of it, so it was repoint to a config and content
> DB, DNS change, and done.
>
>
>
> I do have copious notes on how I built the others and can rebuild from
> scratch easily enough (I just finished the three TS boxes), but dude…six
> servers at once?
>
>
>
> The most frustrating part was discovering that the 4 systems that had been
> powered off could have been “migrated” before power off and there would have
> been no issue with them – the power down nuked ‘em.
>
>
>
> Oh, and the lone surviving server – the PGP Universal Server that manages
> the encrypted machines. (Yes, the PGP machines will still boot w/out the
> server up, but still, I’ve been on this server 50% of my time over the last
> two weeks!).
>
>
>
> Dave
>
>
>
> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~
> ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
>
> ---
> To manage subscriptions click here:
> http://lyris.sunbelt-software.com/read/my_forums/
> or send an email to [email protected]
> with the body: unsubscribe ntsysadmin
>
> ~ Finally, powerful endpoint security that ISN'T a resource hog! ~
> ~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~
>
> ---
> To manage subscriptions click here:
> http://lyris.sunbelt-software.com/read/my_forums/
> or send an email to [email protected]
> with the body: unsubscribe ntsysadmin

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin

Reply via email to