Thanks Kevin

I'm not sure where we got that info on 3 failed discoveries, possibly from our 
predecessor, also the TechNet article on Maintenance mode doesn't specify 
discoveries specifically as being suspended during maintenance mode which 
seemed to back up my theory.
https://technet.microsoft.com/en-us/library/hh212870(v=sc.12).aspx

I'm familiar with the commandlet to remove  disabled discoveries, we do a 
cleanup using it every 3 months or so.

Would this<https://technet.microsoft.com/en-us/library/hh205987(v=sc.12).aspx> 
be the correct place to be looking for detailed info on SCOM, or can you 
recommend some other good sources.

Thanks for all the feedback.
It seems I need to do a bit more reading.

Kind Regards
Gareth Miles

From: [email protected] [mailto:[email protected]] On 
Behalf Of Kevin Holman
Sent: Monday, May 15, 2017 2:57 PM
To: [email protected]
Subject: [msmom] RE: SCOM Network outage

When any class instance goes into Maintenance Mode, *ALL* workflows that 
*target* that class type will unload.  Period.

There is nothing special about discoveries.  They are workflows just like rules 
or monitors and will unload when their targeted class instance goes into MM.  
No exceptions.  Discoveries do NOT continue.

As to:  "why we disable the discovery of a SQL instance in a SQL cluster rather 
then put it into MM, then after 3 failed discoveries the instance will be 
removed from SCOM"

I have no idea what this is about.... But I might not be understand your 
perception.  It is not true at all.  If an object is discovered in SCOM, it is 
ALWAYS discovered.  There is no concept of "3 failed discoveries then removed" 
in SCOM.  If we run the discovery, and we do not discover the object, it is 
deleted immediately.  If we disable the discovery, we are stuck with the object 
FOREVER, until we run Remove-SCOMDisabledClassInstance PowerShell cmdlet, which 
will delete objects where there is an override disabling a discovery explicitly.



From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Gareth Miles
Sent: Monday, May 15, 2017 1:46 AM
To: [email protected]<mailto:[email protected]>
Subject: [msmom] RE: SCOM Network outage

Thanks for the reply Kevin

Is the discoveries stopping in MM specific to the Windows Computer class?
I've understood it that MM suppresses most workflows like  Alerts, perf 
collections etc, but that discoveries will continue.
Which is why we disable the discovery of a SQL instance in a SQL cluster rather 
then put it into MM, then after 3 failed discoveries the instance will be 
removed from SCOM.
Have I got this wrong?

The health service watcher objects into MM is a good tip.

Kind Regards
Gareth Miles



From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Kevin Holman
Sent: Sunday, May 14, 2017 9:14 PM
To: [email protected]<mailto:[email protected]>
Subject: [msmom] RE: SCOM Network outage

Below:

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Gareth Miles
Sent: Friday, May 12, 2017 8:40 AM
To: [email protected]<mailto:[email protected]>
Subject: [msmom] SCOM Network outage

Hi

Our Company is having an emergency network outage next weekend for 6hrs, 
possibly longer.

I have a SCOM 2012 SP1 management group with 6 management servers in our office 
which will be effected, and 13 gateway servers around the world which connect 
to 3 of the management servers with the agent count fairly evenly distributed 
amongst the three management servers.
The site with the  largest agent count has around 750 agents, with two gateways 
and the agents split between them.
The other gateways have between 200 to 400 agents connecting to them.

During the network outage the gateways will not be able to connect to the 
management servers, and the management servers will lose connection to the 
Operationsmanager and WareHouse DB servers.

I have three plans in mind, but not sure which is the better of the two, or if 
there's a cleaner way of managing the outage.
Any advice would be appreciated

Plan 1
Put all agents into maintenance mode at the windows computer level before the 
network outage, so only discoveries are processed.

KH - that is an incorrect assumption.  When you place a Windows Computer into 
MM, EVERYTHING unloads.  Discoveries are no different than rules or monitors in 
this regard.

When the network outage accrues, the gateways and the agents will queue the 
discovery data until network connectivity returns.

KH - no - there will be nothing to queue.  If the agents go into MM, they will 
unload the workflows and send nothing across the wire.

Plan 2
Put all agents into maintenance mode, then shut down the management servers and 
DB servers until the network is back.

Plan 3
Leave as is, let gateway and agents queue data till network connectivity 
returns.


Also what is the process for a Gateway/Agent's queue when it can't connect to 
its Management Server/Gateway, does the queue fill up to a certain size, or 
till the disk is full?

Kind regards
Gareth Miles

KH - Agents will queue until their queue is full, then will FIFO (first in 
first out) based on a prioritization.  We dump perf data first, and alerts last.

Honestly - your choice of action is largely irrelevant.  If the outage is 
network only, then normally you want the agents to queue and write alerts to 
their queues so you don't miss anything.  However, you might see additional 
alerts from agents because of the network outage impacting applications..... so 
this will result in a large amount of alerts that wont be "actionable".  So 
placing them into MM or not is a judgement call.

Shutting down the gateways and management servers is largely irrelevant.  If 
they queue, they will fill the queue then cut off any more downstream 
healthservices until the queue can clear.

Probably the biggest thing I would want to do, is to ensure you place the 
agents Health Service Watcher objects into MM, because you don't really want a 
ton of "computer down" alerts when you know you have a planned network outage.






Reply via email to