Why not just convert these geographically distributed management servers to host their own isolated SCOM management groups, and just connect them together with a central Service Manager instance?
From: [email protected] [mailto:[email protected]] On Behalf Of Sven Wells Sent: Monday, July 29, 2013 7:05 AM To: [email protected] Subject: RE: [msmom] RE: SCOM 2012 Some Mgt Svrs stopped inserting data into Ops DB. The remote SCOM 2012 management servers have an avg. latency of 32ms (NC based servers) and 125ms (UK based server). Up until mid-July these remote Mgmt servers were working fine. As far as I know the network bandwidth/technology has not changed since mid-July. Also, in our SCOM 2007 environment we have always had geographically based Mgmt servers without any issues, which included Gateways. The avg. latency on these is anywhere from 27ms up. The main reason we setup geographically based management servers is in case a network component, ie. router/switch is lost between our core sites we won't get Alert notifications for those systems that have been cutoff. If we have all agents report back to, say, an Austin mgmt. server and then all of a sudden we lose a router/switch in, say, the UK, Alerts will go out that these agents are down, even though they are not. This has happened in the past and those folks that get these alerts are not happy when they are able to log on to these systems, but SCOM "says" they are down. We also have not been able to configure a way to have SCOM 'recognize' that a router/switch is down and prevent agent health alerts as such. Other management systems, ie. Nagios, can do this, but so far we've not seen anything that states SCOM can do it. Another reason for geographically based management servers it to cut down on all the traffic that agents can generate, so geographically placed agents are assigned to their geographically based management servers, then that traffic is reduced on the network. My thinking on this could be off a bit. Thanks, Sven From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Kevin Holman Sent: Sunday, July 28, 2013 1:35 PM To: [email protected]<mailto:[email protected]> Subject: [msmom] RE: SCOM 2012 Some Mgt Svrs stopped inserting data into Ops DB. What is the latency between the remote management servers, and the other management servers in the same datacenter as the OpsMgr databases? The assumption is that no management server should be deployed where network latency is greater than 5ms. There are VERY few WAN technologies that can maintain less than 5ms latencies, therefore it is not recommended to deploy management servers across geographic locations EVER... *except* for some VERY specific disaster recovery scenarios, and even those scenarios must maintain less than 20ms latency and require some level of customization in order to maintain resource pool resiliency and availability. Is there a specific reason why you deployed management servers across geographic locations? Even in SCOM 2007, this was not recommended. From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Sven Wells Sent: Friday, July 26, 2013 4:40 AM To: [email protected]<mailto:[email protected]> Subject: [msmom] SCOM 2012 Some Mgt Svrs stopped inserting data into Ops DB. We noticed yesterday that three of our 12 Management Servers stopped inserting data into the Ops DB at the same time a week ago. There is no Performance View data for these management servers, or their assigned agents, after 18 Jul. We are looking through logs to see what, if anything, occurred on that date. These three Management Servers are located geographically in different places (2 in NC, 1 in Cambridge, UK) from the other 9 MSes. The other 9 MSes are all located in Austin, TX. All of our MSes are virtual. The Operations and Datawarehouse DBs are located in Austin, TX. We recently attempted to create a Resource Pool which included the 3 geographically different MSes, and one or two of the Austin MSes, but that did not seem to work as the Resource Pool would stop heartbeating. After some advice that a Resource Pool should contain same-geographic MSes, we removed those 3 MSes and replaced them with Austin MSes. All Management Servers are showing Green/Healthy in the console. All management servers were working fine until 18 Jul 2013. The agents assigned to those three management servers are all showing Event ID 2120: Event Type: Warning Event Source: HealthService Event Category: Health Service Event ID: 2120 Date: 7/19/2013 Time: 7:18:07 AM User: N/A Computer: XXXXXXX Description: The Health Service has deleted one or more items for management group "XXXX" which could not be sent in 1440 minutes. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp. This event is occurring every hour and has been since 18 July 2013. Any thoughts as to why these three Management Servers, one which has been completely rebooted, the other to have just had all of their System Center services stopped/restarted yesterday, have stopped inserting data? Thanks, Sven Sven Wells SYSTEMS ADMINISTRATION SPECIALIST TECHNOLOGY AND LABORATORY SVCS Wilmington NC HQ PPD Phone +1 910 558 6870 [email protected] <mailto:[email protected]>www.ppdi.com <http://www.ppdi.com/> This email transmission and any documents, files or previous email messages attached to it may contain information that is confidential or legally privileged. If you are not the intended recipient or a person responsible for delivering this transmission to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of this transmission is strictly prohibited. If you have received this transmission in error, please immediately notify the sender by telephone or return email and delete the original transmission and its attachments without reading or saving in any manner. This email transmission and any documents, files or previous email messages attached to it may contain information that is confidential or legally privileged. If you are not the intended recipient or a person responsible for delivering this transmission to the intended recipient, you are hereby notified that you must not read this transmission and that any disclosure, copying, printing, distribution or use of this transmission is strictly prohibited. If you have received this transmission in error, please immediately notify the sender by telephone or return email and delete the original transmission and its attachments without reading or saving in any manner.
