RE: [msmom] RE: OpsMgr agent using too much processor time

Orlebeck, Geoffrey Wed, 18 Feb 2015 13:57:10 -0800

Thank you both. We'll give a shot at modifying the monitor per Kevin's 
recommendation and if we don't see improvement, we'll look into disabling the 
alert on just the problematic servers.

Thanks again.
Geoff

From: [email protected] [mailto:[email protected]] On 
Behalf Of Andrew Kunz
Sent: Wednesday, February 18, 2015 12:43 PM
To: [email protected]
Subject: RE: [msmom] RE: OpsMgr agent using too much processor time

Are you Virtualized on VMWare ?  they have some articles out there indicating  
that cpu metrics could be questionable from the guest, I'm not sure and take it 
with a grain of salt.

That being said, we see the alert quite often and without specifically going in 
and reading the script I sort of formed an opinion

When we review the alert in question we find that the host generating it is 
almost always very idle (single db, couple websites perhaps, other non-resource 
intensive workload), somewhere in the area of less than 5% total cpu, when 
comparing performance views of the machine itself and then the healthservice I 
made the following "assumption"

I have an idle machine of less than 5% cpu and the health service is consuming 
50% of that, the health service needs to consume some resources minimum so I 
don't think the script takes into account the total machine utilization only 
the process utilization

We have a Group with an override applied that we put these nuisance machines 
into

Andrew

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Orlebeck, Geoffrey
Sent: Wednesday, February 18, 2015 2:30 PM
To: '[email protected]'
Subject: [msmom] RE: OpsMgr agent using too much processor time

Thank you for the reply. I just got some additional details and the SQL server 
is only hosting one DB (other than standard system DBs). The IIS server is 
servicing a couple websites. I ran your query on both 2007 and 2012 ops DBs and 
neither of the offending servers showed up in the top 50. I expanded it to top 
100 and found only the SQL server. 2007 OpsDB showed 75 HostedInstances and 
2012 OpsDB returned 78 HostedInstances.

Could this circle back to a topic we discussed previously, where we have a 
single custom management pack in 2007 containing all monitors/rules?

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Kevin Holman
Sent: Wednesday, February 18, 2015 10:49 AM
To: [email protected]<mailto:[email protected]>
Subject: [msmom] RE: OpsMgr agent using too much processor time

During multi-homing, it is normal to get a lot of noise from that monitor for 
agent using too much CPU.  Multi-homing runs twice the workflows and this can 
add up to a lot of stuff, especially if those servers host a large number of 
instances.

For instance - if that SQL server has a large number of databases (over 100) 
then you can really see an impact when being monitored, and especially so when 
multi-homed.
On the IIS server, if there are a large number of web sites, application pools, 
etc.... same deal.

That might explain why these two servers are consuming more resources....

I have an instance count query here:

http://blogs.technet.com/b/kevinholman/archive/2007/10/18/useful-operations-manager-2007-sql-queries.aspx

Get the discovered instance count of the top 50 agents

DECLARE @RelationshipTypeId_Manages UNIQUEIDENTIFIER
SELECT @RelationshipTypeId_Manages = dbo.fn_RelationshipTypeId_Manages()
SELECT TOP 50 bme.DisplayName, SUM(1) AS HostedInstances
FROM BaseManagedEntity bme
RIGHT JOIN (
SELECT
      HBME.BaseManagedEntityId AS HS_BMEID,
      TBME.FullName AS TopLevelEntityName,
      BME.FullName AS BaseEntityName,
      TYPE.TypeName AS TypedEntityName
FROM BaseManagedEntity BME WITH(NOLOCK)
      INNER JOIN TypedManagedEntity TME WITH(NOLOCK) ON BME.BaseManagedEntityId 
= TME.BaseManagedEntityId AND BME.IsDeleted = 0 AND TME.IsDeleted = 0
      INNER JOIN BaseManagedEntity TBME WITH(NOLOCK) ON 
BME.TopLevelHostEntityId = TBME.BaseManagedEntityId AND TBME.IsDeleted = 0
      INNER JOIN ManagedType TYPE WITH(NOLOCK) ON TME.ManagedTypeID = 
TYPE.ManagedTypeID
      LEFT JOIN Relationship R WITH(NOLOCK) ON R.TargetEntityId = 
TBME.BaseManagedEntityId AND R.RelationshipTypeId = @RelationshipTypeId_Manages 
AND R.IsDeleted = 0
      LEFT JOIN BaseManagedEntity HBME WITH(NOLOCK) ON R.SourceEntityId = 
HBME.BaseManagedEntityId
) AS dt ON dt.HS_BMEID = bme.BaseManagedEntityId
GROUP by BME.displayname
order by HostedInstances DESC

See if those two servers are high in the list.

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of Orlebeck, Geoffrey
Sent: Wednesday, February 18, 2015 9:28 AM
To: '[email protected]'
Subject: [msmom] OpsMgr agent using too much processor time

I'm trying to pinpoint what may be the cause of 2 of our servers (out of ~300 
with agents) keep alerting on the "OpsMgr Agent process are using too much 
processor time". Some background/environment info:

1)    Both servers are VMs running Windows Server 2008 R2.

2)    All servers previously on SCOM 2007 are multi-homed between our 2007/2012 
environments.

3)    Our SCOM 2007 environment is triggering the alert.

4)    Every 5mins the MonitoringHost.exe process on both servers spikes to 
values between 60-100%. It lasts anywhere from 5-15 seconds.

5)    2007 and 2012 SCOM management servers are patched to the latest 
respective releases.

6)    Agent versions on all multi-homed servers are identical (v7.1.10184.0)

7)    The alerts started after pushing out the 2012 agent and multi-homing 
began. None of our other multi-homed servers (ranging from Windows Server 2003 
SP2 to 2012 R2) are alerting on this issue.

The MonitoringHost.exe process starts off small around 5MB and steadily grows 
and eventually resets. I haven't figured out the exact trigger, if it's time or 
size, but the largest I've seen committed to a single MonitoringHost.exe 
process is about 175MB.

Kevin Holman's blog entry (Link 
Here<http://blogs.technet.com/b/kevinholman/archive/2009/07/20/do-you-randomly-see-a-monitoringhost-exe-process-consuming-lots-of-cpu.aspx>)
 points to a hotfix (http://support.microsoft.com/kb/968967) but the newest OS 
that hotfix applies to is 2008 Std (non-R2). There is a comment from Kevin in 
the comments section of his article about placing various MPs into Maintenance 
Mode and seeing if the issue persists. However, due to my novice experience, 
I'm not sure how to tell which MPs are applied to these two servers-they are 
both generic 2008 R2 boxes, one has SQL the other IIS. Otherwise they (appear) 
identical to me. So outside of those two roles/programs, I'm not sure what MPs 
SCOM would use other than basic Windows OS MPs...I think.

Should I be looking in the 2007 environment because the alerts are coming from 
2007? Or is there a place to correlate the timestamps of the MonitoringHost.exe 
spiking to see what monitor/rule or performance data the agent is trying to 
gather?

I understand some of these questions may be quite basic, I don't have much 
experience (yet) in the SCOM world, and I dug around most of yesterday and part 
of today to pinpoint what may be the cause, so now just hoping for a little 
help.

Thank you for your time.

-Geoff

Confidentiality Notice: This is a transmission from Community Hospital of the 
Monterey Peninsula. This message and any attached documents may be confidential 
and contain information protected by state and federal medical privacy 
statutes. They are intended only for the use of the addressee. If you are not 
the intended recipient, any disclosure, copying, or distribution of this 
information is strictly prohibited. If you received this transmission in error, 
please accept our apologies and notify the sender. Thank you.

Confidentiality Notice: This is a transmission from Community Hospital of the 
Monterey Peninsula. This message and any attached documents may be confidential 
and contain information protected by state and federal medical privacy 
statutes. They are intended only for the use of the addressee. If you are not 
the intended recipient, any disclosure, copying, or distribution of this 
information is strictly prohibited. If you received this transmission in error, 
please accept our apologies and notify the sender. Thank you.

Confidentiality Notice: This is a transmission from Community Hospital of the 
Monterey Peninsula. This message and any attached documents may be confidential 
and contain information protected by state and federal medical privacy 
statutes. They are intended only for the use of the addressee. If you are not 
the intended recipient, any disclosure, copying, or distribution of this 
information is strictly prohibited. If you received this transmission in error, 
please accept our apologies and notify the sender. Thank you.

RE: [msmom] RE: OpsMgr agent using too much processor time

Reply via email to