Hi Martin,
this is the output of the commands you requested.
1.) uname -m
x86_64
2.) file `which monit`
ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically
linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
I ran the command you supplied to get the cup usage directly as well while
restarting the httpd service as i know this will generate an alert.
Date: Wed, 07 Dec 2011 09:57:37
Action: exec
Host: <hostname removed>
Description: cpu system usage of 50.0% matches resource limit [cpu
system usage>30.0%]
Wed Dec 7 09:57:34 GMT 2011
cpu 207060 501 103542 49452254 25303 83 1569 0 0
Wed Dec 7 09:57:35 GMT 2011
cpu 207061 501 103543 49452353 25303 83 1569 0 0
Wed Dec 7 09:57:36 GMT 2011
cpu 207062 501 103543 49452451 25303 83 1569 0 0
Wed Dec 7 09:57:37 GMT 2011
cpu 207087 501 103559 49452510 25304 83 1569 0 0
Wed Dec 7 09:57:38 GMT 2011
cpu 207088 501 103561 49452608 25304 83 1569 0 0
Wed Dec 7 09:57:40 GMT 2011
If my understanding of /proc/stat is coreect this still doesnt make any
sense but i may be wrong.
Regards
Wayne
On 7 December 2011 09:37, Martin Pala <[email protected]> wrote:
> Please can you check that your monit binary matches the system
> architecture? (i.e. for example 64-bit monit binary on 64-bit system - not
> 32-bit monit on 64-bit system)
>
> To verify provide please the output of following commands:
> 1.) uname -m
> 2.) file `which monit`
>
> Monit takes the statistics from the /proc/stat kernel interface. You can
> collect the statistics manually like this - for example to fetch the state
> in 1 second intervals (30 samples):
>
> $ for ((i=0; i<30; i++)); do date; grep "cpu " /proc/stat; sleep 1; done
>
> Note: monit takes the first /proc/stat line ("cpu") which contains the
> overall cpu usage in the system (summary of all cpus). The /proc/stat also
> contains per-cpu statistics if you want to collect all the statistics,
> replace the "grep 'cpu '" simply with "cat".
>
> Regards,
> Martin
>
>
> On Dec 7, 2011, at 10:04 AM, Lawrence, Wayne wrote:
>
> Hi Martin,
>
> I have tried various methods to dientify the cause of this and took your
> advice and used vmstat. I simply restarted the httpd process from the monit
> web interface while the comand was running and got the following warning.
>
> Description: cpu system usage of 50.0% matches resource limit [cpu
> system usage>30.0%]
>
> But vmstat doesnt show that level of usage at the point of alert. As you
> can see there is some usage in the 3rd line of the output when i restarted
> the httpd service but it doesnt seem enough to trigger an alert.
>
> vmstat 1 10
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu-----
> r b swpd free buff cache si so bi bo in cs us sy id
> wa st
> 0 0 0 859596 114684 856908 0 0 4 6 81 77 0 0
> 99 0 0
> 0 0 0 859448 114684 856916 0 0 0 0 100 94 1 0
> 99 0 0
> 0 0 0 898352 114692 815600 0 0 0 168 555 605 23 15
> 61 1 0
>
> Not sure if there are any other tests i can run to narrow this down a bit
> further as it still isn't making sense.
>
> Regards
>
> Wayne
>
>
>
>
>
> On 7 December 2011 08:27, Martin Pala <[email protected]> wrote:
>
>> Hi Lawrence,
>>
>> the test which triggers the alert is "system" cpu => it's the time the
>> system spend in kernel mode. The cpu usage could be triggered by some
>> background kernel task, to verify the monit report matches the system cpu
>> usage, you should use either "vmstat" or "top" instead of "ps".
>>
>> Best regards,
>> Martin
>>
>>
>> On Dec 6, 2011, at 1:19 PM, Lawrence, Wayne wrote:
>>
>> Hi Igor,
>>
>> the operating system is RHEL6 and monit version is 5.3.1
>>
>> this is what i have in my config
>>
>> if cpu usage (user) > 70% then alert
>> if cpu usage (system) > 30% then alert
>> if cpu usage (wait) > 20% then alert
>>
>> this is one of the errors
>> Description: cpu system usage of 50.0% matches resource limit [cpu system
>> usage>30.0%]
>>
>> this is what i get in /var/log/messages
>> Dec 6 12:01:29 <hostname-removed> monit[864]: <hostname-removed> cpu
>> system usage of 50.0% matches resource limit [cpu system usage>30.0%]
>> Dec 6 12:02:29 <hostname-removed> monit[864]:
>> <hostname-removed><hostname-removed>' cpu system usage check succeeded
>> [current cpu system usage=0.9%]
>>
>> this is the output of ps --no-headers -A -o "%*cpu* sz ucomm" | sort
>> -k1nr | head -20
>>
>> 12:01:29 up 4 days, 20:24, 2 users, load average: 0.04, 0.01, 0.00
>> total used free shared buffers cached
>> Mem: 2055108 1092176 962932 0 53156 811864
>> -/+ buffers/cache: 227156 1827952
>> Swap: 4128760 0 4128760
>> 1.2 44308 perl
>> 0.0 0 aio/0
>> 0.0 0 async/mgr
>> 0.0 0 ata/0
>> 0.0 0 ata_aux
>> 0.0 0 bdi-default
>> 0.0 0 cpuset
>> 0.0 0 crypto/0
>> 0.0 0 events/0
>> 0.0 0 ext4-dio-unwrit
>> 0.0 0 flush-253:0
>> 0.0 0 jbd2/dm-0-8
>> 0.0 0 kacpi_hotplug
>> 0.0 0 kacpi_notify
>> 0.0 0 kacpid
>> 0.0 0 kauditd
>> 0.0 0 kblockd/0
>> 0.0 0 kdmflush
>> 0.0 0 khelper
>> 0.0 0 khubd
>>
>> Have to say i am at a total loss as there is no way the usage figures are
>> accurate.
>> If there is any other info i can supply that will be useful please let me
>> know.
>>
>> Regards
>>
>> Wayne
>>
>>
>> On 6 December 2011 12:03, Igor Homyakov <[email protected]
>> > wrote:
>>
>>> Hi Lawrence,
>>>
>>> Could you be a little bit more specific ? Please provide information
>>> about you operation system, monit version on which the problem
>>> occurred and so on.
>>>
>>> Regards
>>> Igor Homyakov
>>>
>>> On Tue, Dec 6, 2011 at 15:35, Lawrence, Wayne
>>> <[email protected]> wrote:
>>> > Hi,
>>> >
>>> > I have a few CPU usage checks in my monitrc but it seems monit is
>>> > misreporting the usage.
>>> >
>>> > I have run several tests and it seems that monit is multiplying the
>>> actual
>>> > usage by 10.
>>> >
>>> > I ran a process with top running in another shell and CPU usage for
>>> the user
>>> > was never above 10% yet monit informed me that there was 100% cpu
>>> usage.
>>> >
>>> > I have tried various configurations including the one that came with
>>> the
>>> > default config for system cpu monitoring and all seem to demonstrate
>>> the
>>> > same issue.
>>> >
>>> > Any advice welcomed on this
>>> >
>>> > Regards
>>> >
>>> > Wayne Lawrence
>>> >
>>> >
>>> >
>>> > --
>>> > To unsubscribe:
>>> > https://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>> --
>>> To unsubscribe:
>>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>>
>>
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>>
>>
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general