Re: [DISCUSS] Top & Sampling Rates (kvm.resource.LibvirtComputingResource)

Chiradeep Vittal Thu, 27 Feb 2014 22:34:23 -0800

O_O. Good to know!

On 2/26/14, 11:58 PM, "Marty Sweet" <msweet....@gmail.com> wrote:


>Hi Guys,
>
>Does anyone have any ideas about this?
>My main concern is the KVM resource collector and I assume the other
>hypervisor setups are receiving the wrong values.
>
>The CSAgent periodically runs the following command:
> [kvm.resource.LibvirtComputingResource]
>(agentRequest-Handler-2:null) Executing: /bin/bash -c idle=$(top -b -n
>1|grep Cpu\(s\):|cut -d% -f4|cut -d, -f2);echo $idle
>
>===
>When running top manually I get the same results (2 secs after each
>command):
>======
>root@aurora:/var/log/cloudstack/agent# top -b -n1 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>root@aurora:/var/log/cloudstack/agent# top -b -n1 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>root@aurora:/var/log/cloudstack/agent# top -b -n1 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>root@aurora:/var/log/cloudstack/agent# top -b -n2 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>Cpu(s): 29.2%us,  1.1%sy,  0.0%ni, 69.1%id,  0.5%wa,  0.0%hi,  0.1%si,
>0.0%st
>=======
>Apparently this is because:
>"This is because top, vmstat, iostat all in their first run collect
>data since the last reboot time of the system.
>And the successive iterations run on the sampling period that you
>specify. So, in the first run of top, you will see the %idle time
>because from the time of reboot to the time of running top, it was
>that much % idle. But in next iterations, since it is busy it doesn't
>show any %idle.
>Exclude the first iteration and try sampling over the interval you want."
>http://serverfault.com/questions/436446/top-showing-64-idle-on-first-scree
>n-or-batch-run-while-there-is-no-idle-time-a
>========
>
>Wouldn't this result in Cloudstack-Agent getting the wrong idle value
>for the system?
>
>This hasn't been fixed in 4.3.0, so I will create a patch along the
>following lines (if others agree):
>/bin/bash -c idle=$(top -d0.10 -b -n 2|grep Cpu\(s\):|tail -n1|cut -d%
>-f4|cut -d, -f2;echo $idle
>-> Where top -d0.10, changes the refresh interval so the command is
>faster to complete.
>-> tail -n1, get's the last line of the output (the latest idle value)
>===
>
>
>Let me know what you think,
>Regards,
>Marty
>
>
>---------- Forwarded message ----------
>From: Marty Sweet <msweet....@gmail.com>
>Date: Sun, Feb 23, 2014 at 1:20 PM
>Subject: Segfault: Top & Sampling Rates
>(kvm.resource.LibvirtComputingResource)
>To: "dev@cloudstack.apache.org" <dev@cloudstack.apache.org>
>Cc: "us...@cloudstack.apache.org" <us...@cloudstack.apache.org>
>
>
>Hi,
>
>I have just noticed the occasional following error messages in kern.log.
>This is happening on all but 1 of my nodes. Is anyone else
>experiencing this issue?
>=====
>Feb 23 06:53:24 aurora kernel: [10185338.400091] top[27631]: segfault
>at 0 ip 00007f025eba3315 sp 00007fff3f9ed308 error 4 in
>libc-2.15.so[7f025ea6f000+1b5000]
>=====
>
>I happened to have one of the nodes in trace mode, showing
>cloudstack-agent is starting it:
>
>/var/log/cloudstack/agent/agent.log
>======
>2014-02-23 06:53:23,654 DEBUG [kvm.resource.LibvirtComputingResource]
>(agentRequest-Handler-1:null) Executing: /bin/bash -c idle=$(top -b -n
>1|grep Cpu\(s\):|cut -d% -f4|cut -d, -f2);echo $idle
>2014-02-23 06:53:23,661 DEBUG [kvm.resource.LibvirtComputingResource]
>(agentRequest-Handler-2:null) Executing: /bin/bash -c idle=$(top -b -n
>1|grep Cpu\(s\):|cut -d% -f4|cut -d, -f2);echo $idle
>======
>
>## This lead me on to find the following (potential) bug:
>
>When running this manually I get the same result (2 secs before each
>command):
>======
>root@aurora:/var/log/cloudstack/agent# top -b -n1 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>root@aurora:/var/log/cloudstack/agent# top -b -n1 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>root@aurora:/var/log/cloudstack/agent# top -b -n1 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>root@aurora:/var/log/cloudstack/agent# top -b -n2 | grep Cpu
>Cpu(s):  6.3%us,  1.0%sy,  0.0%ni, 92.4%id,  0.2%wa,  0.0%hi,  0.0%si,
>0.0%st
>Cpu(s): 29.2%us,  1.1%sy,  0.0%ni, 69.1%id,  0.5%wa,  0.0%hi,  0.1%si,
>0.0%st
>=======
>Apparently this is because:
>"This is because top, vmstat, iostat all in their first run collect
>data since the last reboot time of the system.
>And the successive iterations run on the sampling period that you
>specify. So, in the first run of top, you will see the %idle time
>because from the time of reboot to the time of running top, it was
>that much % idle. But in next iterations, since it is busy it doesn't
>show any %idle.
>Exclude the first iteration and try sampling over the interval you want."
>http://serverfault.com/questions/436446/top-showing-64-idle-on-first-scree
>n-or-batch-run-while-there-is-no-idle-time-a
>========
>
>Wouldn't this result in Cloudstack-Agent getting the wrong idle value
>for the system?
>
>If this hasn't been fixed in 4.3.0, I will create a patch along the
>following lines (if others agree):
>/bin/bash -c idle=$(top -d0.01 -b -n 2|grep Cpu\(s\):|tail -n1|cut -d%
>-f4|cut -d, -f2;echo $idle
>-> Where top -d0.01, changes the refresh interval so the command is
>faster to complete.
>-> tail -n1, get's the last line of the output (the latest idle value)
>
>Ubuntu 12.04 / KVM / CS 4.2.0
>Linux aurora 3.5.0-34-generic #55~precise1-Ubuntu SMP Fri Jun 7
>16:25:50 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>
>Thanks,
>Marty
>
>
>-- 
>Marty

Re: [DISCUSS] Top & Sampling Rates (kvm.resource.LibvirtComputingResource)

Reply via email to