On 10/31/07, Morris, Kevin J. (LNG-DAY) <[EMAIL PROTECTED]> wrote: > Cannot afford to run something on Linux? > - How does ESALPS gather the Linux process-level data outside of > running an agent/application ON Linux?
The big difference is whether you use an active agent or active agent. ESALPS uses snmd which only uses resources when you retrieve data. And because ESALPS also has the VM usage data per virtual machine, there is no need to ask for per-process data when VM already knows the virtual machine was idle. The other difference is where you archive your data. When each Linux server is keeping its own archive, maintaining the archive (and retrieving data from it) takes resources on Linux and may also impact whatever you were trying to measure. > The data that Linux logged would be wrong? > - I assume you are referring to the inaccuracies of the "tick based CPU > time accounting"? > - I had the understanding that RHEL5 and SLES10 resolved this issue > with the new "virtual CPU time accounting"? The "tick based" accounting is only part of it. Usage is reported in units of 10 ms which is a lot on modern CPUs. This means that many tools will miss fractions of 10 ms on each observation, and get a very low capture ratio. The Linux kernel does keep more accurate usage, and ESALPS uses the process hierarchy to catch the missing fractures, and achieves a capture ratio in the high 90's normally. The other part is the virtualization effect, where the Linux kernel believes it uses 100% of the CPU, but because of shared resources on z/VM it only gets 10% (so all readings are order of magnitude off). The latest Linux kernels have changed this to report 10% - so what's the conclusion when you see Linux report 10% usage; is it short on CPU or not ;-). ESALPS has both numbers and reports both, which means you have the true numbers for accounting and the Linux numbers to understand the performance implications inside Linux. > - The memory, disk, #page faults, etc. are still accurate. Correct? Memory as reported by "free" for example is correct for what Linux believes. But which part of that is resident in z/VM is something ESALPS gets from z/VM monitor data. And you need to include swap usage in that as well. Linux internally does report disk rates, but in a shared environment you can not translate rates into I/O response times (which is what impacts your application). The response time measurements inside the Linux device driver are subject to virtual machine dispatch and likely to make you bark up the wrong tree. ESALPS gets I/O metrics from the subchannel measurement data and combines that with the VM data to allocate the usage to virtual machines. Rob -- Rob van der Heij Velocity Software, Inc http://velocitysoftware.com/ ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
