Brooks Davis wrote:
On Tue, Feb 15, 2005 at 05:27:53PM -0500, Robert E. Parrott wrote:
Is there a way to make ganglia hyperthread - aware in v3.0?
We have a cluster of dual Xeons with hyperthreading, which ganglia
reports as 4 processes. This is annoying when using the web frontend,
since full load appears to only be 1/2 load, and people think there are
twice as many CPUs available.
With 2.5.7, I hacked the code using a patch from OSC to only report 2
processor per real processor when hyperthreading was found enabled.
However, in 3.0 this is not enabled, so I've backed down to 2.5.7 again.
Is there a reason this kind of patch is not in place? Are there options
I;m not aware of in the config files?
There is no sensible way to do this that will always work. The
problem is that some users will want the current behavior and other
will certainly not. It depends on their applications. A small, but
non-zero set of applications exists for which HTT really is nearly as
good as two CPUs. The real issue is that a number of cpus variable is
no where near sufficient to represent the issues involved in representing
the hierarchy of CPU like things on your machine. It's actually the
case that as long as the FSB speeds match, there's no reason why your
CPUs even need to be running at the same internal clock rate, on the
current x86 architecture.
I think a configuration option for gmond to divide the number of CPUs by
something before reporting would be a decent, low effort method of
allowing people to report what they want.
-- Brooks
I have not looked at the method ganglia uses to gather the cpu count on
linux, but I do know that on 2.4.21 linux, /proc/cpuinfo shows
hyperthreaded cpus as sharing the same physical id and runqueue numbers.
I agree with Brooks that this should be configurable, but I do not
recommend dividing number of cpus, rather count the unique physical ids.
I also agree that just reducing the number of cpus doesn't represent the
situation correctly. If you have 2 running process, one heavy usage, one
light usage, on a single HT cpu, with ganglia showing 1 cpu, your load
report will be inaccurate. You will have crossed the red line on the
chart but not in reality. tricksy
Ian