Hello folks,

I've been investigating an issue with a product I work on that sends
metrics to Ganglia.  We provide memory metrics to gmond for each node in
our cluster and in some cases these metrics overlap with those provided by
gmond out of the box, i.e. memory, load.

What I've found is there are numerous areas of Ganglia2 which report memory
usage that appears to be off by a factor of 1024.  Specifically memory
used, free, buffers, swap, etc. is reported in TB where the appropriate
unit would be GB or MB in some cases.  I can provide a screenshot of this
issue but it will need to be separately as the list has a cap of 40KB on
email.

The other areas affected by this inconsistency are the cluster physical
view and the individual node physical view.  I can provide screenshots of
these as well if necessary.

I've been able to track down the root cause as our product's memory metrics
are sent to gmond in bytes where Ganglia2 appears to expect memory metrics
be in KB.  Here are two examples of such an assumption in show_node.php:

# The metrics we need for this node.
$mem_total_gb = $metrics['mem_total']['VAL']/1048576;

# Turning into MBs. A MB is 1024 bytes.
$swap_free=$metrics['swap_free']['VAL']/1024.0;
$swap_total=sprintf("%.1f", $metrics['swap_total']['VAL']/1024.0);

Another from physical_view.php:

# Divide by 1024^2 to get Memory in GB.
$Memory = sprintf("%.1f GB", cluster_sum("mem_total",
$metrics)/(float)1048576);

There are similar issues in mem_report.php and mem_report.json which affect
the memory report I mentioned initially.  There are two basic issues as I
see it:

1) Ganglia2 does not do any interpretation of the unit that may be
associated with memory metrics.  It assumes the metrics will be in KB units
(as that is the the behavior of gmond), does some conversion with that
assumption in mind and graphs or reports in a particular view.  If the
metrics are in bytes multiple areas of Ganglia2 are reported inaccurately.

2) Assuming that we modify our product to report memory metrics to gmond in
KB instead of bytes to resolve issue #1 and match gmond's default behavior
we run into a different issue.  As was reported on a previous thread (
http://www.mail-archive.com/ganglia-general%40lists.sourceforge.net/msg07121.html)
the graph is displayed as millions of KB instead of the more intuitive
multiple of bytes.  In Ganglia2 today the gmond memory metrics appear as
millions of KB to represent GB instead of billions of bytes.  This is
certainly less of an issue than #1 (incorrect data vs. non-intuitive
display of correct data) but it is still an issue for us.


With all of that being said, has any run into this before or have any
suggestions on how to approach this problem?  It would be wonderful if the
summary memory report was able to convert intelligently based on the unit
associated with the reported metric rather than assume the metric is in KB.
 Certainly it should be able to read from the RRD whether the metric is in
"KB" or "B" or "GB", etc and make the appropriate conversion.  I imagine it
would be more difficult to resolve issue #2 as the unit is just a text
field and these graphs are essentially dynamic based on their presence in
the RRD directory.

I've been able to modify the appropriate PHP files in Ganglia2 to test a
resolution for #1 but that is obviously not a solution for our product
going forward.  If Ganglia2 was adaptable to different units for memory
that would be the best solution.

I hope my description of the issue makes sense, please let me know if I can
offer any more details on the problem and we can discuss further.

Best regards,
Jon
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to