It might be worth checking the disk IO... Disk IO is the main constraint on
monitoring clusters/grids actually.

On Dec 12, 2007 5:42 PM, Markus Reusch <[EMAIL PROTECTED]> wrote:

> Hi list(eners),
>
> we are using Ganglia for about 20-30 AIX-Systems. All data is collected
> on 2 LPARs which run gmetad, rrdtool and the Apache, so it's accessible
> for the users with the supplied PHP web frontend.
> Our rrd-database has currently a size of about 400 MB and consists of
> about 4800 files. I have no idea if this size or number of files is a
> problem at all.
> Every day, several users connect to the web frontend and open those
> graphs, which results in some rrdtool calls, exhausting all the
> available CPU time of the machine. No matter if it's just 4 rrdtool
> processes or 20 of them. The machine is already at it's limit when
> working with 4 rrdtool calls. When about 20 of them are started, a 2
> digit number of kernel threads is being queued...
> When all this occurs, you are not able to type properly in your ssh
> session and even some shell scripts being called by cron are having
> problems and give delayed output. This is no peak behaviour - it's
> constantly like this.
>
> The LPAR has AIX 5.3 with 1 * 1,7 MHz CPU and 2 GB RAM. VMM is tuned and
> machine is usually not paging at all. Not a strong machine but we
> thought sufficient for Ganglia + rrdtool + Apache.
>
> Other applications on the machine are not noticable, as you can see
> (topas):
> Name            PID  CPU%  PgSp Owner
> rrdtool      458924  26.0   0.5 nobody
> rrdtool      966772  25.4   0.4 nobody
> rrdtool     1302770  23.0   3.6 root
> rrdtool     1183858  22.4   0.4 nobody
> gmetad      1110100   8.2   5.3 nobody
> topas       1159188   0.6   1.7 root
> gmond        352448   0.1   1.7 nobody
> java         848106   0.0  16.2 root
> sched         12294   0.0   0.4 root
> clstrmgr     405650   0.0  22.5 root
>
>
>
> A snippet from vmstat to see yourself (currently 4 rrdtools are running):
> kthr    memory              page              faults        cpu
> ----- ----------- ------------------------ ------------ -----------
>  r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
>  5  1 304078  4075   0   0   0   0    0   0 285 22896 1079 90 10  0  0
>  6  1 303783  4373   0   0   0   0    0   0 249 2038 787 98  2  0  0
>  6  0 304004  4145   0   0   0   0    0   0 239 17285 1001 92  8  0  0
>
>
> Here is a sample process of rrdtool taken with ps:
> /usr/bin/rrdtool graph --start 1197456160 --end 1197459760 --width 300
> --height 75 --title weight - DEF:sum=/var/lib/ganglia/rrds/PL1650 Linie
> SRZ/<somehostname>/weight.rrd:sum:AVERAGE AREA:sum#0000ff:<somehostname>
> last hour (now -1.00)
>
> My questions:
> Is it normal behaviour/configuration, that so many rrdtools are started
> when people are doing some requests to the web frontend?
> Do you guys spend a "big" box just running Ganglia + rrdtool + Apache or
> maybe just a small Intel/Linux box and it runs smooth?
>
> Any suggestions/hints are welcome, thank you in forward.
>
> Greetings
> Markus
>
> -------------------------------------------------------------------------
> SF.Net email is sponsored by:
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://sourceforge.net/services/buy/index.php
> _______________________________________________
> Ganglia-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/ganglia-general
>



-- 
~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=
Regards,
Aroop
~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=
-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to