On Thu, Dec 17, 2009 at 01:58:31PM -0500, Douglas Wade Needham wrote:
> 
> Here are the details:
>       Version:        3.1.2
>       Host:           VM running Debian Etch (1 VCPU, 512 MB) w/ 
> 2.6.26-2-xen-amd64 kernel
>         HW Host:      Dual AMD Opteron 244's running at 1.8GHz

Assume here "Host" really means the "virtual server" that is allocated for this
task and therefore not to be confused with "HW Host" below which is also
running Debian as a "virtual server host", and all of them are stock debian
etch, correct?

I'd seen severe IO starvation issues (including panics) using the stock
debian guest kernel which only resolved themselves after upgrading it to
2.6.32 (here "guest" refers to the kernel used in the virtual machine),
not to the kernel in the server that hosts all the VM machines and that
will be more difficult to upgrade.

> When we have a program connect across our 1Gbps network connection to
> this gmetad, we end up with very gappy data, if the hosts don't just
> get marked as down and the RRDs stop updating.  I have already started
> pressuring those who would approve moving our RRDs to a memory fs,
> but in the meantime... :(

IMHO this is your only solution considering the IO issues that VMs have.
and the way the gmetad scales and uses RRD (which is also very IO unfriendly)
eventhough it has been mentioned that rrdcached could help somehow.

> I have been running straces which have indicated that we occasionally
> have threads which block on the futex() call for 10+ seconds, and
> occasionally for as long as 500+ seconds.  To limit the impact of the
> strace (which itself can cause the same problem), I even had to do:
> 
>       strace -f -tt -T -s 160 -e trace=process,futex,signal -o 
> gmetad.strace.out2 -p 12618

interesting, and I suspect IO related most likely but you would be probably
able to get a better picture using instead a pthread specific tracer like
mutrace (warning, fairly new code and only packaged for Fedora 12 AFAIK) :

  http://git.0pointer.de/?p=mutrace.git

> But in doing this, I have come up with the following questions:
> 
> 1) Is there any difference between '-d 1' and '-d 10'?  Or between
>    'debug 1' and 'debug 10' in the config file?
> 
>    In looking through the code, it does not seem to be the case.  I
>    would just like confirmation.

not for gmetad AFAIK, but there are several arbitrary uses of "debug_level"
which usually mean you want to use the highest level possible most of the
time anyway.

> 2) Am I seeing correctly that we have the following pthread_mutex
>    definitions?
> 
>    - server_socket_mutex
>    - server_interactive_mutex
>    - Allocated mutex for root summary.
>    - Allocated mutex for each grid partial-summary (1 per data source)
>    - Allocated mutex for each cluster partial summary (1+ per data source)

there is also an rrd_mutex for updating the RRD, and would recommend keep
away from the multiple summary mutexes if you want to keep your sanity.

> 3) Would there be any interests in patches against 3.1.2 to watch
>    calls to pthread_mutex_lock() and pthread_mutex_unlock() to display
>    when a call took more than a certain amount of time to return, or
>    if a lock was held for longer than a certain time??

definitely interesting and if to be enabled (preferably) at compile time
to avoid any added performance degradation and race conditions of its own,
but probably OK too if only enabled at run time through "debug" mode.

beware though that trunk (where the patch would need to be applied first)
and 3.1 (where 3.1.2 comes from) might not be on sync on this code which
has seen several changes lately.

would be interesting also to see how a patched 3.0.7 (or the 3.0 branch HEAD)
would perform in this case as an alternative.

there is also a python version of gmetad in trunk which might help with
what you are doing.

> This last one comes, as given my suspicions on thread starvation, I am
> going to have to instrument a gmetad a bit more to look at the mutexes
> and how long we are in critical sections.

beware gmetad code is a little "rusty" so report back if you see anything
else that doesn't look quiet right.

Carlo

PS. this thread might be better fitted for ganglia-developers.

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to