On Tue, Sep 02, 2008 at 12:54:07PM -0400, Ofer Inbar wrote: > Brad Nicholes <[EMAIL PROTECTED]> wrote: > > Thanks Carlo, this is some good feedback. I know that both Bernard > > and Cos have reported having issues with this bug. Could either (or > > both) of you independently confirm that this patch fixes the problem? > > To reproduce this bug, I'd need a host in a state where it accepts TCP > connections but then leaves them hung, which is not something I want > to do on any of my production hosts,
it shouldn't be a problem at all if your failover sources are setup correctly anyway. you don't need to crash the machine, but just stop the gmond process by running something like : # kill -STOP `pidof gmond` to fix it after you are done you can do : # kill -CONT `pidof gmond` you will need a patched gmetad though, but doesn't need to be the same you have in production either, even if I'd expect you to roll it there quickly if this problem is really a showstopper for your 3.1 production deployment as Brad seemed to think. > If anyone out there on the list has a way to set up a Ganglia > testing cluster and then deliberately put one of the data sources in > his state, wanna test out this patch? that is what I did, but I have to admit that my test environment was tiny as I only used 1 linux box (my gentoo linux workstation) and 1 windows box (a windows vista box where I build my windows ganglia binaries) configured together in one single cluster running 3.1 (the failover source wasn't setup correctly though as I don't have a way to synchronize the clocks between them both, and they are in different VLANs and my little linksys switch can't do multicast routing) Brad is probable looking for someone else to come out with a more realistic production like test, but if no one can do that, I might be able to configure it by moving around some cables and trying to setup a more realistic failover scenario (running linux in the windows box) even if that probably defeats the "indepent confirmation" part of the testing request. Carlo ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Ganglia-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ganglia-general

