On Fri, Aug 29, 2008 at 02:40:00PM -0400, Ofer Inbar wrote:
> Carlo Marcelo Arenas Belon <[EMAIL PROTECTED]> wrote:
> > > Is this Ganglia 3.0.x or 3.1.0, or both...?
> > 
> > both :
> > 
> >   http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=92
> 
> Huh

Indeed!

> this bug is from March 2007, and someone added a patch back then
> to work around it.

Can't believe it is 2008 already, but the bug was actually from Apr 2006
and the workaround as you pointed out was from March 2007.

The workaround didn't fix the problem though and it is more of a new feature
and will be tracked as such in BUG208

> Should this have made it into 3.1, or 3.1.1?  It
> doesn't look like it.

There is a fix in trunk now with r1738 and unless something goes wrong with
it, will be most likely released with 3.1.2 and 3.0.8.

3.1.1 is already in testing and since this bug is not a showstopper for that
specific release, I'd be surprised if the release manager decides it should
be backported to it, but that shouldn't prevent you patching your own package
with the proposed patch if you don't want to wait.

> Also, 
> 
> | ------- Additional Comment #1 From Timothy Witham 2007-03-29 09:01
> | Created an attachment (id=56)
> | gmetad patch to fix bugzilla #92
> | 
> | A quick hack is to pick a random host from the list, which is what
> | this patch does.  It resolves the problem, but might not be ideal.
> | The documentation might need to be fixed since the sources are no
> | longer tried in order.
> | 
> | ------- Additional Comment #2 From Timothy Witham 2008-06-05 10:38
> | My patch still loses if we are talking to a gmond affected by Bug#38.
> | In that case, we receive incomplete data, but since it is some data,
> | we keep talking to that host every time.  Maybe we should just talk to
> | a random host every time.  Better to fix Bug#38 though...
> 
> Bug#38 puts a gmond effectively in deaf mode without logging errors if
> there's a network reconfig.

not quite that way, as the error log will be filled with messages from gmond
not being able to send the multicast updates from what I recall.

> That would have the effect of gmetad
> getting incomplete, old data from that gmond, but that seems to be a
> different problem.

yes, and that is why it has a different bug.

> Told solve it in gmetad, we'd want gmetad to have
> some way of judging whether the data it's getting from a gmond is
> fresh and current, which is not the same as judging whether it
> actually *got* the data from the gmond.

the problematic code was introduced as a fix for BUG27 and was indeed
trying to detect if gmond was able to use the source or not by looking
at the obvious lack of TCP connectivity.  BUG92 showed that the heuristic
was incomplete because didn't include gmond/system that are hung but still
responsive to a TCP three way handshake (which I'd have to guess is not that
frequently observed, or at least less frequently observed than a completely
unresponsive source)

> IOW, I don't think a fix to this bug should wait on being able to also
> handle deaf gmond's more intelligently.

agree, and that is why handling gmond more intelligently will be done as part
of BUG208 instead.

Carlo

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to