On Mon, Nov 30, 2009 at 08:12:34AM +0000, Daniel Pocock wrote:
>
> Carlo Marcelo Arenas Belon wrote:
>> On Sun, Nov 29, 2009 at 10:57:01AM +0000, Carlo Marcelo Arenas Belon wrote:
>>   
>>> On Tue, Nov 24, 2009 at 06:03:51PM -0800, Bernard Li wrote:
>>>     
>>>> Please help us test on as many OS/archs as possible, as this would go
>>>> GA quite immediately ;-)
>>>>       
>>> FreeBSD is not able to return any XML data through TCP/8649 (tested with
>>> FreeBSD 8.0 amd64).
>>
>> the problem wasn't actually the TCP/8649 service but the fact that gmond
>> was going into an infinite loop after sending the first metric update.
>>
>> the issue was tracked down to r2043 and a 3.1.5 development package with
>> that patch reverted is available for testing from :
>>
>>   http://sajino.sajinet.com.pe/ganglia/ganglia-3.1.5.2101.tar.gz
>>   
> Did you see this issue with 3.1.3 or 3.1.4?  They both contain the same  
> patch.

Both 3.1.3 and 3.1.4 should have the same problem, but haven't been able to
test 3.1.3 since it is no longer available.  (FreeBSD 8 was just released a
couple of days ago anyway).  3.1.4 shows the same behavior at least there
and the "fixed" package seems to also work find with OpenBSD 4.4 amd64,
NetBSD 4 i386 and DragonFlyBSD 2.4.1 i386 and amd64 (after also patched
with r2124 to workaround BUG245).

>>> DragonFlyBSD fails to build but a 3.2 version of ganglia which includes
>>> fixes for that fails with the same TCP issue than FreeBSD and so this
>>> issue might be affecting other BSD as well.
>>
>> confirmed also to be affecting OpenBSD (tested with OpenBSD 4.5 amd64)
>> but considering the nature of the "fix" wouldn't be surprised if other
>> configurations were also affected.
>>   
> Are you proposing a fix or just revert the change?

Your call, eventhough a fix for this feature will be probably preferred as
there is nothing special about the BSD for them to be affected and it might
be that the problem is therefore more generic.

At least a revert would be needed for 3.1 as this accounts for a regression
but haven't done so either waiting for you to first revert it on trunk and
then decide on how to proceed from there depending on how critical this
feature was for the release.

> The change has been working on Linux, Solaris and Cygwin.

Other than just doing a manual bisect (using git instead of svn here would
had been useful) to find where the problem was introduced and validate that
reverting it corrects the problem haven't done much analysis of it, but the
fact that it broke in such a strange way (was indeed expecting the culprit
to be somewhere else, specially considering all recent changes in the
networking and the fact that it seemed originally to be triggered by a TCP
request) probably points to a bigger issue which just happens to have not
been visible on the configurations used to test Linux, Solaris and Cygwin,
specially considering how pervasive it was (broke all BSD I had access to
test, at least)

Carlo

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to