On Mon, Nov 30, 2009 at 08:12:34AM +0000, Daniel Pocock wrote: > > Carlo Marcelo Arenas Belon wrote: >> On Sun, Nov 29, 2009 at 10:57:01AM +0000, Carlo Marcelo Arenas Belon wrote: >> >>> On Tue, Nov 24, 2009 at 06:03:51PM -0800, Bernard Li wrote: >>> >>>> Please help us test on as many OS/archs as possible, as this would go >>>> GA quite immediately ;-) >>>> >>> FreeBSD is not able to return any XML data through TCP/8649 (tested with >>> FreeBSD 8.0 amd64). >> >> the problem wasn't actually the TCP/8649 service but the fact that gmond >> was going into an infinite loop after sending the first metric update. >> >> the issue was tracked down to r2043 and a 3.1.5 development package with >> that patch reverted is available for testing from : >> >> http://sajino.sajinet.com.pe/ganglia/ganglia-3.1.5.2101.tar.gz >> > Did you see this issue with 3.1.3 or 3.1.4? They both contain the same > patch.
Both 3.1.3 and 3.1.4 should have the same problem, but haven't been able to test 3.1.3 since it is no longer available. (FreeBSD 8 was just released a couple of days ago anyway). 3.1.4 shows the same behavior at least there and the "fixed" package seems to also work find with OpenBSD 4.4 amd64, NetBSD 4 i386 and DragonFlyBSD 2.4.1 i386 and amd64 (after also patched with r2124 to workaround BUG245). >>> DragonFlyBSD fails to build but a 3.2 version of ganglia which includes >>> fixes for that fails with the same TCP issue than FreeBSD and so this >>> issue might be affecting other BSD as well. >> >> confirmed also to be affecting OpenBSD (tested with OpenBSD 4.5 amd64) >> but considering the nature of the "fix" wouldn't be surprised if other >> configurations were also affected. >> > Are you proposing a fix or just revert the change? Your call, eventhough a fix for this feature will be probably preferred as there is nothing special about the BSD for them to be affected and it might be that the problem is therefore more generic. At least a revert would be needed for 3.1 as this accounts for a regression but haven't done so either waiting for you to first revert it on trunk and then decide on how to proceed from there depending on how critical this feature was for the release. > The change has been working on Linux, Solaris and Cygwin. Other than just doing a manual bisect (using git instead of svn here would had been useful) to find where the problem was introduced and validate that reverting it corrects the problem haven't done much analysis of it, but the fact that it broke in such a strange way (was indeed expecting the culprit to be somewhere else, specially considering all recent changes in the networking and the fact that it seemed originally to be triggered by a TCP request) probably points to a bigger issue which just happens to have not been visible on the configurations used to test Linux, Solaris and Cygwin, specially considering how pervasive it was (broke all BSD I had access to test, at least) Carlo ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers