Bugs item #1266927, was opened at 2005-08-23 02:33 Message generated for change (Comment added) made by bernardli You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1266927&group_id=9368
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Packages Group: 4.2 Status: Open >Resolution: Accepted Priority: 9 Submitted By: Erich Focht (efocht) >Assigned to: Bernard Li (bernardli) Summary: ganglia test fails if other gmonds around Initial Comment: My ganglia tests were failing systematically. Debugging showed the reason: I had another master node in the same network and the test was detecting more hosts than expected. This can be fixed by making the test smarter, such that it detects whether the own cluster's nodes are monitored or not and ignores any other nodes. ---------------------------------------------------------------------- >Comment By: Bernard Li (bernardli) Date: 2005-08-29 21:57 Message: Logged In: YES user_id=879102 After the teleconference I discussed this with Erich and we decided to add another Ganglia test which checks to see if the number of gmonds that are active is the same number of gmonds that is expected to be running. If this is not the case, this test will fail with an error message (ganglia.err) saying that there is a possibility there is a gmond from a server outside of the cluster running on the same network and the user can decide what to do with this. ---------------------------------------------------------------------- Comment By: Erich Focht (efocht) Date: 2005-08-29 01:33 Message: Logged In: YES user_id=338721 The first question here is: why should a test fail if ganglia actually works? The new test fixes this issue. The second question which you raise additionally is: why should somebody have multiple clusters in the same subnet? Actually the clusters don't need to be in th esame subnet, they need to be on the same switch. If I have a 48 port Gbit switch and want to connect a 16 node Xeon and a 16 node Opteron cluster to it, I would feel extremely limited if OSCAR folks would tell me I shouldn't. Whether I feel disturbed by multiple hosts showing up and how I solve this new problem is another issue (like for example defining VLANs on my switch or more carefully crafting the ganglia configuration) which could be the subject of a RFE. IMO the two questions are related but need to be solved separately. ---------------------------------------------------------------------- Comment By: Bernard Li (bernardli) Date: 2005-08-28 22:27 Message: Logged In: YES user_id=879102 I'm starting to think this may not be the best solution... it begs the question - why do you have another gmond running in the same subnet that is not part of the cluster? Although this allows the test to pass, when you bring up the Ganglia page it will still show up the 'renegade' node - this may be confusing to the user. ---------------------------------------------------------------------- Comment By: Erich Focht (efocht) Date: 2005-08-26 15:56 Message: Logged In: YES user_id=338721 Merged to branch (svn r3607). ---------------------------------------------------------------------- Comment By: John (muglerj) Date: 2005-08-26 13:26 Message: Logged In: YES user_id=505737 Ok, fix it. ---------------------------------------------------------------------- Comment By: Erich Focht (efocht) Date: 2005-08-24 04:41 Message: Logged In: YES user_id=338721 Fix this in branch, too? ---------------------------------------------------------------------- Comment By: Erich Focht (efocht) Date: 2005-08-23 02:40 Message: Logged In: YES user_id=338721 Fixed in trunk (svn 3571). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1266927&group_id=9368 ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Oscar-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-devel
