Bugs item #1266927, was opened at 2005-08-23 02:33
Message generated for change (Comment added) made by bernardli
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1266927&group_id=9368

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Packages
Group: 4.2
Status: Open
>Resolution: Accepted
Priority: 9
Submitted By: Erich Focht (efocht)
>Assigned to: Bernard Li (bernardli)
Summary: ganglia test fails if other gmonds around

Initial Comment:
My ganglia tests were failing systematically. Debugging 
showed the reason: I had another master node in the 
same network and the test was detecting more hosts than 
expected. 
 
This can be fixed by making the test smarter, such that it 
detects whether the own cluster's nodes are monitored or 
not and ignores any other nodes. 

----------------------------------------------------------------------

>Comment By: Bernard Li (bernardli)
Date: 2005-08-29 21:57

Message:
Logged In: YES 
user_id=879102

After the teleconference I discussed this with Erich and we
decided to add another Ganglia test which checks to see if
the number of gmonds that are active is the same number of
gmonds that is expected to be running.

If this is not the case, this test will fail with an error
message (ganglia.err) saying that there is a possibility
there is a gmond from a server outside of the cluster
running on the same network and the user can decide what to
do with this.

----------------------------------------------------------------------

Comment By: Erich Focht (efocht)
Date: 2005-08-29 01:33

Message:
Logged In: YES 
user_id=338721

The first question here is: why should a test fail if ganglia actually 
works? The new test fixes this issue. 
 
The second question which you raise additionally is: why should 
somebody have multiple clusters in the same subnet? Actually 
the clusters don't need to be in th esame subnet, they need to 
be on the same switch. If I have a 48 port Gbit switch and want 
to connect a 16 node Xeon and a 16 node Opteron cluster to it, I 
would feel extremely limited if OSCAR folks would tell me I 
shouldn't. Whether I feel disturbed by multiple hosts showing up 
and how I solve this new problem is another issue (like for 
example defining VLANs on my switch or more carefully crafting 
the ganglia configuration) which could be the subject of a RFE. 
 
IMO the two questions are related but need to be solved 
separately. 

----------------------------------------------------------------------

Comment By: Bernard Li (bernardli)
Date: 2005-08-28 22:27

Message:
Logged In: YES 
user_id=879102

I'm starting to think this may not be the best solution... 
it begs the question - why do you have another gmond running
in the same subnet that is not part of the cluster?

Although this allows the test to pass, when you bring up the
Ganglia page it will still show up the 'renegade' node -
this may be confusing to the user.

----------------------------------------------------------------------

Comment By: Erich Focht (efocht)
Date: 2005-08-26 15:56

Message:
Logged In: YES 
user_id=338721

Merged to branch (svn r3607). 

----------------------------------------------------------------------

Comment By: John (muglerj)
Date: 2005-08-26 13:26

Message:
Logged In: YES 
user_id=505737

Ok, fix it.

----------------------------------------------------------------------

Comment By: Erich Focht (efocht)
Date: 2005-08-24 04:41

Message:
Logged In: YES 
user_id=338721

Fix this in branch, too? 

----------------------------------------------------------------------

Comment By: Erich Focht (efocht)
Date: 2005-08-23 02:40

Message:
Logged In: YES 
user_id=338721

Fixed in trunk (svn 3571). 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=1266927&group_id=9368


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to