FWIW, here's the ganglia.err and the output from a basic 'gstat' cmd.


[EMAIL PROTECTED]:$ cat /tmp/ganglia.err Client nodes: oscarnode1.oscardomain
Match pattern: headnode|
Number of hosts matched: 3
Gstat output:
CLUSTER INFORMATION
       Name: OSCAR Cluster
      Hosts: 2
Gexec Hosts: 0
 Dead Hosts: 0
  Localtime: Thu Nov 17 16:01:45 2005

CLUSTER HOSTS
Hostname                     LOAD                       CPU
Gexec
 CPUs (Procs/Total) [     1,     5, 15min] [  User,  Nice, System, Idle,
Wio]

headnode
    2 (    0/  112) [  0.02,  0.07,  0.04] [   0.1,   0.0,   0.4,  98.5,
1.0] OFF
oscarnode1.oscardomain
    2 (    0/   55) [  0.07,  0.02,  0.08] [   0.9,   0.0,   0.3,  98.5,
0.3] OFF

The number of nodes expected is different from the number of nodes
detected.
Check to see if gmond is running on all your nodes and make sure that you
are not having any network issues.





[EMAIL PROTECTED]:$ cat /tmp/gstat.output CLUSTER INFORMATION
       Name: OSCAR Cluster
      Hosts: 2
Gexec Hosts: 0
 Dead Hosts: 0
  Localtime: Thu Nov 17 16:19:18 2005

There are no hosts running gexec at this time




 _________________________________________________________________________
  Thomas Naughton                                      [EMAIL PROTECTED]
  Research Associate                                   (865) 576-4184


On Thu, 17 Nov 2005, Thomas Naughton wrote:

Ok, after stop/starting pbs_server, maui, pbs_mom, a few times the tests
are running just fine...well except for the Ganglia test. ;)



[EMAIL PROTECTED] testing]# ./test_cluster Performing root tests...
Maui service check:maui                                        [PASSED]
Torque node check                                              [PASSED]
Torque service check:pbs_server                                [PASSED]
/home mounts                                                   [PASSED]

Preparing user tests...
Performing user tests...
SSH ping test                                                  [PASSED]
SSH server->node                                               [PASSED]
SSH node->server                                               [PASSED]
Ganglia setup test                                             [FAILED]
Torque default queue definition                                [PASSED]
Torque Shell Test                                              [PASSED]
PVM (via Torque)                                               [PASSED]
MPICH (via Torque)                                             [PASSED]
LAM/MPI (via Torque)                                           [PASSED]

There were issues running some user test scripts.  Please check your logs
located in /home/oscartst.

Run APItests...

Running Installation tests for pvm
[PASS]       2005-11-17T16:02:28Z   pvmd-path-ls.apt
[PASS]       2005-11-17T16:02:28Z   envvar-pvm_arch.apt
[PASS]       2005-11-17T16:02:28Z   envvar-pvm_root.apt
[PASS]       2005-11-17T16:02:28Z   pvmd-path-which.apt
[PASS]       2005-11-17T16:02:28Z   modulecmd-path-ls.apt
[PASS]       2005-11-17T16:02:28Z   pvm-module-list.apt
[PASS]       2005-11-17T16:02:28Z   pvm-module-show-pvm_rsh.apt
[PASS]       2005-11-17T16:02:28Z   pvm-module-show-pvm_arch.apt
[PASS]       2005-11-17T16:02:28Z   pvm-module-show-pvm_root.apt
[EMAIL PROTECTED] testing]#

--tjn

_________________________________________________________________________
 Thomas Naughton                                      [EMAIL PROTECTED]
 Research Associate                                   (865) 576-4184


On Thu, 17 Nov 2005, Thomas Naughton wrote:

Hey,

Today I did a quick test of an unoffical 4.2.1.b2, i.e., what is currently [EMAIL PROTECTED] I did this test on FC4-x86

All went pretty well, still had the hang when restarting maui as previously
noted.  I also had a failure on a few tests, two of them I had to CTRL+C to
get the tests to progress (I think it would be MPICH & LAM's via torque
test_user scripts).  These are the 'Broken pipe' items in the below screen
paste.

So, I'll keep looking to see what's up with these issues, probably
something stupid (ain't it always).

later,
--tjn

Performing root tests...
Maui service check:maui                                        [PASSED]
Torque node check                                              [PASSED]
Torque service check:pbs_server                                [PASSED]
/home mounts                                                   [PASSED]

Preparing user tests...
Performing user tests...
SSH ping test                                                  [PASSED]
SSH server->node                                               [PASSED]
SSH node->server                                               [PASSED]
Ganglia setup test                                             [FAILED]
Torque default queue definition                                [PASSED]
Torque Shell Test                                              [PASSED]
PVM (via Torque)                                               [FAILED]
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe

There were issues running some user test scripts.  Please check your logs
located in /home/oscartst.

Run APItests...

Running Installation tests for pvm
[PASS]       2005-11-17T15:01:02Z   pvmd-path-ls.apt
[PASS]       2005-11-17T15:01:02Z   envvar-pvm_arch.apt
[PASS]       2005-11-17T15:01:02Z   envvar-pvm_root.apt
[PASS]       2005-11-17T15:01:02Z   pvmd-path-which.apt
[PASS]       2005-11-17T15:01:02Z   modulecmd-path-ls.apt
[PASS]       2005-11-17T15:01:02Z   pvm-module-list.apt
[PASS]       2005-11-17T15:01:02Z   pvm-module-show-pvm_rsh.apt
[PASS]       2005-11-17T15:01:02Z   pvm-module-show-pvm_arch.apt
[PASS]       2005-11-17T15:01:02Z   pvm-module-show-pvm_root.apt

ERROR 2 REPORTED ABOVE.

...Hit <ENTER> key to exit...



_________________________________________________________________________
 Thomas Naughton                                      [EMAIL PROTECTED]
 Research Associate                                   (865) 576-4184



-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.  Get Certified Today
Register for a JBoss Training Course.  Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel




-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.  Get Certified Today
Register for a JBoss Training Course.  Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
Oscar-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to