When the test_cluster script says "there must be a PBS problem", it's not necessarily PBS.  There is a series of dependencies underlying it.  Here's what to look for (in order):
Is /home mounted on all nodes?
Can a non-root user ssh from the head node to a node, from there to another node, and then back  to the server again w/o getting prompted for input?
Does "pbsnodes -a" report all nodes "free"?
Does "service maui status" report Maui running?

If all those checkout, we'll proceed from there.

        Jeremy

At 09:04 AM 3/29/2002 -0800, Kyndig Renshai wrote:

Abhin,

I came across this exact problem.  I asked the list and no one replied to this. So I guess that its not only me mystified as to why the pbs tests that are a part of the suite will not run.  I have not come up with a fix, since pbs is new to me.

Question to the OSCAR Developers though...  I was lookin at grid-in-a-box (www.ncsa.uiuc.edu)- they recommended cluster-in-a-box (OSCAR 1.2.1).  They however deployed Condor-G in creating their grids.  I've looked at Condor and it is a cross platform solution - they have ported a portion of the functionality so that its possible to manage both win32 and unix/linux machines in the cluster/pool centralized fashion.  (They lack a spiffie gui like xpbs but hey). 

I'm just wondering if you can add a feature to OSCAR that would allow the choice of management tools? (Condor/PBS) I'm not scripter as yet...  so along with Ahbin - I would greatly appreciate anyone who is expert on the OSCAR - who can contribute some scripts for testing the pbs system along with examples for the rest of features that work.  (The rule in xtreme pgmmin is that testing is crucial to any system integration). 

I noticed that someone mentioned that csh was the default on the master node, but the backend nodes are running bash.  Is this what needs to be fixed to get the OSCAR_tests to run?

Ren

  "abhin g.s" <[EMAIL PROTECTED]> wrote:
hi,

i got o1.2 installed with out any error. created and
copied the test material from the wizard. but when i
log in the the newly created user "rasc" (created
using the wizard) i can ssh to clients but can't run
./test_cluster.. its shows . simple pbs job taking too
much time so aborting..

>>> recently the server restarted 2 times when client
installation via tftp is done.. but after a fresh
reinstallation of rh7.1 and o1.2 nothing is wrong.

>>> i really want to run Nasa Parallel Benchmark 2.3
or some thing like that to check the GFLOPS of the
cluster. so please giude me to install and benchmark
my cluster.. npb2.3 gave me a lot of errors while
compilation.

please help..

abhin.g.s
rasc team

=====
RASC�. Always With You� � 2002 Research And Service Centre.http://rasc.8m.com

_________! _________________________________________
Do You Yahoo!?
Yahoo! Movies - coverage of the 74th Academy Awards�
http://movies.yahoo.com/

_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users



Do You Yahoo!?
Yahoo! Greetings - send greetings for Easter, Passover
_______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to