Hi all, I've been running several oscar clusters for over six months now and they've been working pretty much seamlessly, however...
We've just had a fairly serious power outage here which took down one of the clusters mid processing etc. Since it has come back up I've been having trouble with the PBS queue. The symptoms are as follows: - Running showq returns that there are 16 free nodes (there are a total of 16 nodes plus the master node). - Running pbsnodes -l returns nothing i.e. there are no marked nodes - Running pbsnodes -a returns that all of the nodes are up and running, but are all free - Running the oscar test script (from the installation gui) - The following tests fail: - PBS HDF5 Test - Checking for 16 free nodes: Both of these tests fail with the error message: "Not enough free nodes. Tests incomplete." - The MPICH over PBS tests pass To summarise, for some reason the pbs tests fail, but the mpich tests which run over pbs pass! I'm not really sure where to start in looking for this problem! Any help you can give me would be much appreciated, thanks for your time, Rich -------------------------------- Richard Bruin PhD Student Department of Earth Sciences University of Cambridge ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users
