Okay, I've checked all the items you suggested and what I have looks correct.
"pbsnodes -a" reports that all my nodes are available. My /var/spool/pbs/server_name doesn't have an IP in it. It does have pbs_oscar and since I removed the nfs_oscar and pbs_oscar from the end of the line for each node in the /etc/hosts file it they all ping the master node. My /etc/hosts file looks correct. The server name is not on the same line as the localhost line. The /var/spoo/pbs/mom_priv file looks fine on all my nodes. I've been flip-flopping with pbs_sched and maui just to see if I can get one to work. In the process of that and while pbs_sched was on I noticed a line in one my logs. ####################### Pbs_sched;Job;35.domain.com;Not enough of the right type of nodes available. ####################### The command I'm running is: "qsub -l nodes=1:ppn:1,walltime=30:00:00 /path/to/job.pbs" If I turn off pbs_serv and start maui the job will sumit but wont' do anything. I get the connection refused errors. Thanks again. :) -----Original Message----- From: Erich Focht [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 21, 2005 2:34 PM To: [email protected] Cc: Johnston Michael J Contr AFRL/DES Subject: Re: [Oscar-users] Problems with PBS Hi Mike, On Wednesday 21 December 2005 21:37, Johnston Michael J Contr AFRL/DES wrote: > > I'm having some strange problems with PBS. I've tried to google the errors, > but didn't come up with anything helpful. I did find one post by Bernard > about only having LAM or MPITCH installed. I had both installed and removed > MPITCH. That shouldn't matter, you should be able to use both. > ####################### > > PBS_Server: Connection refused (111) in contact_sched, Could not connect > Scheduler - port 15004 > > ####################### Does "pbsnodes -a" return any useful info? What is the server name in /var/spool/pbs/server_name? Does that name correspond to the IP of the master node? What are the lines describing the IP addresses of the master node looking like in /etc/hosts? Make sure you have the real IP address in front of the master name (check that you don't find the master name on the localhost line). The master should have the _internal_ IP's name. It is important what hostname your master has when starting the pbs server! Finally: have a look at /var/spool/pbs/mom_priv on master and nodes and check whether you find the correct master hostname in there. Restart the daemons: pbs_mom on the clients and the master, pbs_server on the master. Finally restart maui on the master. Good luck! Best regards, Erich ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Oscar-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-users
