I uninstalled the branch version and installed torque 2.3.5 and then everything was fine. qrun worked just fine with the branch version.
thanks for the tips on mom_priv/config j > 2. Re: torque/maui integration - cannot set hostlist error > (Garrick Staples) > Message: 2 > Date: Mon, 22 Dec 2008 21:05:07 -0800 > From: Garrick Staples <[email protected]> > Subject: Re: [Mauiusers] torque/maui integration - cannot set hostlist > error > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset="us-ascii" > > On Sun, Dec 21, 2008 at 08:32:08PM -0500, John Kitchin alleged: > > Hi everyone, > > > > I am in the process of replacing PBSPro on our cluster with Torque/Maui. > I > > have installed the latest versions of Torque and Maui, and Torque appears > to > > run fine on its own and runs jobs. The installations seem to have gone > well > > according to the directions and tests. I have not been able to get maui > to > > schedule jobs though (after stopping pbs_sched and starting maui as user > > jtest), they just remain in the queue in a deferred state. > > > > our basic setup is a login/submit node where pbs_server and maui run > called > > beowulf (beowulf.cheme.cmu.edu is the full name), with the execute nodes > on > > an internal network. > > > > Typical output of checkjob on a deferred job is: > > > > job is deferred. Reason: RMFailure (job cannot be started - cannot set > > hostlist) > > Holds: Defer (hold reason: RMFailure) > > PE: 1.00 StartPriority: 2 > > cannot select job 52 for partition DEFAULT (job hold active) > > > > the torque log indicates an error connecting to MOM: > > 12/21/2008 18:04:32;0008;PBS_Server;Job;52.beowulf;Job Modified at > request > > of jt...@beowulf > > 12/21/2008 18:04:32;0001;PBS_Server;Req;;Server could not connect to MOM > > 12/21/2008 18:04:32;0080;PBS_Server;Req;req_reject;Reject reply > > code=15070(Server could not connect to MOM), aux=0, type=ModifyJob, from > > jt...@beowulf > > 12/21/2008 18:05:16;0002;PBS_Server;Svr;PBS_Server;Torque Server Version > = > > 2.4.0b1, loglevel = 0 > > This means that something is wrong between pbs_server and pbs_mom. I don't > think this has anything to do with maui. > > Test with 'qrun'. That is a torque command that will attempt to start the > job. If that also fails, then you really know it isn't maui. > > Also, you are running trunk. You should really start with the latest 2.1.x > or > 2.3.6 (releasing soon). > > > > on the nodes, the mom config files contain > > matsim (jtest) ~ > ssh c1n10 'cat /var/spool/torque/mom_priv/config' > > $clienthost beowulf > > $restricted *.cheme.cmu.edu > > $clienthost is ancient. You want to use $pbsserver. > > And why use $restricted? That disables security. > > -- > Garrick Staples, GNU/Linux HPCC SysAdmin > University of Southern California > > See the Dishonor Roll at http://www.californiansagainsthate.com/ > >
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
