Michel-
You might also try substituting Maui for pbs_sched in your debugging:
service maui stop
service pbs_sched start

And see if jobs get launched...

Jeremy

At 11:16 AM 7/2/2003 +0200, Michel Beheregaray wrote:
Selon Jeremy Enos <[EMAIL PROTECTED]>:

Yes, qrun launches jobs when they are queueing. I also thinked about maui.
So I will try to install latest (3.2.6) version of maui.

Thanks,
Michel.

> Michel-
> I don't have any configurations like the one you describe, so I don't have
> a good answer for you. It does make me suspicious that older versions
> (with older/different configs) worked for you though. Can you "qrun" the
> stuck jobs successfully? If so, it would be a stronger indication towards
> a Maui config issue.
>
> Jeremy
>
> At 03:43 PM 7/1/2003 +0200, Michel Beheregaray wrote:
> >Hello,
> > I installed an oscar cluster V2.2.1 (with OpenPBS 2.3.16 and
> >MAUI 3.0.7p8).
> >
> >ifrcir123.univ-pau.fr is the oscar and pbs server.
> >
> >I configured a route queue on another pbs server (ifrcir101.univ-pau.fr,
> >outside the oscar cluster)
> >
> >create queue rcourt
> >set queue rcourt queue_type = Route
> >set queue rcourt route_destinations = [EMAIL PROTECTED]
> >set queue rcourt enabled = True
> >set queue rcourt started = True
> >
> >After starting up pbs_server and maui on ifrcir123.univ-pau.fr, I submit
> >a job from ifrcir101 to rcourt queue.
> >The job begins to run on ifrcir123 (or one of its nodes). All is right
> >Same behaviour if I submit others jobs from the same host (even if
> different
> >users).
> >
> >Then I submit another job (or the same script) from ifrcir123 (so locally)
> on
> >court queue.
> >The job keeps in queue ("qstat -a" flag = Q) and here is the maui.log
> >extract :
> >
> >06/30 10:10:19 JobStart(214)
> >06/30 10:10:19 JobDistributeTasks(214,0,NodeList,TaskMap)
> >06/30 10:10:19 AMReserveJobAllocation(214,Reason,ErrMsg)
> >06/30 10:10:19 RMStartJob(214,SC)
> >06/30 10:10:19 PBSStartJob(214,0)
> >06/30 10:10:19 ERROR: cannot set job '214' attr
> >'Resource_List:neednodes' to
> >'iplmat133.univ-pau.fr' (rc: 15001 'Unknown Job Id')
> >06/30 10:10:19 ERROR: cannot set hostlist for job '214')
> >06/30 10:10:19 WARNING: cannot start job '214' through PBS
> >06/30 10:10:19 WARNING: cannot start job '214' through resource manager
> >06/30 10:10:19 ALERT: job '214' deferred after 1 failed start attempts
> (API
> >failure on last attempt)
> >06/30 10:10:19 JobDefer(214,1:00:00,RMFailure,job could not be started
> >through RM)
> >06/30 10:10:19 ALERT: job '214' cannot run (deferring job for 3600
> seconds)
> >[001] 214 1: 1: 1(1) ALL 2:00:00(????????)
> >campillo lcs
> > Idle DEFAULT [court 1] 1056960619 [NONE] [NONE]
> > [NONE] >= 0 >=
> > 0 [NONE]
> >06/30 10:10:19 AMCancelAllocationReservation([NONE],214,Reason)
> >06/30 10:10:19 ERROR: cannot run job '214' in partition DEFAULT
> >06/30 10:10:19 ReservePriorityJob(214,DEFAULT,ResCount)
> >06/30 10:10:19 JobReserve(214,Priority)
> >06/30 10:10:19 INFO: 16 feasible tasks found for job 214:0 (1 Needed)
> >06/30 10:10:19 INFO: 16 feasible tasks found for job 214:0 (1 Needed)
> >06/30 10:10:19 INFO: located resources for 1 tasks (16) in best
> partition
> >for job 214 at time 0:00:00
> >06/30 10:10:19 INFO: tasks located for job 214: 1 of 1 required (16
> >feasible)
> >06/30 10:10:19 JobDistributeTasks(214,0,NodeList,TaskMap)
> >06/30 10:10:19 ReservationJCreate(214,MNodeList,0:00:00,Priority,Res)
> >06/30 10:10:19 INFO: job '214' reserved 1 tasks (partition DEFAULT) to
> >start
> >in 0:00:00 on Mon Jun 30 10:10:19
> >
> >Then I kill jobs and restart pbs server and maui on ifrcir123.
> >I do the same as before but I begin with submitting job from ifrcir123
> >(locally), the job begins to run and all is right. Then I submit a second
> job
> >from ifrcir101 and the result is the same : job keeps queueing.
> >So I cannot submit jobs from different hosts onto the same pbs server.
> >
> >I tried with others pbs servers on the network and the result is the same.
> >Only the first server which submit the first job is accepted for new jobs.
> >Others are rejected.
> >
> >But this operation works fine if pbs server which receives the jobs is a
> >2.3.12 pbs server with 3.0.6p3 maui.
> >
> >Is there a Maui parameter to configure to accept two or more submitting
> >server
> >at a time ?
> >Is there a problem (and a patch ?) with this version of Maui ?
> >What made I wrong ?
> >
> >Thank you for your help,
> >--
> >Michel BEHEREGARAY
> >Centre Informatique de l'Universite de Pau et des Pays de l'Adour (CIUPPA)
> >Bat IFR, rue Jules Ferry, F-64000 PAU T�l.: +33 (0)5 59 72 20 12
> >Courriel : [EMAIL PROTECTED] Fax : +33 (0)5 59 72 20 19
> >
> >
> >-------------------------------------------------------
> >This SF.Net email sponsored by: Free pre-built ASP.NET sites including
> >Data Reports, E-commerce, Portals, and Forums are available now.
> >Download today and enter to win an XBOX or Visual Studio .NET.
> >http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01
> >_______________________________________________
> >Oscar-users mailing list
> >[EMAIL PROTECTED]
> >https://lists.sourceforge.net/lists/listinfo/oscar-users
>



-- Michel BEHEREGARAY Centre Informatique de l'Universite de Pau et des Pays de l'Adour (CIUPPA) Bat IFR, rue Jules Ferry, F-64000 PAU T�l.: +33 (0)5 59 72 20 12 Courriel : [EMAIL PROTECTED] Fax : +33 (0)5 59 72 20 19



------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01 _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to