Am 08.12.2011 um 10:00 schrieb wzlu: > 於 2011/12/8 下午 04:56, Reuti 提到: >> Am 08.12.2011 um 01:11 schrieb wzlu: >> >>> I tried for all available queues and get the same messages. >>> >>> $ qsub -w v c3-parallel-pgimpich2-demo.sh >>> Unable to run job: Job 144914 cannot run in queue "cc001-t001" because it >>> is not contained in its hard queue list (-q) >>> Job 144914 cannot run in queue "q0-em64t-ge" because it is not contained in >>> its hard queue list (-q) >>> Job 144914 cannot run in queue "q0-em64t-ib" because it is not contained in >>> its hard queue list (-q) >>> Job 144914 cannot run in PE "mpich" because it only offers 0 slots >>> verification: no suitable queues. >>> Exiting. >> Is there any resource request in the job script itself? As you don't request >> any queue on the command line, I would think so. Can you provide the header >> please. >> >> -- Reuti > Yes, there is "-pe mpich 2" in the job script. > > the job script like following: > # Specifies the name of the shell to use for the job(recommand) > #$ -S /bin/sh > # Job name(optional) > #$ -N c3-parallel-pgimpich1-demo > # Specifies queue name(required) > #$ -q q0-em64t-ddr
Okay, here you request a queue. Specifying a queue request on the command line will override the list which explains the chaning complaints about the command list. Have the queues q0-em64t-ge, q0-em64t-ib and q0-em64t-ddr the same list of hosts and you limit the overall slot count per exechost? -- Reuti > # Specifies number of nodes(required) > #$ -pe mpich 2 > # Specifies output files(optional) > > Best Regards, > Lu > >> >> >>> $ qsub -q q0-em64t-ib -w v c3-parallel-pgimpich2-demo.sh >>> Unable to run job: Job 144915 cannot run in queue "cc001-t001" because it >>> is not contained in its hard queue list (-q) >>> Job 144915 cannot run in queue "q0-em64t-ddr" because it is not contained >>> in its hard queue list (-q) >>> Job 144915 cannot run in queue "q0-em64t-ge" because it is not contained in >>> its hard queue list (-q) >>> Job 144915 cannot run in PE "mpich" because it only offers 0 slots >>> verification: no suitable queues. >>> Exiting. >>> $ qsub -q q0-em64t-ge -w v c3-parallel-pgimpich2-demo.sh >>> Unable to run job: Job 144916 cannot run in queue "cc001-t001" because it >>> is not contained in its hard queue list (-q) >>> Job 144916 cannot run in queue "q0-em64t-ddr" because it is not contained >>> in its hard queue list (-q) >>> Job 144916 cannot run in queue "q0-em64t-ib" because it is not contained in >>> its hard queue list (-q) >>> Job 144916 cannot run in PE "mpich" because it only offers 0 slots >>> verification: no suitable queues. >>> Exiting. >>> >>> I created a temporary PE and used qalter to run those queued jobs. >>> >>> Best Regards, >>> Lu >>> >>> 於 2011/12/7 下午 07:32, Reuti 提到: >>>> Hi, >>>> >>>> Am 07.12.2011 um 08:13 schrieb wzlu: >>>> >>>>> The same problem occur again. >>>>> >>>>> I try command "qsub -w v" and get following message. >>>>> Unable to run job: Job 144878 cannot run in queue "cc001-t001" because it >>>>> is not contained in its hard queue list (-q) >>>>> Job 144878 cannot run in queue "q0-em64t-ge" because it is not contained >>>>> in its hard queue list (-q) >>>>> Job 144878 cannot run in queue "q0-em64t-ib" because it is not contained >>>>> in its hard queue list (-q) >>>>> Job 144878 cannot run in PE "mpich" because it only offers 0 slots >>>> so, the PE you requested is "mpich". Did you request any queue in the >>>> `qsub` command, and is the PE attached to this queue? >>>> >>>> -- Reuti >>>> >>>> >>>>> verification: no suitable queues. >>>>> Exiting. >>>>> >>>>> Have any idea? Thanks. >>>>> >>>>> Best Regards, >>>>> Lu >>>>> >>>>> 於 2011/10/7 下午 09:04, Reuti 提到: >>>>>> Am 06.10.2011 um 14:40 schrieb Jesse Becker: >>>>>> >>>>>>> I ran into this a few months ago, and it had almost nothing to do with >>>>>>> PE slots. Unfortunately, I can't recall what I did to fix it either. >>>>>>> Try submitting test jobs with "-w v" and "-w p" to get more of an idea >>>>>>> of what's going on. >>>>>> Yes, this needs to be investigated by hand. It's an RFE to get a better >>>>>> scheduling output. Like here, you would like to know why the slots >>>>>> couldn't be allocated. That there are only zero slots avilable, is the >>>>>> result of another limit already. >>>>>> >>>>>> Could be memory, RQS, slots, ... >>>>>> >>>>>> -- Reuti >>>>>> >>>>>>> On Thu, Oct 06, 2011 at 04:39:39AM -0400, wzlu wrote: >>>>>>>> Dear All, >>>>>>>> >>>>>>>> There are 144 nodes in my queue and I configured 1 slot for each node. >>>>>>>> That is 144 nodes with 144 slots. >>>>>>>> The PE is used 121 slots now. One job need 12 PE's slots and there are >>>>>>>> enough nodes and slots for this job. >>>>>>>> But it queued by "cannot run in PE "mpich" because it only offers 0 >>>>>>>> slots". >>>>>>>> >>>>>>>> Configure as following: >>>>>>>> >>>>>>>> $ qconf -sp mpich >>>>>>>> pe_name mpich >>>>>>>> slots 81920 >>>>>>>> user_lists NONE >>>>>>>> xuser_lists NONE >>>>>>>> start_proc_args /bin/true >>>>>>>> stop_proc_args /bin/true >>>>>>>> allocation_rule $round_robin >>>>>>>> control_slaves TRUE >>>>>>>> job_is_first_task FALSE >>>>>>>> urgency_slots min >>>>>>>> >>>>>>>> $ qconf -ssconf >>>>>>>> algorithm default >>>>>>>> schedule_interval 0:0:5 >>>>>>>> maxujobs 0 >>>>>>>> queue_sort_method load >>>>>>>> job_load_adjustments NONE >>>>>>>> load_adjustment_decay_time 0:7:30 >>>>>>>> load_formula slots >>>>>>>> schedd_job_info true >>>>>>>> flush_submit_sec 0 >>>>>>>> flush_finish_sec 0 >>>>>>>> params none >>>>>>>> reprioritize_interval 0:0:0 >>>>>>>> halftime 168 >>>>>>>> usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000 >>>>>>>> compensation_factor 5.000000 >>>>>>>> weight_user 0.250000 >>>>>>>> weight_project 0.250000 >>>>>>>> weight_department 0.250000 >>>>>>>> weight_job 0.250000 >>>>>>>> weight_tickets_functional 0 >>>>>>>> weight_tickets_share 0 >>>>>>>> share_override_tickets TRUE >>>>>>>> share_functional_shares TRUE >>>>>>>> max_functional_jobs_to_schedule 200 >>>>>>>> report_pjob_tickets TRUE >>>>>>>> max_pending_tasks_per_job 50 >>>>>>>> halflife_decay_list none >>>>>>>> policy_hierarchy OFS >>>>>>>> weight_ticket 0.010000 >>>>>>>> weight_waiting_time 0.000000 >>>>>>>> weight_deadline 3600000.000000 >>>>>>>> weight_urgency 0.100000 >>>>>>>> weight_priority 1.000000 >>>>>>>> max_reservation 0 >>>>>>>> default_duration 00:15:00 >>>>>>>> >>>>>>>> How to fix this problem. Thanks a lot. >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> Lu >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> [email protected] >>>>>>>> https://gridengine.org/mailman/listinfo/users >>>>>>> -- >>>>>>> Jesse Becker >>>>>>> NHGRI Linux support (Digicon Contractor) >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> [email protected] >>>>>>> https://gridengine.org/mailman/listinfo/users >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> https://gridengine.org/mailman/listinfo/users >>>>> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
