Re: [gridengine users] cannot run in PE "mpich" because it only offers 0 slots

Reuti Thu, 08 Dec 2011 01:11:03 -0800

Am 08.12.2011 um 10:00 schrieb wzlu:

> 於 2011/12/8 下午 04:56, Reuti 提到:
>> Am 08.12.2011 um 01:11 schrieb wzlu:
>> 
>>> I tried for all available queues and get the same messages.
>>> 
>>> $ qsub -w v c3-parallel-pgimpich2-demo.sh
>>> Unable to run job: Job 144914 cannot run in queue "cc001-t001" because it 
>>> is not contained in its hard queue list (-q)
>>> Job 144914 cannot run in queue "q0-em64t-ge" because it is not contained in 
>>> its hard queue list (-q)
>>> Job 144914 cannot run in queue "q0-em64t-ib" because it is not contained in 
>>> its hard queue list (-q)
>>> Job 144914 cannot run in PE "mpich" because it only offers 0 slots
>>> verification: no suitable queues.
>>> Exiting.
>> Is there any resource request in the job script itself? As you don't request 
>> any queue on the command line, I would think so. Can you provide the header 
>> please.
>> 
>> -- Reuti
> Yes, there is "-pe mpich 2" in the job script.
> 
> the job script like following:
> # Specifies the name of the shell to use for the job(recommand)
> #$ -S /bin/sh
> # Job name(optional)
> #$ -N c3-parallel-pgimpich1-demo
> # Specifies queue name(required)
> #$ -q q0-em64t-ddr


Okay, here you request a queue. Specifying a queue request on the command line 
will override the list which explains the chaning complaints about the command 
list.

Have the queues q0-em64t-ge, q0-em64t-ib and q0-em64t-ddr the same list of 
hosts and you limit the overall slot count per exechost?

-- Reuti


> # Specifies number of nodes(required)
> #$ -pe mpich 2
> # Specifies output files(optional)
> 
> Best Regards,
> Lu
> 
>> 
>> 
>>> $ qsub -q q0-em64t-ib -w v c3-parallel-pgimpich2-demo.sh
>>> Unable to run job: Job 144915 cannot run in queue "cc001-t001" because it 
>>> is not contained in its hard queue list (-q)
>>> Job 144915 cannot run in queue "q0-em64t-ddr" because it is not contained 
>>> in its hard queue list (-q)
>>> Job 144915 cannot run in queue "q0-em64t-ge" because it is not contained in 
>>> its hard queue list (-q)
>>> Job 144915 cannot run in PE "mpich" because it only offers 0 slots
>>> verification: no suitable queues.
>>> Exiting.
>>> $ qsub -q q0-em64t-ge -w v c3-parallel-pgimpich2-demo.sh
>>> Unable to run job: Job 144916 cannot run in queue "cc001-t001" because it 
>>> is not contained in its hard queue list (-q)
>>> Job 144916 cannot run in queue "q0-em64t-ddr" because it is not contained 
>>> in its hard queue list (-q)
>>> Job 144916 cannot run in queue "q0-em64t-ib" because it is not contained in 
>>> its hard queue list (-q)
>>> Job 144916 cannot run in PE "mpich" because it only offers 0 slots
>>> verification: no suitable queues.
>>> Exiting.
>>> 
>>> I created a temporary PE and used qalter to run those queued jobs.
>>> 
>>> Best Regards,
>>> Lu
>>> 
>>> 於 2011/12/7 下午 07:32, Reuti 提到:
>>>> Hi,
>>>> 
>>>> Am 07.12.2011 um 08:13 schrieb wzlu:
>>>> 
>>>>> The same problem occur again.
>>>>> 
>>>>> I try command "qsub -w v" and get following message.
>>>>> Unable to run job: Job 144878 cannot run in queue "cc001-t001" because it 
>>>>> is not contained in its hard queue list (-q)
>>>>> Job 144878 cannot run in queue "q0-em64t-ge" because it is not contained 
>>>>> in its hard queue list (-q)
>>>>> Job 144878 cannot run in queue "q0-em64t-ib" because it is not contained 
>>>>> in its hard queue list (-q)
>>>>> Job 144878 cannot run in PE "mpich" because it only offers 0 slots
>>>> so, the PE you requested is "mpich". Did you request any queue in the 
>>>> `qsub` command, and is the PE attached to this queue?
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>> verification: no suitable queues.
>>>>> Exiting.
>>>>> 
>>>>> Have any idea? Thanks.
>>>>> 
>>>>> Best Regards,
>>>>> Lu
>>>>> 
>>>>> 於 2011/10/7 下午 09:04, Reuti 提到:
>>>>>> Am 06.10.2011 um 14:40 schrieb Jesse Becker:
>>>>>> 
>>>>>>> I ran into this a few months ago, and it had almost nothing to do with
>>>>>>> PE slots. Unfortunately, I can't recall what I did to fix it either.
>>>>>>> Try submitting test jobs with "-w v" and "-w p" to get more of an idea
>>>>>>> of what's going on.
>>>>>> Yes, this needs to be investigated by hand. It's an RFE to get a better 
>>>>>> scheduling output. Like here, you would like to know why the slots 
>>>>>> couldn't be allocated. That there are only zero slots avilable, is the 
>>>>>> result of another limit already.
>>>>>> 
>>>>>> Could be memory, RQS, slots, ...
>>>>>> 
>>>>>> -- Reuti
>>>>>> 
>>>>>>> On Thu, Oct 06, 2011 at 04:39:39AM -0400, wzlu wrote:
>>>>>>>> Dear All,
>>>>>>>> 
>>>>>>>> There are 144 nodes in my queue and I configured 1 slot for each node. 
>>>>>>>> That is 144 nodes with 144 slots.
>>>>>>>> The PE is used 121 slots now. One job need 12 PE's slots and there are 
>>>>>>>> enough nodes and slots for this job.
>>>>>>>> But it queued by "cannot run in PE "mpich" because it only offers 0 
>>>>>>>> slots".
>>>>>>>> 
>>>>>>>> Configure as following:
>>>>>>>> 
>>>>>>>> $ qconf -sp mpich
>>>>>>>> pe_name mpich
>>>>>>>> slots 81920
>>>>>>>> user_lists NONE
>>>>>>>> xuser_lists NONE
>>>>>>>> start_proc_args /bin/true
>>>>>>>> stop_proc_args /bin/true
>>>>>>>> allocation_rule $round_robin
>>>>>>>> control_slaves TRUE
>>>>>>>> job_is_first_task FALSE
>>>>>>>> urgency_slots min
>>>>>>>> 
>>>>>>>> $ qconf -ssconf
>>>>>>>> algorithm default
>>>>>>>> schedule_interval 0:0:5
>>>>>>>> maxujobs 0
>>>>>>>> queue_sort_method load
>>>>>>>> job_load_adjustments NONE
>>>>>>>> load_adjustment_decay_time 0:7:30
>>>>>>>> load_formula slots
>>>>>>>> schedd_job_info true
>>>>>>>> flush_submit_sec 0
>>>>>>>> flush_finish_sec 0
>>>>>>>> params none
>>>>>>>> reprioritize_interval 0:0:0
>>>>>>>> halftime 168
>>>>>>>> usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000
>>>>>>>> compensation_factor 5.000000
>>>>>>>> weight_user 0.250000
>>>>>>>> weight_project 0.250000
>>>>>>>> weight_department 0.250000
>>>>>>>> weight_job 0.250000
>>>>>>>> weight_tickets_functional 0
>>>>>>>> weight_tickets_share 0
>>>>>>>> share_override_tickets TRUE
>>>>>>>> share_functional_shares TRUE
>>>>>>>> max_functional_jobs_to_schedule 200
>>>>>>>> report_pjob_tickets TRUE
>>>>>>>> max_pending_tasks_per_job 50
>>>>>>>> halflife_decay_list none
>>>>>>>> policy_hierarchy OFS
>>>>>>>> weight_ticket 0.010000
>>>>>>>> weight_waiting_time 0.000000
>>>>>>>> weight_deadline 3600000.000000
>>>>>>>> weight_urgency 0.100000
>>>>>>>> weight_priority 1.000000
>>>>>>>> max_reservation 0
>>>>>>>> default_duration 00:15:00
>>>>>>>> 
>>>>>>>> How to fix this problem. Thanks a lot.
>>>>>>>> 
>>>>>>>> Best Regards,
>>>>>>>> Lu
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> [email protected]
>>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>> -- 
>>>>>>> Jesse Becker
>>>>>>> NHGRI Linux support (Digicon Contractor)
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> [email protected]
>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> https://gridengine.org/mailman/listinfo/users
>>>>> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] cannot run in PE "mpich" because it only offers 0 slots

Reply via email to