We have a pe environment threaded and each node has 30 slots, 120GB ram.
Jobs requiring pe slots >= 19 are getting stuck in queue in qw state with
following error,

parallel environment:  threaded range: 19
scheduling info:            cannot run in PE "threaded" because it only
offers 0 slots

which doesnt make any sense. currently there are more than 30 nodes that
are idle with 30 slots each

I am running simple test job, no other complexes are requested.
echo "sleep 10" | qsub -pe threaded 19

We are using GE 2011.11p1

Here is the output of one of execute host in sge config,

hostname              compute-2-2.local
load_scaling          NONE
complex_values        slots=30,h_vmem=120G,io_slots=30
load_values           arch=linux-x64,num_proc=30,mem_total=123136.023438M, \
                      swap_total=3999.992188M,virtual_total=127136.015625M,
\
                      load_avg=11.020000,load_short=11.000000, \
                      load_medium=11.020000,load_long=10.810000, \
                      mem_free=75806.339844M,swap_free=3973.246094M, \
                      virtual_free=79779.585938M,mem_used=47329.683594M, \
                      swap_used=26.746094M,virtual_used=47356.429688M, \

cpu=36.200000,m_socket=30,m_core=30,np_load_avg=0.367333, \
                      np_load_short=0.366667,np_load_medium=0.367333, \
                      np_load_long=0.360333
processors            30
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         NONE
report_variables      NONE

Thanks,
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to