After making suggested changes it worked.

Thanks Reuti


On Fri, Mar 28, 2014 at 3:49 PM, Reuti <[email protected]> wrote:

> Am 28.03.2014 um 21:31 schrieb Karun K:
>
> > No other requests, just slots
> >
> > echo "sleep 10" | qsub -pe threaded 19
>
> Yes, this job. This was the request for all jobs running right now too?
>
> The set slots valuein a the PE definition is large enough to cover all
> nodes?
>
>
> > default_duration is set to INFINITY currently.
>
> It's best to put a value here like a couple of days representing the usual
> runtime of the jobs, otherwise the INFINITY might allow jobs with a
> reservation to slip in in case no "-l h_rt=..." is requested which sets the
> actual expected runtime (and they might reserve slots beforehand too).
>
> Can you submit the jobs with a reservation ("-R y") and switching on
> reservation by setting "max_reservation" in the scheduler configuration?
>
> -- Reuti
>
>
> > Thanks!
> >
> >
> > On Thu, Mar 27, 2014 at 5:33 PM, Reuti <[email protected]>
> wrote:
> > Am 28.03.2014 um 00:28 schrieb Karun K:
> >
> > > We have a pe environment threaded and each node has 30 slots, 120GB
> ram.
> > > Jobs requiring pe slots >= 19 are getting stuck in queue in qw state
> with following error,
> > >
> > > parallel environment:  threaded range: 19
> > > scheduling info:            cannot run in PE "threaded" because it
> only offers 0 slots
> >
> > This output is (often) misleading.
> >
> > No job requests any "-l h_rt=..." and "-R y" for a reservation? What is
> the value of "default_duration" in the scheduler configuration?
> >
> > -- Reuti
> >
> >
> > > which doesnt make any sense. currently there are more than 30 nodes
> that are idle with 30 slots each
> > >
> > > I am running simple test job, no other complexes are requested.
> > > echo "sleep 10" | qsub -pe threaded 19
> > >
> > > We are using GE 2011.11p1
> > >
> > > Here is the output of one of execute host in sge config,
> > >
> > > hostname              compute-2-2.local
> > > load_scaling          NONE
> > > complex_values        slots=30,h_vmem=120G,io_slots=30
> > > load_values
> arch=linux-x64,num_proc=30,mem_total=123136.023438M, \
> > >
> swap_total=3999.992188M,virtual_total=127136.015625M, \
> > >                       load_avg=11.020000,load_short=11.000000, \
> > >                       load_medium=11.020000,load_long=10.810000, \
> > >                       mem_free=75806.339844M,swap_free=3973.246094M, \
> > >
> virtual_free=79779.585938M,mem_used=47329.683594M, \
> > >                       swap_used=26.746094M,virtual_used=47356.429688M,
> \
> > >
> cpu=36.200000,m_socket=30,m_core=30,np_load_avg=0.367333, \
> > >                       np_load_short=0.366667,np_load_medium=0.367333, \
> > >                       np_load_long=0.360333
> > > processors            30
> > > user_lists            NONE
> > > xuser_lists           NONE
> > > projects              NONE
> > > xprojects             NONE
> > > usage_scaling         NONE
> > > report_variables      NONE
> > >
> > > Thanks,
> > > _______________________________________________
> > > users mailing list
> > > [email protected]
> > > https://gridengine.org/mailman/listinfo/users
> >
> >
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to