Am 13.01.2014 um 18:33 schrieb Joe Borġ:

> Thanks.  Can you please tell me what I'm doing wrong?
> 
> qsub -q test.q -R y -l h_rt=60 -pe test.pe 1 small.bash
> qsub -q test.q -R y -l h_rt=120 -pe test.pe 2 big.bash
> qsub -q test.q -R y -l h_rt=60 -pe test.pe 1 small.bash
> qsub -q test.q -R y -l h_rt=60 -pe test.pe 1 small.bash

Only the parallel job needs "-R y".


> 
> job-ID  prior   name       user         state submit/start at     queue       
>                    slots ja-task-ID 
> -----------------------------------------------------------------------------------------------------------------
>  156757 0.50000 small.bash joe.borg     qw    01/13/2014 16:45:18             
>                        1        
>  156761 0.50000 big.bash   joe.borg     qw    01/13/2014 16:55:31             
>                        2        
>  156762 0.50000 small.bash joe.borg     qw    01/13/2014 16:55:33             
>                        1        
>  156763 0.50000 small.bash joe.borg     qw    01/13/2014 16:55:34             
>                        1  
> 
> ...But when I release...

max_reservation is set?

But the reservation feature must also be seen in a running cluster. If all four 
jobs are on hold and released at once, I wouldn't be surprised if it's not 
strictly FIFO.


> job-ID  prior   name       user         state submit/start at     queue       
>                    slots ja-task-ID 
> -----------------------------------------------------------------------------------------------------------------
>  156757 0.50000 small.bash joe.borg     r     01/13/2014 16:56:06 test.q@test 
>                  1        
>  156762 0.50000 small.bash joe.borg     r     01/13/2014 16:56:06 test.q@test 
>                  1        
>  156761 0.50000 big.bash   joe.borg     qw    01/13/2014 16:55:31             
>                       2        
>  156763 0.50000 small.bash joe.borg     qw    01/13/2014 16:55:34             
>                      1        

As job 156762 has the same runtime as 156757, backfilling will occur to use the 
otherwise idling core. Whether job 156762 is started or not, the parallel one 
156761 will start at the same time. Only 156763 shouldn't start.

-- Reuti


> 
> 
> Thanks
> 
> 
> 
> Regards,
> Joseph David Borġ 
> josephb.org
> 
> 
> On 13 January 2014 17:26, Reuti <[email protected]> wrote:
> Am 13.01.2014 um 17:24 schrieb Joe Borġ:
> 
> > Hi Reuti,
> >
> > I am using a PE, so that's fine.
> >
> > I've not set either of the other 3.  Will the job be killed if 
> > default_duration is exceeded?
> 
> No. It can be set to any value you like (like a few weeks), but it shouldn't 
> be set to "INFINITY" as SGE judges infinity being smaller than infinity and 
> so backfilling will always occur.
> 
> -- Reuti
> 
> 
> > Thanks
> >
> >
> >
> > Regards,
> > Joseph David Borġ
> > josephb.org
> >
> >
> > On 13 January 2014 16:16, Reuti <[email protected]> wrote:
> > Hi,
> >
> > Am 13.01.2014 um 16:58 schrieb Joe Borġ:
> >
> > > I'm trying to set up an SGE queue and am having a problem getting the 
> > > jobs to start in the right order.  Here is my example - test.q with 2 
> > > possible slots and the following jobs queued:
> > >
> > > job-ID  prior   name       user         state submit/start at     queue   
> > >                        slots ja-task-ID
> > > -----------------------------------------------------------------------------------------------------------------
> > >  1           0.50000 small.bash joe.borg     qw    01/13/2014 15:43:16    
> > >                                 1
> > >  2           0.50000 big.bash   joe.borg     qw    01/13/2014 15:43:24    
> > >                                 2
> > >  3           0.50000 small.bash joe.borg     qw    01/13/2014 15:43:27    
> > >                                 1
> > >  4           0.50000 small.bash joe.borg     qw    01/13/2014 15:43:28    
> > >                                 1
> > >
> > > I want the jobs to run in that order, but (obviously), when I enable the 
> > > queue, the small jobs fill the available slots and the big job has to 
> > > wait for them to complete.  I'd like it setup so that only job 1 runs; 
> > > finishes, then 2 (with both slots), then the final 2 jobs, 3 & 4, 
> > > together.
> > >
> > > I've looked at -R y on submission, but doesn't seem to work.
> >
> > For the reservation to work (and it's only necessary to request it for the 
> > parallel job) it's necessary to have suitable "h_rt" requests for all jobs.
> >
> > - Do you request any "h_rt" for all jobs?
> > - Do you have a "default_duration" set to a proper value in the schedule 
> > configuration otherwise?
> > - Is "max_reservation" set to a value like 16?
> >
> > -- Reuti
> >
> >
> > > Regards,
> > > Joseph David Borġ
> > > josephb.org
> > > _______________________________________________
> > > users mailing list
> > > [email protected]
> > > https://gridengine.org/mailman/listinfo/users
> >
> >
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to