Re: [gridengine users] limit the nuber of job to start each schedule interval

Nicolás Serrano Martínez-Santos Wed, 26 Mar 2014 03:30:38 -0700

We have this problem continously. The solution we have created is to 
cache the data that is going to be used. Typically, there are X cores per
host so the same data is read X times (more if the process is repeated). So
each process looks if the file has been cached, if not, it tries to lock the
file (so multiple copies are avoided). When the file is locked is copied to the
host.


If you want to take a look to the script:

http://bazaar.launchpad.net/~translectures/tlk/trunk/view/head:/scripts/tLtask-train/scripts/get_cache.sh

The problem is that this script is quite system dependent and that the first
processes are going to take more time, as they have to cache the data.

Excerpts from Arnau Bria's message of 2014-03-26 09:30:25 +0100:
> On Tue, 25 Mar 2014 16:53:43 +0100
> Reuti Reuti wrote:
> 
> > Hi,
> Hi Reuti,
>  
> > Am 25.03.2014 um 15:37 schrieb Arnau Bria:
> > 
> > > I've been looking for a parameter that limits the amount of jobs to
> > > be started in each schedule interval but I did not find it (man
> > > sge_sched_conf)
> > > 
> > > Is there any way to limit that?
> > 
> > IIRC there was a similar question on the list before. The solution
> > was to put a random sleep in the queue prolog to avoid overloading of
> > any NFS server from where the jobs will read data (in case that's
> > your goal).
> 
> Yes, that's my goal.
> I'll have to study that solution. I don't know if we can add extra
> walltime to users jobs as they have to pay per walltime used...
> 
> Thanks for your answer,
> 
> > -- Reuti
> Cheers,
> Arnau

-- 
NiCo
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] limit the nuber of job to start each schedule interval

Reply via email to