Pretty much: 1. I submit a job for 8 slots 2. A large number of slots are reserved (maybe the entire cluster) 3. When a single node with 8 slots is available, the job is run and the rest of the reservation is released.
I don’t anticipate needing to run parallel jobs very often, but when they come in, I want them to run before any serial jobs with lower priority. -- Gary Jackson On 9/14/17, 10:07 AM, "Reuti" <[email protected]> wrote: Hi, Maybe I don't get it in the right way: > Am 14.09.2017 um 15:41 schrieb Jackson, Gary L. <[email protected]>: > > > Thanks! This definitely has me headed in the right direction. > > Is there any way to overprovision the reservation? For instance, I’d like to reserve more resources than needed, and then when the needed resources become free, the job is started and the rest of the resources are released? You want to submit a job requesting 20 cores for an 8 core job. When 20 cores become free, you will use the necessary 8 cores and release the 12 superfluous ones? > The purpose of this is to reduce turnaround time on the parallel jobs at the expense of delaying serial jobs. You could also limit the number of serial jobs in general by an RQS, and in addition pack them at one end of the cluster by a queue sort order. Or in a simple setup certain nodes could also be dedicated for the serial jobs only and the rest of the cluster for the parallel ones. -- Reuti > -- > Gary Jackson > > On 9/13/17, 4:44 PM, "Reuti" <[email protected]> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > Am 13.09.2017 um 19:15 schrieb Jackson, Gary L.: > >> >> I’d like to run multithreaded jobs on a cluster that has previously been used exclusively for serial jobs. The problem is that serial jobs are bypassing the parallel jobs despite the parallel jobs having higher priority. Since an entire node never comes free, a parallel job will never run. How do I set up scheduling policy to suspend scheduling serial jobs until higher-priority parallel jobs are scheduled? > > You will need slot reservation for the parallel job. Mainly as a starting point: > > $ qconf -ssconf > ... > max_reservation 20 > default_duration 8760:00:00 > > and submit the parallel jobs with "-R y". The best would be to supply all jobs (serial and parallel) with a sensible expected run time by "-l h_rt=…". This will both reserve slots for the large parallel job (which may reserve some slots which keep them idle until enough slots are collected) and backfilling, which will still allow serial (or smaller parallel jobs) to start on idle reserved cores if it's known for sure, that they will finish before the last necessary slot for the parallel job becomes free. > > - -- Reuti > > http://gridengine.org/pipermail/users/2012-July/004090.html > -----BEGIN PGP SIGNATURE----- > Comment: GPGTools - https://gpgtools.org > > iEYEARECAAYFAlm5mJ0ACgkQo/GbGkBRnRoGhQCeMU0RBHIVWLdeNddKse2sw/jd > RNQAnj7PXXealFk0DYpmv53rca2MFqAM > =5awF > -----END PGP SIGNATURE----- > > > _______________________________________________ SGE-discuss mailing list [email protected] https://arc.liv.ac.uk/mailman/listinfo/sge-discuss
