Re: [SGE-discuss] Mixing shared memory parallel and serial jobs

Jackson, Gary L. Thu, 14 Sep 2017 07:22:02 -0700

Pretty much:

1. I submit a job for 8 slots
2. A large number of slots are reserved (maybe the entire cluster)
3. When a single node with 8 slots is available, the job is run and the rest of 
the reservation is released.


I don’t anticipate needing to run parallel jobs very often, but when they come 
in, I want them to run before any serial jobs with lower priority.

-- 
Gary Jackson

On 9/14/17, 10:07 AM, "Reuti" <[email protected]> wrote:

    Hi,
    
    Maybe I don't get it in the right way:
    
    > Am 14.09.2017 um 15:41 schrieb Jackson, Gary L. <[email protected]>:
    > 
    > 
    > Thanks! This definitely has me headed in the right direction.
    > 
    > Is there any way to overprovision the reservation? For instance, I’d like 
to reserve more resources than needed, and then when the needed resources 
become free, the job is started and the rest of the resources are released?
    
    You want to submit a job requesting 20 cores for an 8 core job. When 20 
cores become free, you will use the necessary 8 cores and release the 12 
superfluous ones?
    
    
    > The purpose of this is to reduce turnaround time on the parallel jobs at 
the expense of delaying serial jobs.
    
    You could also limit the number of serial jobs in general by an RQS, and in 
addition pack them at one end of the cluster by a queue sort order. Or in a 
simple setup certain nodes could also be dedicated for the serial jobs only and 
the rest of the cluster for the parallel ones.
    
    -- Reuti
    
    
    > -- 
    > Gary Jackson
    > 
    > On 9/13/17, 4:44 PM, "Reuti" <[email protected]> wrote:
    > 
    >    -----BEGIN PGP SIGNED MESSAGE-----
    >    Hash: SHA1
    > 
    >    Hi,
    > 
    >    Am 13.09.2017 um 19:15 schrieb Jackson, Gary L.:
    > 
    >> 
    >> I’d like to run multithreaded jobs on a cluster that has previously been 
used exclusively for serial jobs. The problem is that serial jobs are bypassing 
the parallel jobs despite the parallel jobs having higher priority. Since an 
entire node never comes free, a parallel job will never run. How do I set up 
scheduling policy to suspend scheduling serial jobs until higher-priority 
parallel jobs are scheduled?
    > 
    >    You will need slot reservation for the parallel job. Mainly as a 
starting point:
    > 
    >    $ qconf -ssconf
    >    ...
    >    max_reservation                   20
    >    default_duration                  8760:00:00
    > 
    >    and submit the parallel jobs with "-R y". The best would be to supply 
all jobs (serial and parallel) with a sensible expected run time by "-l 
h_rt=…". This will both reserve slots for the large parallel job (which may 
reserve some slots which keep them idle until enough slots are collected) and 
backfilling, which will still allow serial (or smaller parallel jobs) to start 
on idle reserved cores if it's known for sure, that they will finish before the 
last necessary slot for the parallel job becomes free.
    > 
    >    - -- Reuti
    > 
    >    http://gridengine.org/pipermail/users/2012-July/004090.html
    >    -----BEGIN PGP SIGNATURE-----
    >    Comment: GPGTools - https://gpgtools.org
    > 
    >    iEYEARECAAYFAlm5mJ0ACgkQo/GbGkBRnRoGhQCeMU0RBHIVWLdeNddKse2sw/jd
    >    RNQAnj7PXXealFk0DYpmv53rca2MFqAM
    >    =5awF
    >    -----END PGP SIGNATURE-----
    > 
    > 
    > 
    
    

_______________________________________________
SGE-discuss mailing list
[email protected]
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

Re: [SGE-discuss] Mixing shared memory parallel and serial jobs

Reply via email to