[slurm-dev] Re: Setting up SLURM for a single multi-core node

Rohan Garg Sat, 13 Feb 2016 21:48:38 -0800

I finally figured out that --ntasks=1 is what I wanted. My applications
are multi-threaded (pthread), and what I really want is to be able
to bind each pthread to a separate physical core. Setting --cpus-per-task
to the number of threads should work except that my system has
HyperThreading enabled. On the 24-core machine, the configuration
is as follows:


  NUMA node0 CPU(s):     0-11,24-35
  NUMA node1 CPU(s):     12-23,36-47

I can't seem to get SLURM to schedule one thread per physical core. The
relevant parts from my slurm.conf and one of my job submission
scripts are as follows:

  slurm.conf:

    SchedulerType=sched/backfill
    SelectType=select/cons_res
    SelectTypeParameters=CR_CORE,CR_ONE_TASK_PER_CORE
    TaskPlugin=task/affinity
    TaskPluginParam=Sched
    NodeName=localhost NodeAddr=127.0.0.1 Sockets=2 CoresPerSocket=12 
ThreadsPerCore=2 Procs=48 State=UNKNOWN

  job.batch:

    #SBATCH --ntasks=1 # On process
    #SBATCH --ntasks-per-core=1  # Does not seem to help?
    #SBATCH --cpus-per-task=2 # Two threads
    srun /path/to/application

Submitting the above job allocates two threads on the same physical
core. I noticed that specifying --hint=nomultithread has no effect.
Also, I tried setting the ThreadsPerCore param to 1 and the Procs
param to 24 but it did not help either. Once again, it feels like
I'm missing something obvious; any help is appreciated.

Thanks again,
Rohan

> On Feb 11, 2016, at 3:31 PM, Rohan Garg <[email protected]> wrote:
> 
> 
> Thanks for the suggestions, Ben. I tried this on the 24-core machine,
> and it worked!
> 
> ----- Original Message -----
> From: "Fitzpatrick, Ben" <[email protected]>
> To: "slurm-dev" <[email protected]>
> Sent: Thursday, February 11, 2016 12:02:02 PM
> Subject: [slurm-dev] Re: Setting up SLURM for a single multi-core node
> 
> Hi Rohan,
> 
> We use:
> 
> SelectTypeParameters=CR_CORE_DEFAULT_DIST_BLOCK
> 
> to try to keep jobs on the one socket if possible - the documentation 
> describes it like this:
> 
>                     CR_CORE_DEFAULT_DIST_BLOCK
>                             Allocate cores within a node using block  distri-
>                             bution  by  default.   This  is a pseudo-best-fit
>                             algorithm that minimizes the number of boards and
>                             minimizes  the  number of sockets (within minimum
>                             boards) used for the  allocation.   This  default
>                             behavior  can be overridden specifying a particu-
>                             lar  "-m"  parameter   with   srun/salloc/sbatch.
>                             Without  this  option,  cores  will  be allocated
>                             cyclicly across the sockets.
> 
> and would probably run like this to get 8 cores:
> 
> #!/bin/bash
> #SBATCH --ntasks-per-core=1
> #SBATCH --ntasks=8
> srun /path/to/application1
> 
> (that or just --ntasks=16).
> 
> Cheers,
> 
> Ben
> 
> -----Original Message-----
> From: Rohan Garg [mailto:[email protected]] 
> Sent: 11 February 2016 16:06
> To: slurm-dev
> Subject: [slurm-dev] Re: Setting up SLURM for a single multi-core node
> 
> 
> Thanks for the quick response, Benjamin.
> 
> My intent of using the `--sockets-per-node=1` option was to try to
> force the allocation of a job on a single socket for spatial locality.
> But this is not a hard constraint for me. As I said before, my main
> goal is to be able to occupy all the available cores on a multi-core
> machine, given enough jobs.
> 
>> You have 3 jobs that each request 1 socket = 3 socket.
>> You have 2 phys. sockets.
> 
> I'm not sure if this is how it's behaving because:
> 
>  a) just specifying the `--cores-per-socket` option and the
>     `--threads-per-core` option doesn't change the behavior of the
>     scheduler; and
>  b) in another setting with a 24-core machine (12 cores each on 2 sockets),
>     depending on how I submit my jobs, I am able to occupy up to 20 cores
>     by submitting 5 jobs, each of which uses the `--sockets-per-node=1`
>     option (along with the other two options).
> 
> ----- Original Message -----
> From: "Benjamin Redling" <[email protected]>
> To: "slurm-dev" <[email protected]>
> Sent: Thursday, February 11, 2016 8:18:02 AM
> Subject: [slurm-dev] Re: Setting up SLURM for a single multi-core node
> 
> On 2016-02-11 07:36, Rohan Garg wrote:
>> 
>> Hello,
>> 
>> I'm trying to set up SLURM-15.08.1 on a single multi-core node to
>> manage multi-threaded jobs. The machine has 16 physical cores
>> on 2 sockets with HyperThreading enabled. I'm using the EASY
>> scheduling algorithm with backfilling. The goal is to fully utilize all
>> the available cores at all times.
>> 
>> Given a list of three jobs with requirements of 8 cores, 2 cores,
>> and 4 cores, the expectation is that the jobs should be co-scheduled
>> to utilize 14 of the 16 available cores.  However, I can't seem to
>> get SLURM to work as expected. SLURM runs the latter two jobs
>> together but refuses to schedule the first job until they finish.
>> (Is this the expected behavior of the EASY-backfilling algorithm?)
>> 
>> Here's the list of jobs:
>> 
>>  $ cat job1.batch
>> 
>>    #!/bin/bash
>>    #SBATCH --sockets-per-node=1
>>    #SBATCH --cores-per-socket=8
>>    #SBATCH --threads-per-core=1
>>    srun /path/to/application1
>> 
>>  $ cat job2.batch
>> 
>>    #!/bin/bash
>>    #SBATCH --sockets-per-node=1
>>    #SBATCH --cores-per-socket=2
>>    #SBATCH --threads-per-core=1
>>    srun /path/to/application2
>> 
>>  $ cat job3.batch
>> 
>>    #!/bin/bash
>>    #SBATCH --sockets-per-node=1
>>    #SBATCH --cores-per-socket=4
>>    #SBATCH --threads-per-core=1
>>    srun /path/to/application3
> 
> At a quick glance:
> 
> In general let the scheduler do the work. Don't micro-manage.
> Be aware that your SBATCH setting are constraints -- not hints.
> 
> You have 3 jobs that each request 1 socket = 3 socket.
> You have 2 phys. sockets.
> 
> /Benjamin
> -- 
> FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
> vox: +49 3641 9 44323 | fax: +49 3641 9 44321

[slurm-dev] Re: Setting up SLURM for a single multi-core node

Reply via email to