[slurm-dev] Re: Disk I/O as consumable?

Marcin Stolarek Tue, 08 Sep 2015 05:02:23 -0700

2015-09-08 12:55 GMT+02:00 Raymond Wan <[email protected]>:

>
> Dear all,
>
> I'm trying to figure out how to configure a "cluster" with a single
> computer (i.e., execution and master node is the same).  After I
> figure this out, I hope that setting up a cluster with multiple nodes
> is not too difficult.
>
> In particular, I think the default setting permits only a single job
> per node at a time.  However, I'd like to set things up so that more
> than one job can run at a time.
>
> I'm looking at the CPU Management User and Administrator Guide [1],
> and in particular, the Consumable Resources in Slurm page [2].  I hope
> I'm on the right track?
>
> In the examples, I understand the memory (CR_Memory) example.  But, I
> don't quite understand the CR_CPU_Memory example.  What is -N and -n?
> The manpages says -N is the number of nodes...so with only one node,
> that is meaningless in my case.  -n is "number of tasks".  Is "number
> of tasks" the same as "number of CPUs"?
>
> Is there a reason why the example used both -N and -n and not just -n?
>  Do the two parameters interact somehow?
>
> If I have a computer with 2 cores and 10 threads each, that is 20
> CPUs.  So, -n can range from 1 to 20?
>
> And under SelectTypeParameters, if I set CR_CPU_Memory, then a job
> enters the running state if both CPU and Memory is available.
>
> So far, I hope I'm correct?  If so, then my "real" question is that
> the jobs I would like to run are mainly I/O intensive.  CPU and Memory
> usage is important, but the bottleneck is probably disk I/O.  If I've
> set up k disk partitions using object store, I'd like no more than k
> jobs to run at a time and I'd like each one to write to a different
> partition.
>
> I *think* this is "impossible" to do since it would be hard to force
> users to write to one partition and not any others.  But, I thought
> I'd ask anyway in case there is something within SLURM that I've
> missed.  Any suggestions?
>


You can use something like this:
 https://github.com/fafik23/slurm_plugins/blob/master/unshare/unshare.c
Using unshare syscall/linux namespaces, and unmount specified filesystems.
You can use licenses to achieve a kind of limiting number of jobs that are
using specified mountpoint, but... thats not real IOPS threshold. Currently
I don't how any linux mechanism that allows limitting process to specified
number of I/O operations per second. At our side we've been considering
writing our own fusefs with this functionality.

If you are using local disks,  gres may fit better than licenses...

cheers,
marcin

[slurm-dev] Re: Disk I/O as consumable?

Reply via email to