You shouldn't need to muck with slurm cgroups code nor use a full
chroot solution to remove access to certain fileystems for unauthorized
jobs. You can write a simple plugin to put the job in a private
namespace via the unshare(2) system call, mark the new filesystem
tree private, then unmount unauthorized filesystems.

Since you've marked the mount tree in the current namespace as
private, the unmount shouldn't propagate to other namespaces,
so your Lustre filesystems will stay mounted in the default
namespace.

You would need to do this in a spank plugin so that the unshare(2)
and mount(8) commands run in the context of the processes which
you are launching.

Here's an example of the process using the unshare(1) shell utility:


 # hype356 /root > df /p/lcrater2
 Filesystem           1K-blocks      Used Available Use% Mounted on
 zwicky-mds2-ib0@o2ib9:/lc2
                     218665928880 125545872548 93120053348  58%
                     /p/lcrater2
 # hype356 /root > unshare -m bash
 hype356@root:df /p/lcrater2
 Filesystem           1K-blocks      Used Available Use% Mounted on
 zwicky-mds2-ib0@o2ib9:/lc2
                     218665928880 125545872548 93120053348  58%
                     /p/lcrater2
 hype356@root:mount --make-rprivate / 
 hype356@root:umount /p/lcrater2
 hype356@root:df /p/lcrater2
 Filesystem           1K-blocks      Used Available Use% Mounted on 
 /dev/sda3            1919906792  26320708 1796060428   2% /


Invoking a new login on this node shows that /p/lcrater2 is actually
still mounted in the default namespace:

 grondo@hype356:~$ df /p/lcrater2
 Filesystem           1K-blocks      Used Available Use% Mounted on  
                 218665928880 125545872548 93120053348  58% /p/lcrater2


mark


Moe Jette <je...@schedmd.com> writes:

> If you define 20 lustre licenses in Slurm and every job using those  
> resources requests a license that should work. Linux cgroups could  
> prevent file system for jobs without a license, but that would require  
> some non-trivial changes to the Slurm code.
>
>
> Quoting Marcin Stolarek <stolarek.mar...@gmail.com>:
>
>> 2012/12/16 Aaron Knister <aaron.knis...@gmail.com>
>>
>>> Hi Marcin,
>>>
>>> Hi Aaron,
>>
>>
>>> Could you describe the use case for preventing access when the lustre
>>> license isn't specified? That might help me offer a better solution.
>>>
>>
>> For instance, I know that my lustre installation can support efficiently
>> only 20 running jobs, but I have more CPU resources so I'd like to prevent
>> running more than 20 jobs on lustre, but allow other jobs (using only local
>> scratch or even do not using scratch at all).
>>
>>>
>>> Off the top of my head I could see using prolog/epilog scripts to mount
>>> and unmount lustre as required however this only works if your nodes are
>>> job-exclusive (one job per node).
>>>
>>
>>  This is not my case, I have heterogenous installation with 64 cores in a
>> few nodes on which are sometimes running 64 separate jobs :)
>>
>>>
>>> Another option would be to use a chroot environment for each job and only
>>> bind-mount required filesystems. This approach has a lot of overhead,
>>> though and I'm not entirely sure of the implementation details. You could
>>> also try the cgroup namespace subsystem. I did some reading this morning
>>> and can't find documentation for this but I had some vague recollection
>>> that you could control which filesystems are accessible to each namespace.
>>>
>> chroot proposition sounds interesting, I don't know how to use cgroups for
>> filesystem IO operations limiting.
>>
>> thank you,
>> marcin
>>
>>>
>>> Sent from my iPhone
>>>
>>> On Dec 16, 2012, at 10:12 AM, Marcin Stolarek <stolarek.mar...@gmail.com>
>>> wrote:
>>>
>>> Hi all,
>>>
>>> I'm thinking about describing lustre as a licence in our slurm
>>> installation. Do you know any possibility of preventing job from writing
>>> on/reading from a specified file system if user haven't specified
>>> appropriate licence string?
>>>
>>> thanks,
>>> marcin
>>>
>>>
>>

Reply via email to