On Wed, Aug 14, 2019 at 05:11:02PM +0200, Nicolas FOURNIALS wrote:
> Hi,
>
> Le 14/08/2019 à 16:35, Andreas Haupt a écrit :
> > Preventing access to the 'wrong' gpu devices by "malicious jobs" is not
> > that easy. An idea could be to e.g. play with device permissions.
>
> That's what we do by ha
Apologies, I should have followed up on this. It looks like they've
already started work on handling the NVidia device permissions. Look
under the branches section, and there are useful notes in both the
"hardened" and "nvidia_dev_chgrp" branches.
https://github.com/RSE-Sheffield/sge-gpuprolog/b
Le 14/08/2019 à 19:50, Dj Merrill a écrit :
Thanks everyone for the feedback. I found this on Github that looks
promising:
https://github.com/RSE-Sheffield/sge-gpuprolog
Thanks for pointing it.
I can probably edit the scripts to also change the permissions on the
/dev/nvidia* devices as
Thanks everyone for the feedback. I found this on Github that looks
promising:
https://github.com/RSE-Sheffield/sge-gpuprolog
and this to go with it:
https://gist.github.com/willfurnass/10277756070c4f374e6149a281324841
I can probably edit the scripts to also change the permissions on the
/dev/
On Wed, 14 Aug 2019 at 7:21am, Dj Merrill wrote
To date in our HPC Grid running Son of Grid Engine 8.1.9, we've had
single Nvidia GPU cards per compute node. We are contemplating the
purchase of a single compute node that has multiple GPU cards in it, and
want to ensure that running jobs only h
In the message dated: Wed, 14 Aug 2019 10:21:12 -0400,
The pithy ruminations from Dj Merrill on
[[gridengine users] Multi-GPU setup] were:
=> To date in our HPC Grid running Son of Grid Engine 8.1.9, we've had
=> single Nvidia GPU cards per compute node. We are contemplating the
=>
You could probably do this using consumables and using resource quoatas to
enforce them.
Ian
On Wed, Aug 14, 2019 at 8:34 AM Christopher Heiny
wrote:
> On Wed, 2019-08-14 at 16:35 +0200, Andreas Haupt wrote:
> > Hi Dj,
> >
> > we do this by setting $CUDA_VISIBLE_DEVICES in a prolog script (and
On Wed, 2019-08-14 at 16:35 +0200, Andreas Haupt wrote:
> Hi Dj,
>
> we do this by setting $CUDA_VISIBLE_DEVICES in a prolog script (and
> according to what has been requested by the job).
>
> Preventing access to the 'wrong' gpu devices by "malicious jobs" is
> not
> that easy. An idea could be
Yes, UGE supports this out of the box. Depending on whether the job is a
regular job or a Docker container the method used to restrict access only to
the assigned GPU is slightly different. UGE also will only schedule jobs to
nodes where it is guaranteed to be able doing this.
The interface for
Hello,
from a kernel/mechanism point of view, it is perfectly possible to
restrict device access using cgroups. I use that on my current cluster,
works really well (both for things like CPU cores and GPUs - you only
see what you request, even using something like 'nvidia-smi').
Sadly, my curre
Hi,
Le 14/08/2019 à 16:35, Andreas Haupt a écrit :
Preventing access to the 'wrong' gpu devices by "malicious jobs" is not
that easy. An idea could be to e.g. play with device permissions.
That's what we do by having /dev/nvidia[0-n] files owned by root and
with permissions 660.
Prolog (execu
Hi DJ,
I'm not sure if SoGE supports it, but UGE has the concept of "resource
maps" (aka RSMAP) complexes which we use to assign specific hardware
resources to specific jobs. It functions sort of as a hybrid array/scalar
consumable.
It looks like this in the host complex_values configuration:
cu
Hi Dj,
we do this by setting $CUDA_VISIBLE_DEVICES in a prolog script (and
according to what has been requested by the job).
Preventing access to the 'wrong' gpu devices by "malicious jobs" is not
that easy. An idea could be to e.g. play with device permissions.
Cheers,
Andreas
On Wed, 2019-08-
To date in our HPC Grid running Son of Grid Engine 8.1.9, we've had
single Nvidia GPU cards per compute node. We are contemplating the
purchase of a single compute node that has multiple GPU cards in it, and
want to ensure that running jobs only have access to the GPU resources
they ask for, and d
14 matches
Mail list logo