subject:"\[gridengine users\] Multi\-GPU setup"

Re: [gridengine users] Multi-GPU setup

2019-08-28 Thread Hay, William

On Wed, Aug 14, 2019 at 05:11:02PM +0200, Nicolas FOURNIALS wrote: > Hi, > > Le 14/08/2019 à 16:35, Andreas Haupt a écrit : > > Preventing access to the 'wrong' gpu devices by "malicious jobs" is not > > that easy. An idea could be to e.g. play with device permissions. > > That's what we do by ha

Re: [gridengine users] Multi-GPU setup

2019-08-20 Thread Dj Merrill

Apologies, I should have followed up on this. It looks like they've already started work on handling the NVidia device permissions. Look under the branches section, and there are useful notes in both the "hardened" and "nvidia_dev_chgrp" branches. https://github.com/RSE-Sheffield/sge-gpuprolog/b

Re: [gridengine users] Multi-GPU setup

2019-08-20 Thread Nicolas FOURNIALS

Le 14/08/2019 à 19:50, Dj Merrill a écrit : Thanks everyone for the feedback. I found this on Github that looks promising: https://github.com/RSE-Sheffield/sge-gpuprolog Thanks for pointing it. I can probably edit the scripts to also change the permissions on the /dev/nvidia* devices as

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Dj Merrill

Thanks everyone for the feedback. I found this on Github that looks promising: https://github.com/RSE-Sheffield/sge-gpuprolog and this to go with it: https://gist.github.com/willfurnass/10277756070c4f374e6149a281324841 I can probably edit the scripts to also change the permissions on the /dev/

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Joshua Baker-LePain

On Wed, 14 Aug 2019 at 7:21am, Dj Merrill wrote To date in our HPC Grid running Son of Grid Engine 8.1.9, we've had single Nvidia GPU cards per compute node. We are contemplating the purchase of a single compute node that has multiple GPU cards in it, and want to ensure that running jobs only h

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread bergman

In the message dated: Wed, 14 Aug 2019 10:21:12 -0400, The pithy ruminations from Dj Merrill on [[gridengine users] Multi-GPU setup] were: => To date in our HPC Grid running Son of Grid Engine 8.1.9, we've had => single Nvidia GPU cards per compute node. We are contemplating the =>

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Ian Kaufman

You could probably do this using consumables and using resource quoatas to enforce them. Ian On Wed, Aug 14, 2019 at 8:34 AM Christopher Heiny wrote: > On Wed, 2019-08-14 at 16:35 +0200, Andreas Haupt wrote: > > Hi Dj, > > > > we do this by setting $CUDA_VISIBLE_DEVICES in a prolog script (and

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Christopher Heiny

On Wed, 2019-08-14 at 16:35 +0200, Andreas Haupt wrote: > Hi Dj, > > we do this by setting $CUDA_VISIBLE_DEVICES in a prolog script (and > according to what has been requested by the job). > > Preventing access to the 'wrong' gpu devices by "malicious jobs" is > not > that easy. An idea could be

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Friedrich Ferstl

Yes, UGE supports this out of the box. Depending on whether the job is a regular job or a Docker container the method used to restrict access only to the assigned GPU is slightly different. UGE also will only schedule jobs to nodes where it is guaranteed to be able doing this. The interface for

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Tina Friedrich

Hello, from a kernel/mechanism point of view, it is perfectly possible to restrict device access using cgroups. I use that on my current cluster, works really well (both for things like CPU cores and GPUs - you only see what you request, even using something like 'nvidia-smi'). Sadly, my curre

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Nicolas FOURNIALS

Hi, Le 14/08/2019 à 16:35, Andreas Haupt a écrit : Preventing access to the 'wrong' gpu devices by "malicious jobs" is not that easy. An idea could be to e.g. play with device permissions. That's what we do by having /dev/nvidia[0-n] files owned by root and with permissions 660. Prolog (execu

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Skylar Thompson

Hi DJ, I'm not sure if SoGE supports it, but UGE has the concept of "resource maps" (aka RSMAP) complexes which we use to assign specific hardware resources to specific jobs. It functions sort of as a hybrid array/scalar consumable. It looks like this in the host complex_values configuration: cu

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Andreas Haupt

Hi Dj, we do this by setting $CUDA_VISIBLE_DEVICES in a prolog script (and according to what has been requested by the job). Preventing access to the 'wrong' gpu devices by "malicious jobs" is not that easy. An idea could be to e.g. play with device permissions. Cheers, Andreas On Wed, 2019-08-

[gridengine users] Multi-GPU setup

2019-08-14 Thread Dj Merrill

To date in our HPC Grid running Son of Grid Engine 8.1.9, we've had single Nvidia GPU cards per compute node. We are contemplating the purchase of a single compute node that has multiple GPU cards in it, and want to ensure that running jobs only have access to the GPU resources they ask for, and d

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

Re: [gridengine users] Multi-GPU setup

[gridengine users] Multi-GPU setup

14 matches

Site Navigation

Mail list logo

Footer information