SGE has no information of GPU. Defining a consumable of "ngpus" is a
way to do that, but SGE still does not know which GPU is assigned to
which job(or process).

What I did is to set a script to assign available GPU id(s) to a
job(or MPI process) , like SGE load sensor, but put it in
/etc/prof.d(RedHat Linux), which can be run for each process(using ssh
as qrsh), which is very useful for parallel GPU jobs. And set GPU to
run in "Exclusive  Thread" mode.  If you run parallel job, your
program needs to know to choose a usable GPU(assigned by the script
and exported as $CUDA_VISIBLE_DEVICES) for each process, or do not set
the GPU id in the program for each process  and it can also work
through CUDA.

On Wed, Nov 19, 2014 at 11:41 AM, Kevin Taylor
<[email protected]> wrote:
>
> The catch for us is that we want to know which GPU we're using for the code
> we have.
>
>
>> Date: Wed, 19 Nov 2014 16:58:32 +0100
>> From: [email protected]
>> To: [email protected]; [email protected]
>> Subject: Re: [gridengine users] Requesting a resource OR another resource
>
>>
>> Hi.
>>
>> You have two gpu on one host.. why not define a consumable resource
>> gpu=2 and request it with -l gpu=1 ?
>>
>> the value of gpu will be decreased by one, and it would be possible for
>> another job to ask for the remaining gpu.. or you could request two gpus
>> for one job, with -l gpu=2
>>
>> Best regards.
>> Robi
>>
>>
>> Il 19.11.2014 14:44, Kevin Taylor ha scritto:
>> > I'm not sure if this is possible or not, but thought I'd ask it.
>> >
>> > We have a setup of consumable resources for our GPUs. If a system has
>> > two we have a complex called gpu1_free and gpu2_free. They'll be equal
>> > to 1 if they're free and zero if they're not. Typically we just request
>> > like this: qsub -l gpu1_free=1 job.sh
>> >
>> > Is there a way though qsub to say
>> >
>> > qsub -l gpu1_free=1 OR gpu2_free=1 job.sh
>> >
>> > I know putting multiple -l's will ask for both, but we just want one or
>> > the other, whichever.
>> >
>> > Univa has that nice RSMAP feature that would solve our issue, but we
>> > haven't worked out finances on that yet so we're seeing if we can just
>> > make it work a little easier with what we have.
>> >
>> > Thanks.
>> >
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > [email protected]
>> > https://gridengine.org/mailman/listinfo/users
>>
>
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to