Thanks, Ian!

I haven't checked the GPU load sensor in detail, either. It sounds to
me it only handles the number of GPU allocated to a job, but the job
doesn't know which GPUs it actually get and set the
CUDA_VISIBLE_DEVICE(some programs need this env to be set). This can
be done by writing some scripts/programs, but to me, it is not an
accurate solution, since some jobs may still happen to collide to each
other on the same GPU on a multiple GPU node. If GE can have the
memory to record the GPUs allocated to a job, then this can be
perfect.


On Mon, Apr 14, 2014 at 1:46 PM, Ian Kaufman <ikauf...@eng.ucsd.edu> wrote:
> I believe there already is support for GPUs - there is a GPU Load
> Sensor in Open Grid Engine. You may have to build it yourself, I
> haven't checked to see if it comes pre-packaged.
>
> Univa has Phi support, and I believe OGE/OGS has it as well, or at
> least has been working on it.
>
> Ian
>
> On Mon, Apr 14, 2014 at 10:35 AM, Feng Zhang <prod.f...@gmail.com> wrote:
>> Hi,
>>
>> Is there's any plan to implement the GPU resource management in SGE in
>> the near future? Like Slurm or Torque? There are some ways to do this
>> using scripts/programs, but I wonder that if the SGE itself can
>> recognize and manage GPU(and Phi). Not need to be complicated and
>> powerful, just do basic work.
>>
>> Thanks,
>> _______________________________________________
>> users mailing list
>> users@gridengine.org
>> https://gridengine.org/mailman/listinfo/users
>
>
>
> --
> Ian Kaufman
> Research Systems Administrator
> UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to