Thanks, Ian! I haven't checked the GPU load sensor in detail, either. It sounds to me it only handles the number of GPU allocated to a job, but the job doesn't know which GPUs it actually get and set the CUDA_VISIBLE_DEVICE(some programs need this env to be set). This can be done by writing some scripts/programs, but to me, it is not an accurate solution, since some jobs may still happen to collide to each other on the same GPU on a multiple GPU node. If GE can have the memory to record the GPUs allocated to a job, then this can be perfect.
On Mon, Apr 14, 2014 at 1:46 PM, Ian Kaufman <ikauf...@eng.ucsd.edu> wrote: > I believe there already is support for GPUs - there is a GPU Load > Sensor in Open Grid Engine. You may have to build it yourself, I > haven't checked to see if it comes pre-packaged. > > Univa has Phi support, and I believe OGE/OGS has it as well, or at > least has been working on it. > > Ian > > On Mon, Apr 14, 2014 at 10:35 AM, Feng Zhang <prod.f...@gmail.com> wrote: >> Hi, >> >> Is there's any plan to implement the GPU resource management in SGE in >> the near future? Like Slurm or Torque? There are some ways to do this >> using scripts/programs, but I wonder that if the SGE itself can >> recognize and manage GPU(and Phi). Not need to be complicated and >> powerful, just do basic work. >> >> Thanks, >> _______________________________________________ >> users mailing list >> users@gridengine.org >> https://gridengine.org/mailman/listinfo/users > > > > -- > Ian Kaufman > Research Systems Administrator > UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users