On Tue, May 15, 2012 at 6:11 AM, Semi <[email protected]> wrote: > Can you give me more detailed answer and correct my definitions.
Hi Semi, I was away for the past 2 days. Please always cc the list when you are replying (I guess Reuti, Ron, and I always suggest people to do that - there are many ways to configure Grid Engine, and others may see something that we don't see, and it is uaually better to get feedback from more people). On the other hand, if you really need you might consider support ( http://www.scalablelogic.com/scalable-grid-engine-support ). There is always someone who can respond to your questions even when I am away. > qconf -sc|grep gpu > gpu gpu INT <= YES YES > 0 0 > > qconf -me sge135 > hostname sge135 > load_scaling NONE > complex_values gpu=2 > > qconf -mconf sge135 > sge135: > mailer /bin/mail > xterm /usr/bin/X11/xterm > qlogin_daemon /usr/sbin/in.telnetd > rlogin_daemon /usr/sbin/in.rlogind > load_sensor /storage/SGE6U8/gpu-load-sensor/cuda_sensor Note that if you statically define a host to have 2 GPUs, then you don't need to use the cuda_sensor. The GPU load sensor distributed by the Open Grid Scheduler project (which you can find in other Grid Engine implementations) is very similar to Bright Computing's GPU Management in the Bright Cluster Manager: http://www.brightcomputing.com/NVIDIA-GPU-Cluster-Management-Monitoring.php We both monitor temperature, fan speed, voltage, ECC, etc. When we started the GPU load sensor development we didn't know that Bright had something similar... >From a scheduling point of view, you can ignore most of that. Some sites like to bias node priority based on GPU temperature, and in some cases if the ECC error is real bad then the GPU should not be used for GPU jobs. > > qsub -l gpu=1 test.sh > > And if I need parallel run on GPU. What I have to do? How define pe for GPU? You just use "qsub -l gpu=2" if you want to use 2 GPUs for that job. Rayson > > > On 5/14/2012 2:51 PM, Rayson Ho wrote: > > Just get the load sensor from: > > https://gridscheduler.svn.sourceforge.net/svnroot/gridscheduler/trunk/source/dist/gpu/gpu_sensor.c > > Compile it on your system - and make sure that it has the CUDA SDK & > libraries installed (Google is your friend - look for the nvidia-ml > library). > > % cc gpu_sensor.c -lnvidia-ml > > Before you use it as a load sensor, compile and run it interactively: > > % cc gpu_sensor.c -DSTANDALONE -lnvidia-ml > > Make sure that the code is reporting something meaningful on your system. > > Rayson > > > > On Mon, May 14, 2012 at 4:55 AM, Semi <[email protected]> wrote: > > Please help in GPU integration under SGE and parallel running of NAMD and > GAMESS on GPU via SGE. > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
