Thanks for a quick reply. You have summarized correctly. So you would recommend a setup where the gpu-frontend-node would manage all machines, cpu and gpu? That is good to know, but requires some rethinking of our current setup.
The desire from the users is to be able to send different workloads to different queues, depending on the type of task, ie some tasks are more ideal for cpu, others gpu. Additionally they wish to utilize the cpus on the gpu-nodes to maximize the overall utilization. I am not sure whether this is possible, or if it is an overall different debate. The desire from me as a sysadmin is to have the all-function used as a method of removing machines temporarily to perform tests or upgrades without users adding more jobs. Johan On Fri, Nov 4, 2011 at 9:54 AM, William Hay <[email protected]> wrote: > On 4 November 2011 07:24, Johan Finstadsveen <[email protected]> > wrote: > > Hi, > > Unsure whether this is the correct forum for this debate. > > > > We are currently in the process of acquiring a gpu-cluster. From before > we > > have a cpu-based cluster running Rocks 5.3 and SGE. The desire from the > > users is to have three different queues from a single frontend (types of > > queues: all, cpu, gpu). > > What is the optimal/best practice setup in this case? Should one frontend > > administrate all machines (old and new cluster). Or should a dedicated > > server use SGE to send queues to the two cluster-frontends (ie, have > three > > SGE). Or are there other setups or solutions that are more optimal? > > Best regards > > Johan Finstadsveen > If I understand correctly you are debating whether to use a single > cluster or a front end > cluster that uses transfer queues to feed two backend clusters? > Presumably the all queue > would be for CPU jobs that can run on the machines to which the GPUs are > attached as well as the dedicated CPU resource. I'd suggest a single > cluster as I believe that > with the transfer queue setup it would be hard to avoid making an > early commitment as to which cluster > a given job should run on which could lead unnecessarily wasted resources. > > What if anything would be the advantage to the user of selecting the > cpu queue? Are the machines in > the cpu cluster better in some way? > > William >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
