On Mon, May 28, 2012 at 03:16:56PM -0400, Alan McKay wrote:
Hey folks,
Hola,
That cluster has a dedicated node for sgemaster, which also exports /gridware to the other nodes. To me this seems wasteful since the load on that system is flatlined at just about zero 100% of the time. It essentially does nothing. I have several months of "munin" data to show this.
It is, perhaps wasteful now, but I consider it to be a good practice (more or less). Details below.
This would seem to tell me that I could put sgemaster on one of the compute nodes without putting any excess load on that node. Then repurpose the box running sgemaster.
You could, but I think that is a bad idea (see below).
As for the /gridware currently exported by the sgemaster, I would move that to our ZFS appliance.
We do the same thing--works well too.
Do most people keep a dedicated node for sgemaster? Or do most do as I'd like to do and run it on a compute node?
It all depends on your environment. On small clusters with few jobs, dedicated hardware for SGE is probably overkill. On large clusters with thousands of jobs (or more), dedicated hardware is a requirement, and the hardware should be moderately beefy as well. Running the master on a compute node is a bad idea. As metioned elsewhere in this thread, processes on the compute node can run amok, and may interfere with the qmaster process. On the other hand, it is possible for the qmaster to consume a very large amount of RAM (if not CPU cycles), forcing either the compute jobs or the qmaster itself to swap--and that sucks for everyone. A compromise is to run the qmaster on a login node. This system typically does not run jobs, and often has sufficient resources (CPU/RAM) to handle the addition load (such as it may be) of running the SGE qmaster. -- Jesse Becker NHGRI Linux support (Digicon Contractor) Specialization is for insects. -- R.A.Heinlein _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
