I agree - separating the master, login, and storage is wise for the specific reason of having one user or one poorly constructed script ruin everyone's day. For larger, more active clusters, this is my SOP. For smaller ones, I might skip the login node.
As far as specs, these days, finding a low spec system is not really worth it. The price delta isn't huge. But, it also depends on what the head is doing in addition to Grid Engine. If you are using ROCKS, Warewulf, or Perceus, or some other software toolkit to manage and deploy the nodes, you might need more CPU, RAM, and storage. I personally prefer homogeneous hardware for frontend, login, and compute. Sure, it gives plenty of elbow room for the frontend and login, but I'd prefer that over those systems going down due to random loss of resources. Obviously give more RAM to the compute nodes, and either do RAM only (lots of RAM) or big disks for scratch. I only do stateless nodes that run in RAM and use any local disks as scratch space. Ian On Wed, Jul 20, 2016 at 7:56 AM, Notorious Biggles < notoriousbigg...@gmail.com> wrote: > Hi all, > > I have some money available to replace the infrastructure nodes of one of > my company's grid engine clusters and I wanted a sanity check before I > order anything new. > > Initially we contacted the company we originally bought the cluster from > and they quoted us for a combined login/storage/master node with loads of > everything and a hefty price tag. I feel an aversion to combining login > nodes with storage and master nodes - we already have that on one of the > clusters and a user being able to crash the entire cluster seems a bad > thing to me and it happened often enough. > > I read Rayson's blog post about scaling grid engine to 10k nodes at > http://blogs.scalablelogic.com/2012/11/running-10000-node-grid-engine-cluster.html > and it seems that 4 cores and 1 GB of memory is more than enough to run a > grid engine master. Given that I'd be lucky to have 100 nodes to a master, > can anybody see a reason to spec a high powered master node? I look at my > existing master nodes with 8+ cores and 24+ GB of memory and in Ganglia all > I see is acres of green from memory being used as cache and buffers. It > seems rather a waste. > > The other thing I was curious about is what kind of spec seems reasonable > to you for a login node. My one cluster with separate login nodes has > similar specs to the master nodes - 8 cores, 24 GB memory and it seems > wasted. I can see an argument for these nodes to be more than just a low > end box, especially if anybody is trying to do some kind of visualization > on them, but I've never had complaints about them being under-powered yet. > > Any thoughts you might have are appreciated. > > Thanks > Biggles > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users > > -- Ian Kaufman Research Systems Administrator UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users