On Fri, Jun 19, 2015 at 10:03 AM, Daniel Gaston <[email protected]> wrote:
> Hi Everyone, > > I've looked through the archives and the web but still have some questions > on this question. > > 1) If I was looking at building a small compute/HPC cluster is Mesos > overkill in such a situation? > > > Mesos isn't overkill, though there may be platforms developed more specifically for your use case (vs. Mesos which is extremely generalized). > 2) What is the minimum number of physical nodes? It seems from > documentation and examples ideally this is something like 5, with 3 masters > and say two slaves. > Technically speaking, you can do it all with one node. It just depends what properties you need. Having three masters (or any HA grouping, ie/ an odd number greater than 1) is overkill if high availability isn't a requirement - you can just have a single master node and live with the fact that if it goes down, you can't schedule any new tasks until you bring it back. Unless you have some true HA requirements, it seems intuitively wasteful to have 3 masters and 2 slaves (unless the cost of 5 nodes is inconsequential to you and you hate the environment). > 3) What are some other good resources in terms of doing this? > Appropriate specs for individual nodes, particularly where you would likely > want slave/compute nodes to be much beefier than Master nodes. What other > equipment would you need, just nodes and switches? > Depends what your workloads look like. Mesos itself (both master and slave) is very thin - under most circumstances it won't even need a whole CPU core to itself. Remember, Mesos itself doesn't do any real work other than coordination - it's the processes you use it to schedule/run that are going to use up the physical resources. So the question you ask yourself in this situation is which primary resources does my workload use? Is it CPU heavy, memory heavy, maybe disk or network I/O heavy? That's how you decide what machines to throw at it. The question is more or less the same whether you use Mesos to schedule or not. Identifying resource requirements should be possible both by understanding what the process does, and by measuring it with standard unix tools. As for the second part of your question, you just need a set of computers that can run modern Linux and talk to each other over TCP/IP. You probably want them on a private network. > 4) Would it make sense to have a smaller number of physical nodes split up > into virtual nodes or will this just make everything much more complex? > This is probably not necessary. Mesos has native support for process isolation via cgroups, which obviates one of the advantages of VMs. Structurally, the whole *point* of Mesos is to abstract away the concept of individual machines into pools of compute capacity, so you're kinda working at cross purposes if you go down this road too far. > Any thoughts, opinions, or directions to resources is much appreciated! > > > > Cheers, > Dan > > > -- *bigo* / oliver nicholas | staff engineer, infrastructure | uber technologies, inc.

