Re: Thoughts and opinions in physically building a cluster

2015-06-25 Thread Dick Davies
That doesn't sound too bad (it's a fairly typical setup e.g. on an Amazon VPC).
You probably want to avoid NAT or similar things between master and
slaves to avoid
a lot of LIBPROCESS_IP tricks so same switch sounds good.

Personally I quite like the master/slave distinction.

I wouldn't want a runaway set of  tasks to bog down the masters and
operationally we'd alert
if we're starting to lose masters whereas the slaves are 'cattle' and
we can just spin up more as
they die if need be (it's a little more tricky to scale out masters
and zookeepers so they get treated
as though they were a bit less expendable).

I co-locate the zookeeper ensemble on the masters on smaller clusters
to save VM count,
but that's more personal taste than anything.

On 25 June 2015 at 17:12, Daniel Gaston daniel.gas...@dal.ca wrote:
 So this may be another relatively noob question, but when designing a mesos 
 cluster, is it basically as simple as the nodes connected by a switch? Since 
 any of the nodes can be master nodes or acting as both master and slave, I 
 am guessing there is no need for another head node as you would have with a 
 traditional cluster design. But would each of the nodes then have to be 
 connected to the external/institutional network?

 My rough idea was for this small cluster to not be connected to the main 
 institutional network but for my workstation to be connected to both the 
 cluster's network as well as to the institutional network


 
 From: CCAAT cc...@tampabay.rr.com
 Sent: June-19-15 4:57 PM
 To: user@mesos.apache.org
 Cc: cc...@tampabay.rr.com
 Subject: Re: Thoughts and opinions in physically building a cluster

 On 06/19/2015 01:28 PM, Daniel Gaston wrote:

 On 19/06/2015 18:38, Oliver Nicholas wrote:
 Unless you have some true HA requirements, it seems intuitively
 wasteful to have 3 masters and 2 slaves (unless the cost of 5 nodes is
 inconsequential to you and you hate the environment).
 Any particular reason not to have three nodes which are acting both as
 master and slaves?

 None at all. I'm not a cluster or networking guru, and have only played with 
 mesos in
 cloud-based settings so I wasn't sure how this would work. But it makes 
 sense, that way
 the 'standby' masters are still participating in the zookeeper quorum while 
 still being
 available to do real work as slave nodes.

 Daniel. There is no such thing as a 'cluster guru'. It's all 'seat of
 the pants' flying right now; so you are fine what you are doing and
 propose. If codes do not exist to meet your specific needs and goals,
 they can  (should?) be created.


 I'm working on an architectural expansion Where nodes (virtual, actual
 or bare metal) migrate from master -- entrepreneur -- worker -- slave
 -- embedded (bare metal or specially attached hardware. I'm proposing
 to do all of this with the Autonomy_Function and decisions being made
 bottom_up as opposed to the current top_down dichotomy. I'm prolly going
 to have to 'fork codes' for a while to get things stable and then
 hope they are included; when other minds see the validity of the ideas.


 Surely one box can be set up as both master and slave. Moving slaves
 to masters, should be an automatic function and will prolly will be
 address in the future codes of mesos.


 PS: Keep pushing your ideas and do not take no for an answer!
 Mesos belongs to everybody.

 hth,
 James



Re: Thoughts and opinions in physically building a cluster

2015-06-19 Thread Daniel Gaston
Thanks Oliver, a lot of great suggestions. One of the reasons I was interested 
in Mesos was the idea of it being more generalized. While this small HPC 
cluster will serve one primary job, it will also be used for research purposes. 
So being able to easily test out frameworks and not be 'locked in' to one way 
of doing things is appealing.  Most jobs are relatively CPU/RAM heavy (and 
small file disk I/O unfortunately) but I already have a good handle on building 
individual compute servers that would handle that, so would be suitable slave 
nodes/compute clusters. HA would be nice in terms of ensuring turn-around times 
on workflows, but likely isn't a major issue in terms of if it is down for a 
few hours no one will lose any sleep or die. If the node could be brought back 
up reasonably it should be fine.


Re: Thoughts and opinions in physically building a cluster

2015-06-19 Thread Brian Candler

On 19/06/2015 18:38, Oliver Nicholas wrote:
Unless you have some true HA requirements, it seems intuitively 
wasteful to have 3 masters and 2 slaves (unless the cost of 5 nodes is 
inconsequential to you and you hate the environment).
Any particular reason not to have three nodes which are acting both as 
master and slaves?


Re: Thoughts and opinions in physically building a cluster

2015-06-19 Thread Oliver Nicholas
On Fri, Jun 19, 2015 at 11:22 AM, Brian Candler b.cand...@pobox.com wrote:

 On 19/06/2015 18:38, Oliver Nicholas wrote:

 Unless you have some true HA requirements, it seems intuitively wasteful
 to have 3 masters and 2 slaves (unless the cost of 5 nodes is
 inconsequential to you and you hate the environment).

 Any particular reason not to have three nodes which are acting both as
 master and slaves?


Certainly seems reasonable to me!

-- 
*bigo* / oliver nicholas | staff engineer, infrastructure | uber
technologies, inc.


Re: Thoughts and opinions in physically building a cluster

2015-06-19 Thread Oliver Nicholas
On Fri, Jun 19, 2015 at 10:03 AM, Daniel Gaston daniel.gas...@dal.ca
wrote:

  Hi Everyone,

 I've looked through the archives and the web but still have some questions
 on this question.

 1) If I was looking at building a small compute/HPC cluster is Mesos
 overkill in such a situation?


 Mesos isn't overkill, though there may be platforms developed more
specifically for your use case (vs. Mesos which is extremely generalized).


  2) What is the minimum number of physical nodes? It seems from
 documentation and examples ideally this is something like 5, with 3 masters
 and say two slaves.

Technically speaking, you can do it all with one node.  It just depends
what properties you need.  Having three masters (or any HA grouping, ie/ an
odd number greater than 1) is overkill if high availability isn't a
requirement - you can just have a single master node and live with the fact
that if it goes down, you can't schedule any new tasks until you bring it
back.

Unless you have some true HA requirements, it seems intuitively wasteful to
have 3 masters and 2 slaves (unless the cost of 5 nodes is inconsequential
to you and you hate the environment).


  3) What are some other good resources in terms of doing this?
 Appropriate specs for individual nodes, particularly where you would likely
 want slave/compute nodes to be much beefier than Master nodes. What other
 equipment would you need, just nodes and switches?

Depends what your workloads look like.  Mesos itself (both master and
slave) is very thin - under most circumstances it won't even need a whole
CPU core to itself.  Remember, Mesos itself doesn't do any real work other
than coordination - it's the processes you use it to schedule/run that are
going to use up the physical resources.

So the question you ask yourself in this situation is which primary
resources does my workload use? Is it CPU heavy, memory heavy, maybe disk
or network I/O heavy? That's how you decide what machines to throw at it.
The question is more or less the same whether you use Mesos to schedule or
not.  Identifying resource requirements should be possible both by
understanding what the process does, and by measuring it with standard unix
tools.

As for the second part of your question, you just need a set of computers
that can run modern Linux and talk to each other over TCP/IP.  You probably
want them on a private network.


 4) Would it make sense to have a smaller number of physical nodes split up
 into virtual nodes or will this just make everything much more complex?

This is probably not necessary.  Mesos has native support for process
isolation via cgroups, which obviates one of the advantages of VMs.
Structurally, the whole *point* of Mesos is to abstract away the concept of
individual machines into pools of compute capacity, so you're kinda working
at cross purposes if you go down this road too far.



  Any thoughts, opinions, or directions to resources is much appreciated!



 Cheers,
 Dan





-- 
*bigo* / oliver nicholas | staff engineer, infrastructure | uber
technologies, inc.


Re: Thoughts and opinions in physically building a cluster

2015-06-19 Thread Eelco Maljaars | Maljaars IT
Hi, 


exactly what I’ve been doing for a few smaller setups. I see this as the 
minimum ‘production’ setup. For testing or development I just run everything on 
a single node, sometimes even a VM. 

Regards, 

Eelco 

 On 19 Jun 2015, at 20:23, Oliver Nicholas b...@uber.com wrote:
 
 
 On Fri, Jun 19, 2015 at 11:22 AM, Brian Candler b.cand...@pobox.com 
 mailto:b.cand...@pobox.com wrote:
 On 19/06/2015 18:38, Oliver Nicholas wrote:
 Unless you have some true HA requirements, it seems intuitively wasteful to 
 have 3 masters and 2 slaves (unless the cost of 5 nodes is inconsequential to 
 you and you hate the environment).
 Any particular reason not to have three nodes which are acting both as master 
 and slaves?
 
 Certainly seems reasonable to me!
 
 -- 
 bigo / oliver nicholas | staff engineer, infrastructure | uber technologies, 
 inc.



smime.p7s
Description: S/MIME cryptographic signature


Re: Thoughts and opinions in physically building a cluster

2015-06-19 Thread Dave Martens
Thanks for all of these comments - I had similar questions.

What is the minimum RAM for a master or a slave?  I have heard that the
Mesos slave software adds 1GB of RAM on top of what the slave's workload
processing will require.  I have read that 8GB is the min for a Mesos
machine but it wasn't clear that this was an official/hard requirement.


On Fri, Jun 19, 2015 at 11:31 AM, Eelco Maljaars | Maljaars IT 
ee...@maljaars-it.nl wrote:

 Hi,


 exactly what I’ve been doing for a few smaller setups. I see this as the
 minimum ‘production’ setup. For testing or development I just run
 everything on a single node, sometimes even a VM.

 Regards,

 Eelco


 On 19 Jun 2015, at 20:23, Oliver Nicholas b...@uber.com wrote:


 On Fri, Jun 19, 2015 at 11:22 AM, Brian Candler b.cand...@pobox.com
 wrote:

 On 19/06/2015 18:38, Oliver Nicholas wrote:

 Unless you have some true HA requirements, it seems intuitively wasteful
 to have 3 masters and 2 slaves (unless the cost of 5 nodes is
 inconsequential to you and you hate the environment).

 Any particular reason not to have three nodes which are acting both as
 master and slaves?


 Certainly seems reasonable to me!

 --
 *bigo* / oliver nicholas | staff engineer, infrastructure | uber
 technologies, inc.





Re: Thoughts and opinions in physically building a cluster

2015-06-19 Thread CCAAT

On 06/19/2015 01:45 PM, Dave Martens wrote:

Thanks for all of these comments - I had similar questions.

What is the minimum RAM for a master or a slave?  I have heard that the
Mesos slave software adds 1GB of RAM on top of what the slave's workload
processing will require.  I have read that 8GB is the min for a Mesos
machine but it wasn't clear that this was an official/hard requirement.


There are probably published/standard numbers for the various distros 
that the slave node is build upon (actual or virtual). Actually with a 
robust (CI) infrastructure, these sort of resource metrics and the 
various benchmarks should be revealed to the user community, routinely.

I'm not certain on what, if any, of this sort of data is being published.


If you tune (stip) the Operating System or the virtual image or the 
kernel, then these numbers are most likely lower. I'm not sure much has 
been published on tuning the OS, kernels or installs for mesos. HPC

offerings will surely be pushing the envelop of these and many more
related  metrics for performance tuning to specialized classes of 
problem, hardware specifics and other goals of performance tuning.



Most will run on bloat_ware but the smarter datacenters and HPC folks 
will 'cut the pork' for those single digit performance gains. YMMV.



hth,
James