We run 100% on AWS and have been running Mesos in production since version
0.19

Our cluster consists of 3 dedicated zookeeper nodes (M3.2lx), 3 dedicated
masters (M3.2lx), 8 dedicated slaves (M4.4xl) and 2 haproxy (M4.Medium)
instances used in conjunction with marathon-lb for routing requests into
backend services running on Mesos.  
  

We use Terraform a hashicorp tool for building the physical cluster nodes and
Ansible for configuring Mesos, Chronos, and Marathon and Mesos-dns. For
monitoring needs we leverage Datadog which has built in integration for
tracking various stat in the cluster like CPU, Disk, Mem, Roles etc..

  

As of optimization we currently run two different workloads ELT
(Spark/MR/Hadoop) and Scala based microservices. I've since started using
different attributes to prevent my batch oriented jobs from consuming too many
resources and at times blocking on my realtime microservices so instead of
running all services across all nodes I use constraints on both Marathon and
Chronos to fix this and basically partitioned my server into two groups.  

  

The only reason issue we ran into while running in AWS was sizing issues of
our masters. Initially since I knew from the go I would use my masters as
dedicated nodes I started with m3.medium which end up being way too small
andwe would see issues with noisy neighbors % cpu steal was always high ~50%
which would cause huge latency and timeouts between my masters, slaves and
zookeeper. After replacing the m3 mediums with m4.2lx this issue has since
went away.

  

Let me know if you have any specifics.

  

\--RB

  

> On Jan 10 2016, at 2:27 am, lwq Adolph <kenan3...@gmail.com> wrote:  

>

> Hi everyone:

>

>  My future mesos cluster will be at least 100 nodes.So optimization of mesos
is important.May you share your experience on using mesos in production
environment.It can contain following topics:

>

> 1\. monitor tools of mesos cluster

>

> 2\. optimization of mesos parameters

>

>  

>

> Thanks very much  

>

>  

>

> \--  

>

> Thanks & Best Regards

>

> 卢文泉 | Adolph Lu

>

> TEL:+86 15651006559

>

> Linker Networks(<http://www.linkernetworks.com/>)


-- 
*NOTICE TO RECIPIENTS*: This communication is confidential and intended for 
the use of the addressee only. If you are not an intended recipient of this 
communication, please delete it immediately and notify the sender by return 
email. Unauthorized reading, dissemination, distribution or copying of this 
communication is prohibited. This communication does not constitute an 
offer to sell or a solicitation of an indication of interest to purchase 
any loan, security or any other financial product or instrument, nor is it 
an offer to sell or a solicitation of an indication of interest to purchase 
any products or services to any persons who are prohibited from receiving 
such information under applicable law. The contents of this communication 
may not be accurate or complete and are subject to change without notice. 
As such, Orchard App, Inc. (including its subsidiaries and affiliates, 
"Orchard") makes no representation regarding the accuracy or completeness 
of the information contained herein. The intended recipient is advised to 
consult its own professional advisors, including those specializing in 
legal, tax and accounting matters. Orchard does not provide legal, tax or 
accounting advice.

Reply via email to