Re: CPU resource allocation: ignore?

2015-03-11 Thread Connor Doyle
If you don't care at all about accounting usage of that resource then you 
should be able to set it to 0.0.  As Ian mentioned, this won't be enforced with 
the cpu isolator disabled.
--
Connor

 On Mar 11, 2015, at 08:43, Ian Downes idow...@twitter.com wrote:
 
 The --isolation flag for the slave determines how resources are *isolated*, 
 i.e., by not specifying any cpu isolator there will be no isolation between 
 executors for cpu usage; the Linux scheduler will try to balance their 
 execution.
 
 Cpu and memory are considered required resources for executors and I believe 
 the master enforces this.
 
 What are behavior are you trying to achieve? If your jobs don't require much 
 cpu then can you not just set a small value, like 0.25 cpu?
 
 On Wed, Mar 11, 2015 at 7:20 AM, Geoffroy Jabouley 
 geoffroy.jabou...@gmail.com wrote:
 Hello
 
 As cpu relatives shares are *not very* relevant in our heterogenous cluster, 
 we would like to get rid of CPU resources management and only use MEM 
 resources for our cluster and tasks allocation.
 
 Even when modifying the isolation flag of our slave to 
 --isolation=cgroups/mem, we see these in the logs:
 
 from the slave, at startup:
 I0311 15:09:55.006750 50906 slave.cpp:289] Slave resources: 
 ports(*):[31000-32000, 80-443]; cpus(*):2; mem(*):1979; disk(*):22974
 
 from the master:
 I0311 15:15:16.764714 50884 hierarchical_allocator_process.hpp:563] 
 Recovered ports(*):[31000-32000, 80-443]; cpus(*):2; mem(*):1979; 
 disk(*):22974 (total allocatable: ports(*):[31000-32000, 80-443]; cpus(*):2; 
 mem(*):1979; disk(*):22974) on slave 
 20150311-150951-3982541578-5050-50860-S0 from framework 
 20150311-150951-3982541578-5050-50860-
 
 And mesos master UI is showing both CPU and MEM resources status.
 
 
 
 Btw, we are using Marathon and Jenkins frameworks to start our mesos tasks, 
 and the cpus field seems mandatory (set to 1.0 by default). So i guess you 
 cannot easily bypass cpu resources allocation...
 
 
 Any idea?
 Regards
 
 2015-02-19 15:15 GMT+01:00 Ryan Thomas r.n.tho...@gmail.com:
 Hey Don,
 
 Have you tried only setting the 'cgroups/mem' isolation flag on the slave 
 and not the cpu one? 
 
 http://mesosphere.com/docs/reference/mesos-slave/
 
 
 ryan
 
 On 19 February 2015 at 14:13, Donald Laidlaw donlaid...@me.com wrote:
 I am using Mesos 0.21.1 with Marathon 0.8.0 and running everything in 
 docker containers.
 
 Is there a way to have mesos ignore the cpu relative shares? That is, not 
 limit the docker container CPU at all when it runs. I would still want to 
 have the Memory resource limitation, but would rather just let the linux 
 system under the containers schedule all the CPU.
 
 This would allow us to just allocate tasks to mesos slaves based on 
 available memory only, and to let those tasks get whatever CPU they could 
 when they needed it. This is desireable where there can be lots of 
 relative high memory tasks that have very low CPU requirements. Especially 
 if we do not know the capabilities of the slave machines with regards to 
 CPU. Some of them may have fast CPU's, some slow, so it is hard to pick a 
 relative number for that slave.
 
 Thanks,
 
 Don Laidlaw
 


Re: CPU resource allocation: ignore?

2015-03-11 Thread Geoffroy Jabouley
Ok so it seems better to keep cpu isolator and use little cpu share.


btw, when trying to create a mesos task using Marathon with cpu=0.0, i get
the following errors:

[2015-03-11 17:05:48,395] INFO Received status update for task
test-app.7b6ad5d9-c808-11e4-946b-56847afe9799: *TASK_LOST (Task uses
invalid resources: cpus(*):0)* (mesosphere.marathon.MarathonScheduler:148)
[2015-03-11 17:05:48,402] INFO Task
test-app.7b6ad5d9-c808-11e4-946b-56847afe9799 expunged and removed from
TaskTracker (mesosphere.marathon.tasks.TaskTracker:107)

so i guess this is not possible.

2015-03-11 17:05 GMT+01:00 Ian Downes idow...@twitter.com:

 Sorry, I meant that no cpu isolator only means no isolation.

 *The allocator* *does enforce a non-zero cpu allocation*, specifically
 see MIN_CPUS defined in src/master/constants.cpp to be 0.01 and used by the
 allocator:

 HierarchicalAllocatorProcessRoleSorter, FrameworkSorter::*allocatable*(
 const Resources resources)
 {
   Optiondouble cpus = resources.cpus();
   OptionBytes mem = resources.mem();

   return (*cpus.isSome()  cpus.get() = MIN_CPUS*) ||
  (mem.isSome()  mem.get() = MIN_MEM);
 }

 On Wed, Mar 11, 2015 at 8:54 AM, Connor Doyle con...@mesosphere.io
 wrote:

 If you don't care at all about accounting usage of that resource then you
 should be able to set it to 0.0.  As Ian mentioned, this won't be enforced
 with the cpu isolator disabled.
 --
 Connor

 On Mar 11, 2015, at 08:43, Ian Downes idow...@twitter.com wrote:

 The --isolation flag for the slave determines how resources are
 *isolated*, i.e., by not specifying any cpu isolator there will be no
 isolation between executors for cpu usage; the Linux scheduler will try to
 balance their execution.

 Cpu and memory are considered required resources for executors and I
 believe the master enforces this.

 What are behavior are you trying to achieve? If your jobs don't require
 much cpu then can you not just set a small value, like 0.25 cpu?

 On Wed, Mar 11, 2015 at 7:20 AM, Geoffroy Jabouley 
 geoffroy.jabou...@gmail.com wrote:

 Hello

 As cpu relatives shares are *not very* relevant in our heterogenous
 cluster, we would like to get rid of CPU resources management and only use
 MEM resources for our cluster and tasks allocation.

 Even when modifying the isolation flag of our slave to
 --isolation=cgroups/mem, we see these in the logs:

 *from the slave, at startup:*
 I0311 15:09:55.006750 50906 slave.cpp:289] Slave resources:
 ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979; disk(*):22974

 *from the master:*
 I0311 15:15:16.764714 50884 hierarchical_allocator_process.hpp:563]
 Recovered ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979;
 disk(*):22974 (total allocatable: ports(*):[31000-32000, 80-443];
 *cpus(*):2*; mem(*):1979; disk(*):22974) on slave
 20150311-150951-3982541578-5050-50860-S0 from framework
 20150311-150951-3982541578-5050-50860-

 And mesos master UI is showing both CPU and MEM resources status.



 Btw, we are using Marathon and Jenkins frameworks to start our mesos
 tasks, and the cpus field seems mandatory (set to 1.0 by default). So i
 guess you cannot easily bypass cpu resources allocation...


 Any idea?
 Regards

 2015-02-19 15:15 GMT+01:00 Ryan Thomas r.n.tho...@gmail.com:

 Hey Don,

 Have you tried only setting the 'cgroups/mem' isolation flag on the
 slave and not the cpu one?

 http://mesosphere.com/docs/reference/mesos-slave/


 ryan

 On 19 February 2015 at 14:13, Donald Laidlaw donlaid...@me.com wrote:

 I am using Mesos 0.21.1 with Marathon 0.8.0 and running everything in
 docker containers.

 Is there a way to have mesos ignore the cpu relative shares? That is,
 not limit the docker container CPU at all when it runs. I would still want
 to have the Memory resource limitation, but would rather just let the 
 linux
 system under the containers schedule all the CPU.

 This would allow us to just allocate tasks to mesos slaves based on
 available memory only, and to let those tasks get whatever CPU they could
 when they needed it. This is desireable where there can be lots of 
 relative
 high memory tasks that have very low CPU requirements. Especially if we do
 not know the capabilities of the slave machines with regards to CPU. Some
 of them may have fast CPU's, some slow, so it is hard to pick a relative
 number for that slave.

 Thanks,

 Don Laidlaw








Re: CPU resource allocation: ignore?

2015-03-11 Thread Ian Downes
The --isolation flag for the slave determines how resources are *isolated*,
i.e., by not specifying any cpu isolator there will be no isolation between
executors for cpu usage; the Linux scheduler will try to balance their
execution.

Cpu and memory are considered required resources for executors and I
believe the master enforces this.

What are behavior are you trying to achieve? If your jobs don't require
much cpu then can you not just set a small value, like 0.25 cpu?

On Wed, Mar 11, 2015 at 7:20 AM, Geoffroy Jabouley 
geoffroy.jabou...@gmail.com wrote:

 Hello

 As cpu relatives shares are *not very* relevant in our heterogenous
 cluster, we would like to get rid of CPU resources management and only use
 MEM resources for our cluster and tasks allocation.

 Even when modifying the isolation flag of our slave to
 --isolation=cgroups/mem, we see these in the logs:

 *from the slave, at startup:*
 I0311 15:09:55.006750 50906 slave.cpp:289] Slave resources:
 ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979; disk(*):22974

 *from the master:*
 I0311 15:15:16.764714 50884 hierarchical_allocator_process.hpp:563]
 Recovered ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979;
 disk(*):22974 (total allocatable: ports(*):[31000-32000, 80-443];
 *cpus(*):2*; mem(*):1979; disk(*):22974) on slave
 20150311-150951-3982541578-5050-50860-S0 from framework
 20150311-150951-3982541578-5050-50860-

 And mesos master UI is showing both CPU and MEM resources status.



 Btw, we are using Marathon and Jenkins frameworks to start our mesos
 tasks, and the cpus field seems mandatory (set to 1.0 by default). So i
 guess you cannot easily bypass cpu resources allocation...


 Any idea?
 Regards

 2015-02-19 15:15 GMT+01:00 Ryan Thomas r.n.tho...@gmail.com:

 Hey Don,

 Have you tried only setting the 'cgroups/mem' isolation flag on the slave
 and not the cpu one?

 http://mesosphere.com/docs/reference/mesos-slave/


 ryan

 On 19 February 2015 at 14:13, Donald Laidlaw donlaid...@me.com wrote:

 I am using Mesos 0.21.1 with Marathon 0.8.0 and running everything in
 docker containers.

 Is there a way to have mesos ignore the cpu relative shares? That is,
 not limit the docker container CPU at all when it runs. I would still want
 to have the Memory resource limitation, but would rather just let the linux
 system under the containers schedule all the CPU.

 This would allow us to just allocate tasks to mesos slaves based on
 available memory only, and to let those tasks get whatever CPU they could
 when they needed it. This is desireable where there can be lots of relative
 high memory tasks that have very low CPU requirements. Especially if we do
 not know the capabilities of the slave machines with regards to CPU. Some
 of them may have fast CPU's, some slow, so it is hard to pick a relative
 number for that slave.

 Thanks,

 Don Laidlaw






Re: mesos-collectd-plugin

2015-03-11 Thread Dick Davies
Hi Dan

I can see a couple of things that could be wrong
(NB: not a collectd expert, but these are differences I see from
my working config).

1. Is /opt/collectd/etc/collectd.conf your main collectd config file?

otherwise, it's not being read at all by collectd.

2. I configure the plugin in that file i.e. the

Module mesos-master

block should be in  /opt/collectd/etc/collectd.conf , not tucked down
in the python module path
directory.

3. Are you sure your master listens on localhost? Mine doesn't, I
needed to set that Host line
to match the IP I set that master to listen on ( e.g. in /etc/mesos-master/ip ).

Pretty sure one of those will do the trick
(NB: you'll only get metrics from the elected master; the 'standby'
masters still get polled
but collectd will ignore any data from them unless they're the primary)

On 11 March 2015 at 19:52, Dan Dong dongda...@gmail.com wrote:
 Hi, Dick,
   I put the plugin under:
 $ ls -l /opt/collectd/lib/collectd/plugins/python/
 total 504
 -rw-r--r-- 1 root root345 Mar 10 19:40 mesos-master.conf
 -rw-r--r-- 1 root root  1 Mar 10 15:06 mesos-master.py
 -rw-r--r-- 1 root root322 Mar 10 19:44 mesos-slave.conf
 -rw-r--r-- 1 root root   6808 Mar 10 15:06 mesos-slave.py
 -rw-r--r-- 1 root root 288892 Mar 10 19:35 python.a
 -rwxr-xr-x 1 root root969 Mar 10 19:35 python.la
 -rwxr-xr-x 1 root root 188262 Mar 10 19:35 python.so

 And in /opt/collectd/etc/collectd.conf, I set:

 LoadPlugin python
 Globals true
 /LoadPlugin
 .

 Plugin python
 ModulePath /opt/collectd/lib/collectd/plugins/python/
 LogTraces true
 /Plugin

 $ cat /opt/collectd/lib/collectd/plugins/python/mesos-master.conf
 LoadPlugin python
 Globals true
 /LoadPlugin

 Plugin python
 ModulePath /opt/collectd/lib/collectd/plugins/python/
 Import mesos-master
 Module mesos-master
 Host localhost
 Port 5050
 Verbose false
 Version 0.21.0
 /Module
 /Plugin

 Anything wrong with the above settings?

 Cheers,
 Dan



 2015-03-10 17:21 GMT-05:00 Dick Davies d...@hellooperator.net:

 Hi Dan

 The .py files (the plugin) live in the collectd python path,
 it sounds like maybe you're not loading the plugin .conf file into
 your collectd config?

 The output will depend on what your collectd is set to write to, I use
 it with write_graphite.

 On 10 March 2015 at 20:41, Dan Dong dongda...@gmail.com wrote:
  Hi, All,
Does anybody use this mesos-collectd-plugin:
  https://github.com/rayrod2030/collectd-mesos
 
  I have installed collectd and this plugin, then configured it as
  instructions and restarted the collectd daemon, why seems nothing
  happens on
  the mesos:5050 web UI( python plugin has been turned on in
  collectd.conf).
 
  My question is:
  1. Should I install collectd and this mesos-collectd-plugin on each
  master
  and slave nodes and restart collectd daemon? (This is what I have done.)
  2. Should the config file mesos-master.conf only configured on master
  node
  and
  mesos-slave.conf only configured on slave node?(This is what I have
  done.)
  Or both of them should only appear on master node?
  3. Is there an example( or a figure) of what output one is expected to
  see
  by this plugin?
 
  Cheers,
  Dan
 




Re: mesos-collectd-plugin

2015-03-11 Thread Dan Dong
Hi, Dick,
  1. Yes, /opt/collectd/etc/collectd.conf is my main and only collectd
config file.
  2. Have put the Module mesos-master block into collectd.conf now.
  3. I have set only 1 master, and set the Host line to ip address of
master node and restarted collectd.

I think I have to install graphite on my cluster too to let the following
line works in collectd.conf:
LoadPlugin write_graphite

Cheers,
Dan


2015-03-11 15:09 GMT-05:00 Dick Davies d...@hellooperator.net:

 Hi Dan

 I can see a couple of things that could be wrong
 (NB: not a collectd expert, but these are differences I see from
 my working config).

 1. Is /opt/collectd/etc/collectd.conf your main collectd config file?

 otherwise, it's not being read at all by collectd.

 2. I configure the plugin in that file i.e. the

 Module mesos-master

 block should be in  /opt/collectd/etc/collectd.conf , not tucked down
 in the python module path
 directory.

 3. Are you sure your master listens on localhost? Mine doesn't, I
 needed to set that Host line
 to match the IP I set that master to listen on ( e.g. in
 /etc/mesos-master/ip ).

 Pretty sure one of those will do the trick
 (NB: you'll only get metrics from the elected master; the 'standby'
 masters still get polled
 but collectd will ignore any data from them unless they're the primary)

 On 11 March 2015 at 19:52, Dan Dong dongda...@gmail.com wrote:
  Hi, Dick,
I put the plugin under:
  $ ls -l /opt/collectd/lib/collectd/plugins/python/
  total 504
  -rw-r--r-- 1 root root345 Mar 10 19:40 mesos-master.conf
  -rw-r--r-- 1 root root  1 Mar 10 15:06 mesos-master.py
  -rw-r--r-- 1 root root322 Mar 10 19:44 mesos-slave.conf
  -rw-r--r-- 1 root root   6808 Mar 10 15:06 mesos-slave.py
  -rw-r--r-- 1 root root 288892 Mar 10 19:35 python.a
  -rwxr-xr-x 1 root root969 Mar 10 19:35 python.la
  -rwxr-xr-x 1 root root 188262 Mar 10 19:35 python.so
 
  And in /opt/collectd/etc/collectd.conf, I set:
 
  LoadPlugin python
  Globals true
  /LoadPlugin
  .
 
  Plugin python
  ModulePath /opt/collectd/lib/collectd/plugins/python/
  LogTraces true
  /Plugin
 
  $ cat /opt/collectd/lib/collectd/plugins/python/mesos-master.conf
  LoadPlugin python
  Globals true
  /LoadPlugin
 
  Plugin python
  ModulePath /opt/collectd/lib/collectd/plugins/python/
  Import mesos-master
  Module mesos-master
  Host localhost
  Port 5050
  Verbose false
  Version 0.21.0
  /Module
  /Plugin
 
  Anything wrong with the above settings?
 
  Cheers,
  Dan
 
 
 
  2015-03-10 17:21 GMT-05:00 Dick Davies d...@hellooperator.net:
 
  Hi Dan
 
  The .py files (the plugin) live in the collectd python path,
  it sounds like maybe you're not loading the plugin .conf file into
  your collectd config?
 
  The output will depend on what your collectd is set to write to, I use
  it with write_graphite.
 
  On 10 March 2015 at 20:41, Dan Dong dongda...@gmail.com wrote:
   Hi, All,
 Does anybody use this mesos-collectd-plugin:
   https://github.com/rayrod2030/collectd-mesos
  
   I have installed collectd and this plugin, then configured it as
   instructions and restarted the collectd daemon, why seems nothing
   happens on
   the mesos:5050 web UI( python plugin has been turned on in
   collectd.conf).
  
   My question is:
   1. Should I install collectd and this mesos-collectd-plugin on each
   master
   and slave nodes and restart collectd daemon? (This is what I have
 done.)
   2. Should the config file mesos-master.conf only configured on master
   node
   and
   mesos-slave.conf only configured on slave node?(This is what I
 have
   done.)
   Or both of them should only appear on master node?
   3. Is there an example( or a figure) of what output one is expected to
   see
   by this plugin?
  
   Cheers,
   Dan
  
 
 



Call for MesosCon 2015 Sponsors

2015-03-11 Thread Chris Aniszczyk
Hello Mesos community!

We're in the process of planning MesosCon 2015
http://events.linuxfoundation.org/events/mesoscon for this August and are
super thankful to our current list of sponsors: Cisco, eBay, Hubspot,
Mesosphere, Twitter and VMWare.

We are still looking for sponsors! if you're interested in sponsoring
MesosCon, you can read the prospectus here:
http://events.linuxfoundation.org/events/mesoscon/sponsor

Feel free to reach out to me if you have any questions.

Anyways, looking forward to seeing all of you at MesosCon in beautiful
Seattle.

-- 
Cheers,

Chris Aniszczyk
http://aniszczyk.org
+1 512 961 6719


Re: mesos on coreos

2015-03-11 Thread Gurvinder Singh
Thanks Alex for the information and others too for sharing their
experiences.

- Gurvinder
On 03/11/2015 07:50 PM, Alex Rukletsov wrote:
 Gurvinder,
 
 no, there are no publicly available binaries, neither is documentation
 at this point. We will publish either or both as soon as it is rock solid.
 
 On Wed, Mar 11, 2015 at 2:08 AM, Gurvinder Singh
 gurvinder.si...@uninett.no mailto:gurvinder.si...@uninett.no wrote:
 
 On 03/10/2015 11:41 PM, Tim Chen wrote:
  Hi all,
 
  As Alex said you can run Mesos in CoreOS without Docker if you put in
  the dependencies in.
 
 Tim, is there any documentation of using Mesos outside container in
 CoreOS available or binary available which we can wget in cloud-init
 file to fulfill dependencies. As we would like to test it out Mesos on
 CoreOS outside docker.
 
 - Gurvinder
  It is a common ask though to run Mesos-slave in a Docker container in
  general, either on CoreOS or not. It's definitely a bit involved as you
  need to mount in a directory for persisting work dir and also mounting
  in /sys/fs for cgroups, also you should use the --pid=host flag since
  Docker 1.5 so it shares the host pid namespace.
 
  Although you get a lot less isolation, there are still motivations to
  run slave in Docker regardless.
 
  One thing that's missing from the mesos docker containerizer is that it
  won't be able to recover tasks on restart, and I have a series of
  patches pending review to fix that.
 
  Tim
 
  On Tue, Mar 10, 2015 at 3:16 PM, Alex Rukletsov a...@mesosphere.io 
 mailto:a...@mesosphere.io
  mailto:a...@mesosphere.io mailto:a...@mesosphere.io wrote:
 
  My 2¢.
 
 
  First of all, it doesn’t look like a great idea to package
  resource manager into Docker putting one more abstraction
 layer
  between a resource itself and resource manager.
 
 
  You can run mesos-slave on CoreOS node without putting it into a
  Docker container.
 
  —Alex
 
 
 
 



Re: CPU resource allocation: ignore?

2015-03-11 Thread Ben Whitehead
you could try using the posix cpu isolator (it only provides monitoring not
limiting)
`--isolation='posix/cpu,cgroups/mem` then then only using a small
allocation of CPU for your marathon app (0.1)

On Wed, Mar 11, 2015 at 9:41 AM, Cole Brown bigs...@gmail.com wrote:

 I may be wrong in this, but it seems as though the ideal solution for this
 would be to avoid using the cpu isolator, but allocate MIN_CPUS cpus to
 your tasks. That way you can avoid having the isolator give you 1/100th of
 a CPU/sec of CPU time, while still allowing yourself 100 tasks per CPU
 resource.

 Though I could imagine the resource fragmentation clashing with other
 processes!


 On Wed, Mar 11, 2015 at 11:43 AM Ian Downes idow...@twitter.com wrote:

 The --isolation flag for the slave determines how resources are
 *isolated*, i.e., by not specifying any cpu isolator there will be no
 isolation between executors for cpu usage; the Linux scheduler will try to
 balance their execution.

 Cpu and memory are considered required resources for executors and I
 believe the master enforces this.

 What are behavior are you trying to achieve? If your jobs don't require
 much cpu then can you not just set a small value, like 0.25 cpu?

 On Wed, Mar 11, 2015 at 7:20 AM, Geoffroy Jabouley 
 geoffroy.jabou...@gmail.com wrote:

 Hello

 As cpu relatives shares are *not very* relevant in our heterogenous
 cluster, we would like to get rid of CPU resources management and only use
 MEM resources for our cluster and tasks allocation.

 Even when modifying the isolation flag of our slave to
 --isolation=cgroups/mem, we see these in the logs:

 *from the slave, at startup:*
 I0311 15:09:55.006750 50906 slave.cpp:289] Slave resources:
 ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979; disk(*):22974

 *from the master:*
 I0311 15:15:16.764714 50884 hierarchical_allocator_process.hpp:563]
 Recovered ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979;
 disk(*):22974 (total allocatable: ports(*):[31000-32000, 80-443];
 *cpus(*):2*; mem(*):1979; disk(*):22974) on slave
 20150311-150951-3982541578-5050-50860-S0 from framework
 20150311-150951-3982541578-5050-50860-

 And mesos master UI is showing both CPU and MEM resources status.



 Btw, we are using Marathon and Jenkins frameworks to start our mesos
 tasks, and the cpus field seems mandatory (set to 1.0 by default). So i
 guess you cannot easily bypass cpu resources allocation...


 Any idea?
 Regards

 2015-02-19 15:15 GMT+01:00 Ryan Thomas r.n.tho...@gmail.com:

 Hey Don,

 Have you tried only setting the 'cgroups/mem' isolation flag on the
 slave and not the cpu one?

 http://mesosphere.com/docs/reference/mesos-slave/


 ryan

 On 19 February 2015 at 14:13, Donald Laidlaw donlaid...@me.com wrote:

 I am using Mesos 0.21.1 with Marathon 0.8.0 and running everything in
 docker containers.

 Is there a way to have mesos ignore the cpu relative shares? That is,
 not limit the docker container CPU at all when it runs. I would still want
 to have the Memory resource limitation, but would rather just let the 
 linux
 system under the containers schedule all the CPU.

 This would allow us to just allocate tasks to mesos slaves based on
 available memory only, and to let those tasks get whatever CPU they could
 when they needed it. This is desireable where there can be lots of 
 relative
 high memory tasks that have very low CPU requirements. Especially if we do
 not know the capabilities of the slave machines with regards to CPU. Some
 of them may have fast CPU's, some slow, so it is hard to pick a relative
 number for that slave.

 Thanks,

 Don Laidlaw







Re: mesos on coreos

2015-03-11 Thread Gurvinder Singh
On 03/10/2015 11:41 PM, Tim Chen wrote:
 Hi all,
 
 As Alex said you can run Mesos in CoreOS without Docker if you put in
 the dependencies in.
 
Tim, is there any documentation of using Mesos outside container in
CoreOS available or binary available which we can wget in cloud-init
file to fulfill dependencies. As we would like to test it out Mesos on
CoreOS outside docker.

- Gurvinder
 It is a common ask though to run Mesos-slave in a Docker container in
 general, either on CoreOS or not. It's definitely a bit involved as you
 need to mount in a directory for persisting work dir and also mounting
 in /sys/fs for cgroups, also you should use the --pid=host flag since
 Docker 1.5 so it shares the host pid namespace.
 
 Although you get a lot less isolation, there are still motivations to
 run slave in Docker regardless. 
 
 One thing that's missing from the mesos docker containerizer is that it
 won't be able to recover tasks on restart, and I have a series of
 patches pending review to fix that.
 
 Tim
 
 On Tue, Mar 10, 2015 at 3:16 PM, Alex Rukletsov a...@mesosphere.io
 mailto:a...@mesosphere.io wrote:
 
 My 2¢.
  
 
 First of all, it doesn’t look like a great idea to package
 resource manager into Docker putting one more abstraction layer
 between a resource itself and resource manager. 
 
 
 You can run mesos-slave on CoreOS node without putting it into a
 Docker container.
  
 —Alex
 
 



Re: Question on Monitoring a Mesos Cluster

2015-03-11 Thread Alex Rukletsov
The master/cpus_percent metric is nothing else than used / total. It
however represent resources allocated to tasks, but tasks may not use
them fully (or use more if isolation is not enabled). You can't get
actual cluster utilisation, the best option is to aggregate system/*
metrics, that report the node load. This however includes all the
process running on a node, not only mesos and its tasks. Hope this
helps.


On Mon, Mar 9, 2015 at 8:16 AM, Andras Kerekes 
andras.kere...@ishisystems.com wrote:

 We use the same monitoring script from rayrod2030. However instead of the
 master_cpus_percent, we use the master_cpus_used and master_cpus_total to
 calculate a percentage. And this will give the allocated percentage of
 CPUs in
 the cluster, the actual utilization is measured by collectd.

 -Original Message-
 From: rasput...@gmail.com [mailto:rasput...@gmail.com] On Behalf Of Dick
 Davies
 Sent: Saturday, March 07, 2015 2:15 PM
 To: user@mesos.apache.org
 Subject: Re: Question on Monitoring a Mesos Cluster

 Yeah, that confused me too - I think that figure is specific to the
 master/slave polled (and that'll just be the active one since you're only
 reporting when master/elected is true.

 I'm using this one https://github.com/rayrod2030/collectd-mesos  , not
 sure if
 that's the same as yours?


 On 7 March 2015 at 18:56, Jeff Schroeder jeffschroe...@computer.org
 wrote:
  Responses inline
 
  On Sat, Mar 7, 2015 at 12:48 PM, CCAAT cc...@tampabay.rr.com wrote:
 
  ... snip ...
 
  After getting everything working, I built a few dashboards, one of
  which displays these stats from http://master:5051/metrics/snapshot:
 
  master/disk_percent
  master/cpus_percent
  master/mem_percent
 
  I had assumed that this was something like aggregate cluster
  utilization, but this seems incorrect in practice. I have a small
  cluster with ~1T of memory, ~25T of Disks, and ~540 CPU cores. I had
  a dozen or so small tasks running, and launched 500 tasks with 1G of
  memory and 1 CPU each.
 
  Now I'd expect to se the disk/cpu/mem percentage metrics above go up
  considerably. I did notice that cpus_percent went to around 0.94.
 
  What is the correct way to measure overall cluster utilization for
  capacity planning? We can have the NOC watch this and simply add
  more hardware when the number starts getting low.
 
 
  Boy, I cannot wait to read the tidbits of wisdom here. Maybe the
  development group has more accurate information if not some vague
  roadmap on resource/process monitoring. Sooner or later, this is
  going to become a quintessential need; so I hope the deep thinkers
  are all over this need both in the user and dev groups.
 
  In fact the monitoring can easily create a significant loading on the
  cluster/cloud, if one is not judicious in how this is architect,
  implemented and dynamically tuned.
 
 
 
 
  Monitoring via passive metrics gathering and application telemetry
  is one of the best ways to do it. That is how I've implemented things
 
 
 
  The beauty of the rest api is that it isn't heavyweight, and every
  master has it on port 5050 (by default) and every slave has it on port
  5051 (by default). Since I'm throwing this all into graphite (well
  technically cassandra fronted by cyanite fronted by graphite-api...
  but same difference), I found a reasonable way to do capacity
  planning. Collectd will poll the master/slave on each mesos host every
  10 seconds (localhost:5050 on masters and localhost:5151 on slaves).
  This gets put into graphite via collectd's write_graphite plugin.
  These 3 graphite targets give me percentages of utilization for nice
 graphs:
 
  alias(asPercent(collectd.mesos.clustername.gauge-master_cpu_used,
  collectd.mesos.clustername.gauge-master_cpu_total), Total CPU Usage)
  alias(asPercent(collectd.mesos.clustername.gauge-master_mem_used,
  collectd.mesos.clustername.gauge-master_mem_total), Total Memory
  Usage)
  alias(asPercent(collectd.mesos.clustername.gauge-master_disk_used,
  collectd.mesos.clustername.gauge-master_disk_total), Total Disk
  Usage)
 
  With that data, you can have your monitoring tools such as
  nagios/icinga poll graphite. Using the native graphite render api, you
 can
  do things like:
 
  * if the cpu usage is over 80% for 24 hours, send a warning event
  * if the cpu usage is over 95% for 6 hours, send a critical event
 
  This allows mostly no-impact monitoring since the monitoring tools are
  hitting graphite.
 
  Anyways, back to the original questions:
 
  How does everyone do proper monitoring and capacity planning for large
  mesos clusters? I expect my cluster to grow beyond what it currently
  is by quite a bit.
 
  --
  Jeff Schroeder
 
  Don't drink and derive, alcohol and analysis don't mix.
  http://www.digitalprognosis.com



Re: mesos on coreos

2015-03-11 Thread Adam Bordelon
Some current issues are listed under
https://issues.apache.org/jira/browse/MESOS-2115
See also previous email discussions:
http://www.mail-archive.com/user%40mesos.apache.org/msg02123.html
http://www.mail-archive.com/user%40mesos.apache.org/msg01617.html

On Tue, Mar 10, 2015 at 4:58 PM, craig w codecr...@gmail.com wrote:

 Is there any documentation describing what's necessary to run mesos master
 and slaves in Docker containers? You already mentioned a few things
 (mounting work dir, /sys/fs, etc).

 Thanks

 On Tue, Mar 10, 2015 at 6:41 PM, Tim Chen t...@mesosphere.io wrote:

 Hi all,

 As Alex said you can run Mesos in CoreOS without Docker if you put in the
 dependencies in.

 It is a common ask though to run Mesos-slave in a Docker container in
 general, either on CoreOS or not. It's definitely a bit involved as you
 need to mount in a directory for persisting work dir and also mounting in
 /sys/fs for cgroups, also you should use the --pid=host flag since Docker
 1.5 so it shares the host pid namespace.

 Although you get a lot less isolation, there are still motivations to run
 slave in Docker regardless.

 One thing that's missing from the mesos docker containerizer is that it
 won't be able to recover tasks on restart, and I have a series of patches
 pending review to fix that.

 Tim

 On Tue, Mar 10, 2015 at 3:16 PM, Alex Rukletsov a...@mesosphere.io
 wrote:

 My 2¢.


 First of all, it doesn’t look like a great idea to package resource
 manager into Docker putting one more abstraction layer between a resource
 itself and resource manager.


 You can run mesos-slave on CoreOS node without putting it into a Docker
 container.

 —Alex





 --

 https://github.com/mindscratch
 https://www.google.com/+CraigWickesser
 https://twitter.com/mind_scratch
 https://twitter.com/craig_links




Re: CPU resource allocation: ignore?

2015-03-11 Thread Geoffroy Jabouley
Hello

As cpu relatives shares are *not very* relevant in our heterogenous
cluster, we would like to get rid of CPU resources management and only use
MEM resources for our cluster and tasks allocation.

Even when modifying the isolation flag of our slave to
--isolation=cgroups/mem, we see these in the logs:

*from the slave, at startup:*
I0311 15:09:55.006750 50906 slave.cpp:289] Slave resources:
ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979; disk(*):22974

*from the master:*
I0311 15:15:16.764714 50884 hierarchical_allocator_process.hpp:563]
Recovered ports(*):[31000-32000, 80-443]; *cpus(*):2*; mem(*):1979;
disk(*):22974 (total allocatable: ports(*):[31000-32000, 80-443];
*cpus(*):2*; mem(*):1979; disk(*):22974) on slave
20150311-150951-3982541578-5050-50860-S0 from framework
20150311-150951-3982541578-5050-50860-

And mesos master UI is showing both CPU and MEM resources status.



Btw, we are using Marathon and Jenkins frameworks to start our mesos tasks,
and the cpus field seems mandatory (set to 1.0 by default). So i guess
you cannot easily bypass cpu resources allocation...


Any idea?
Regards

2015-02-19 15:15 GMT+01:00 Ryan Thomas r.n.tho...@gmail.com:

 Hey Don,

 Have you tried only setting the 'cgroups/mem' isolation flag on the slave
 and not the cpu one?

 http://mesosphere.com/docs/reference/mesos-slave/


 ryan

 On 19 February 2015 at 14:13, Donald Laidlaw donlaid...@me.com wrote:

 I am using Mesos 0.21.1 with Marathon 0.8.0 and running everything in
 docker containers.

 Is there a way to have mesos ignore the cpu relative shares? That is, not
 limit the docker container CPU at all when it runs. I would still want to
 have the Memory resource limitation, but would rather just let the linux
 system under the containers schedule all the CPU.

 This would allow us to just allocate tasks to mesos slaves based on
 available memory only, and to let those tasks get whatever CPU they could
 when they needed it. This is desireable where there can be lots of relative
 high memory tasks that have very low CPU requirements. Especially if we do
 not know the capabilities of the slave machines with regards to CPU. Some
 of them may have fast CPU's, some slow, so it is hard to pick a relative
 number for that slave.

 Thanks,

 Don Laidlaw