Re: Mesos-DNS host based HTTP-redirection from slave to container

2015-08-02 Thread Ryan Thomas
Hey Itamar,

Using DNS to redirect to a port will only be possible if you're using SRV
records (I'm not sure what mesos-dns uses) but this doesn't really matter
as it won't be looked up by the browser.

For this solution I have a small daemon written in go running on a number
of hosts (that aren't slaves), this locates the marathon master, and pulls
down my apps - I tag apps with a Host label (something like
foo.example.com) and then I create a haproxy config file with backends
directed by the host header. There's a few more smarts in it around only
pulling apps with a green healthcheck etc.

This daemon manages the lifecycle of haproxy on the node - it uses a
polling model, not an event driven one from the marathon event stream.

Another solution that uses the event-stream is this one
https://github.com/QubitProducts/bamboo - it's been a while since I checked
it out, but was functional back then.

Hope that helps.

ryan

On 2 August 2015 at 07:10, Itamar Ostricher ita...@yowza3d.com wrote:

 I use marathon to launch a nginx-docker-container named my-app, and set
 up Mesos-DNS, such that my-app.marathon.mesos returns the IP of the slave
 running the container (e.g. 10.20.30.40).

 Now, my-app is running on some dynamically-allocated port (e.g. 31001),
 but I would like http://my-app.marathon.mesos/foo to hit my app at
 http://10.20.30.40:31001/foo

 Is there a best practice way to achieve this behavior?

 I was thinking about a proxy running on each slave, listening on port 80,
 redirecting incoming HTTP requests based on the request host to the correct
 port on localhost. The correct port can be determined by querying
 mesos-dns itself.

 This sounds like a pretty common use-case, so I wondered if anyone can
 point me at an existing solution for this.

 Thanks!
 - Itamar.



Re: Mesos-DNS host based HTTP-redirection from slave to container

2015-08-02 Thread Ryan Thomas
Yes it appears that mesos-dns does use SRV records - I should really check
it out :)

On 2 August 2015 at 10:50, Ryan Thomas r.n.tho...@gmail.com wrote:

 Hey Itamar,

 Using DNS to redirect to a port will only be possible if you're using SRV
 records (I'm not sure what mesos-dns uses) but this doesn't really matter
 as it won't be looked up by the browser.

 For this solution I have a small daemon written in go running on a number
 of hosts (that aren't slaves), this locates the marathon master, and pulls
 down my apps - I tag apps with a Host label (something like
 foo.example.com) and then I create a haproxy config file with backends
 directed by the host header. There's a few more smarts in it around only
 pulling apps with a green healthcheck etc.

 This daemon manages the lifecycle of haproxy on the node - it uses a
 polling model, not an event driven one from the marathon event stream.

 Another solution that uses the event-stream is this one
 https://github.com/QubitProducts/bamboo - it's been a while since I
 checked it out, but was functional back then.

 Hope that helps.

 ryan

 On 2 August 2015 at 07:10, Itamar Ostricher ita...@yowza3d.com wrote:

 I use marathon to launch a nginx-docker-container named my-app, and set
 up Mesos-DNS, such that my-app.marathon.mesos returns the IP of the slave
 running the container (e.g. 10.20.30.40).

 Now, my-app is running on some dynamically-allocated port (e.g. 31001),
 but I would like http://my-app.marathon.mesos/foo to hit my app at
 http://10.20.30.40:31001/foo

 Is there a best practice way to achieve this behavior?

 I was thinking about a proxy running on each slave, listening on port 80,
 redirecting incoming HTTP requests based on the request host to the correct
 port on localhost. The correct port can be determined by querying
 mesos-dns itself.

 This sounds like a pretty common use-case, so I wondered if anyone can
 point me at an existing solution for this.

 Thanks!
 - Itamar.





Re: Mesos-DNS host based HTTP-redirection from slave to container

2015-08-02 Thread Ryan Thomas
If you are going to be pulling data down yourself it would be better to do
it from marathon, than mesos-dns as you will have additional data about the
tasks available.

On 2 August 2015 at 11:12, tommy xiao xia...@gmail.com wrote:

 mesos-dns store the app's IP and ports. so you can query the mesos-dns to
 setup a route rule to define the url.

 2015-08-02 17:51 GMT+08:00 Ryan Thomas r.n.tho...@gmail.com:

 Yes it appears that mesos-dns does use SRV records - I should really
 check it out :)

 On 2 August 2015 at 10:50, Ryan Thomas r.n.tho...@gmail.com wrote:

 Hey Itamar,

 Using DNS to redirect to a port will only be possible if you're using
 SRV records (I'm not sure what mesos-dns uses) but this doesn't really
 matter as it won't be looked up by the browser.

 For this solution I have a small daemon written in go running on a
 number of hosts (that aren't slaves), this locates the marathon master, and
 pulls down my apps - I tag apps with a Host label (something like
 foo.example.com) and then I create a haproxy config file with backends
 directed by the host header. There's a few more smarts in it around only
 pulling apps with a green healthcheck etc.

 This daemon manages the lifecycle of haproxy on the node - it uses a
 polling model, not an event driven one from the marathon event stream.

 Another solution that uses the event-stream is this one
 https://github.com/QubitProducts/bamboo - it's been a while since I
 checked it out, but was functional back then.

 Hope that helps.

 ryan

 On 2 August 2015 at 07:10, Itamar Ostricher ita...@yowza3d.com wrote:

 I use marathon to launch a nginx-docker-container named my-app, and
 set up Mesos-DNS, such that my-app.marathon.mesos returns the IP of the
 slave running the container (e.g. 10.20.30.40).

 Now, my-app is running on some dynamically-allocated port (e.g.
 31001), but I would like http://my-app.marathon.mesos/foo to hit my
 app at http://10.20.30.40:31001/foo

 Is there a best practice way to achieve this behavior?

 I was thinking about a proxy running on each slave, listening on port
 80, redirecting incoming HTTP requests based on the request host to the
 correct port on localhost. The correct port can be determined by querying
 mesos-dns itself.

 This sounds like a pretty common use-case, so I wondered if anyone can
 point me at an existing solution for this.

 Thanks!
 - Itamar.






 --
 Deshi Xiao
 Twitter: xds2000
 E-mail: xiaods(AT)gmail.com



Mesos 0.21.0 release page correction

2015-05-04 Thread Ryan Thomas
Whilst this is a bit old, the docs here
http://mesos.apache.org/blog/mesos-0-21-0-released/ for 0.21.0 link to
the wrong ticket for the shared filesystem isolator.

Cheers,

ryan


Re: CPU resource allocation: ignore?

2015-02-19 Thread Ryan Thomas
Hey Don,

Have you tried only setting the 'cgroups/mem' isolation flag on the slave
and not the cpu one?

http://mesosphere.com/docs/reference/mesos-slave/


ryan

On 19 February 2015 at 14:13, Donald Laidlaw donlaid...@me.com wrote:

 I am using Mesos 0.21.1 with Marathon 0.8.0 and running everything in
 docker containers.

 Is there a way to have mesos ignore the cpu relative shares? That is, not
 limit the docker container CPU at all when it runs. I would still want to
 have the Memory resource limitation, but would rather just let the linux
 system under the containers schedule all the CPU.

 This would allow us to just allocate tasks to mesos slaves based on
 available memory only, and to let those tasks get whatever CPU they could
 when they needed it. This is desireable where there can be lots of relative
 high memory tasks that have very low CPU requirements. Especially if we do
 not know the capabilities of the slave machines with regards to CPU. Some
 of them may have fast CPU's, some slow, so it is hard to pick a relative
 number for that slave.

 Thanks,

 Don Laidlaw



Re: Unable to follow Sandbox links from Mesos UI.

2015-01-22 Thread Ryan Thomas
It is a request from your browser session, not from the master that is
going to the slaves - so in order to view the sandbox you need to ensure
that the machine your browser is on can resolve and route to the masters
_and_ the slaves.

The master doesn't proxy the sandbox requests through itself (yet) - they
are made directly from your browser instance to the slaves.

Make sure you can resolve the slaves from the machine you're browsing the
UI on.

Cheers,

ryan

On 22 January 2015 at 15:42, Dan Dong dongda...@gmail.com wrote:

 Thank you all, the master and slaves can resolve each others' hostname and
 ssh login without password, firewalls have been switched off on all the
 machines too.
 So I'm confused what will block such a pull of info of slaves from UI?

 Cheers,
 Dan


 2015-01-21 16:35 GMT-06:00 Cody Maloney c...@mesosphere.io:

 Also see https://issues.apache.org/jira/browse/MESOS-2129 if you want to
 track progress on changing this.

 Unfortunately it is on hold for me at the moment to fix.

 Cody

 On Wed, Jan 21, 2015 at 2:07 PM, Ryan Thomas r.n.tho...@gmail.com
 wrote:

 Hey Dan,

 The UI will attempt to pull that info directly from the slave so you
 need to make sure the host is resolvable  and routeable from your browser.

 Cheers,

 Ryan

 From my phone


 On Wednesday, 21 January 2015, Dan Dong dongda...@gmail.com wrote:

 Hi, All,
  When I try to access sandbox  on mesos UI, I see the following info( The
  same error appears on every slave sandbox.):

  Failed to connect to slave '20150115-144719-3205108908-5050-4552-S0'
  on 'centos-2.local:5051'.

  Potential reasons:
  The slave's hostname, 'centos-2.local', is not accessible from your
 network  The slave's port, '5051', is not accessible from your network


  I checked that:
  slave centos-2.local can be login from any machine in the cluster without
  password by ssh centos-2.local ;

  port 5051 on slave centos-2.local could be connected from master by
  telnet centos-2.local 5051
 The stdout and stderr are there on each slave's /tmp/mesos/..., but seems 
 mesos UI just could not access it.
 (and Both master and slaves are on the same network IP ranges).  Should I 
 open any port on slaves? Any hint what's the problem here?

  Cheers,
  Dan






Re: cluster wide init

2015-01-22 Thread Ryan Thomas
If this was going to be used to allocate tasks outside of the schedulers
resource management, and for every slave, why not just use the OS provided
init system instead?

On 22 January 2015 at 19:40, Sharma Podila spod...@netflix.com wrote:

 Schedulers can only use resources on slaves that are unused by and
 unallocated to other schedulers. Therefore, schedulers cannot achieve this
 unless you reserve slots on every slave for the scheduler. Seems kind of
 a forced fit. An init like support would be more fundamental to Mesos
 cluster itself, if available.


 On Thu, Jan 22, 2015 at 10:08 AM, Ryan Thomas r.n.tho...@gmail.com
 wrote:

 This seems more like the responsibility of the scheduler that is running,
 like marathon or aurora.

 I haven't tried it but I would imagine if you had 10 slaves and started a
 job with 11 tasks with host exclusivity when you spin up an 11th slave
 marathon would start it there.


 On Thursday, 22 January 2015, Sharma Podila spod...@netflix.com wrote:

 Just a thought looking forward...
 Might be useful to define an init kind of feature in Mesos slaves.
 Configuration can be defined in Mesos master that lists services that must
 be run on all slaves. When slaves register, they get the list of services
 to run all the time. Updates to the configuration can be dynamically
 reflected on all slaves and therefore this ensures that all slaves run the
 required services. Sophistication can be put in place to have different set
 of services for different types of slaves (by resource types/quantity,
 etc.).
 Such a feature bodes well with Mesos being the DataCenter OS/Kernel.


 On Thu, Jan 22, 2015 at 9:43 AM, CCAAT cc...@tampabay.rr.com wrote:

 On 01/21/2015 11:10 PM, Shuai Lin wrote:

 OK, I'll take a look at the debian package.

 thanks,
 James




  You can always write the init wrapper scripts for marathon. There is an
 official debian package, which you can find in mesos's apt repo.

 On Thu, Jan 22, 2015 at 4:20 AM, CCAAT cc...@tampabay.rr.com
 mailto:cc...@tampabay.rr.com wrote:

 Hello all,

 I was reading about Marathon: Marathon scheduler processes were
 started outside of Mesos using init, upstart, or a similar tool
 [1]

 This means

 So my related questions are

 Does Marathon work with mesos + Openrc as the init system?

 Are there any other frameworks that work with Mesos + Openrc?


 James



 [1] http://mesosphere.github.io/__marathon/
 http://mesosphere.github.io/marathon/








Re: Unable to follow Sandbox links from Mesos UI.

2015-01-21 Thread Ryan Thomas
Hey Dan,

The UI will attempt to pull that info directly from the slave so you need
to make sure the host is resolvable  and routeable from your browser.

Cheers,

Ryan

From my phone

On Wednesday, 21 January 2015, Dan Dong dongda...@gmail.com wrote:

 Hi, All,
  When I try to access sandbox  on mesos UI, I see the following info( The
  same error appears on every slave sandbox.):

  Failed to connect to slave '20150115-144719-3205108908-5050-4552-S0'
  on 'centos-2.local:5051'.

  Potential reasons:
  The slave's hostname, 'centos-2.local', is not accessible from your
 network  The slave's port, '5051', is not accessible from your network


  I checked that:
  slave centos-2.local can be login from any machine in the cluster without
  password by ssh centos-2.local ;

  port 5051 on slave centos-2.local could be connected from master by
  telnet centos-2.local 5051
 The stdout and stderr are there on each slave's /tmp/mesos/..., but seems 
 mesos UI just could not access it.
 (and Both master and slaves are on the same network IP ranges).  Should I 
 open any port on slaves? Any hint what's the problem here?

  Cheers,
  Dan




Re: Killing Docker containers

2014-10-15 Thread Ryan Thomas
I've updated the RB with the feedback from the previous
https://reviews.apache.org/r/26734/


On 15 October 2014 08:57, Ryan Thomas r.n.tho...@gmail.com wrote:

 Here is the RB link https://reviews.apache.org/r/26709/ - fixed at a 30
 second timeout at the moment, but I'd imagine that this is something we
 want to make configurable.

 ryan

 On 15 October 2014 08:32, Ankur Chauhan an...@malloc64.com wrote:

 ++ I was planning on submitting that patch. But if someone has this
 sorted out already, I'll defer.

 Sent from my iPhone

 On Oct 14, 2014, at 2:19 PM, Ryan Thomas r.n.tho...@gmail.com wrote:

 The docker stop command will attempt to kill the container if it doesn't
 stop in 10 seconds by default. I think we should be using this with the -t
 flag to control the time between stop and kill rather than just using kill.

 I'll try to submit a patch.

 Cheers,

 ryan

 On 15 October 2014 05:37, Scott Rankin sran...@crsinc.com wrote:

  Hi All,

  I’m working on prototyping Mesos+Marathon for our services platform,
 using apps deployed as Docker containers.  Our applications register
 themselves with our service discovery framework on startup and un-register
 themselves when they shut down (assuming they shut down reasonably
 gracefully).  What I’m finding is that when Mesos shuts down a docker
 container, it uses “docker kill” as opposed to “docker stop”.  I can see
 the reasoning behind this, but it causes a problem in that the container
 doesn’t get a chance to clean up after itself.

  Is this something that might be addressed?  Perhaps by trying docker
 stop and then running kill if it doesn’t shut down after 30 seconds or
 something?

  Thanks,
 Scott

 This email message contains information that Corporate Reimbursement
 Services, Inc. considers confidential and/or proprietary, or may later
 designate as confidential and proprietary. It is intended only for use of
 the individual or entity named above and should not be forwarded to any
 other persons or entities without the express consent of Corporate
 Reimbursement Services, Inc., nor should it be used for any purpose other
 than in the course of any potential or actual business relationship with
 Corporate Reimbursement Services, Inc. If the reader of this message is not
 the intended recipient, or the employee or agent responsible to deliver it
 to the intended recipient, you are hereby notified that any dissemination,
 distribution, or copying of this communication is strictly prohibited. If
 you have received this communication in error, please notify sender
 immediately and destroy the original message.

 Internal Revenue Service regulations require that certain types of
 written advice include a disclaimer. To the extent the preceding message
 contains advice relating to a Federal tax issue, unless expressly stated
 otherwise the advice is not intended or written to be used, and it cannot
 be used by the recipient or any other taxpayer, for the purpose of avoiding
 Federal tax penalties, and was not written to support the promotion or
 marketing of any transaction or matter discussed herein.






Re: Killing Docker containers

2014-10-15 Thread Ryan Thomas
Latest review is here https://reviews.apache.org/r/26736/ had to update due
to style failures.

Cheers,

Ryan

On Thursday, 16 October 2014, Scott Rankin sran...@crsinc.com wrote:

  Thanks, Ryan.  That solution sounds perfect.

   From: Ryan Thomas r.n.tho...@gmail.com
 javascript:_e(%7B%7D,'cvml','r.n.tho...@gmail.com');
 Reply-To: user@mesos.apache.org
 javascript:_e(%7B%7D,'cvml','user@mesos.apache.org'); 
 user@mesos.apache.org
 javascript:_e(%7B%7D,'cvml','user@mesos.apache.org');
 Date: Tuesday, October 14, 2014 at 5:19 PM
 To: user@mesos.apache.org
 javascript:_e(%7B%7D,'cvml','user@mesos.apache.org'); 
 user@mesos.apache.org
 javascript:_e(%7B%7D,'cvml','user@mesos.apache.org');
 Subject: Re: Killing Docker containers

   The docker stop command will attempt to kill the container if it
 doesn't stop in 10 seconds by default. I think we should be using this with
 the -t flag to control the time between stop and kill rather than just
 using kill.

  I'll try to submit a patch.

  Cheers,

  ryan

 On 15 October 2014 05:37, Scott Rankin sran...@crsinc.com
 javascript:_e(%7B%7D,'cvml','sran...@crsinc.com'); wrote:

  Hi All,

  I’m working on prototyping Mesos+Marathon for our services platform,
 using apps deployed as Docker containers.  Our applications register
 themselves with our service discovery framework on startup and un-register
 themselves when they shut down (assuming they shut down reasonably
 gracefully).  What I’m finding is that when Mesos shuts down a docker
 container, it uses “docker kill” as opposed to “docker stop”.  I can see
 the reasoning behind this, but it causes a problem in that the container
 doesn’t get a chance to clean up after itself.

  Is this something that might be addressed?  Perhaps by trying docker
 stop and then running kill if it doesn’t shut down after 30 seconds or
 something?

  Thanks,
 Scott

 This email message contains information that Corporate Reimbursement
 Services, Inc. considers confidential and/or proprietary, or may later
 designate as confidential and proprietary. It is intended only for use of
 the individual or entity named above and should not be forwarded to any
 other persons or entities without the express consent of Corporate
 Reimbursement Services, Inc., nor should it be used for any purpose other
 than in the course of any potential or actual business relationship with
 Corporate Reimbursement Services, Inc. If the reader of this message is not
 the intended recipient, or the employee or agent responsible to deliver it
 to the intended recipient, you are hereby notified that any dissemination,
 distribution, or copying of this communication is strictly prohibited. If
 you have received this communication in error, please notify sender
 immediately and destroy the original message.

 Internal Revenue Service regulations require that certain types of
 written advice include a disclaimer. To the extent the preceding message
 contains advice relating to a Federal tax issue, unless expressly stated
 otherwise the advice is not intended or written to be used, and it cannot
 be used by the recipient or any other taxpayer, for the purpose of avoiding
 Federal tax penalties, and was not written to support the promotion or
 marketing of any transaction or matter discussed herein.


   This email message contains information that Corporate Reimbursement
 Services, Inc. considers confidential and/or proprietary, or may later
 designate as confidential and proprietary. It is intended only for use of
 the individual or entity named above and should not be forwarded to any
 other persons or entities without the express consent of Corporate
 Reimbursement Services, Inc., nor should it be used for any purpose other
 than in the course of any potential or actual business relationship with
 Corporate Reimbursement Services, Inc. If the reader of this message is not
 the intended recipient, or the employee or agent responsible to deliver it
 to the intended recipient, you are hereby notified that any dissemination,
 distribution, or copying of this communication is strictly prohibited. If
 you have received this communication in error, please notify sender
 immediately and destroy the original message.

 Internal Revenue Service regulations require that certain types of written
 advice include a disclaimer. To the extent the preceding message contains
 advice relating to a Federal tax issue, unless expressly stated otherwise
 the advice is not intended or written to be used, and it cannot be used by
 the recipient or any other taxpayer, for the purpose of avoiding Federal
 tax penalties, and was not written to support the promotion or marketing of
 any transaction or matter discussed herein.



Re: Killing Docker containers

2014-10-14 Thread Ryan Thomas
The docker stop command will attempt to kill the container if it doesn't
stop in 10 seconds by default. I think we should be using this with the -t
flag to control the time between stop and kill rather than just using kill.

I'll try to submit a patch.

Cheers,

ryan

On 15 October 2014 05:37, Scott Rankin sran...@crsinc.com wrote:

  Hi All,

  I’m working on prototyping Mesos+Marathon for our services platform,
 using apps deployed as Docker containers.  Our applications register
 themselves with our service discovery framework on startup and un-register
 themselves when they shut down (assuming they shut down reasonably
 gracefully).  What I’m finding is that when Mesos shuts down a docker
 container, it uses “docker kill” as opposed to “docker stop”.  I can see
 the reasoning behind this, but it causes a problem in that the container
 doesn’t get a chance to clean up after itself.

  Is this something that might be addressed?  Perhaps by trying docker
 stop and then running kill if it doesn’t shut down after 30 seconds or
 something?

  Thanks,
 Scott

 This email message contains information that Corporate Reimbursement
 Services, Inc. considers confidential and/or proprietary, or may later
 designate as confidential and proprietary. It is intended only for use of
 the individual or entity named above and should not be forwarded to any
 other persons or entities without the express consent of Corporate
 Reimbursement Services, Inc., nor should it be used for any purpose other
 than in the course of any potential or actual business relationship with
 Corporate Reimbursement Services, Inc. If the reader of this message is not
 the intended recipient, or the employee or agent responsible to deliver it
 to the intended recipient, you are hereby notified that any dissemination,
 distribution, or copying of this communication is strictly prohibited. If
 you have received this communication in error, please notify sender
 immediately and destroy the original message.

 Internal Revenue Service regulations require that certain types of written
 advice include a disclaimer. To the extent the preceding message contains
 advice relating to a Federal tax issue, unless expressly stated otherwise
 the advice is not intended or written to be used, and it cannot be used by
 the recipient or any other taxpayer, for the purpose of avoiding Federal
 tax penalties, and was not written to support the promotion or marketing of
 any transaction or matter discussed herein.



Docker containerizer port conflict

2014-10-05 Thread Ryan Thomas
I've been trying out the docker-integration with mesos  marathon since the
bridged networking has been added and I've run into a couple of issues -
the most disturbing seems to be allocating of already in use ports (I
suspect this may be a marathon issue) and the failure to recover the tasks
once this occurs.

What I am running is a very simple setup, driven locally from vagrant. I
attempt to run the python3 container specified here under Bridged
Networking (https://mesosphere.github.io/marathon/docs/native-docker.html).

What I see is that, whilst the container is being pulled for the first time
every task exists as KILLED. Once the image has been pulled, the container
starts but mesos does not realise this - causing it to fail to start
additional containers with port allocation conflicts. Killing the
unrecognised container in docker will unblock mesos to start up the
containers.

Now, once this is started, if I attempt to scale the number of instances up
in marathon, I see in the UI that it attempts to start another container (a
third in my case, two slaves) with the same port allocations that are
already in use on the slave.

This is the error in the slave logs:

E1005 10:41:01.812988  2883 slave.cpp:2485] Container
'05cf52f1-b915-45e5-9071-6b46fda3b71c' for executor
'bridged-webapp.18747ba3-4c7c-11e4-9567-080027100ea3' of framework
'20141005-083953-159390892-5050-9177-' failed to start: Failed to
'docker run -d -c 512 -m 67108864 -e PORT=31000 -e PORT0=31000 -e
PORTS=31000,31001 -e PORT1=31001 -e MESOS_SANDBOX=/mnt/mesos/sandbox -v
/tmp/mesos/slaves/20141005-101854-159390892-5050-1326-0/frameworks/20141005-083953-159390892-5050-9177-/executors/bridged-webapp.18747ba3-4c7c-11e4-9567-080027100ea3/runs/05cf52f1-b915-45e5-9071-6b46fda3b71c:/mnt/mesos/sandbox
--net bridge -p 31000:8080/tcp -p 31001:161/udp --entrypoint /bin/sh --name
mesos-05cf52f1-b915-45e5-9071-6b46fda3b71c python:3 -c python3 -m
http.server 8080': exit status = exited with status 1 stderr = WARNING:
Your kernel does not support swap limit capabilities. Limitation discarded.
2014/10/05 10:41:01 Error response from daemon: Cannot start container
b2516e3356ca1cf3163f6926249b4e936ec9afe4549ee37f4a9d5df62dbbaf1b: Bind for
0.0.0.0:31000 failed: port is already allocated

There is nothing in the stderr or stdout of the task.

I have setup the slaves according to the docs (set the containerizers and
the timeout) - any help here would be appreciated.

Cheers,

Ryan


Re: Frontend loadbalancer configuration for long running tasks

2014-09-09 Thread Ryan Thomas
Hi Ankur,

I saw this on the mesos subreddit not five minutes ago!

http://www.qubitproducts.com/content/Opensourcing-Bamboo

Cheers,

Ryan
On 9 Sep 2014 18:53, Ankur Chauhan an...@malloc64.com wrote:

 Hi all,

 (Please let me know if this is not the correct place for such a question).
 I have been looking at mesos + marathon + haproxy as a way of deploying
 long running web applications. Mesos coupled with marathon's /tasks api
 gives me all the information needed to get a haproxy configured and load
 balancing all the tasks but it seems a little too simplistic.

 I was wondering if there are other projects or if others could share how
 they configure/reconfigure their loadbalancers when new tasks come alive.

 Just to make things a little more concrete consider the following use case:

 There are two web applications that are running as tasks on mesos:
 1. webapp1 (http + https) on app1.domain.com
 2. webapp2 (http + https) on app2.domain.com

 We want to configure a HAProxy server that routes traffic from users (:80
 and :443) and loadbalances it correctly onto the correct set of tasks.
 Obviously there is some haproxy configuration happening here but i am
 interested in finding out what others have been doing in similar cases
 before I go around building yet another haproxy reconfigure and reload
 script.

 -- Ankur



Re: Mesos webcast

2014-09-09 Thread Ryan Thomas
Hey Vinod,

Will this be recorded? It starts at 4am in my timezone :)

Cheers,

Ryan

On 10 September 2014 03:22, Vinod Kone vinodk...@gmail.com wrote:

 Hi folks,

 I'm doing a webcast on Mesos this thursday (h/t Mesosphere) where I will
 talk about some of the core features of Mesos (slave recovery,
 authentication and authorization). At the end, we will have time for QA
 for any and all questions related to Mesos.

 More details:
 https://attendee.gotowebinar.com/register/7957587123935365890

 Thanks,



Re: Mesos 0.20.0 with Docker registry availability

2014-09-05 Thread Ryan Thomas
Whilst this is somewhat unrelated to the mesos implementation, I think it
is generally good practice to have immutable tags on the images, this is
something I dislike about docker :)

Whist the gc of old images will eventually become a problem, it will really
only be the layer delta that is consumed with each new tag. But I think
yes, there would need to be some mechanism to clear out the images in the
local registry.

ryan
On 5 Sep 2014 18:03, mccraig mccraig mccraigmccr...@gmail.com wrote:

 ah, so i will have to use a different tag to update an app

 one immediate problem i can see is that it makes garbage collecting old
 docker images from slaves harder : currently i update the image associated
 with a tag and restart tasks to update the running app, then occasionally a
 cron job to remove all docker images with no tag

 if every updated image has a new tag it will be harder to figure out which
 images to remove... perhaps any with no running container, though that
 could lead to unnecessary pulls and slower restarts of failed tasks

 :craig

 On 5 Sep 2014, at 08:43, Ryan Thomas r.n.tho...@gmail.com wrote:

 Hey Craig,

 docker run will attempt a pull of the image if it cannot find a matching
 image and tag in its local repository.

 So it should only pull on the first run of a given tag.

 ryan
 On 5 Sep 2014 17:41, mccraig mccraig mccraigmccr...@gmail.com wrote:

 hi tim,

 if it doesn't pull on every run, when will it pull ?

 :craig

 On 5 Sep 2014, at 07:05, Tim Chen t...@mesosphere.io wrote:

 Hi Maxime,

 It is a very valid concern and that's why I've added a patch that should
 go out in 0.20.1 to not do a docker pull on every run anymore.

 Mesos will still try to docker pull when the image isn't available
 locally (via docker inspect), but only once.

 The downside ofcourse is that you're not able to automatically get the
 latest tagged image, but I think it's worth while price to may to gain the
 benefits of not depending on registry, able to run local images and more.

 Tim


 On Thu, Sep 4, 2014 at 10:50 PM, Maxime Brugidou 
 maxime.brugi...@gmail.com wrote:

 Hi,

 The current Docker integration in 0.20 does a docker pull from the
 registry before running any task. This means that your entire Mesos cluster
 becomes unusable if the registry goes down.

 The docs allow you to configure a custom .dockercfg for your tasks to
 point to a private docker registry.

 However it is not easy to run an HA docker registry. The docker-registry
 project recommend using S3 storage buy this is definitely not an option for
 some people.

 I know that for regular artifacts, Mesos can use HDFS storage and you
 can run your HDFS datanodes as Mesos tasks.

 So even if I attempt to have a docker registry storage in HDFS (which is
 not supported by docker-registry at the moment), I am stuck on a chicken
 and egg problem. I want to have as little services outside of Mesos as
 possible and it is hard to maintain HA services (especially outside of
 Mesos).

 Is there anyone running Mesos with Docker in production without S3? I am
 trying to make all the services outside of Mesos (the infra services that
 are necessary to run Mesos like DNS, Haproxy, Chef server... etc) either HA
 or not critical for the cluster to run. The docker registry is a new piece
 of infra outside of Mesos that is critical...

 Best,
 Maxime





Re: Issue with Multinode Cluster

2014-08-25 Thread Ryan Thomas
If you're using the mesos-init-wrapper you can write the IP to
/etc/mesos-master/ip and that flag will be set. This goes for all the
flags, and can be done for the slave as well in /etc/mesos-slave.


On 26 August 2014 10:18, Vinod Kone vinodk...@gmail.com wrote:

 From the logs, it looks like master is binding to its loopback address
 (127.0.0.1) and publishing that to ZK. So the slave is trying to reach the
 master on its loopback interface, which is failing.

 Start the master with --ip flag set to its visible ip (10.1.100.116).
 Mesosphere probably has a file (/etc/defaults/mesos-master?) to set these
 flags.


 On Mon, Aug 25, 2014 at 3:26 PM, Frank Hinek frank.hi...@gmail.com
 wrote:

 Logs attached from master, slave, and zookeeper after a reboot of both
 nodes.




 On August 25, 2014 at 1:14:07 PM, Vinod Kone (vinodk...@gmail.com) wrote:

 what do the master and slave logs say?


 On Mon, Aug 25, 2014 at 9:03 AM, Frank Hinek frank.hi...@gmail.com
 wrote:

  I was able to get a single node environment setup on Ubuntu 14.04.1
 following this guide: http://mesosphere.io/learn/install_ubuntu_debian/

  The single slave registered with the master via the local Zookeeper and
 I could run basic commands by posting to Marathon.

  I then tried to build a multi node cluster following this guide:
 http://mesosphere.io/docs/mesosphere/getting-started/cloud-install/

  The guide walks you through using the Mesosphere packages to install
 Mesos, Marathon, and Zookeeper one one node that will be the master and on
 the slave just Mesos.  You then disable automatic start of: mesos-slave on
 the master, mesos-master on the slave, and zookeeper on the slave.  It ends
 up looking like:

  NODE 1 (MASTER):
  - IP Address: 10.1.100.116
  - mesos-master
  - marathon
  - zookeeper

  NODE 2 (SLAVE):
  - IP Address: 10.1.100.117
  - mesos-slave

  The issue I’m running into is that the slave rarely is able to register
 with the master using the Zookeeper.  I can never run any jobs from
 marathon (just trying a simple sleep 5 command).  Even when the slave does
 register the Mesos UI shows 1 “Deactivated” slave — it never goes active.

  Here are the values I have for /etc/mesos/zk:

  MASTER: zk://10.1.100.116:2181/mesos
  SLAVE: zk://10.1.100.116:2181/mesos

  Any ideas of what to troubleshoot?  Would greatly appreciate pointers.

  Environment details:
  - Ubuntu Server 14.04.1 running as VMs on ESXi 5.5U1
  - Mesos: 0.20.0
  - Marathon 0.6.1

  There are no apparent connectivity issues, and I’m not having any
 problems with other VMs on the ESXi host.  All VM to VM communication is on
 the same VLAN and within the same host.

  Zookeeper log on master (slave briefly registered so I tried to run a
 sleep 5 command from marathon and then the slave disconnected):

  2014-08-25 11:50:34,976 - INFO  [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket
 connection from /10.1.100.117:45778
 2014-08-25 11:50:34,977 - WARN  [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@793] - Connection request from old
 client /10.1.100.117:45778; will be dropped if server is in r-o mode
 2014-08-25 11:50:34,977 - INFO  [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@839] - Client attempting to
 establish new session at /10.1.100.117:45778
 2014-08-25 11:50:34,978 - INFO  [SyncThread:0:ZooKeeperServer@595] -
 Established session 0x1480b22f7fc with negotiated timeout 1 for
 client /10.1.100.117:45778
 2014-08-25 11:51:05,724 - INFO  [ProcessThread(sid:0
 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
 when processing sessionid:0x1480b22f7f1 type:create cxid:0x53faafa9
 zxid:0x49 txntype:-1 reqpath:n/a Error Path:/marathon Error:KeeperErrorCode
 = NodeExists for /marathon
 2014-08-25 11:51:05,724 - INFO  [ProcessThread(sid:0
 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
 when processing sessionid:0x1480b22f7f1 type:create cxid:0x53faafaa
 zxid:0x4a txntype:-1 reqpath:n/a Error Path:/marathon/state
 Error:KeeperErrorCode = NodeExists for /marathon/state
 2014-08-25 11:51:09,145 - INFO  [ProcessThread(sid:0
 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
 when processing sessionid:0x1480b22f7f1 type:create cxid:0x53faafb5
 zxid:0x4d txntype:-1 reqpath:n/a Error Path:/marathon Error:KeeperErrorCode
 = NodeExists for /marathon
 2014-08-25 11:51:09,146 - INFO  [ProcessThread(sid:0
 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
 when processing sessionid:0x1480b22f7f1 type:create cxid:0x53faafb6
 zxid:0x4e txntype:-1 reqpath:n/a Error Path:/marathon/state
 Error:KeeperErrorCode = NodeExists for /marathon/state






Re: Issue with Multinode Cluster

2014-08-25 Thread Ryan Thomas
I'm not sure what the best-practice is, but I use the /etc/mesos* method as
I find it more explicit.


On 26 August 2014 10:38, Frank Hinek frank.hi...@gmail.com wrote:

 Vinod: bingo!  I’ve spent 2 days trying to figure this out.  The only
 interfaces on the VMs were eth0 and lo—interesting that it picked the
 loopback automatically or that the tutorials didn’t note this.

 Ryan: Is it considered better practice to modify /etc/default/mesos-master
 or write the IP to /etc/mesos-master/ip ?


 On August 25, 2014 at 8:31:42 PM, Ryan Thomas (r.n.tho...@gmail.com)
 wrote:

 If you're using the mesos-init-wrapper you can write the IP to
 /etc/mesos-master/ip and that flag will be set. This goes for all the
 flags, and can be done for the slave as well in /etc/mesos-slave.


 On 26 August 2014 10:18, Vinod Kone vinodk...@gmail.com wrote:

 From the logs, it looks like master is binding to its loopback address
 (127.0.0.1) and publishing that to ZK. So the slave is trying to reach the
 master on its loopback interface, which is failing.

 Start the master with --ip flag set to its visible ip (10.1.100.116).
 Mesosphere probably has a file (/etc/defaults/mesos-master?) to set these
 flags.


 On Mon, Aug 25, 2014 at 3:26 PM, Frank Hinek frank.hi...@gmail.com
 wrote:

  Logs attached from master, slave, and zookeeper after a reboot of both
 nodes.




 On August 25, 2014 at 1:14:07 PM, Vinod Kone (vinodk...@gmail.com)
 wrote:

  what do the master and slave logs say?


 On Mon, Aug 25, 2014 at 9:03 AM, Frank Hinek frank.hi...@gmail.com
 wrote:

  I was able to get a single node environment setup on Ubuntu 14.04.1
 following this guide: http://mesosphere.io/learn/install_ubuntu_debian/

  The single slave registered with the master via the local Zookeeper
 and I could run basic commands by posting to Marathon.

  I then tried to build a multi node cluster following this guide:
 http://mesosphere.io/docs/mesosphere/getting-started/cloud-install/

  The guide walks you through using the Mesosphere packages to install
 Mesos, Marathon, and Zookeeper one one node that will be the master and on
 the slave just Mesos.  You then disable automatic start of: mesos-slave on
 the master, mesos-master on the slave, and zookeeper on the slave.  It ends
 up looking like:

  NODE 1 (MASTER):
  - IP Address: 10.1.100.116
  - mesos-master
  - marathon
  - zookeeper

  NODE 2 (SLAVE):
  - IP Address: 10.1.100.117
  - mesos-slave

  The issue I’m running into is that the slave rarely is able to
 register with the master using the Zookeeper.  I can never run any jobs
 from marathon (just trying a simple sleep 5 command).  Even when the slave
 does register the Mesos UI shows 1 “Deactivated” slave — it never goes
 active.

  Here are the values I have for /etc/mesos/zk:

  MASTER: zk://10.1.100.116:2181/mesos
  SLAVE: zk://10.1.100.116:2181/mesos

  Any ideas of what to troubleshoot?  Would greatly appreciate pointers.

  Environment details:
  - Ubuntu Server 14.04.1 running as VMs on ESXi 5.5U1
  - Mesos: 0.20.0
  - Marathon 0.6.1

  There are no apparent connectivity issues, and I’m not having any
 problems with other VMs on the ESXi host.  All VM to VM communication is on
 the same VLAN and within the same host.

  Zookeeper log on master (slave briefly registered so I tried to run a
 sleep 5 command from marathon and then the slave disconnected):

  2014-08-25 11:50:34,976 - INFO  [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket
 connection from /10.1.100.117:45778
 2014-08-25 11:50:34,977 - WARN  [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@793] - Connection request from
 old client /10.1.100.117:45778; will be dropped if server is in r-o
 mode
 2014-08-25 11:50:34,977 - INFO  [NIOServerCxn.Factory:
 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@839] - Client attempting to
 establish new session at /10.1.100.117:45778
 2014-08-25 11:50:34,978 - INFO  [SyncThread:0:ZooKeeperServer@595] -
 Established session 0x1480b22f7fc with negotiated timeout 1 for
 client /10.1.100.117:45778
 2014-08-25 11:51:05,724 - INFO  [ProcessThread(sid:0
 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
 when processing sessionid:0x1480b22f7f1 type:create cxid:0x53faafa9
 zxid:0x49 txntype:-1 reqpath:n/a Error Path:/marathon Error:KeeperErrorCode
 = NodeExists for /marathon
 2014-08-25 11:51:05,724 - INFO  [ProcessThread(sid:0
 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
 when processing sessionid:0x1480b22f7f1 type:create cxid:0x53faafaa
 zxid:0x4a txntype:-1 reqpath:n/a Error Path:/marathon/state
 Error:KeeperErrorCode = NodeExists for /marathon/state
 2014-08-25 11:51:09,145 - INFO  [ProcessThread(sid:0
 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
 when processing sessionid:0x1480b22f7f1 type:create cxid:0x53faafb5
 zxid:0x4d txntype:-1 reqpath:n/a Error Path:/marathon Error:KeeperErrorCode
 = NodeExists

Re: MesosCon attendee introduction thread

2014-08-14 Thread Ryan Thomas
Hi all,

My name is Ryan Thomas and I am a development team lead at Atlassian in
Sydney, Australia. We have been playing with Mesos / Marathon / Aurora /
Docker since about November last year and I'm really keen to talk to people
about the development and deployment process around micro services.

I'm occasionally in the irc channels as  and on twitter as
@hobos_delight.

Cheers, and looking forward to chatting with everyone!


ryan


On 15 August 2014 09:14, Ray Rodriguez rayrod2...@gmail.com wrote:

 Hi everyone,

 My name is Ray Rodriguez and I am a data infrastructure engineer in the
 data science team at Sailthru in New York City.  I first started
 experimenting with Mesos/Marathon/Chronos about 8 months ago and am
 currently building out a Spark cluster running on Mesos.  I'm also into all
 things automation/CM/Infrastructure as Code etc.. including chef,
 consul/etcd, zookeeper, docker, coreos.  I'm the author of a couple of
 Mesos cookbooks and recently contributed a collectd plugin for parsing
 Mesos stats (https://github.com/rayrod2030/collectd-mesos).

 Looking forward to talking to everyone about their experiences running
 Spark on Mesos in production and the rest of the Mesos ecosystem.

 Twitter: @rayray2030


 On Thu, Aug 14, 2014 at 7:05 PM, Dave Lester daveles...@gmail.com wrote:

 Hi All,

 I thought it would be nice to kickoff a thread for folks to introduce
 themselves in advance of #MesosCon
 http://events.linuxfoundation.org/events/mesoscon, so here goes:

 My name is Dave Lester, and I am Open Source Advocate at Twitter. Twitter
 is an organizing sponsor for #MesosCon, and I've worked closely with Chris
 Aniszczyk, the Linux Foundation, and a great team of volunteers to
 hopefully make this an awesome community event.

 I'm interested in meeting more companies using Mesos that we can add to
 our #PoweredByMesos list
 http://mesos.apache.org/documentation/latest/powered-by-mesos/, and
 chatting with folks about Apache Aurora
 http://aurora.incubator.apache.org. Right now my Thursday and Friday
 evenings are free, so let's grab a beer and chat more.

 I'm also on Twitter: @davelester

 Next!





Re: MesosCon attendee introduction thread

2014-08-14 Thread Ryan Thomas
Hey David,

I'll be keen to have a chat, we've been using Clojure for a bit of work as
well.

ryan


On 15 August 2014 12:08, David Greenberg dsg123456...@gmail.com wrote:

 I'm David Greenberg, and I work at Two Sigma. I've been rearchitecting our
 main compute cluster to run on top of Mesos. I've been developing an
 internal framework with an interesting, different scheduler model. Another
 member of our team will also be at the conference. I'm really excited to
 talk to others about using Mesos from Clojure and developing custom
 frameworks!


 On Thu, Aug 14, 2014 at 9:15 PM, Bill Farner wfar...@apache.org wrote:

 I'm Bill Farner, tech lead of the Aurora team at Twitter for the past 4+
 years, and an Aurora committer.  I will be giving a talk detailing some of
 the history of Aurora, and explaining some new features we have on the
 roadmap.  We Aurora developers have been really excited to see the project
 successfully in use by other companies, and I can't wait to discuss details
 with folks at the conference!

 -=Bill


 On Thu, Aug 14, 2014 at 4:55 PM, Brian Wickman wick...@apache.org
 wrote:

 I'm Brian Wickman (@wickman http://twitter.com/wickman) from the
 cloud infrastructure group at Twitter and an Aurora committer.  I'll be
 around both days, probably spending Friday hacking on pesos
 https://github.com/wickman/pesos and related projects.  I'd also be
 happy to give ad-hoc Aurora tutorials during the Hackathon, e.g. advanced
 Aurora configuration and/or hacking the Aurora executor come to mind.

 Looking forward to meeting everyone!

 ~brian


 On Thu, Aug 14, 2014 at 4:05 PM, Dave Lester daveles...@gmail.com
 wrote:

 Hi All,

 I thought it would be nice to kickoff a thread for folks to introduce
 themselves in advance of #MesosCon
 http://events.linuxfoundation.org/events/mesoscon, so here goes:

 My name is Dave Lester, and I am Open Source Advocate at Twitter.
 Twitter is an organizing sponsor for #MesosCon, and I've worked closely
 with Chris Aniszczyk, the Linux Foundation, and a great team of volunteers
 to hopefully make this an awesome community event.

 I'm interested in meeting more companies using Mesos that we can add to
 our #PoweredByMesos list
 http://mesos.apache.org/documentation/latest/powered-by-mesos/, and
 chatting with folks about Apache Aurora
 http://aurora.incubator.apache.org. Right now my Thursday and Friday
 evenings are free, so let's grab a beer and chat more.

 I'm also on Twitter: @davelester

 Next!