Re: [VOTE] Release Apache Mesos 0.23.1 (rc1)

2015-09-23 Thread Vinod Kone
+1 (binding) Tested on CI for centos5/6 @vinodkone > On Sep 23, 2015, at 3:45 PM, Adam Bordelon wrote: > > +1 (binding) Tested on CI for CentOS7, Fedora22, and Ubuntu 14.04. > >> On Wed, Sep 23, 2015 at 10:25 AM, haosdent wrote: >> >> +1 test on Ubuntu 14.04 >> >>> On Tue, Sep 22, 2015 at

Re: Metric for tasks queued/waiting?

2015-09-23 Thread Niklas Nielsen
Created a ticket for us to continue the discussion: https://issues.apache.org/jira/browse/MESOS-3507 We can try to capture the explicit use-case from Aaron and maybe create another ticket to track a more-or-less generic path we could go down. Cheers, Niklas On 23 September 2015 at 15:55, Sharma

Re: Metric for tasks queued/waiting?

2015-09-23 Thread Sharma Podila
Discussing in a separate place/JIRA ticket sounds good. Basically, representing contention using a summary of pending resource requests from each framework could be the hints to mesos master. However, this gets into intricacies, not the least of which is diversity of resource requests, qualified by

Re: [VOTE] Release Apache Mesos 0.23.1 (rc1)

2015-09-23 Thread Adam Bordelon
+1 (binding) Tested on CI for CentOS7, Fedora22, and Ubuntu 14.04. On Wed, Sep 23, 2015 at 10:25 AM, haosdent wrote: > +1 test on Ubuntu 14.04 > > On Tue, Sep 22, 2015 at 9:06 AM, Adam Bordelon wrote: > >> Hi friends, >> >> Please vote on releasing the following candidate as Apache Mesos 0.23.1

Re: Metric for tasks queued/waiting?

2015-09-23 Thread Niklas Nielsen
I'd love to see this solved in a general way; "How does the framework communicate (insert intent, metric, hint, etc) to mesos". In one way, the 'webui_url' of in the framework info conveys "This is how you get to my web ui". As providing a webui was a common pattern for the frameworks. This could

Re: Metric for tasks queued/waiting?

2015-09-23 Thread Alex Gaudio
Hi Aaron, You might consider trying to solve the autoscaling problem with Relay, a Python tool I use to solve this problem. Feel free to shoot me an email if you are interested. github.com/sailthru/relay Alex On Wed, Sep 23, 2015, 11:03 AM David Greenberg wrote: > In addition, this technique

Re: [VOTE] Release Apache Mesos 0.24.1 (rc1)

2015-09-23 Thread Adam Bordelon
+1 (binding) Tested on CI for CentOS7, Fedora22, and Ubuntu 14.04. On Tue, Sep 22, 2015 at 1:41 PM, Niklas Nielsen wrote: > +1 (binding) > > On 21 September 2015 at 11:46, Vinod Kone wrote: > >> +1 (binding) >> >> Tested on CI for CentOS5 and CentOS6. >> >> On Fri, Sep 18, 2015 at 6:21 PM, Adam

Re: mesos-containerizer: error while loading shared libraries: libmesos-0.24.0.so

2015-09-23 Thread Joris Van Remoortere
Can you run the slave and executor with GLOG_v=1 set on the environment and try to provide some more context for this error: > mesos-containerizer: error while loading shared libraries: > libmesos-0.24.0.so: cannot open shared object file: No such file or > directory Are there any logs on the sla

Re: Changing mesos slave configuration

2015-09-23 Thread Joris Van Remoortere
We are adding better support for systemd in 0.25. The ticket is MESOS-3425. Naturally this is still somewhat experimental, but we would love your feedback. We will add some documentation on recommended setups on systemd. With the changes going into 0.25 you should be able to launch your slave with

Re: Detecting slave crashes event

2015-09-23 Thread Joris Van Remoortere
There is a plan for event subscription, but it is still in the early design phase. In 0.25 we are adding slave exit hooks: MESOS-3015 This will allow you to generate whatever events you like based on removal of a slave. This is your best bet in terms of an immediate solution :-) @Kapil and @Nikla

Re: SSL in Mesos 0.23

2015-09-23 Thread Benjamin Mahler
+joris On Thu, Sep 17, 2015 at 6:44 AM, tommy xiao wrote: > read many more report on SSL。 does it mean currently the mesos can't > support ssl interconn? > > 2015-09-17 18:55 GMT+08:00 Carlos Sanchez : > >> I got back to SSL and made some progress, SSL is enabled now (I think >> I needed to expo

Re: Detecting slave crashes event

2015-09-23 Thread Benjamin Mahler
I believe some of the contributors from Mesosphere have been thinking about it, but not sure on the plans. I'll let them reply here. On Wed, Sep 16, 2015 at 11:11 AM, Paul Bell wrote: > Thank you, Benjamin. > > So, I could periodically request the metrics endpoint, or stream the logs > (maybe vi

Re: [VOTE] Release Apache Mesos 0.23.1 (rc1)

2015-09-23 Thread haosdent
+1 test on Ubuntu 14.04 On Tue, Sep 22, 2015 at 9:06 AM, Adam Bordelon wrote: > Hi friends, > > Please vote on releasing the following candidate as Apache Mesos 0.23.1. > > 0.23.1 is a bug fix release and includes the following: > > ---

Re: Changing mesos slave configuration

2015-09-23 Thread Pradeep Chhetri
Thank you for the issue link. I will go through to understand which configuration changes can be done with and without recovery. On Wed, Sep 23, 2015 at 4:25 PM, Vinod Kone wrote: > It's not yet possible to make certain slave configuration changes while > making recovery (reconnecting with old e

Re: how does resource allocation work with docker?

2015-09-23 Thread Guangya Liu
Hi Clarke, Yes, you are right, the mesos is now using 1024 * CPUS as cpushare, you can refer to https://github.com/apache/mesos/blob/master/src/docker/docker.cpp#L394 for detail. Regarding to cpus, I think that by default the container have same number of cores/cpus as the docker server, not sure

Re: Changing mesos slave configuration

2015-09-23 Thread Vinod Kone
It's not yet possible to make certain slave configuration changes while making recovery (reconnecting with old executors) work. See https://issues.apache.org/jira/browse/MESOS-1739 and attached tickets for details. On Wed, Sep 23, 2015 at 7:37 AM, Pradeep Chhetri < pradeep.chhetr...@gmail.com> wr

Re: Metric for tasks queued/waiting?

2015-09-23 Thread David Greenberg
In addition, this technique could be implemented in the allocator with an understanding of global demand: https://www.youtube.com/watch?v=BkBMYUe76oI That would allow for tunable fair-sharing based on DRF-principles. On Wed, Sep 23, 2015 at 10:59 AM haosdent wrote: > Feel free to open a story i

RE: how does resource allocation work with docker?

2015-09-23 Thread Clarke, Trevor
I think you misunderstand. I'm not looking to change the container resource allocation, I'm looking to have the program in the container make adjustments (changing static buffer allocations based on memory limits or adjusting the number of threads based on cpu allocation). But I need a way to co

Re: Metric for tasks queued/waiting?

2015-09-23 Thread haosdent
Feel free to open a story in jira if you think you ideas are awesome. :-) On Sep 23, 2015 10:54 PM, "Sharma Podila" wrote: > Ah, OK, thanks. Yes, Fenzo is a Java library. > > It might be a nice addition to Mesos master to get a global view of > contention for resources. In addition to autoscaling

RE: how does resource allocation work with docker?

2015-09-23 Thread haosdent
>Is there a way for a process in the docker container to determine how many cpus it has been allocated? Because docker limit the resource before start the container, I think there isn't a way to change container resource usage after it has already run. You only could allocate resource before launc

Re: Metric for tasks queued/waiting?

2015-09-23 Thread Sharma Podila
Ah, OK, thanks. Yes, Fenzo is a Java library. It might be a nice addition to Mesos master to get a global view of contention for resources. In addition to autoscaling, it would be useful in the allocator. On Wed, Sep 23, 2015 at 7:29 AM, Aaron Carey wrote: > Thanks Sharma, > > I was in the au

Re: Changing mesos slave configuration

2015-09-23 Thread Pradeep Chhetri
Thank you for the replies. Paul, I am talking the about the same directory. There is a file named slave.info inside /tmp/mesos/meta/slaves/latest and this needs to be cleaned before starting mesos slave with a configuration change. No i am not using systemd. It is basically sysvinit which is spaw

RE: how does resource allocation work with docker?

2015-09-23 Thread Clarke, Trevor
That's just the scheduler...I did locate the appropriate docker run call and it appears that mesos cpu offer * 1024 is used as the cpu share so there's no hard enforcement of total cpus but the scheduler block is scaled to the number of requested cpus. It also appears there's no way to determine

RE: Metric for tasks queued/waiting?

2015-09-23 Thread Aaron Carey
Thanks Sharma, I was in the audience for a talk you did about Fenzo at MesosCon :) It looked great but we're a python shop primarily so the Java requirement would be a problem for us. The scaling in the scheduler makes total sense, (obvious when you think about it!), I was naively hoping for s

Re: Metric for tasks queued/waiting?

2015-09-23 Thread Sharma Podila
Jobs/tasks wait in framework schedulers, not mesos master. Autoscaling triggers must come from schedulers, not only because that's who knows the pending task set size, but, also because it knows how many of them need to be launched right away, on what kind of machines. We built such an autoscaling

Re: how does resource allocation work with docker?

2015-09-23 Thread Guangya Liu
Hi Clarke, You can take a look at the framework for docker here https://github.com/apache/mesos/blob/master/src/examples/docker_no_executor_framework.cpp All of the resource limitations for a docker task should be defined in the framework. Thanks, Guangya On Wed, Sep 23, 2015 at 9:33 PM, Clark

RE: combining stdout/stderr

2015-09-23 Thread Clarke, Trevor
That's my fallback but it would require modifying and rebuilding the docker containers and it won't redirect and stderr from the os instance in the container, just the application that's executing. From: haosdent [haosd...@gmail.com] Sent: Wednesday, September 23

how does resource allocation work with docker?

2015-09-23 Thread Clarke, Trevor
When a framework accepts an offer and starts a docker task, how does mesos enforce the task's allocated cpus? Is CPU share used and scaled appropriately? Are cpu sets explicitly specified to limit execution to the allocated cpus? Is there a way for a process in the docker container to determine

Re: combining stdout/stderr

2015-09-23 Thread haosdent
How about use 2>&1 ? On Wed, Sep 23, 2015 at 9:16 PM, Clarke, Trevor wrote: > Is there a way to have mesos combine stderr with stdout in a docker > process so there's only one log stream with proper interleave? > > > This message and any enclosures are intended only for the addressee. > Please >

combining stdout/stderr

2015-09-23 Thread Clarke, Trevor
Is there a way to have mesos combine stderr with stdout in a docker process so there's only one log stream with proper interleave? This message and any enclosures are intended only for the addressee. Please notify the sender by email if you are not the intended recipient. If you are not the

Re: Changing mesos slave configuration

2015-09-23 Thread craig w
I believe Brian might be referring to the "KillMode" in the systemd unit file: # the default is cgroup, which means kill all processes # in the control group of this process, which is not # what you'd want KillMode=process On Wed, Sep 23, 2015 at 8:11 AM, Brian Devins wrote: > Are you using sys

Re: Changing mesos slave configuration

2015-09-23 Thread Brian Devins
Are you using systemd? There is a known issue with slave recovery on systemd. I'm on mobile or I would link you to the last thread around this but there is a line you can add to the config that is supposed to fix it. Whether it will fix it is another matter. I am fighting this issue at work myself.

RE: Metric for tasks queued/waiting?

2015-09-23 Thread Aaron Carey
No, I basically had the same question as Jim (but maybe didn't word it so well ;)) I'll have a look at your response there :) From: haosdent [haosd...@gmail.com] Sent: 23 September 2015 10:12 To: user@mesos.apache.org Subject: Re: Metric for tasks queued/waiting?

Re: Changing mesos slave configuration

2015-09-23 Thread Paul Bell
Hi Pradeep, Perhaps I am speaking to a slightly different point, but when I change /etc/default/mesos-slave to add a new attribute, I have to remove file /tmp/mesos/meta/slaves/latest. IIRC, mesos-slave itself, in failing to start after such a change, tells me to do this: rm -f /tmp/mesos/meta/s

Re: Mesos master metrics endpoint?

2015-09-23 Thread haosdent
Hi, @James As far as I know, when the framework request offer, master/messages_resource_request would increment. And when the framework accept the offer and launch tasks, master/messages_launch_tasks would increment. But mesos don't know how many tasks pending in framework, because I think this inf

Re: Metric for tasks queued/waiting?

2015-09-23 Thread haosdent
Does /metrics/snapshot not satisfy your requirement? On Wed, Sep 23, 2015 at 4:50 PM, Aaron Carey wrote: > Hi all, > > Is there any way to get a metric of all tasks currently waiting/queued in > Mesos (across all schedulers)? The snapshot metrics seem to cover ever > other kind of task state? Th

Mesos master metrics endpoint?

2015-09-23 Thread James Vanns
Hi all. It appears there is a glaring omission in the 'Tasks' section of the following doc; http://mesos.apache.org/documentation/latest/monitoring/ Shouldn't there be a 'Tasks waiting' metric!? We generally have tasks hanging around for a while because their resource requests can't (yet) be met

Metric for tasks queued/waiting?

2015-09-23 Thread Aaron Carey
Hi all, Is there any way to get a metric of all tasks currently waiting/queued in Mesos (across all schedulers)? The snapshot metrics seem to cover ever other kind of task state? This would be quite useful for auto-scaling purposes.. Thanks, Aaron

Changing mesos slave configuration

2015-09-23 Thread Pradeep Chhetri
Hello all, I have often faced this problem that whenever i try to add some configuration parameter to mesos-slave or change any configuration (eg. add a new attribute in mesos-slave), the mesos slave doesnt come up on restart. I have to delete the slave.info file and then restart the slave but it