On augmenting TLS configuration options in libprocess

2019-05-24 Thread Alex Rukletsov
Folks, We reviewed TLS configuration options in libprocess and came up with the following proposal [1] to allow for certificate verification in client mode only. In short, the proposal suggests to add two flags to libprocess so that it can be configured to: * always require presence and verify

Re: '*.json' endpoints removed in 1.7

2019-05-11 Thread Alex Rukletsov
; breaking some user/tooling, in my opinion. We could revisit this if and > when we do a Mesos 2.0. > > On Wed, Aug 8, 2018 at 9:25 AM Alex Rukletsov wrote: > > > Folks, > > > > The long ago deprecated '*.json' endpoints will be removed in Mesos > 1.7.0. > > Pleas

Re: [VOTE] Release Apache Mesos 1.4.3 (rc1)

2019-01-28 Thread Alex Rukletsov
This will be the last official 1.4.x release. Even though we agreed to keep the branch and occasionally back port fixes to it post last release, maybe it makes sense to include all pending patches into 1.4.3? I see for example Gilbert added the fix for MESOS-9532 [1]. We were also considering back

Re: Join us at MesosCon 2018 next week!

2018-11-07 Thread Alex Rukletsov
I'd like to thank everyone involved in organising this MesosCon, and especially Gastón, Jörg, and Andy. I enjoyed the laid-back "underground" style this year; it was easy to engage in conversations with users and Mesos developers. Looking forward to the next MesosCon! Alex On Thu, Nov 1, 2018 at

On committer candidate nomination

2018-10-16 Thread Alex Rukletsov
Folks, A seemingly complex and long path to become a committer can drive away potential candidates shortly after they start contributing to the project. Around a year ago Jim Jagielski raised a concern about the high entry bar we have in the project. We heard the feedback and decided to

Re: [VOTE] Release Apache Mesos 1.7.0 (rc3)

2018-09-14 Thread Alex Rukletsov
+1 (binding) Mesosphere's internal CI run with the aforementioned tag. Observed 4 flaky tests, 3 are known: https://issues.apache.org/jira/browse/MESOS-5048 https://issues.apache.org/jira/browse/MESOS-8260 https://issues.apache.org/jira/browse/MESOS-8951 One has been introduced as part of adding

Re: [VOTE] Release Apache Mesos 1.7.0 (rc1)

2018-08-22 Thread Alex Rukletsov
MESOS-9177 has been filed today. It is very likely a regression introduced by one of the state.json improvements. We are still investigating, but it is obviously a -1 (binding) for rc1. Alex. On Wed, Aug 22, 2018 at 4:34 AM, Chun-Hung Hsiao wrote: > Hi all, > > Please vote on releasing the

Re: [VOTE] Release Apache Mesos 1.4.2 (rc1)

2018-08-20 Thread Alex Rukletsov
+1 binding (make check on Mac OS 10.13.5) On Mon, Aug 20, 2018 at 8:28 PM, Kapil Arya wrote: > +1 binding (internal CI). > > The Apache CI failures reported by Vinod are all known flaky tests. I have > inserted the details inline. > > Best, > Kapil > > On Tue, Aug 14, 2018 at 11:03 AM Vinod

'*.json' endpoints removed in 1.7

2018-08-08 Thread Alex Rukletsov
Folks, The long ago deprecated '*.json' endpoints will be removed in Mesos 1.7.0. Please use their non-'.json' counterparts instead. Commit: https://github.com/apache/mesos/commit/42551cb5290b7b04101f7d800b4b8fd573e47b91 JIRA ticket: https://issues.apache.org/jira/browse/MESOS-4509 Alex.

Re: [VOTE] Release Apache Mesos 1.3.3 (rc1)

2018-07-20 Thread Alex Rukletsov
MPark— what's the decision regarding the 1.3.3 release? On Mon, Jul 9, 2018 at 8:52 PM, Michael Park wrote: > I'm considering simply abandoning the 1.3.3 release and bringing the 1.3.x > branch to end of life. > If anyone really wants a 1.3.3, I'm certainly willing to finish the > release

Re: Backport Policy

2018-07-16 Thread Alex Rukletsov
y as necessary, and leave it on > the committers to decide if backporting a particular change is necessary. > > > On 07/13/2018 12:54 am, Alex Rukletsov wrote: > >> This is exactly where our views differ, Ben : ) >> >> Ideally, I would like a release manager to have

Re: Backport Policy

2018-07-13 Thread Alex Rukletsov
erally backport every bug fix I commit > that applies cleanly, right after I commit it to master (with the > exceptions I listed below). > > On Thu, Jul 12, 2018 at 8:39 AM, Alex Rukletsov > wrote: > > > I would like to back port as little as possible. I suggest the fo

Re: Proposing change to the allocatable check in the allocator

2018-06-12 Thread Alex Rukletsov
Instead of the master flag, why not a master API call. This will allow to update the value without restarting the master. Another thought is that we should explain operators how and when to use this knob. For example, if they observe a behavioural pattern A, then it means B is happening, and

Re: Update the *Minimum Linux Kernel version* supported on Mesos

2018-04-08 Thread Alex Rukletsov
This does not seem to me as a disruptive change, so I'm +1. On Thu, Apr 5, 2018 at 6:36 PM, Jie Yu wrote: > User namespaces require >= 3.12 (November 2013). Can we make that the >> minimum? > > > No, we need to support CentOS7 which uses 3.10 (some variant) > > - Jie > > On

Re: Release policy and 1.6 release schedule

2018-03-26 Thread Alex Rukletsov
I would like us to do monthly releases and support 10 branches at a time. Ideally, releasing that often reduces the burden for the release manager, because there are less changes and less new features. However, we lack automation to support this pace: our release guide [1] is several pages long

Re: Mesos 1.5.0 Release

2017-12-22 Thread Alex Rukletsov
https://issues.apache.org/jira/browse/MESOS-8297 has just landed. Let's include it in 1.5.0 as well. On Fri, Dec 22, 2017 at 4:35 AM, Jie Yu wrote: > Yeah, I am doing a grooming right now. > > Sent from my iPhone > > > On Dec 21, 2017, at 7:25 PM, Benjamin Mahler

[RESULT][VOTE] Release Apache Mesos 1.1.3 (rc2)

2017-08-31 Thread Alex Rukletsov
Hi all, The vote for Mesos 1.1.3 (rc2) has passed with the following votes. +1 (Binding) -- Alex R Till Tönshoff Vinod Kone There were no 0 or -1 votes. Please find the release at: https://dist.apache.org/repos/dist/release/mesos/1.1.3 It is recommended to use a

Re: [VOTE] Release Apache Mesos 1.1.3 (rc2)

2017-08-31 Thread Alex Rukletsov
cmake > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos- > Release/40/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=-- > verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu% > 3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)

[VOTE] Release Apache Mesos 1.1.3 (rc2)

2017-08-25 Thread Alex Rukletsov
Folks, Please vote on releasing the following candidate as Apache Mesos 1.1.3. Note that this will be the last 1.1.x release. 1.1.3 includes the following: ** Bug  * [MESOS-5187] - The filesystem/linux isolator

Re: Mesos 1.1.3 release

2017-08-17 Thread Alex Rukletsov
We have two more issues that I would like to have in 1.1.3 because it's the last 1.1.x release: https://issues.apache.org/jira/browse/MESOS-7865 https://issues.apache.org/jira/browse/MESOS-7863 They are in review and will be back ported soon. On Tue, Jul 25, 2017 at 11:28 AM, Alex Rukletsov

Re: Mesos 1.1.3 release

2017-07-25 Thread Alex Rukletsov
MESOS-7643 is still unresolved. I am moving the cut date for one more week, because this is the last patch release for 1.1.x. On Fri, Jul 14, 2017 at 6:34 PM, Alex Rukletsov <a...@mesosphere.com> wrote: > Folks, > > We are planning to cut the 1.1.3 release once MESOS-7643 is res

Re: Mesos 1.1.3 release

2017-07-14 Thread Alex Rukletsov
://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12331463 Till & Alex. On Wed, Jun 14, 2017 at 12:59 PM, Alex Rukletsov <a...@mesosphere.com> wrote: > Folks, > > there are only 2 back ported tickets to the 1.1.x branch so far (MESOS-7540 > and MESOS-7569). Since this

Re: Executors and CPU allocations

2017-06-26 Thread Alex Rukletsov
Regarding your second idea, you may have a "dummy" task with, say, 1.8 CPU and "run" it iff there is at least another real task running, while assigning 0.1 CPU for your executor. You can do some bookkeeping in the executor to determine whether a certain executor is idle (and hence a "dummy" task

On Apache Mesos release process

2017-06-17 Thread Alex Rukletsov
Folks, for more than a year Apache Mesos releases are done according to our "then new" release policy [1]. It seems to work quite well, but today I would like to address things that can be improved. Let's start with pain points: * A minor bug can cancel a release vote, even for a patch release.

Mesos 1.1.3 release

2017-06-14 Thread Alex Rukletsov
Folks, there are only 2 back ported tickets to the 1.1.x branch so far (MESOS-7540 and MESOS-7569). Since this will be the last 1.1.x release, we are delaying it for 3 more weeks to leave more time for people to include critical bug fixes. Till & Alex.

Re: [VOTE] Release Apache Mesos 1.2.1 (rc1)

2017-06-12 Thread Alex Rukletsov
PortMapping tests are indeed in bade shape. There are JIRAs already, have a look before filing new ones: MESOS-4646, MESOS-5687, MESOS-2765, MESOS-5690, MESOS-5688, MESOS-5689, MESOS-4643, MESOS-4644, MESOS-5309 On Sat, Jun 10, 2017 at 10:58 AM, Adam Bordelon wrote: > +1

[RESULT][VOTE] Release Apache Mesos 1.1.2 (rc2)

2017-05-19 Thread Alex Rukletsov
Hi all, The vote for Mesos 1.1.2 (rc2) has passed with the following votes. +1 (Binding) -- Vinod Kone Till Tönshoff Alex Rukletsov There were no 0 or -1 votes. Please find the release at: https://dist.apache.org/repos/dist/release/mesos/1.1.2 It is recommended

[VOTE] Release Apache Mesos 1.1.2 (rc2)

2017-05-12 Thread Alex Rukletsov
Folks, Please vote on releasing the following candidate as Apache Mesos 1.1.2. 1.1.2 includes the following: ** Bug * [MESOS-2537] - AC_ARG_ENABLED checks are broken. * [MESOS-5028] - Copy provisioner cannot

Re: [VOTE] Release Apache Mesos 1.1.2 (rc1)

2017-05-12 Thread Alex Rukletsov
est we include this fix in 1.1.2 >> https://issues.apache.org/jira/browse/MESOS-7471 >> >> On Thu, May 4, 2017 at 12:07 PM, Alex Rukletsov <a...@mesosphere.com> >> wrote: >> >>> Hi all, >>&g

Re: [VOTE] Release Apache Mesos 1.1.2 (rc1)

2017-05-10 Thread Alex Rukletsov
7471 > > On Thu, May 4, 2017 at 12:07 PM, Alex Rukletsov <a...@mesosphere.com> > wrote: > >> Hi all, >> >> Please vote on releasing the following candidate as Apache Mesos 1.1.2. >> >> 1.1.2 includes the following: >> --

[VOTE] Release Apache Mesos 1.1.2 (rc1)

2017-05-04 Thread Alex Rukletsov
Hi all, Please vote on releasing the following candidate as Apache Mesos 1.1.2. 1.1.2 includes the following: ** Bug * [MESOS-2537] - AC_ARG_ENABLED checks are broken. * [MESOS-5028] - Copy provisioner cannot

Re: Default executor grace period

2017-04-25 Thread Alex Rukletsov
Commented on the ticket. On Tue, Jan 17, 2017 at 12:27 PM, Tomek Janiszewski wrote: > Created issue for this: https://issues.apache.org/jira/browse/MESOS-6933 > > pon., 16 sty 2017 o 17:13 użytkownik Tomek Janiszewski > napisał: > >> I looks like it's

Re: mesos container cluster came across health check coredump log

2017-03-31 Thread Alex Rukletsov
Cool, looking forward to it! On Fri, Mar 31, 2017 at 4:30 AM, tommy xiao <xia...@gmail.com> wrote: > Alex,Yes, let me have a try. > > 2017-03-31 3:16 GMT+08:00 Alex Rukletsov <a...@mesosphere.com>: > >> This is https://issues.apache.org/jira/browse/MESOS-7210.

Re: mesos container cluster came across health check coredump log

2017-03-30 Thread Alex Rukletsov
This is https://issues.apache.org/jira/browse/MESOS-7210. Deshi, do you want to send the patch? I or Haosdent can shepherd. A. On Thu, Mar 30, 2017 at 12:27 PM, tommy xiao wrote: > interesting for the specified case. > > 2017-03-30 7:52 GMT+08:00 Jie Yu :

[RESULT][VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-14 Thread Alex Rukletsov
Hi folks, The vote for Mesos 1.1.1 (rc2) has passed with the following votes. +1 (Binding) -- *** AlexR *** Till Tönshoff *** Vinod Kone There were no 0 or -1 votes. Please find the release at: https://dist.apache.org/repos/dist/release/mesos/1.1.1 It is

Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-14 Thread Alex Rukletsov
, MESOS-7218 > > > On Mar 4, 2017, at 1:09 AM, Vinod Kone <vinodk...@apache.org> wrote: > > +1 (binding) > > Since the perf issue I reported earlier doesn't seem to be a blocker. > > On Fri, Mar 3, 2017 at 12:14 AM, Alex Rukletsov <a...@mesosphere.com> > wro

Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-03 Thread Alex Rukletsov
ke > [image: Success] > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos- > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=-- > verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu% > 3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&

[VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-02-27 Thread Alex Rukletsov
Hi all, Please vote on releasing the following candidate as Apache Mesos 1.1.1. 1.1.1 includes the following: ** Bug * [MESOS-6002] - The whiteout file cannot be removed correctly using aufs backend. *

Re: customized IP for health check

2017-01-18 Thread Alex Rukletsov
I'm not sure that exposing a domain will help: do you know the IP of your task upfront, i.e., at the moment when you construct TaskInfo? Isn't your task listening on all interfaces? On Wed, Jan 18, 2017 at 9:54 AM, CmingXu wrote: > the network I am currently used is USER,

[Design Doc] Arbitrary task checks in Mesos

2017-01-05 Thread Alex Rukletsov
We've recently been working on a design for arbitrary task checks [1] in Mesos (currently called probes, but this will likely change). Please have a look and leave comments on the doc or

Mesos 1.1.1 release dashboard

2016-12-22 Thread Alex Rukletsov
Folks, We are planning to cut the 1.1.1 release early next week. If you have any patches that need to get into 1.1.1, please make sure that either it is already in the 1.1.x branch or the corresponding ticket has a target version including 1.1.1 *by Monday* Dec 26. The release dashboard:

Re: Mesos on AWS

2016-12-21 Thread Alex Rukletsov
Kiril— from what you described it does not sound like the problem is the Linux distribution. It may be your AWS configuration. However, if a combination of health checks and heavy loaded agent leads to the agent termination — I would like to investigate this issue. Please come back—with logs!—if

Re: Proposal: mesosadm, the command to bootstrap the mesos cluster.

2016-12-14 Thread Alex Rukletsov
I have a different opinion on this. Several years ago I came across the concept of "mean wizards" — any helpers that hide away important steps from the user and hence do not give them opportunity to learn how things actually work. (If you're interested it was about projects in Borland IDEs that

Re: Command healthcheck failed but status KILLED

2016-12-12 Thread Alex Rukletsov
Technically the task hast not failed but was killed by the executor (because it failed a health check). On Fri, Dec 9, 2016 at 11:27 AM, Tomek Janiszewski wrote: > Hi > > What is desired behavior when command health check failed? On Mesos 1.0.2 > when health check fails task

Re: Duplicate task IDs

2016-12-11 Thread Alex Rukletsov
I'm fine with prohibiting non-unique IDs, but why do you plan to keep the most recent in case of a conflict? I'd expect any duplicate (that we can find out) is rejected / killed / banned / unchurched. On 9 Dec 2016 8:13 pm, "Joris Van Remoortere" wrote: > Hey Neil, > > I

Re: Quota

2016-12-11 Thread Alex Rukletsov
Granularity in the allocator is a single agent. Hence even though you set quota for 0.0001 CPU, at least one agent is "blocked". This is probably the reason why marathon is not getting offers. You can turn verbose master logs and check allocator messages to confirm. Alex. On 10 Dec 2016 2:14 am,

Re: healthcheck task?

2016-12-07 Thread Alex Rukletsov
rg/apache/mesos/ > Protos.HealthCheck.html > but not a single example of how to use it > > On Wed, Dec 7, 2016 at 11:22 AM, Alex Rukletsov <a...@mesosphere.com> > wrote: > >> What exactly do you mean under "health check task"? >> >> On Wed, Dec

Re: healthcheck task?

2016-12-07 Thread Alex Rukletsov
What exactly do you mean under "health check task"? On Wed, Dec 7, 2016 at 5:09 PM, Victor L wrote: > Can someone recommend simple example of how to add healthcheck task to > java framework? > Thanks, > >

Re: [VOTE] Release Apache Mesos 0.28.3 (rc1)

2016-11-30 Thread Alex Rukletsov
&(!ubuntu-us1)&&(!ubuntu-eu2)/> >> cmake >> [image: Success] >> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/25/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7

Re: [VOTE] Release Apache Mesos 0.28.3 (rc1)

2016-11-28 Thread Alex Rukletsov
I see LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem failing on CentOS 7 and Fedora 23, see e.g., [1]. I don't see any backports touching [2], can it be a regression or this test is know to be problematic in 0.28.x? [1] http://pastebin.com/c5PzfGF8 [2]

On increasing visibility into experimental features.

2016-11-01 Thread Alex Rukletsov
Folks, Additionally to the "known bugs" proposal in a parallel thread, we think that maintaining a list of still experimental features for each minor release will significantly help users to adjust their expectations. Our suggestion is to include a new section into the CHANGELOG called

Transition TASK_KILLING -> TASK_RUNNING

2016-10-31 Thread Alex Rukletsov
We've recently discovered a bug that may lead to a task being transitioned from killing to running state. More information about it in MESOS-6457 [1]. We plan to fix it in 1.2.0 and will backport it to all supported versions. [1] https://issues.apache.org/jira/browse/MESOS-6457

Re: [VOTE] Release Apache Mesos 1.1.0 (rc1)

2016-10-25 Thread Alex Rukletsov
This vote is cancelled. We'll cut RC2 later this week after the blockers are resolved. On Tue, Oct 25, 2016 at 5:48 AM, Zameer Manji wrote: > I'm going to -1 (non binding) for the same reason as David Robinson. > > I would classify the FD leak as serious and a violation of

On Mesos versioning and deprecation policy

2016-10-12 Thread Alex Rukletsov
Folks, There have been a bunch of online [1, 2] and offline discussions about our deprecation and versioning policy. I found that people—including myself—read the versioning doc [3] differently; moreover some aspects are not captured there. I would like to start a discussion around this topic by

Re: How to shutdown mesos-agent gracefully?

2016-10-12 Thread Alex Rukletsov
To make sure: you are aware of SIGUSR1? On Tue, Oct 11, 2016 at 5:37 PM, tommy xiao wrote: > Hi Ma, > > could you please input more background, why Maintenance feature is not > best option for your request? > > 2016-10-11 14:47 GMT+08:00 haosdent : > > >

Re: 1.1.0 release

2016-10-12 Thread Alex Rukletsov
, Alex & Till On Tue, Oct 11, 2016 at 5:30 PM, Alex Rukletsov <a...@mesosphere.io> wrote: > Folks, > > in preparation for Mesos 1.1.0 release we would like to ask people who > have worked on features in 1.1.0 to either: > * update the CHANGELOG and declare the feature impl

Re: 1.1.0 release

2016-10-11 Thread Alex Rukletsov
Folks, in preparation for Mesos 1.1.0 release we would like to ask people who have worked on features in 1.1.0 to either: * update the CHANGELOG and declare the feature implemented or experimental, make sure documentation is updated as well; * postpone to 1.2 and update the related epic; *

Re: what is the status on this?

2016-09-21 Thread Alex Rukletsov
out having agreement with your shepherd. Joseph Wu is driving the effort, get in touch with him and I'm sure you'll figure out the plan! On Tue, Sep 13, 2016 at 9:41 PM, kant kodali <kanth...@gmail.com> wrote: > @Alex Rukletsov I am sorry I took some time to respond. I am very excited &g

Re: mesos marathon roles

2016-09-08 Thread Alex Rukletsov
Vincent, role in a "consumed" resource can be "*", but the allocator will account this resource based on the consumer's role. In other words, if your Marathon is registered in role "prod", all "*" resources it consumes will be accounted for "prod" role. Hence yes, you can let everything

Re: what is the status on this?

2016-09-05 Thread Alex Rukletsov
desperately in search for alternative preferably using consul. I just > hear lot of positive response when comes it consul. It will be great to see > mesos and consul working together in which we would be ready to jump at it > and make a switch for YARN to Mesos. > > Thanks, > Kant &

Re: [VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-12 Thread Alex Rukletsov
+1 (binding) make check on Mac OS 10.11.6 with apple clang-703.0.31. DockerFetcherPluginTest.INTERNET_CURL_FetchImage is flaky (MESOS-4570), but this does not seem to be a regression or a blocker. On Fri, Aug 12, 2016 at 10:30 PM, Radoslaw Gruchalski wrote: > I am trying

Re: [VOTE] Release Apache Mesos 1.0.0 (rc2)

2016-07-15 Thread Alex Rukletsov
Haosdent investigated the issue, and it seems that health checks do work for docker executor. Hence I retract my negative vote. On Fri, Jul 15, 2016 at 12:57 PM, Alex Rukletsov <a...@mesosphere.com> wrote: > -1 (binding): MESOS-5848 > <https://issues.apache.org/jira/browse/MESOS

Re: [VOTE] Release Apache Mesos 1.0.0 (rc2)

2016-07-15 Thread Alex Rukletsov
-1 (binding): MESOS-5848 . The fix is on the way. On Wed, Jul 13, 2016 at 1:19 AM, Zhitao Li wrote: > +1 (nonbinding) > > Tested by 1)running all tests on Mac OS, 2) perform upgrade and downgrade > on a small test cluster

Re: removed slace "ID": (131.154.96.172): health check timed out

2016-04-18 Thread Alex Rukletsov
I believe it's because slaves are able to connect to the master, but the master is not able to connect to the slaves. That's why you see them connected for some time and gone afterwards. On Mon, Apr 18, 2016 at 6:47 PM, Stefano Bianchi wrote: > Indeed, i dont know why, i

Re: removed slace "ID": (131.154.96.172): health check timed out

2016-04-18 Thread Alex Rukletsov
Does this also happen when master3 is leading? My guess is that you're not allowong incoming connections from master1 and master2 to slave3. Generally, masters should be able to connect to slaves, not just respond to their requests. On 18 Apr 2016 13:17, "Stefano Bianchi"

Re: Mesos agents across a WAN?

2016-03-31 Thread Alex Rukletsov
Jeff, regarding 3: we are investigating this: https://issues.apache.org/jira/browse/MESOS-3548 On Thu, Mar 31, 2016 at 3:56 AM, Jeff Schroeder wrote: > Given regional bare metal Mesos clusters on multiple continents, are there > any known issues running some of the

Re: Executors no longer inherit environment variables from the agent

2016-03-10 Thread Alex Rukletsov
I have two questions. First, does this change include the executor library? We currently use environment variables to propagate various config values from an agent to executors. If it does, what is the alternative? Second, what will be the preferred way to pass config values to executors? It

Re: Sync Mesos-Master to Slaves

2015-12-28 Thread Alex Rukletsov
-slave on ubuntu 14, to reproduce this comportement. > > When I deploy only on Ubuntu 14 master+slave, the issue disappear … > > Fred > > > > > > > On 09 Dec 2015, at 16:30, Alex Rukletsov <a...@mesosphere.com> wrote: > > Frederic, > > I have skimmed throug

Re: mesos-elasticsearch vs Elasticsearch with Marathon

2015-12-28 Thread Alex Rukletsov
/1.4/modules-discovery.html. > We deploy elasticsearch via Marathon and it works great. > > On Mon, Dec 28, 2015 at 2:17 PM, Eric LEMOINE <elemo...@mirantis.com> wrote: >> >> On Mon, Dec 28, 2015 at 7:55 PM, Alex Rukletsov <a...@mesosphere.com> >> wrote: >> > Eri

Re: mesos-elasticsearch vs Elasticsearch with Marathon

2015-12-28 Thread Alex Rukletsov
Eric— give me a chance to answer that before you fall into frustration : ). Also, you can directly write to framework developers (mesos...@container-solutions.com) and they either confirm or bust my guess. Or maybe one of the authors — Frank — will chime in in this thread. Marathon has no idea

Re: Sync Mesos-Master to Slaves

2015-12-09 Thread Alex Rukletsov
Frederic, I have skimmed through the logs and they are do not seem to be complete (especially for master1). Could you please say what task has been killed (id) and which master failover triggered that? I see at least three failovers in the logs : ). Also, could you please share some background

Re: Verifying Zero Downtime Upgrade Process For Existing Mesos Cluster

2015-12-07 Thread Alex Rukletsov
Hi Abishek, I would strongly advise not to skip 6 versions. It's hard to say whether there were any changes that will prevent 0.25 masters to talk to 0.19 slaves (my intuition says there were some breaking changes to protobufs). We do *not* support upgrade by skipping version, so please upgrade

Re: [VOTE] Release Apache Mesos 0.26.0 (rc3)

2015-12-02 Thread Alex Rukletsov
`make check -j7` — OK `make distcheck -j7` — fails, probably MESOS-3973 , see hints below. Both on Mac OS 10.10.4 I see the following lines in the log: ... libtool: warning: 'libmesos.la' has not been installed in

Re: Change roles and weights without restarting Mesos

2015-11-27 Thread Alex Rukletsov
Hey Mario, it's not possible right now, but there are several efforts which intend to fix it in the nearest future. Take a look at [1] and [2]. [1] https://issues.apache.org/jira/browse/MESOS-3988 [2] https://issues.apache.org/jira/browse/MESOS-3177 On Fri, Nov 27, 2015 at 2:24 PM, Mario

Re: Is it possible to monitor resource usage per-task for the same executor?

2015-11-02 Thread Alex Rukletsov
In mesos, resources are isolated and accounted per container. A task is basically a description, it is up to an executor how to interpret it. In some cases, for example if an executor *just* creates a message in its internal queue for incoming tasks, it is almost impossible to track resource usage

Re: How to trace offers given to services/frameworks

2015-09-29 Thread Alex Rukletsov
The master logs the number of offers it sends to a framework. If you need exact information about offer resources and you use the built-in allocator, run the master with the `GLOG_v=2`, which will trigger detailed allocation logging in the built-in allocator. On Tue, Sep 29, 2015 at 10:35 AM,

Re: Fwd: [Breaking Change 0.24 & Upgrade path] ZooKeeper MasterInfo change.

2015-09-25 Thread Alex Rukletsov
James— Marco will correct me if I'm wrong, but my understanding is that this change does *not* impact what ZooKeeper version you can use with Mesos. We have changed the format of the message stored in ZK from protobuf to JSON. This message is needed by frameworks for mesos master leader

Re: Reservations for multiple different agents

2015-09-22 Thread Alex Rukletsov
Rinaldo, or you may try to install or port svn libs and check whether it works. On Tue, Sep 22, 2015 at 2:25 AM, Guangya Liu wrote: > Hi Rinaldo, > > The dynamic reservation endpoint support was introduced in 0.25.0, you may > want to use the latest code to build. > > If

Re: [VOTE] Release Apache Mesos 0.24.0 (rc2)

2015-09-05 Thread Alex Rukletsov
Afaik, Pythontest is flaky on OS X, and should be fine on Ubuntu. On 4 Sep 2015 10:48 pm, "Bernd Mathiske" wrote: > And also Ubuntu 13.10: [ FAILED ] ExamplesTest.PythonFramework, known > flaky test, so still +1 > > On Sep 4, 2015, at 9:11 PM, Bernd Mathiske

Re: How does mesos determine how much memory on a node is available for offer?

2015-09-03 Thread Alex Rukletsov
Mesos agent (aka slave) estimates the memory available and advertises all of it minus 1GB. If there is less than 2GB available, only half is advertised [1]. [1]: https://github.com/apache/mesos/blob/master/src/slave/containerizer/containerizer.cpp#L98 On Thu, Sep 3, 2015 at 4:01 AM, Anand

Re: mesos-master resource offer details

2015-09-02 Thread Alex Rukletsov
urce offer made available - the cpu's being > offered and I'm stuck there.. > > I really appreciate if you have any suggestions! Thanks. > > On Wed, Sep 2, 2015 at 9:54 AM, Alex Rukletsov <a...@mesosphere.com> > wrote: > >> To what Haosdent said: you cannot get a li

Re: mesos-master resource offer details

2015-09-02 Thread Alex Rukletsov
To what Haosdent said: you cannot get a list of offers from master logs, but you can get a list of allocations from the built-in allocator in you bump up the log level (GLOG_v=2). On Wed, Sep 2, 2015 at 7:36 AM, haosdent wrote: > If the offer is rejected by your framework,

Re: Use docker start rather than docker run?

2015-08-28 Thread Alex Rukletsov
to wait to use that feature. You could also choose to implement a custom executor for now if you like. Tim On Fri, Aug 28, 2015 at 10:43 AM, Alex Rukletsov a...@mesosphere.com wrote: Paul, that component is called DockerContainerizer and it's part of Mesos Agent (check /Users/alex

Re: Are the resource options documented?

2015-08-25 Thread Alex Rukletsov
From Mesos point of view, a resource is just a string, your agents may advertise gpu, bananas, pandas and so on. However, some resources are known to Mesos, and for them isolation is possible. A good example is a cgroups isolator for mem resources, which will invoke OOM killer if necessary.

Re: Custom Scheduler: Diagnosing cause of container task failures

2015-08-25 Thread Alex Rukletsov
It looks like we can have a better error message here. @Jay, mind filing a JIRA ticket for with description, status update, and your fix attached? Thanks! On Fri, Aug 21, 2015 at 7:36 PM, Jay Taylor j...@jaytaylor.com wrote: Eventually I was able to isolate what was going on; in this case the

Re: Launching tasks with reserved resources

2015-08-17 Thread Alex Rukletsov
Hi Gidon, just to make sure, you mean static reservations on mesos agents (via --resources flag) and not dynamic reservations, right? Let me first try to explain, why you get the TASK_ERROR message. The built-in allocator merges '*' and reserved resources, hinting master to create a single

Re: Launching tasks with reserved resources

2015-08-17 Thread Alex Rukletsov
to be aware of the reservation policies. Regards, Gidon From:Alex Rukletsov a...@mesosphere.com To:user@mesos.apache.org Date:17/08/2015 01:02 PM Subject:Re: Launching tasks with reserved resources -- Hi Gidon, just

Re: [VOTE] Release Apache Mesos 0.23.0 (rc1)

2015-07-06 Thread Alex Rukletsov
-1 Compilation error on Mac OS 10.10.4 with clang 3.5, which is supported according to release notes. More details: https://issues.apache.org/jira/browse/MESOS-2991 On Mon, Jul 6, 2015 at 11:55 AM, Jörg Schad jo...@mesosphere.io wrote: P.S. to my prior +1 Tested on ubuntu-trusty-14.04

Re: When do executors shutdown?

2015-06-30 Thread Alex Rukletsov
in memory data). On 30 Jun 2015, at 12:32, Alex Rukletsov a...@mesosphere.com wrote: There are two types of tasks: (1) those that specify an executor and (2) those, that specify a command. When a task of ttype (1) arrives to a slave, the slave checks whether an executor with the same executorID

Re: Setting minimum offer size

2015-06-30 Thread Alex Rukletsov
to configure what all filters it recognizes while making offers, which will also make the effect on scalability limited,as far as I understand. Thoughts? Thanks, Dharmesh On Sun, Jun 28, 2015 at 7:29 PM, Alex Rukletsov a...@mesosphere.com wrote: Sharma, that's exactly what we plan to add to Mesos

Re: Setting minimum offer size

2015-06-28 Thread Alex Rukletsov
Sharma, that's exactly what we plan to add to Mesos. Dynamic reservations will land in 0.23, the next step is to optimistically offer reserved but yet unused resources (we call them optimistic offers) to other framework as revocable. The alternative with one framework will of course work, but

Re: what reason caused the high cached memory by rcuos process

2015-06-22 Thread Alex Rukletsov
is what issue spawn many more rcuos process. i only running mesos-slave instance. 2015-05-26 23:50 GMT+08:00 Alex Rukletsov a...@mesosphere.com: What exactly is your concern? On Mon, May 25, 2015 at 2:45 PM, tommy xiao xia...@gmail.com wrote: Today i setup a testing cluster in azure Cloud

Re: Resource modelling questions

2015-06-19 Thread Alex Rukletsov
Inlined. On Fri, Jun 19, 2015 at 4:17 AM, zhou weitao zhouwtl...@gmail.com wrote: Alex, hi, 2015-06-18 23:25 GMT+08:00 Alex Rukletsov a...@mesosphere.com: Zhou, I haven't read the *Design* yet, but I don't think it is solving the same question between priorities and quota. For example

Re: Resource modelling questions

2015-06-18 Thread Alex Rukletsov
Zhou, I haven't read the *Design* yet, but I don't think it is solving the same question between priorities and quota. For example, assume we only have 10G memory reservating for framework A totally, then another urgency framework is getting nothing. which is statical partition still. While

Re: mesosphere.io broken?

2015-06-17 Thread Alex Rukletsov
For downloads, use https://mesosphere.com/downloads/ Elastic Mesos has been decommissioned, use https://google.mesosphere.com/ or https://digitalocean.mesosphere.com/ but keep in mind they will be decommissioned soon (~1 month) as well. However, if you want to try DCOS installation on AWS, check

Re: Setting Rate of Resource Offers

2015-06-14 Thread Alex Rukletsov
Christopher, try adjusting master allocation_interval flag. It specifies often the allocator performs batch allocations to frameworks. As Ondrej pointed out, if you framework explicitly declines offers, it won't be re-offered the same resources for some period of time. On Sat, Jun 13, 2015 at

Re: Can Mesos master offer resources to multiple frameworks simultaneously?

2015-06-10 Thread Alex Rukletsov
I'll try to answer these questions. 1. Currently, the only language you can use is C++. You can workaround this by writing a proxy in c++ that delegates the calls to, say, python scripts. See http://mesos.apache.org/documentation/latest/allocation-module/ for more details. 2. The default

Re: 答复: [DISCUSS] Renaming Mesos Slave

2015-06-08 Thread Alex Rukletsov
While I'm apathetic to changing the name, I think we should do more than just voting on an alternate name in case we decide to proceed and replace the master/slave terminology. Such change is very expensive and it makes sense to do it once than to rush and pick up an ambiguous term. If we make

Re: Restarting mesos-slave in a node restarts all the apps

2015-05-29 Thread Alex Rukletsov
Siva, yes, this is intended behaviour: keep tasks running and give the Mesos Worker some time to re-register. You can adjust this timeout via --slave_reregister_timeout, but keep in mind 10 min is the minimum. On Fri, May 29, 2015 at 8:04 AM, Sivaram Kannan sivara...@gmail.com wrote: Hi ,

Re: Reminder: /stats.json is deprecated

2015-05-20 Thread Alex Rukletsov
Reminder: please don't forget to update your code to use /metrics/snapshot endpoint instead of deprecated /stats.json prior Mesos 0.23 release. On Wed, Apr 8, 2015 at 1:07 PM, Alex Rukletsov a...@mesosphere.com wrote: Folks, if you build tooling around Mesos, please be advised that in current

  1   2   >