On augmenting TLS configuration options in libprocess

2019-05-24 Thread Alex Rukletsov
Folks, We reviewed TLS configuration options in libprocess and came up with the following proposal [1] to allow for certificate verification in client mode only. In short, the proposal suggests to add two flags to libprocess so that it can be configured to: * always require presence and verify se

Re: '*.json' endpoints removed in 1.7

2019-05-11 Thread Alex Rukletsov
f > breaking some user/tooling, in my opinion. We could revisit this if and > when we do a Mesos 2.0. > > On Wed, Aug 8, 2018 at 9:25 AM Alex Rukletsov wrote: > > > Folks, > > > > The long ago deprecated '*.json' endpoints will be removed in Mesos > 1.7.0.

Re: [VOTE] Release Apache Mesos 1.4.3 (rc1)

2019-01-28 Thread Alex Rukletsov
This will be the last official 1.4.x release. Even though we agreed to keep the branch and occasionally back port fixes to it post last release, maybe it makes sense to include all pending patches into 1.4.3? I see for example Gilbert added the fix for MESOS-9532 [1]. We were also considering back

Re: Join us at MesosCon 2018 next week!

2018-11-07 Thread Alex Rukletsov
I'd like to thank everyone involved in organising this MesosCon, and especially Gastón, Jörg, and Andy. I enjoyed the laid-back "underground" style this year; it was easy to engage in conversations with users and Mesos developers. Looking forward to the next MesosCon! Alex On Thu, Nov 1, 2018 at

On committer candidate nomination

2018-10-16 Thread Alex Rukletsov
Folks, A seemingly complex and long path to become a committer can drive away potential candidates shortly after they start contributing to the project. Around a year ago Jim Jagielski raised a concern about the high entry bar we have in the project. We heard the feedback and decided to liberalize

Re: [VOTE] Release Apache Mesos 1.7.0 (rc3)

2018-09-14 Thread Alex Rukletsov
+1 (binding) Mesosphere's internal CI run with the aforementioned tag. Observed 4 flaky tests, 3 are known: https://issues.apache.org/jira/browse/MESOS-5048 https://issues.apache.org/jira/browse/MESOS-8260 https://issues.apache.org/jira/browse/MESOS-8951 One has been introduced as part of adding

Re: [VOTE] Release Apache Mesos 1.7.0 (rc1)

2018-08-22 Thread Alex Rukletsov
MESOS-9177 has been filed today. It is very likely a regression introduced by one of the state.json improvements. We are still investigating, but it is obviously a -1 (binding) for rc1. Alex. On Wed, Aug 22, 2018 at 4:34 AM, Chun-Hung Hsiao wrote: > Hi all, > > Please vote on releasing the f

Re: [VOTE] Release Apache Mesos 1.4.2 (rc1)

2018-08-20 Thread Alex Rukletsov
+1 binding (make check on Mac OS 10.13.5) On Mon, Aug 20, 2018 at 8:28 PM, Kapil Arya wrote: > +1 binding (internal CI). > > The Apache CI failures reported by Vinod are all known flaky tests. I have > inserted the details inline. > > Best, > Kapil > > On Tue, Aug 14, 2018 at 11:03 AM Vinod Kone

'*.json' endpoints removed in 1.7

2018-08-08 Thread Alex Rukletsov
Folks, The long ago deprecated '*.json' endpoints will be removed in Mesos 1.7.0. Please use their non-'.json' counterparts instead. Commit: https://github.com/apache/mesos/commit/42551cb5290b7b04101f7d800b4b8fd573e47b91 JIRA ticket: https://issues.apache.org/jira/browse/MESOS-4509 Alex.

Re: [VOTE] Release Apache Mesos 1.3.3 (rc1)

2018-07-20 Thread Alex Rukletsov
MPark— what's the decision regarding the 1.3.3 release? On Mon, Jul 9, 2018 at 8:52 PM, Michael Park wrote: > I'm considering simply abandoning the 1.3.3 release and bringing the 1.3.x > branch to end of life. > If anyone really wants a 1.3.3, I'm certainly willing to finish the > release porti

Re: Backport Policy

2018-07-16 Thread Alex Rukletsov
general we should backport only as necessary, and leave it on > the committers to decide if backporting a particular change is necessary. > > > On 07/13/2018 12:54 am, Alex Rukletsov wrote: > >> This is exactly where our views differ, Ben : ) >> >> Ideally, I would

Re: Backport Policy

2018-07-13 Thread Alex Rukletsov
low, I generally backport every bug fix I commit > that applies cleanly, right after I commit it to master (with the > exceptions I listed below). > > On Thu, Jul 12, 2018 at 8:39 AM, Alex Rukletsov > wrote: > > > I would like to back port as little as possible. I suggest

Re: Proposing change to the allocatable check in the allocator

2018-06-12 Thread Alex Rukletsov
Instead of the master flag, why not a master API call. This will allow to update the value without restarting the master. Another thought is that we should explain operators how and when to use this knob. For example, if they observe a behavioural pattern A, then it means B is happening, and tunin

Re: Update the *Minimum Linux Kernel version* supported on Mesos

2018-04-08 Thread Alex Rukletsov
This does not seem to me as a disruptive change, so I'm +1. On Thu, Apr 5, 2018 at 6:36 PM, Jie Yu wrote: > User namespaces require >= 3.12 (November 2013). Can we make that the >> minimum? > > > No, we need to support CentOS7 which uses 3.10 (some variant) > > - Jie > > On Thu, Apr 5, 2018 at 8

Re: Release policy and 1.6 release schedule

2018-03-26 Thread Alex Rukletsov
I would like us to do monthly releases and support 10 branches at a time. Ideally, releasing that often reduces the burden for the release manager, because there are less changes and less new features. However, we lack automation to support this pace: our release guide [1] is several pages long and

Re: Mesos 1.5.0 Release

2017-12-22 Thread Alex Rukletsov
https://issues.apache.org/jira/browse/MESOS-8297 has just landed. Let's include it in 1.5.0 as well. On Fri, Dec 22, 2017 at 4:35 AM, Jie Yu wrote: > Yeah, I am doing a grooming right now. > > Sent from my iPhone > > > On Dec 21, 2017, at 7:25 PM, Benjamin Mahler wrote: > > > > Meng is working

[RESULT][VOTE] Release Apache Mesos 1.1.3 (rc2)

2017-08-31 Thread Alex Rukletsov
Hi all, The vote for Mesos 1.1.3 (rc2) has passed with the following votes. +1 (Binding) -- Alex R Till Tönshoff Vinod Kone There were no 0 or -1 votes. Please find the release at: https://dist.apache.org/repos/dist/release/mesos/1.1.3 It is recommended to use a mir

Re: [VOTE] Release Apache Mesos 1.1.3 (rc2)

2017-08-31 Thread Alex Rukletsov
iew/Mesos/job/Mesos- > Release/40/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose, > ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04, > label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/> > > cmake > > <https://builds.apache.org/view/M-R

[VOTE] Release Apache Mesos 1.1.3 (rc2)

2017-08-25 Thread Alex Rukletsov
Folks, Please vote on releasing the following candidate as Apache Mesos 1.1.3. Note that this will be the last 1.1.x release. 1.1.3 includes the following: ** Bug  * [MESOS-5187] - The filesystem/linux isolator does

Re: Mesos 1.1.3 release

2017-08-17 Thread Alex Rukletsov
We have two more issues that I would like to have in 1.1.3 because it's the last 1.1.x release: https://issues.apache.org/jira/browse/MESOS-7865 https://issues.apache.org/jira/browse/MESOS-7863 They are in review and will be back ported soon. On Tue, Jul 25, 2017 at 11:28 AM, Alex Rukl

Re: Mesos 1.1.3 release

2017-07-25 Thread Alex Rukletsov
MESOS-7643 is still unresolved. I am moving the cut date for one more week, because this is the last patch release for 1.1.x. On Fri, Jul 14, 2017 at 6:34 PM, Alex Rukletsov wrote: > Folks, > > We are planning to cut the 1.1.3 release once MESOS-7643 is resolved. If > you have an

Re: Mesos 1.1.3 release

2017-07-14 Thread Alex Rukletsov
://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12331463 Till & Alex. On Wed, Jun 14, 2017 at 12:59 PM, Alex Rukletsov wrote: > Folks, > > there are only 2 back ported tickets to the 1.1.x branch so far (MESOS-7540 > and MESOS-7569). Since this will be the last 1.1.x

Re: Executors and CPU allocations

2017-06-26 Thread Alex Rukletsov
Regarding your second idea, you may have a "dummy" task with, say, 1.8 CPU and "run" it iff there is at least another real task running, while assigning 0.1 CPU for your executor. You can do some bookkeeping in the executor to determine whether a certain executor is idle (and hence a "dummy" task s

On Apache Mesos release process

2017-06-17 Thread Alex Rukletsov
Folks, for more than a year Apache Mesos releases are done according to our "then new" release policy [1]. It seems to work quite well, but today I would like to address things that can be improved. Let's start with pain points: * A minor bug can cancel a release vote, even for a patch release. *

Mesos 1.1.3 release

2017-06-14 Thread Alex Rukletsov
Folks, there are only 2 back ported tickets to the 1.1.x branch so far (MESOS-7540 and MESOS-7569). Since this will be the last 1.1.x release, we are delaying it for 3 more weeks to leave more time for people to include critical bug fixes. Till & Alex.

Re: [VOTE] Release Apache Mesos 1.2.1 (rc1)

2017-06-12 Thread Alex Rukletsov
PortMapping tests are indeed in bade shape. There are JIRAs already, have a look before filing new ones: MESOS-4646, MESOS-5687, MESOS-2765, MESOS-5690, MESOS-5688, MESOS-5689, MESOS-4643, MESOS-4644, MESOS-5309 On Sat, Jun 10, 2017 at 10:58 AM, Adam Bordelon wrote: > +1 (binding) Good enough fo

[RESULT][VOTE] Release Apache Mesos 1.1.2 (rc2)

2017-05-19 Thread Alex Rukletsov
Hi all, The vote for Mesos 1.1.2 (rc2) has passed with the following votes. +1 (Binding) -- Vinod Kone Till Tönshoff Alex Rukletsov There were no 0 or -1 votes. Please find the release at: https://dist.apache.org/repos/dist/release/mesos/1.1.2 It is recommended to

[VOTE] Release Apache Mesos 1.1.2 (rc2)

2017-05-12 Thread Alex Rukletsov
Folks, Please vote on releasing the following candidate as Apache Mesos 1.1.2. 1.1.2 includes the following: ** Bug * [MESOS-2537] - AC_ARG_ENABLED checks are broken. * [MESOS-5028] - Copy provisioner cannot repl

Re: [VOTE] Release Apache Mesos 1.1.2 (rc1)

2017-05-12 Thread Alex Rukletsov
.2 >> https://issues.apache.org/jira/browse/MESOS-7471 >> >> On Thu, May 4, 2017 at 12:07 PM, Alex Rukletsov >> wrote: >> >>> Hi all, >>> >>> Please vote on releasing the following candidate as Apache Mesos 1.1.2. >>> >>> 1

Re: [VOTE] Release Apache Mesos 1.1.2 (rc1)

2017-05-10 Thread Alex Rukletsov
This vote is cancelled. Vinod, I'll look into the failure and report back. After that, I'll start a new vote. On 9 May 2017 10:07 am, "Jie Yu" wrote: > -1 > > I suggest we include this fix in 1.1.2 > https://issues.apache.org/jira/browse/MESOS-7471 > >

[VOTE] Release Apache Mesos 1.1.2 (rc1)

2017-05-04 Thread Alex Rukletsov
Hi all, Please vote on releasing the following candidate as Apache Mesos 1.1.2. 1.1.2 includes the following: ** Bug * [MESOS-2537] - AC_ARG_ENABLED checks are broken. * [MESOS-5028] - Copy provisioner cannot rep

Re: Default executor grace period

2017-04-25 Thread Alex Rukletsov
Commented on the ticket. On Tue, Jan 17, 2017 at 12:27 PM, Tomek Janiszewski wrote: > Created issue for this: https://issues.apache.org/jira/browse/MESOS-6933 > > pon., 16 sty 2017 o 17:13 użytkownik Tomek Janiszewski > napisał: > >> I looks like it's supported because executor prints grace per

Re: mesos container cluster came across health check coredump log

2017-03-31 Thread Alex Rukletsov
Cool, looking forward to it! On Fri, Mar 31, 2017 at 4:30 AM, tommy xiao wrote: > Alex,Yes, let me have a try. > > 2017-03-31 3:16 GMT+08:00 Alex Rukletsov : > >> This is https://issues.apache.org/jira/browse/MESOS-7210. Deshi, do you >> want to send the patch? I

Re: mesos container cluster came across health check coredump log

2017-03-30 Thread Alex Rukletsov
This is https://issues.apache.org/jira/browse/MESOS-7210. Deshi, do you want to send the patch? I or Haosdent can shepherd. A. On Thu, Mar 30, 2017 at 12:27 PM, tommy xiao wrote: > interesting for the specified case. > > 2017-03-30 7:52 GMT+08:00 Jie Yu : > >> + AlexR, haosdent >> >> For poster

[RESULT][VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-14 Thread Alex Rukletsov
Hi folks, The vote for Mesos 1.1.1 (rc2) has passed with the following votes. +1 (Binding) -- *** AlexR *** Till Tönshoff *** Vinod Kone There were no 0 or -1 votes. Please find the release at: https://dist.apache.org/repos/dist/release/mesos/1.1.1 It is recommende

Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-14 Thread Alex Rukletsov
gt; > On Mar 4, 2017, at 1:09 AM, Vinod Kone wrote: > > +1 (binding) > > Since the perf issue I reported earlier doesn't seem to be a blocker. > > On Fri, Mar 3, 2017 at 12:14 AM, Alex Rukletsov > wrote: > >> Was this perf issue introduced by one of the fixe

Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-03 Thread Alex Rukletsov
t; <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos- > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=-- > verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu% > 3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ub

[VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-02-27 Thread Alex Rukletsov
Hi all, Please vote on releasing the following candidate as Apache Mesos 1.1.1. 1.1.1 includes the following: ** Bug * [MESOS-6002] - The whiteout file cannot be removed correctly using aufs backend. * [MESOS-60

Re: customized IP for health check

2017-01-18 Thread Alex Rukletsov
I'm not sure that exposing a domain will help: do you know the IP of your task upfront, i.e., at the moment when you construct TaskInfo? Isn't your task listening on all interfaces? On Wed, Jan 18, 2017 at 9:54 AM, CmingXu wrote: > the network I am currently used is USER, and each task was assig

[Design Doc] Arbitrary task checks in Mesos

2017-01-05 Thread Alex Rukletsov
We've recently been working on a design for arbitrary task checks [1] in Mesos (currently called probes, but this will likely change). Please have a look and leave comments on the doc or st

Mesos 1.1.1 release dashboard

2016-12-22 Thread Alex Rukletsov
Folks, We are planning to cut the 1.1.1 release early next week. If you have any patches that need to get into 1.1.1, please make sure that either it is already in the 1.1.x branch or the corresponding ticket has a target version including 1.1.1 *by Monday* Dec 26. The release dashboard: https://

Re: Mesos on AWS

2016-12-21 Thread Alex Rukletsov
Kiril— from what you described it does not sound like the problem is the Linux distribution. It may be your AWS configuration. However, if a combination of health checks and heavy loaded agent leads to the agent termination — I would like to investigate this issue. Please come back—with logs!—if y

Re: Proposal: mesosadm, the command to bootstrap the mesos cluster.

2016-12-14 Thread Alex Rukletsov
I have a different opinion on this. Several years ago I came across the concept of "mean wizards" — any helpers that hide away important steps from the user and hence do not give them opportunity to learn how things actually work. (If you're interested it was about projects in Borland IDEs that wer

Re: Can I consider other framework tasks as a resource? Does it make sense?

2016-12-14 Thread Alex Rukletsov
Task dependency is probably too vague to discuss specifically. Mesos currently does not explicitly support arbitrary task dependencies. You mentioned colocation, one type of dependency, so let's look at it. If I understood you correctly, you would like to colocate a task from framework B to the sa

Re: Command healthcheck failed but status KILLED

2016-12-12 Thread Alex Rukletsov
Technically the task hast not failed but was killed by the executor (because it failed a health check). On Fri, Dec 9, 2016 at 11:27 AM, Tomek Janiszewski wrote: > Hi > > What is desired behavior when command health check failed? On Mesos 1.0.2 > when health check fails task has state KILLED ins

Re: Duplicate task IDs

2016-12-11 Thread Alex Rukletsov
I'm fine with prohibiting non-unique IDs, but why do you plan to keep the most recent in case of a conflict? I'd expect any duplicate (that we can find out) is rejected / killed / banned / unchurched. On 9 Dec 2016 8:13 pm, "Joris Van Remoortere" wrote: > Hey Neil, > > I concur that using duplic

Re: Quota

2016-12-11 Thread Alex Rukletsov
Granularity in the allocator is a single agent. Hence even though you set quota for 0.0001 CPU, at least one agent is "blocked". This is probably the reason why marathon is not getting offers. You can turn verbose master logs and check allocator messages to confirm. Alex. On 10 Dec 2016 2:14 am,

Re: healthcheck task?

2016-12-07 Thread Alex Rukletsov
Protos.HealthCheck.html > but not a single example of how to use it > > On Wed, Dec 7, 2016 at 11:22 AM, Alex Rukletsov > wrote: > >> What exactly do you mean under "health check task"? >> >> On Wed, Dec 7, 2016 at 5:09 PM, Victor L wrote: >>

Re: healthcheck task?

2016-12-07 Thread Alex Rukletsov
What exactly do you mean under "health check task"? On Wed, Dec 7, 2016 at 5:09 PM, Victor L wrote: > Can someone recommend simple example of how to add healthcheck task to > java framework? > Thanks, > >

Re: [VOTE] Release Apache Mesos 0.28.3 (rc1)

2016-11-30 Thread Alex Rukletsov
-eu2)/> >> cmake >> [image: Success] >> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/25/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ub

Re: [VOTE] Release Apache Mesos 0.28.3 (rc1)

2016-11-28 Thread Alex Rukletsov
I see LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem failing on CentOS 7 and Fedora 23, see e.g., [1]. I don't see any backports touching [2], can it be a regression or this test is know to be problematic in 0.28.x? [1] http://pastebin.com/c5PzfGF8 [2] https://github.com/apache/mesos/blob/0

Re: [VOTE] Release Apache Mesos 1.0.2 (rc3)

2016-11-10 Thread Alex Rukletsov
+1 (binding) Tested in internal CI. On Mon, Nov 7, 2016 at 8:24 PM, Vinod Kone wrote: > Hi all, > > > Please vote on releasing the following candidate as Apache Mesos 1.0.2. > > > This is a bug fix release. > > > The CHANGELOG for the release is available at: > > https://git-wip-us.apache.org/r

On increasing visibility into experimental features.

2016-11-01 Thread Alex Rukletsov
Folks, Additionally to the "known bugs" proposal in a parallel thread, we think that maintaining a list of still experimental features for each minor release will significantly help users to adjust their expectations. Our suggestion is to include a new section into the CHANGELOG called "Experimen

Re: default docker stop timeout

2016-11-01 Thread Alex Rukletsov
Note that this flag is deprecated in Mesos 1.0.0 in favor of kill policies. On Wed, Oct 26, 2016 at 3:36 PM, Hendrik Haddorp wrote: > right, that's what I found and it also works. I was just wondering if the > default is a good choice. > > On 26.10.2016 11:45, haosdent wrote: > >> It is because

Transition TASK_KILLING -> TASK_RUNNING

2016-10-31 Thread Alex Rukletsov
We've recently discovered a bug that may lead to a task being transitioned from killing to running state. More information about it in MESOS-6457 [1]. We plan to fix it in 1.2.0 and will backport it to all supported versions. [1] https://issues.apache.org/jira/browse/MESOS-6457

Re: [VOTE] Release Apache Mesos 1.1.0 (rc1)

2016-10-25 Thread Alex Rukletsov
This vote is cancelled. We'll cut RC2 later this week after the blockers are resolved. On Tue, Oct 25, 2016 at 5:48 AM, Zameer Manji wrote: > I'm going to -1 (non binding) for the same reason as David Robinson. > > I would classify the FD leak as serious and a violation of the isolation > that t

On Mesos versioning and deprecation policy

2016-10-12 Thread Alex Rukletsov
Folks, There have been a bunch of online [1, 2] and offline discussions about our deprecation and versioning policy. I found that people—including myself—read the versioning doc [3] differently; moreover some aspects are not captured there. I would like to start a discussion around this topic by s

Re: How to shutdown mesos-agent gracefully?

2016-10-12 Thread Alex Rukletsov
To make sure: you are aware of SIGUSR1? On Tue, Oct 11, 2016 at 5:37 PM, tommy xiao wrote: > Hi Ma, > > could you please input more background, why Maintenance feature is not > best option for your request? > > 2016-10-11 14:47 GMT+08:00 haosdent : > > > gracefully means not affect running task

Re: 1.1.0 release

2016-10-12 Thread Alex Rukletsov
, Alex & Till On Tue, Oct 11, 2016 at 5:30 PM, Alex Rukletsov wrote: > Folks, > > in preparation for Mesos 1.1.0 release we would like to ask people who > have worked on features in 1.1.0 to either: > * update the CHANGELOG and declare the feature implemented or > e

Re: 1.1.0 release

2016-10-11 Thread Alex Rukletsov
Folks, in preparation for Mesos 1.1.0 release we would like to ask people who have worked on features in 1.1.0 to either: * update the CHANGELOG and declare the feature implemented or experimental, make sure documentation is updated as well; * postpone to 1.2 and update the related epic; * promote

Re: forcing framework to re-schedule?

2016-09-22 Thread Alex Rukletsov
Victor, could you please describe your case in details? I would like to understand why standard mesos health checks won't suit your case. On Wed, Sep 14, 2016 at 8:15 AM, haosdent wrote: > Hi, @Victor taskId is specified in `TaskInfo` when you launchTask. > > On Wed, Sep 14, 2016 at 6:22 AM, Vi

Re: what is the status on this?

2016-09-21 Thread Alex Rukletsov
out having agreement with your shepherd. Joseph Wu is driving the effort, get in touch with him and I'm sure you'll figure out the plan! On Tue, Sep 13, 2016 at 9:41 PM, kant kodali wrote: > @Alex Rukletsov I am sorry I took some time to respond. I am very excited > since the beginning

Re: mesos marathon roles

2016-09-08 Thread Alex Rukletsov
Vincent, role in a "consumed" resource can be "*", but the allocator will account this resource based on the consumer's role. In other words, if your Marathon is registered in role "prod", all "*" resources it consumes will be accounted for "prod" role. Hence yes, you can let everything unreserve

Re: what is the status on this?

2016-09-05 Thread Alex Rukletsov
en comes it consul. It will be great to see > mesos and consul working together in which we would be ready to jump at it > and make a switch for YARN to Mesos. > > Thanks, > Kant > > > > > On Wed, Aug 31, 2016 1:03 AM, Alex Rukletsov a...@mesosphere.com wrote: >

Re: what is the status on this?

2016-08-31 Thread Alex Rukletsov
Kant— mind telling us what is your use case and why this ticket is important for you? It will help us prioritize work. On Fri, Aug 26, 2016 at 2:46 AM, tommy xiao wrote: > Hi guys, i always focus on t his case. but good news is etcd always have > patchs. so the coming consul is very easy, just

Re: what's the difference between mesos and yarn?

2016-08-24 Thread Alex Rukletsov
On Tue, Aug 23, 2016 at 10:04 AM, Yu Wei wrote: > It seems that yarn could also provide functionality to allocate resource > for applications. > > Yarn could be modified to support various resource allocation > requirements. For mesos, frameworks also need to be written to satisfy > customer requ

Re: [VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-12 Thread Alex Rukletsov
+1 (binding) make check on Mac OS 10.11.6 with apple clang-703.0.31. DockerFetcherPluginTest.INTERNET_CURL_FetchImage is flaky (MESOS-4570), but this does not seem to be a regression or a blocker. On Fri, Aug 12, 2016 at 10:30 PM, Radoslaw Gruchalski wrote: > I am trying to build Mesos 1.0.1 f

Re: [VOTE] Release Apache Mesos 1.0.0 (rc2)

2016-07-15 Thread Alex Rukletsov
Haosdent investigated the issue, and it seems that health checks do work for docker executor. Hence I retract my negative vote. On Fri, Jul 15, 2016 at 12:57 PM, Alex Rukletsov wrote: > -1 (binding): MESOS-5848 > <https://issues.apache.org/jira/browse/MESOS-5848>. The fix is on the

Re: [VOTE] Release Apache Mesos 1.0.0 (rc2)

2016-07-15 Thread Alex Rukletsov
-1 (binding): MESOS-5848 . The fix is on the way. On Wed, Jul 13, 2016 at 1:19 AM, Zhitao Li wrote: > +1 (nonbinding) > > Tested by 1)running all tests on Mac OS, 2) perform upgrade and downgrade > on a small test cluster for both master and slav

Re: removed slace "ID": (131.154.96.172): health check timed out

2016-04-18 Thread Alex Rukletsov
I believe it's because slaves are able to connect to the master, but the master is not able to connect to the slaves. That's why you see them connected for some time and gone afterwards. On Mon, Apr 18, 2016 at 6:47 PM, Stefano Bianchi wrote: > Indeed, i dont know why, i am not able to reach all

Re: removed slace "ID": (131.154.96.172): health check timed out

2016-04-18 Thread Alex Rukletsov
Does this also happen when master3 is leading? My guess is that you're not allowong incoming connections from master1 and master2 to slave3. Generally, masters should be able to connect to slaves, not just respond to their requests. On 18 Apr 2016 13:17, "Stefano Bianchi" wrote: > Hi > On opensta

Re: Mesos agents across a WAN?

2016-03-31 Thread Alex Rukletsov
Jeff, regarding 3: we are investigating this: https://issues.apache.org/jira/browse/MESOS-3548 On Thu, Mar 31, 2016 at 3:56 AM, Jeff Schroeder wrote: > Given regional bare metal Mesos clusters on multiple continents, are there > any known issues running some of the agents over the WAN? Is anyon

Re: Executors no longer inherit environment variables from the agent

2016-03-10 Thread Alex Rukletsov
I have two questions. First, does this change include the executor library? We currently use environment variables to propagate various config values from an agent to executors. If it does, what is the alternative? Second, what will be the preferred way to pass config values to executors? It woul

Re: Access to Design Doc

2016-01-14 Thread Alex Rukletsov
I think public docs shared in public JIRA tickets should have "anyone with the link" permission. In my experience, people tend to read design docs months after the corresponding feature has landed. Moreover, since Apache Mesos is an open project it feels right that everyone can at least read publis

Re: mesos-elasticsearch vs Elasticsearch with Marathon

2015-12-28 Thread Alex Rukletsov
We deploy elasticsearch via Marathon and it works great. > > On Mon, Dec 28, 2015 at 2:17 PM, Eric LEMOINE wrote: >> >> On Mon, Dec 28, 2015 at 7:55 PM, Alex Rukletsov >> wrote: >> > Eric— >> > >> > give me a chance to answer that before you fal

Re: mesos-elasticsearch vs Elasticsearch with Marathon

2015-12-28 Thread Alex Rukletsov
Eric— give me a chance to answer that before you fall into frustration : ). Also, you can directly write to framework developers (mesos...@container-solutions.com) and they either confirm or bust my guess. Or maybe one of the authors — Frank — will chime in in this thread. Marathon has no idea ab

Re: Role-related configuration in Mesos

2015-12-28 Thread Alex Rukletsov
An example that clarifies Benjamin's point: quota is set per role indeed, but it may change in the future (I can envision quotas for individual frameworks as well). I think: * It would be great to merge relevant actions into one endpoint and express the difference via http verbs ("/reservation"

Re: Sync Mesos-Master to Slaves

2015-12-28 Thread Alex Rukletsov
ce this comportement. > > When I deploy only on Ubuntu 14 master+slave, the issue disappear … > > Fred > > > > > > > On 09 Dec 2015, at 16:30, Alex Rukletsov wrote: > > Frederic, > > I have skimmed through the logs and they are do not seem to be comple

Re: Sync Mesos-Master to Slaves

2015-12-09 Thread Alex Rukletsov
Frederic, I have skimmed through the logs and they are do not seem to be complete (especially for master1). Could you please say what task has been killed (id) and which master failover triggered that? I see at least three failovers in the logs : ). Also, could you please share some background abo

Re: Verifying Zero Downtime Upgrade Process For Existing Mesos Cluster

2015-12-07 Thread Alex Rukletsov
Hi Abishek, I would strongly advise not to skip 6 versions. It's hard to say whether there were any changes that will prevent 0.25 masters to talk to 0.19 slaves (my intuition says there were some breaking changes to protobufs). We do *not* support upgrade by skipping version, so please upgrade to

Re: Sync Mesos-Master to Slaves

2015-12-05 Thread Alex Rukletsov
Hey Fred, Logs (master and slave) can be helpful to sched some light on the problem. On 2 Dec 2015 3:01 pm, "Frederic LE BRIS" wrote: > Hi, > > I manage a Mesos Cluster 0.23.0 based on .deb from Mesosphere on Ubuntu > 14.04. > > We deployed 3 zookeeper, 3 Mesos-master, and 3 Marathon : HA Mode >

Re: [VOTE] Release Apache Mesos 0.26.0 (rc3)

2015-12-02 Thread Alex Rukletsov
`make check -j7` — OK `make distcheck -j7` — fails, probably MESOS-3973 , see hints below. Both on Mac OS 10.10.4 I see the following lines in the log: ... libtool: warning: 'libmesos.la' has not been installed in '/Users/alex/Projects/mesos/build

Re: Change roles and weights without restarting Mesos

2015-11-27 Thread Alex Rukletsov
Hey Mario, it's not possible right now, but there are several efforts which intend to fix it in the nearest future. Take a look at [1] and [2]. [1] https://issues.apache.org/jira/browse/MESOS-3988 [2] https://issues.apache.org/jira/browse/MESOS-3177 On Fri, Nov 27, 2015 at 2:24 PM, Mario Pastore

Re: Is it possible to monitor resource usage per-task for the same executor?

2015-11-02 Thread Alex Rukletsov
In mesos, resources are isolated and accounted per container. A task is basically a description, it is up to an executor how to interpret it. In some cases, for example if an executor *just* creates a message in its internal queue for incoming tasks, it is almost impossible to track resource usage

Re: How to trace offers given to services/frameworks

2015-09-29 Thread Alex Rukletsov
The master logs the number of offers it sends to a framework. If you need exact information about offer resources and you use the built-in allocator, run the master with the `GLOG_v=2`, which will trigger detailed allocation logging in the built-in allocator. On Tue, Sep 29, 2015 at 10:35 AM, tomm

Re: Fwd: [Breaking Change 0.24 & Upgrade path] ZooKeeper MasterInfo change.

2015-09-25 Thread Alex Rukletsov
James— Marco will correct me if I'm wrong, but my understanding is that this change does *not* impact what ZooKeeper version you can use with Mesos. We have changed the format of the message stored in ZK from protobuf to JSON. This message is needed by frameworks for mesos master leader detection.

Re: Reservations for multiple different agents

2015-09-22 Thread Alex Rukletsov
Rinaldo, or you may try to install or port svn libs and check whether it works. On Tue, Sep 22, 2015 at 2:25 AM, Guangya Liu wrote: > Hi Rinaldo, > > The dynamic reservation endpoint support was introduced in 0.25.0, you may > want to use the latest code to build. > > If build fails on Oracle L

Re: [VOTE] Release Apache Mesos 0.24.0 (rc2)

2015-09-05 Thread Alex Rukletsov
Afaik, Pythontest is flaky on OS X, and should be fine on Ubuntu. On 4 Sep 2015 10:48 pm, "Bernd Mathiske" wrote: > And also Ubuntu 13.10: [ FAILED ] ExamplesTest.PythonFramework, known > flaky test, so still +1 > > On Sep 4, 2015, at 9:11 PM, Bernd Mathiske wrote: > > +1 [binding] > > MacOS X

Re: How does mesos determine how much memory on a node is available for offer?

2015-09-03 Thread Alex Rukletsov
Mesos agent (aka slave) estimates the memory available and advertises all of it minus 1GB. If there is less than 2GB available, only half is advertised [1]. [1]: https://github.com/apache/mesos/blob/master/src/slave/containerizer/containerizer.cpp#L98 On Thu, Sep 3, 2015 at 4:01 AM, Anand Mazumda

Re: mesos-master resource offer details

2015-09-02 Thread Alex Rukletsov
resource offer made available - the cpu's being > offered and I'm stuck there.. > > I really appreciate if you have any suggestions! Thanks. > > On Wed, Sep 2, 2015 at 9:54 AM, Alex Rukletsov > wrote: > >> To what Haosdent said: you cannot get a list of offers f

Re: mesos-master resource offer details

2015-09-02 Thread Alex Rukletsov
To what Haosdent said: you cannot get a list of offers from master logs, but you can get a list of allocations from the built-in allocator in you bump up the log level (GLOG_v=2). On Wed, Sep 2, 2015 at 7:36 AM, haosdent wrote: > If the offer is rejected by your framework, could you find this lo

Re: Use "docker start" rather than "docker run"?

2015-08-28 Thread Alex Rukletsov
>>>> Hi Paul, >>>> >>>> We don't [re]start a container since we assume once the task terminated >>>> the container is no longer reused. In Mesos to allow tasks to reuse the >>>> same executor and handle task logic accordingly people w

Re: Use "docker start" rather than "docker run"?

2015-08-28 Thread Alex Rukletsov
Paul, that component is called DockerContainerizer and it's part of Mesos Agent (check "/Users/alex/Projects/mesos/src/slave/containerizer/docker.hpp"). @Tim, could you answer the "docker start" vs. "docker run" question? On Fri, Aug 28, 2015 at 1:26 PM, Paul Bell wrote: > Hi All, > > I first p

Re: Custom Scheduler: Diagnosing cause of container task failures

2015-08-25 Thread Alex Rukletsov
It looks like we can have a better error message here. @Jay, mind filing a JIRA ticket for with description, status update, and your fix attached? Thanks! On Fri, Aug 21, 2015 at 7:36 PM, Jay Taylor wrote: > Eventually I was able to isolate what was going on; in this case the > FrameworkInfo.Us

Re: Are the resource options documented?

2015-08-25 Thread Alex Rukletsov
>From Mesos point of view, a resource is just a string, your agents may advertise "gpu", "bananas", "pandas" and so on. However, some resources are known to Mesos, and for them isolation is possible. A good example is a cgroups isolator for "mem" resources, which will invoke OOM killer if necessary

Re: Launching tasks with reserved resources

2015-08-17 Thread Alex Rukletsov
two separate offers > indeed helps here, since each offer comes with a single role. > In any case, I agree it makes sense for a developer to be aware of the > reservation policies. > > > > Regards, > Gidon > > > > > >

Re: Launching tasks with reserved resources

2015-08-17 Thread Alex Rukletsov
Hi Gidon, just to make sure, you mean static reservations on mesos agents (via --resources flag) and not dynamic reservations, right? Let me first try to explain, why you get the TASK_ERROR message. The built-in allocator merges '*' and reserved resources, hinting master to create a single offer.

Re: Lots of master elections

2015-07-07 Thread Alex Rukletsov
Got it. I was confused by your first email where you said you have 2 masters. On Tue, Jul 7, 2015 at 4:40 AM, Ashic Mahtab wrote: > Sure, Alex. > > 3 masters. Quorum is 2. > > -- > Date: Mon, 6 Jul 2015 19:44:28 +0200 > Subject: Re: Lots of master elections > From: a.

Re: Lots of master elections

2015-07-06 Thread Alex Rukletsov
Ashic, great that you solved the issue. Could you please clarify what HA configuration you have: how many masters and what --quorum you use? On Sat, Jul 4, 2015 at 5:09 PM, Ashic Mahtab wrote: > Hi Nikolaos, > I'm using an external zk, so didn't need to restart it. > > I might have jumped the g

Re: [VOTE] Release Apache Mesos 0.23.0 (rc1)

2015-07-06 Thread Alex Rukletsov
-1 Compilation error on Mac OS 10.10.4 with clang 3.5, which is supported according to release notes. More details: https://issues.apache.org/jira/browse/MESOS-2991 On Mon, Jul 6, 2015 at 11:55 AM, Jörg Schad wrote: > P.S. to my prior +1 > Tested on ubuntu-trusty-14.04 including docker. > > On

  1   2   >