Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Cong Wang
On Tue, Mar 15, 2016 at 2:39 PM, Jie Yu  wrote:
> Mesos currently has no notion of long term stable releases (i.e., LTS). I
> think the consensus in the last community sync was to introduce LTS after
> 1.0.


You don't need LTS as kernel, even talking about short term stable releases
like 0.27.2 (?), they look horrible too, I don't see any git tags or
branches for
these releases, just a tar ball?! Huh...


>
> 0.27.2 has already been released. Looks like we need 0.27.3 if we want to
> backport it.


What determines which patches need to backport for Mesos community?
It doesn't look like every bug fix is evaluated and considered after they
are merged into master branch.

>
> I am OK with back porting it. Then the question is that whether we want to
> backport it to other releases as well.
>

It should be backported to whichever releases it applies to and you support,
I don't see Mesos community has such a procedure.


Re: [VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Vinod Kone
+1 (binding)

Tested on ASF CI (ubuntu 14.04 w/ gcc and clang).

On Wed, Mar 16, 2016 at 6:07 PM, Vinod Kone  wrote:

>
> On Wed, Mar 16, 2016 at 5:59 PM, Daniel Osborne <
> daniel.osbo...@metaswitch.com> wrote:
>
>> Is this issue a blocker? Are we moving to rc3 or proceeding with 0.28.0?
>>
>
> It was not marked as such, so I'm guessing not. @Jie and @Zhitao, can you
> confirm?
>
> Also, we still need some binding votes for this release to go official.
> @committers: can you please vote?
>


Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Jie Yu
I understand your frustration. I am curious what review/ticket are you
talking about, and who is the shepherd for your review/ticket?

Mesos project has a clear guide how to contribute to the project, that's
what the community has agreed on:

https://github.com/apache/mesos/blob/master/docs/submitting-a-patch.md#before-you-start-writing-code

"Find a shepherd to collaborate on your patch. A shepherd is a Mesos
committer that will work with you to give you feedback on your proposed
design, and to eventually commit your change into the Mesos source tree."

- Jie




On Wed, Mar 16, 2016 at 2:03 PM, Cong Wang  wrote:

> On Wed, Mar 16, 2016 at 12:18 PM, Jie Yu  wrote:
> >>
> >> like many other review requests are burned or take
> >
> > 6+ months to merge
> >
> >
> > Have you reached out to any shepherd for that ticket/review?
> >
>
> This is exactly where it doesn't work.
>
> You, as qualified as a committer, need to do _your_ work, you have
> to prioritize all the review requests you get, take care of all of them.
> Why? Because you have them all, you should know which of them
> are more important than others therefore should be reviewed earlier
> than later. You can't wait for others to tell you, because you are a
> committer.
>
> Priorities are based on _your_ understanding of the code, rather than
> who you know better or who pings you more than others.
>
> You have 20+ committers, you should be able to handle all of these
> review requests, but apparently some of them are not working at all
> so some of them are overloaded. This is a problem you need to fix,
> rather than being ping'ed.
>


Re: [VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Kapil Arya
+1 (binding).

You can find the links to rpm/deb files for this RC here:

http://open.mesosphere.com/downloads/mesos-rc/

On Thu, Mar 17, 2016 at 12:58 PM, Michael Park  wrote:

> +1 (binding)
>
> Internal CI results with the corresponding JIRA tickets for the failed
> tests:
>
> CentOS 6 (non-SSL):
> CentOS 6 (SSL):
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 )
>
> CentOS 7 (non-SSL):
>   - ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand
> (MESOS-4810)
>
> CentOS 7 (SSL):
>   - LinuxFilesystemIsolatorTest.ROOT_MultipleContainers (Fixed in master)
> (MESOS-4912 )
>   - ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand
> (MESOS-4810 )
>
> Debian 8 (non-SSL):
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 )
>
> Debian 8 (SSL):
>   - NsTest.ROOT_setns
> (MESOS-3000 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 )
>
> Ubuntu 12 (non-SSL):
>   - HealthCheckTest.ROOT_DOCKER_DockerHealthStatusChange
> Failed with MESOS-2017
> 
>
> Ubuntu 12 (SSL): Success!
> Ubuntu 14 (non-SSL): Success!
> Ubuntu 14 (SSL): Success!
> Ubuntu 15 (non-SSL): Success!
> Ubuntu 15 (SSL): Success!
>
> On 11 March 2016 at 15:46, Vinod Kone  wrote:
>
> > Hi all,
> >
> >
> > Please vote on releasing the following candidate as Apache Mesos 0.28.0.
> >
> >
> > 0.28.0 includes the following:
> >
> >
> >
> 
> >
> > Release Notes - Mesos - Version 0.28.0
> >
> > 
> >
> > This release contains the following new features:
> >
> >   * [MESOS-4343] - A new cgroups isolator for enabling the net_cls
> > subsystem in
> >
> > Linux. The cgroups/net_cls isolator allows operators to provide
> network
> >
> > performance isolation and network segmentation for containers within
> a
> > Mesos
> >
> > cluster. To enable the cgroups/net_cls isolator, append
> > `cgroups/net_cls` to
> >
> > the `--isolation` flag when starting the slave. Please refer to
> >
> > docs/mesos-containerizer.md for more details.
> >
> >
> >   * [MESOS-4687] - The implementation of scalar resource values (e.g.,
> "2.5
> >
> > CPUs") has changed. Mesos now reliably supports resources with up to
> > three
> >
> > decimal digits of precision (e.g., "2.501 CPUs"); resources with more
> > than
> >
> > three decimal digits of precision will be rounded. Internally,
> resource
> > math
> >
> > is now done using a fixed-point format that supports three decimal
> > digits of
> >
> > precision, and then converted to/from floating point for input and
> > output,
> >
> > respectively. Frameworks that do their own resource math and
> manipulate
> >
> > fractional resources may observe differences in roundoff error and
> > numerical
> >
> > precision.
> >
> >
> >   * [MESOS-4479] - Reserved resources can now optionally include
> "labels".
> >
> > Labels are a set of key-value pairs that can be used to associate
> > metadata
> >
> > with a reserved resource. For example, frameworks can use this
> feature
> > to
> >
> > distinguish between two reservations for the same role at the same
> > agent
> >
> > that are intended for different purposes.
> >
> >
> >   * [MESOS-2840] - **Experimental** support for container images in Mesos
> >
> > containerizer (a.k.a. Unified Containerizer). This allows frameworks
> to
> >
> > launch Docker/Appc containers using Mesos containerizer without
> relying
> > on
> >
> > docker daemon (engine) or rkt. The isolation of the containers is
> done
> > using
> >
> > isolators. Please refer to docs/container-image.md for currently
> > supported
> >
> > features and limitations.
> >
> >
> >   * [MESOS-4793] - **Experimental** support for v1 Executor HTTP API.
> This
> >
> > allows executors to send HTTP requests to the /api/v1/executor agent
> >
> > endpoint without the need for an executor driver. Please refer to
> >
> > docs/executor-http-api.md for more details.
> >
> >
> >   * [MESOS-4370] Added support for service discovery of Docker containers
> > that
> >
> > use Docker Remote API v1.21.
> >
> >
> > Additional API Changes:
> >
> >   * [MESOS-4066] - Agent should not return partial state when a request
> is
> > made to /state endpoint during recovery.
> >
> >   * [MESOS-4547] - Introduce TASK_KILLING state.
> >
> >   * [MESOS-4712] - Remove 'force' field from the Subscribe Call in v1
> > Scheduler API.
> >
> >   * [MESOS-4591] - Change the object 

Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-19 Thread Cong Wang
On Wed, Mar 16, 2016 at 11:56 AM, Joseph Wu  wrote:
> Cong Wang,
>
> The tags are sync'd.  See: https://github.com/apache/mesos/releases
>
> You might not have done: git pull --tags


Yeah, I figured it out by myself too. This is why I hate tags personally,
branches are better since they are fetched without additional parameters.

Any reason why Mesos maintainers picked tags over branches to manage
releases? Just curious...


Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Cong Wang
On Wed, Mar 16, 2016 at 11:57 AM, Zameer Manji  wrote:
> Cong brings up a good point here. Currently Mesos has a very aggressive
> release cadence. This results in several questions as a cluster operator
> and framework author.
>
>- What is the support from the community/committers for each release?
>- Do cluster operators and framework authors need to move at the same
>space at the community?
>- Will bugfixes be automatically backported?
>
> The lack of clarity here can result in several issues because it is easy
> for the Mesos PMC to cut releases quickly, but it isn't easy for people
> with existing clusters to upgrade at that pace. An aggressive release
> policy without clear support for older releases can leave several users in
> a bad position where they might need to upgrade Mesos through one (or
> more!) releases just to get a critical bugfix.
>

I think the core reason why Mesos community lacks this is there is not
a central maintainer who only/mostly handles maintenance stuffs,
is responsible for the quality of the whole project, and monitors each
release cycle.

Like Linux kernel, Linus is obviously the one who cuts the base releases
and monitors the overall project, he only takes pull requests from each
subsystem maintainers. Each subsystem maintainer needs to decide
what to send to Linus during this cycle.

Mesos has 20+ committers, all of them commit to the master branch,
which makes things worse without a central maintainer. Mesos release
managers are volunteers (from my observation) from the community
rather than a dedicated one, this makes it hard to find one responsible
for a specific release.

Just my 2 cents.


Re: Unzip should work in non interactive mode

2016-03-19 Thread Jie Yu
I can shepherd it. Do you have a patch ready?

- Jie

On Fri, Mar 18, 2016 at 3:13 AM, Tomek Janiszewski 
wrote:

> Hi
>
> Consider situation when deployed zip file is malformed and contains
> duplicated files .
> When fetcher downloads malformed zip file, that contains duplicated files
> (e.g., dist zips generated by gradle) and try to uncompress it, deployment
> hang in staged phase because unzip prompt if file should be replaced. unzip
> should overwrite this file or break with error. I created issue for this
> MESOS-4885
> It looks like easy fix, anyone want to shepherd it?
>
> Best
> Tomek
>


Re: Looking for a shepherd for MESOS-4878

2016-03-19 Thread Jie Yu
Shuai, thanks for the patch. Taking a look at it now.

On Thu, Mar 17, 2016 at 8:30 PM, Shuai Lin  wrote:

> Ping.
>
>
> On Thu, Mar 10, 2016 at 3:55 PM, Shuai Lin  wrote:
>
>> The bug is: if a framework launches a task with mesos containerizer and a
>> docker image, and the slave failed to fetch the image for any reason, the
>> task state would be stuck in STAGING forever.
>>
>> https://issues.apache.org/jira/browse/MESOS-4878
>>
>> @jieyu can you shepherd it? Thanks!
>>
>
>


Re: [VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Vinod Kone
On Wed, Mar 16, 2016 at 5:59 PM, Daniel Osborne <
daniel.osbo...@metaswitch.com> wrote:

> Is this issue a blocker? Are we moving to rc3 or proceeding with 0.28.0?
>

It was not marked as such, so I'm guessing not. @Jie and @Zhitao, can you
confirm?

Also, we still need some binding votes for this release to go official.
@committers: can you please vote?


Re: Vote on #MesosCon proposals, deadline Friday March 25

2016-03-19 Thread David Greenberg
Hi Jay,

Thanks for your feedback! The reason we're asking for you to rank the
topics is that this will allow us to better understand everyone's relative
preferences--next, we'll use standard voting algorithms to determine the
schedule, to ensure most people get as many talks they want as possible. We
hope you enjoy the program we come up with :)

Thanks,
David

On Sat, Mar 19, 2016 at 12:39 AM Jay JN Guo  wrote:

> Hi,
>
> Thank you for this good work and I'm already looking forward to this
> MesosCon.
>
> Although one minor suggestion here, Accept/Reject on a scale of 10 is a
> bit intimidating. Personally, I only have three feeling toward a topic:
> will go/maybe/not interested, whereas quantifying these feeling into a
> scale of 10 for 154 topics is just too much. Maybe we could simplify the
> form in the future. We could take OpenStack summit voting form as an
> example.
>
> Cheers,
> /J
>
> - Original message -
> From: Kiersten Gaffney 
> To: dev@mesos.apache.org, u...@mesos.apache.org
> Cc: David Greenberg , Dave Lester <
> d...@davelester.org>, Kiersten Gaffney 
> Subject: Vote on #MesosCon proposals, deadline Friday March 25
> Date: Sat, Mar 19, 2016 8:11 AM
>
>
> Please take a few minutes the next few days and review what members of the
> community have submitted!
>
> Voting forms close Friday, March 25, 2016, 11:55 PST
>
> A total of 154 proposals were submitted in time for #MesosCon review, up
> significantly from 63 submitted for last year’s conference. Similar to last
> year, the MesosCon program committee is opening these proposals up for
> community review/feedback to better-inform our decisions about what should
> be included in the program.
>
> In order to make it easier to review a subset of the proposals, we’ve
> segmented them based upon two loose themes: Developer and Users.
>
> Developers: http://bit.ly/1RpZPvj
>
> Talks on how frameworks can be used, developed, and integrate with Mesos.
>
> Users: http://bit.ly/1Mspaxp
>
> A combination of talks that are use cases (how company x uses Mesos), and
> operations-focused (how we deploy x, use Docker, etc).
>
> The forms above also include an opportunity to indicate which sessions you
> didn't see proposed but would like to attend.
>
> Thanks in advance for your participation!
>
> Kiersten, Dave, and David (Program Committee)
>
>


Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-19 Thread Cong Wang
On Wed, Mar 16, 2016 at 11:49 AM, Cong Wang  wrote:
> On Mon, Mar 7, 2016 at 8:29 PM, Michael Park  wrote:
>> Please find the release at:
>> https://dist.apache.org/repos/dist/release/mesos/0.27.2
>>
>> It is recommended to use a mirror to download the release:
>> http://www.apache.org/dyn/closer.cgi
>>
>> The CHANGELOG for the release is available at:
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.2
>>
>> The mesos-0.27.2.jar has been released to:
>> https://repository.apache.org
>>
>
> So why the git tags are not synced to github mirror?
>
> $ git tag -l | grep '0\.27\.2'

Oh, my bad, I keep forgetting to add --tags to git-fetch:

$ git tag -l | grep '0\.27\.2'
0.27.2
0.27.2-rc1


Re: [RESULT][VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Bill Farner
Jake - i think that would be wonderful!

On Thu, Mar 17, 2016 at 11:17 AM, Jake Farrell  wrote:

> I've been maintaining a deb/rpm set for Mesos and for Aurora and Thrift we
> have been using the infra supported Bintray to make it available to the
> community via http://www.apache.org/dist/${project}/${os}
>
> If there is interest I'd be happy to put some time into bringing my patches
> into reviews and helping setup jenkins tests, etc.
>
> -Jake
>
>
>
>
>
>
> On Thu, Mar 17, 2016 at 1:41 PM, Vinod Kone  wrote:
>
> > The project itself doesn't officially release rpms/debs, but the
> community
> > members do.  For example, Mesosphere is planning to release rpms/debs
> > shortly.
> >
> > On Thu, Mar 17, 2016 at 10:38 AM, craig w  wrote:
> >
> > > Great news. Do the rpm's get automatically built and released or will
> > they
> > > come later this week?
> > >
> > > On Thu, Mar 17, 2016 at 1:28 PM, Vinod Kone 
> > wrote:
> > >
> > >> Hi all,
> > >>
> > >>
> > >> The vote for Mesos 0.28.0 (rc2) has passed with the
> > >>
> > >> following votes.
> > >>
> > >>
> > >> +1 (Binding)
> > >>
> > >> --
> > >>
> > >> Vinod Kone
> > >>
> > >> Michael Park
> > >>
> > >> Kapil Arya
> > >>
> > >>
> > >> +1 (Non-binding)
> > >>
> > >> --
> > >>
> > >> Greg Mann
> > >>
> > >> Daniel Osborne
> > >>
> > >> Jorg Schad
> > >>
> > >> Zhitao Li
> > >>
> > >>
> > >> There were no 0 or -1 votes.
> > >>
> > >>
> > >> Please find the release at:
> > >>
> > >> https://dist.apache.org/repos/dist/release/mesos/0.28.0
> > >>
> > >>
> > >> It is recommended to use a mirror to download the release:
> > >>
> > >> http://www.apache.org/dyn/closer.cgi
> > >>
> > >>
> > >> The CHANGELOG for the release is available at:
> > >>
> > >>
> > >>
> >
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.0
> > >>
> > >>
> > >> The mesos-0.28.0.jar has been released to:
> > >>
> > >> https://repository.apache.org
> > >>
> > >>
> > >> The website (http://mesos.apache.org) will be updated shortly to
> > reflect
> > >> this release.
> > >>
> > >>
> > >> Thanks,
> > >>
> > >
> > >
> > >
> > > --
> > >
> > > https://github.com/mindscratch
> > > https://www.google.com/+CraigWickesser
> > > https://twitter.com/mind_scratch
> > > https://twitter.com/craig_links
> > >
> > >
> >
>


Re: Contributor request

2016-03-19 Thread Tomek Janiszewski
Hi

Vinod has already  added me and I successfully assigned to MESOS-3243.

Thanks
Tomek
czw., 17 mar 2016, 00:45 użytkownik Artem Harutyunyan 
napisał:

> Hi Tomek,
>
> Sorry for the delay. Just tried to add you but couldn't find you. Which
> email did you use (jani...@gmail.com does appear to be in the list of
> registered users).
>
> Artem.
>
> On Fri, Mar 11, 2016 at 6:48 AM, Tomek Janiszewski 
> wrote:
>
> > Please add me as contributor in the Mesos JIRA project. My Apache JIRA's
> > username: janisz
> >
> > Thanks!
> >
>


RE: [VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Daniel Osborne
Is this issue a blocker? Are we moving to rc3 or proceeding with 0.28.0?

Sorry if this is a silly question, a bit new to the release / voting process.

Best,
-Dan

From: Zhitao Li [mailto:zhitaoli...@gmail.com]
Sent: Tuesday, March 15, 2016 8:15 AM
To: Jörg Schad 
Cc: u...@mesos.apache.org; dev@mesos.apache.org
Subject: Re: [VOTE] Release Apache Mesos 0.28.0 (rc2)

Marked duplicate. Thanks!

On Tue, Mar 15, 2016 at 5:56 AM, Jörg Schad 
> wrote:
I believe the 
ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand issue is 
already tracked here: https://issues.apache.org/jira/browse/MESOS-4810
@zhitaio could you check whether this describes your issue (if so could you 
close the new issue as duplicate?). Thanks!

On Tue, Mar 15, 2016 at 6:55 AM, Zhitao Li 
> wrote:
Filed https://issues.apache.org/jira/browse/MESOS-4946 to track.

All "OsTest" passes under root on my machine.

On Mon, Mar 14, 2016 at 6:30 PM, haosdent 
> wrote:
Maybe fill a ticket in https://issues.apache.org/jira/browse/MESOS would be 
more convenience for further discussion. By the way, could "OsTest.User" pass 
in your machine? It also call "os::getgid" during test.

On Tue, Mar 15, 2016 at 6:57 AM, Zhitao Li 
> wrote:
When running `sudo make check` on debian 8, I saw the following unaccounted 
test failure:


[ FAILED ] ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand 
(1129 ms)



It seems to related to an error message with `Failed to change user to 'root': 
Failed to getgid: unknown user`

I've included verbose test log output at 
https://gist.github.com/zhitaoli/95436f4ea2df13c4b137.

On Mon, Mar 14, 2016 at 2:59 PM, Daniel Osborne 
> wrote:
+1 (non-binding)

Ran `sudo make check` on Centos 7. All tests passed.

Also ran some runtime tests with unified containerizer launching docker images 
and regular mesos tasks, as well as some tasks using the docker containerizer. 
All working as expected

Cheers,
-Dan

-Original Message-
From: Vinod Kone [mailto:vinodk...@apache.org]
Sent: Friday, March 11, 2016 12:46 PM
To: dev >; user 
>
Subject: [VOTE] Release Apache Mesos 0.28.0 (rc2)

Hi all,


Please vote on releasing the following candidate as Apache Mesos 0.28.0.


0.28.0 includes the following:



Release Notes - Mesos - Version 0.28.0



This release contains the following new features:

  * [MESOS-4343] - A new cgroups isolator for enabling the net_cls subsystem in

Linux. The cgroups/net_cls isolator allows operators to provide network

performance isolation and network segmentation for containers within a Mesos

cluster. To enable the cgroups/net_cls isolator, append `cgroups/net_cls` to

the `--isolation` flag when starting the slave. Please refer to

docs/mesos-containerizer.md for more details.


  * [MESOS-4687] - The implementation of scalar resource values (e.g., "2.5

CPUs") has changed. Mesos now reliably supports resources with up to three

decimal digits of precision (e.g., "2.501 CPUs"); resources with more than

three decimal digits of precision will be rounded. Internally, resource math

is now done using a fixed-point format that supports three decimal digits of

precision, and then converted to/from floating point for input and output,

respectively. Frameworks that do their own resource math and manipulate

fractional resources may observe differences in roundoff error and numerical

precision.


  * [MESOS-4479] - Reserved resources can now optionally include "labels".

Labels are a set of key-value pairs that can be used to associate metadata

with a reserved resource. For example, frameworks can use this feature to

distinguish between two reservations for the same role at the same agent

that are intended for different purposes.


  * [MESOS-2840] - **Experimental** support for container images in Mesos

containerizer (a.k.a. Unified Containerizer). This allows frameworks to

launch Docker/Appc containers using Mesos containerizer without relying on

docker daemon (engine) or rkt. The isolation of the containers is done using

isolators. Please refer to 
docs/container-image.md for currently supported

features and limitations.


  * [MESOS-4793] - **Experimental** support for v1 Executor HTTP API. This

allows executors to send HTTP requests to the /api/v1/executor agent

endpoint without 

Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-19 Thread Jie Yu
I like the idea of using branches to manage releases.

We can use that to manage point releases and backports as well.

Say we want to cut 0.29.0 now, we fork a branch 0.29.0 and tag RCs in that
branch. Once the RC is accepted, the head of that branch will become the
release.

Then, we immediate fork that branch and create 0.29.1 branch.

When a new bug fix is committed on the trunk, the committer will decide
whether it'll affect the old releases (a bounded number, we can decide that
later). If it does, the committer of that patch should also cherry-pick
that patch to the point releases (e.g., 0.29.1 in this case). We can do a
timely based point releases.

- Jie

On Fri, Mar 18, 2016 at 1:35 PM, Cong Wang  wrote:

> On Wed, Mar 16, 2016 at 11:56 AM, Joseph Wu  wrote:
> > Cong Wang,
> >
> > The tags are sync'd.  See: https://github.com/apache/mesos/releases
> >
> > You might not have done: git pull --tags
>
>
> Yeah, I figured it out by myself too. This is why I hate tags personally,
> branches are better since they are fetched without additional parameters.
>
> Any reason why Mesos maintainers picked tags over branches to manage
> releases? Just curious...
>


Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Cong Wang
On Wed, Mar 16, 2016 at 5:19 PM, Vinod Kone  wrote:
> Cong, I understand your frustration with the review process and backports.
> I've already created a ticket to track the latter. Would love your
> input/feedback on it.
>
> Regarding the former, we understand the pain. Our use of shepherds is a way
> to tackle the problem. While it's not perfect it has definitely improved the
> situation IMO. As Jie mentioned earlier, if you have some other concrete
> suggestions to improve the process please join us in our community syncs and
> help us! We will be grateful. It is not an easy problem to solve.
>
> As an aside, I feel the tone of this thread has gone from being constructive
> to being attacking and personal. This is not acceptable in the Mesos
> community. Please refer to
> http://www.apache.org/foundation/policies/conduct.html for our code of
> conduct. This might be different from the Linux community.

This has conflicts with the previously paragraph.

I understand why you feel being attacked by just pointing out your mistakes.
Most human beings do, people don't like to admit their mistakes , me too!!!

The only difference is I always thank people who points out my mistakes
instead of feeling attacked.

This is exactly I don't like to join your community syncs. Your
reaction reflects
something deep in your culture (because you as a committer represents the
community), this is why this community can't be improved.

Think about it, Vinod. Remember that, the opposite of love is not hate,
it's indifference. If I really hated your community, I would just keep
silent here
and laugh at you in a different place. You should be smart enough to figure
out which way is helpful to improve your community.

I am _not_ saying my advice is valuable, I am just saying refusing to listen
hurts your community, especially when you consider pointing out your
mistakes as attacks.


Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-19 Thread Joseph Wu
Cong Wang,

The tags are sync'd.  See: https://github.com/apache/mesos/releases

You might not have done: git pull --tags

On Wed, Mar 16, 2016 at 11:49 AM, Cong Wang  wrote:

> On Mon, Mar 7, 2016 at 8:29 PM, Michael Park  wrote:
> > Please find the release at:
> > https://dist.apache.org/repos/dist/release/mesos/0.27.2
> >
> > It is recommended to use a mirror to download the release:
> > http://www.apache.org/dyn/closer.cgi
> >
> > The CHANGELOG for the release is available at:
> >
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.2
> >
> > The mesos-0.27.2.jar has been released to:
> > https://repository.apache.org
> >
>
> So why the git tags are not synced to github mirror?
>
> $ git tag -l | grep '0\.27\.2'
>


0.28.1

2016-03-19 Thread Jie Yu
Hi,

We recently noticed two bugs

in
0.28.0 related to the unified containerizer:

Because of that, I propose we cut a point release (0.28.1) once these two
bugs are fixed. I volunteer to be the release manager for this point
release.

In the meantime, if you have any issue that you want to merge into 0.28.1,
please mark the relevant ticket's fix version to be 0.28.1 so that I am
aware of that.

Thanks!
- Jie


Re: [DISCUSS] Fetching Docker Images Requiring User Credentials.

2016-03-19 Thread Avinash Sridharan
This might be a bit far fetched, but does it make sense to associate these
credential configurations with roles? Roles identify the capability of
frameworks in obtaining resources so was wondering if we can use the same
capability to distinguish (and control access) to credentials ?

On Wed, Mar 16, 2016 at 1:34 PM, Kevin Klues  wrote:

> On Tue, Mar 15, 2016 at 6:10 PM, Gilbert Song 
> wrote:
> > @Kevin, thanks for writing it down in detail. It sounds good that a more
> > concrete
> > schema is designed to generally solve similar auth problem.
> >
> > Just have two potential issues inlined below:
> >
> > On Tue, Mar 15, 2016 at 5:39 PM, Kevin Klues  wrote:
> >>
> >> Yeah, option 2.
> >>
> >> I was trying to expand on Avinash's suggestion and make it a bit more
> >> concrete in terms of what was being proposed. Needing to reload the
> >> agent just to update the list of credentials it accepts seems
> >> undesirable though.
> >>
> >> Maybe we could have a way to start the agent with a default config (by
> >> iterating on the schema from my previous email), but allow newly
> >> launched frameworks to somehow update the config on the fly through a
> >
> >
> > Will it be too expensive to update all agents every time a new framework
> > joins (handling consensus problem as well)?
>
> Not sure, I haven't though about it in depth.  What I was picturing
> though was something exactly like what you describe for how the docker
> containerizer currently solves this problem, except instead of using
> docker/config.json directly, use a new credentials.json file which
> follows a schema similar to what I proposed above.
>
> >>
> >> file in their sandbox that follows the same schema.
> >
> >
> > Does that mean the file in sandbox should be exposed to each other?
> >
> >>
> >> On Tue, Mar 15, 2016 at 5:25 PM, Jie Yu  wrote:
> >> > Kevin, are you suggesting option 2 and having a config file like the
> >> > above?
> >> >
> >> > I think another downside of a per-agent config is that it's hard to
> >> > maintain this. What if a new framework joins and has a new credential
> >> > for
> >> > the docker images. Do we need to restart the agent to reload the
> config?
> >> >
> >> > - Jie
> >> >
> >> > On Tue, Mar 15, 2016 at 1:25 PM, Kevin Klues 
> wrote:
> >> >
> >> >> Can we be a bit more concrete here and try to build up a schema for
> >> >> this.
> >> >> Maybe something like:
> >> >>
> >> >> {
> >> >>   [
> >> >> {
> >> >>   "service" : "docker",
> >> >>   "registries" :
> >> >>   [
> >> >> "uri" : "",
> >> >> "default_credentials" :
> >> >> {
> >> >>   "type" : "",
> >> >>   "credential" :
> >> >>   {
> >> >>   // Custom based on type...
> >> >>   }
> >> >> },
> >> >> "image_credentials" :
> >> >> [
> >> >>   {
> >> >> "image_name" : "",
> >> >> "type" : "",
> >> >> "credential" :
> >> >> {
> >> >>   // Custom based on type...
> >> >> },
> >> >>   },
> >> >>   ...
> >> >> ],
> >> >> ...
> >> >>   ]
> >> >>   ...
> >> >> },
> >> >> ...
> >> >>   ]
> >> >> }
> >> >>
> >> >>
> >> >> On Tue, Mar 15, 2016 at 12:57 PM, Jie Yu 
> wrote:
> >> >> >>
> >> >> >> Yeah I was thinking having the JSON as a dictionary with keys
> being
> >> >> >> the
> >> >> >> registry URI (appc/docker) and the values being credentials (which
> >> >> >> will
> >> >> be
> >> >> >> a dictionary as well I guess).
> >> >> >
> >> >> >
> >> >> > Using registry URI as the key is problematic. Think about the
> public
> >> >> docker
> >> >> > hub. Different frameworks might want to use different credentials
> to
> >> >> access
> >> >> > their docker images.
> >> >> >
> >> >> > - Jie
> >> >> >
> >> >> > On Tue, Mar 15, 2016 at 11:52 AM, Avinash Sridharan <
> >> >> avin...@mesosphere.io
> >> >> >
> >> >> > wrote:
> >> >> >
> >> >> >> On Tue, Mar 15, 2016 at 11:43 AM, Vinod Kone <
> vinodk...@apache.org>
> >> >> wrote:
> >> >> >>
> >> >> >> > moved core@ to *bcc*
> >> >> >> >
> >> >> >> > On Tue, Mar 15, 2016 at 11:18 AM, Avinash Sridharan <
> >> >> >> avin...@mesosphere.io
> >> >> >> > > wrote:
> >> >> >> >
> >> >> >> >> Why not follow option 2, but instead of passing the agent
> >> >> credentials,
> >> >> >> >> pass a location to the flag where credentials for the registry
> >> >> >> >> can be
> >> >> >> found
> >> >> >> >> (in JSON)? The frameworks can set credentials (maybe registry
> >> >> >> >> name or
> >> >> >> URL
> >> >> >> >> to the registry), and the credentials can be learnt from the
> JSON
> >> >> >> config.
> >> >> >> >>
> >> >> >> >
> >> >> >> > What if we need credentials for multiple-registries? Have a JSON
> >> >> >> > with
> >> >> one
> >> >> >> > credential per registry I guess? But if possible, 

Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Cong Wang
On Wed, Mar 16, 2016 at 11:58 AM, Jie Yu  wrote:
>
> Currently, it's based on request. We definitely need to improve this part.


It simply doesn't work, like many other review requests are burned or take
6+ months to merge. I am sure you need to improve that too, but after
watching Mesos community for months, I don't see any improvement yet.


Re: [VOTE] Release Apache Mesos 0.26.1 (rc2)

2016-03-19 Thread Benjamin Mahler
Thanks for the hard work! Do we need to backport the rmdir fixes on the
outstanding release candidates?

commit 5278e5cc50544ed7af28b15a1acd2b2e96a15a47
Author: Jojy Varghese 
Date:   Tue Mar 15 17:12:01 2016 -0700

Added support for FTS_SLNONE in rmdir.

Review: https://reviews.apache.org/r/44874/

commit fbe1f37f65fd9f1d4f2c30a3cfd7a50df92ccc2c
Author: Alex Clemmer 
Date:   Tue Mar 1 23:29:21 2016 -0800

Stout:[1/2] Fixed error reporting bug in `os::rmdir`.

Review: https://reviews.apache.org/r/43907/

commit f8b7ac28b1a918864a06b3f99f45b0257c7b6f68
Author: Jojy Varghese 
Date:   Tue Mar 1 14:32:13 2016 -0800

Added FS_DEFAULT case in rmdir.

We currently dont handle special files like device files in rmdir. This
change adds FS_DEFAULT as one of the cases where we try to unlink a
file. Reference: http://man7.org/linux/man-pages/man3/fts.3.html

Review: https://reviews.apache.org/r/44230/

On Wed, Mar 16, 2016 at 8:21 PM, Vinod Kone  wrote:

> +1 (binding)
>
> Tested on ASF CI.
>
> On Sun, Mar 13, 2016 at 4:33 PM, Michael Park  wrote:
>
> > +1 (binding)
> >
> > Internal CI results with the corresponding JIRA tickets for the failed
> > tests:
> >
> > CentOS 6 (non-SSL):
> >   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
> > (MESOS-3049 )
> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> > (MESOS-4039 )
> >   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
> > (MESOS-4035 )
> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> > (MESOS-3215 )
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> > (MESOS-4047 ,
> > MESOS-4053 )
> >
> > CentOS 6 (SSL):
> >   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
> > (MESOS-3049 )
> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> > (MESOS-4039 )
> >   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
> > (MESOS-4035 )
> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> > (MESOS-3215 )
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> > (MESOS-4047 ,
> > MESOS-4053 )
> >
> > CentOS 7 (non-SSL):
> >   - LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
> > (MESOS-4677 )
> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> > (MESOS-4039 )
> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> > (MESOS-3215 )
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> > (MESOS-4047 ,
> > MESOS-4053 )
> >
> > CentOS 7 (SSL):
> >   - FetcherCacheTest.RemoveLRUCacheEntries
> > (MESOS-4156 )
> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> > (MESOS-4039 )
> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> > (MESOS-3215 )
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> > (MESOS-4047 ,
> > MESOS-4053 )
> >
> > Debian 8 (non-SSL): Success!
> > Debian 8 (SSL): Failed with MESOS-2017
> > 
> >
> > Ubuntu 12 (non-SSL):
> > Ubuntu 12 (SSL):
> > Ubuntu 14 (non-SSL):
> > Ubuntu 14 (SSL):
> >   - UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
> >   - UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> > (MESOS-4035 )
> >
> > Ubuntu 15 (non-SSL): Success!
> > Ubuntu 15 (SSL): Success!
> >
> > On 13 March 2016 at 18:43, Michael Park  wrote:
> >
> > > While the vote for this release was open until Fri Mar 11 23:59:59 EST
> > > 2016,
> > > I'm going to give it another 3 days since there has not been any -1
> > votes.
> > >
> > 

Re: Looking for shepherd (MESOS-4355 - Docker Volume Isolator)

2016-03-19 Thread Jie Yu
Guangya, I'd be happy to shepherd this work if no other committers
volunteer for this work.

- Jie

On Thu, Mar 17, 2016 at 6:05 AM, Guangya Liu  wrote:

> Hi,
>
> I was now working on the FS for MESOS-4355 with some EMC guys, can anyone
> help shepherd for this? There are some issues need to discuss with the
> shepherd.
>
> Thanks,
>
> Guangya
>


Re: [RESULT][VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Vinod Kone
+1

@vinodkone

> On Mar 17, 2016, at 11:27 AM, Bill Farner  wrote:
> 
> Jake - i think that would be wonderful!
> 
>> On Thu, Mar 17, 2016 at 11:17 AM, Jake Farrell  wrote:
>> I've been maintaining a deb/rpm set for Mesos and for Aurora and Thrift we
>> have been using the infra supported Bintray to make it available to the
>> community via http://www.apache.org/dist/${project}/${os}
>> 
>> If there is interest I'd be happy to put some time into bringing my patches
>> into reviews and helping setup jenkins tests, etc.
>> 
>> -Jake
>> 
>> 
>> 
>> 
>> 
>> 
>> On Thu, Mar 17, 2016 at 1:41 PM, Vinod Kone  wrote:
>> 
>> > The project itself doesn't officially release rpms/debs, but the community
>> > members do.  For example, Mesosphere is planning to release rpms/debs
>> > shortly.
>> >
>> > On Thu, Mar 17, 2016 at 10:38 AM, craig w  wrote:
>> >
>> > > Great news. Do the rpm's get automatically built and released or will
>> > they
>> > > come later this week?
>> > >
>> > > On Thu, Mar 17, 2016 at 1:28 PM, Vinod Kone 
>> > wrote:
>> > >
>> > >> Hi all,
>> > >>
>> > >>
>> > >> The vote for Mesos 0.28.0 (rc2) has passed with the
>> > >>
>> > >> following votes.
>> > >>
>> > >>
>> > >> +1 (Binding)
>> > >>
>> > >> --
>> > >>
>> > >> Vinod Kone
>> > >>
>> > >> Michael Park
>> > >>
>> > >> Kapil Arya
>> > >>
>> > >>
>> > >> +1 (Non-binding)
>> > >>
>> > >> --
>> > >>
>> > >> Greg Mann
>> > >>
>> > >> Daniel Osborne
>> > >>
>> > >> Jorg Schad
>> > >>
>> > >> Zhitao Li
>> > >>
>> > >>
>> > >> There were no 0 or -1 votes.
>> > >>
>> > >>
>> > >> Please find the release at:
>> > >>
>> > >> https://dist.apache.org/repos/dist/release/mesos/0.28.0
>> > >>
>> > >>
>> > >> It is recommended to use a mirror to download the release:
>> > >>
>> > >> http://www.apache.org/dyn/closer.cgi
>> > >>
>> > >>
>> > >> The CHANGELOG for the release is available at:
>> > >>
>> > >>
>> > >>
>> > https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.0
>> > >>
>> > >>
>> > >> The mesos-0.28.0.jar has been released to:
>> > >>
>> > >> https://repository.apache.org
>> > >>
>> > >>
>> > >> The website (http://mesos.apache.org) will be updated shortly to
>> > reflect
>> > >> this release.
>> > >>
>> > >>
>> > >> Thanks,
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > > https://github.com/mindscratch
>> > > https://www.google.com/+CraigWickesser
>> > > https://twitter.com/mind_scratch
>> > > https://twitter.com/craig_links
>> > >
>> > >
>> >
> 


Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-19 Thread Cong Wang
On Mon, Mar 7, 2016 at 8:29 PM, Michael Park  wrote:
> Please find the release at:
> https://dist.apache.org/repos/dist/release/mesos/0.27.2
>
> It is recommended to use a mirror to download the release:
> http://www.apache.org/dyn/closer.cgi
>
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.2
>
> The mesos-0.27.2.jar has been released to:
> https://repository.apache.org
>

So why the git tags are not synced to github mirror?

$ git tag -l | grep '0\.27\.2'


Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Cong Wang
On Wed, Mar 16, 2016 at 2:21 PM, Jie Yu  wrote:
> I understand your frustration. I am curious what review/ticket are you
> talking about, and who is the shepherd for your review/ticket?


Why not check your backlog for your answer? Or do you need me to write
a script to scan all the pending review requests for you?


>
> Mesos project has a clear guide how to contribute to the project, that's
> what the community has agreed on:
>
> https://github.com/apache/mesos/blob/master/docs/submitting-a-patch.md#before-you-start-writing-code
>

I assume this doesn't apply to your committers, at least BenM:

commit 152ac2b13916bcf2bb9e52accc4951c3ce5bfd76
Author: Benjamin Mahler 
Date:   Sun Feb 21 14:22:07 2016 +0100

Log the shutdown duration in the executor driver.

commit 1488f16d283f69b7dc96feaee91b04a09012ca4a
Author: Benjamin Mahler 
Date:   Sat Feb 20 17:35:30 2016 +0100


Added TASK_KILLING to the API changes in the CHANGELOG.


commit 978ccb5dd637f0e1577ecae1e21973f50429b04c
Author: Benjamin Mahler 
Date:   Sat Feb 20 17:28:58 2016 +0100


Added docker executor tests for TASK_KILLING.


commit ee86b13633a9469629dbd79681d0776b6020f76a
Author: Benjamin Mahler 
Date:   Sat Feb 20 16:18:22 2016 +0100


Added command executor tests for TASK_KILLING.


commit 25d303d8743b524c92627d48f7dfb7ac2a921ede
Author: Benjamin Mahler 
Date:   Sat Feb 20 15:31:28 2016 +0100


Fixed health check process leak when shutdown is called without killTask.



> "Find a shepherd to collaborate on your patch. A shepherd is a Mesos
> committer that will work with you to give you feedback on your proposed
> design, and to eventually commit your change into the Mesos source tree."
>

This doesn't work, and it needs to change. I already state my reason in the
previous reply, which is just ignored, yeah, like many other requests.


Re: [RESULT][VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Vinod Kone
The project itself doesn't officially release rpms/debs, but the community
members do.  For example, Mesosphere is planning to release rpms/debs
shortly.

On Thu, Mar 17, 2016 at 10:38 AM, craig w  wrote:

> Great news. Do the rpm's get automatically built and released or will they
> come later this week?
>
> On Thu, Mar 17, 2016 at 1:28 PM, Vinod Kone  wrote:
>
>> Hi all,
>>
>>
>> The vote for Mesos 0.28.0 (rc2) has passed with the
>>
>> following votes.
>>
>>
>> +1 (Binding)
>>
>> --
>>
>> Vinod Kone
>>
>> Michael Park
>>
>> Kapil Arya
>>
>>
>> +1 (Non-binding)
>>
>> --
>>
>> Greg Mann
>>
>> Daniel Osborne
>>
>> Jorg Schad
>>
>> Zhitao Li
>>
>>
>> There were no 0 or -1 votes.
>>
>>
>> Please find the release at:
>>
>> https://dist.apache.org/repos/dist/release/mesos/0.28.0
>>
>>
>> It is recommended to use a mirror to download the release:
>>
>> http://www.apache.org/dyn/closer.cgi
>>
>>
>> The CHANGELOG for the release is available at:
>>
>>
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.0
>>
>>
>> The mesos-0.28.0.jar has been released to:
>>
>> https://repository.apache.org
>>
>>
>> The website (http://mesos.apache.org) will be updated shortly to reflect
>> this release.
>>
>>
>> Thanks,
>>
>
>
>
> --
>
> https://github.com/mindscratch
> https://www.google.com/+CraigWickesser
> https://twitter.com/mind_scratch
> https://twitter.com/craig_links
>
>


Re: [VOTE] Release Apache Mesos 0.26.1 (rc2)

2016-03-19 Thread Benjamin Mahler
These are be captured under:
https://issues.apache.org/jira/browse/MESOS-4979

On Thu, Mar 17, 2016 at 5:04 PM, Benjamin Mahler  wrote:

> Thanks for the hard work! Do we need to backport the rmdir fixes on the
> outstanding release candidates?
>
> commit 5278e5cc50544ed7af28b15a1acd2b2e96a15a47
> Author: Jojy Varghese 
> Date:   Tue Mar 15 17:12:01 2016 -0700
>
> Added support for FTS_SLNONE in rmdir.
>
> Review: https://reviews.apache.org/r/44874/
>
> commit fbe1f37f65fd9f1d4f2c30a3cfd7a50df92ccc2c
> Author: Alex Clemmer 
> Date:   Tue Mar 1 23:29:21 2016 -0800
>
> Stout:[1/2] Fixed error reporting bug in `os::rmdir`.
>
> Review: https://reviews.apache.org/r/43907/
>
> commit f8b7ac28b1a918864a06b3f99f45b0257c7b6f68
> Author: Jojy Varghese 
> Date:   Tue Mar 1 14:32:13 2016 -0800
>
> Added FS_DEFAULT case in rmdir.
>
> We currently dont handle special files like device files in rmdir. This
> change adds FS_DEFAULT as one of the cases where we try to unlink a
> file. Reference: http://man7.org/linux/man-pages/man3/fts.3.html
>
> Review: https://reviews.apache.org/r/44230/
>
> On Wed, Mar 16, 2016 at 8:21 PM, Vinod Kone  wrote:
>
>> +1 (binding)
>>
>> Tested on ASF CI.
>>
>> On Sun, Mar 13, 2016 at 4:33 PM, Michael Park  wrote:
>>
>> > +1 (binding)
>> >
>> > Internal CI results with the corresponding JIRA tickets for the failed
>> > tests:
>> >
>> > CentOS 6 (non-SSL):
>> >   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
>> > (MESOS-3049 )
>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>> > (MESOS-4039 )
>> >   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
>> > (MESOS-4035 )
>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>> > (MESOS-3215 )
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>> > (MESOS-4047 ,
>> > MESOS-4053 )
>> >
>> > CentOS 6 (SSL):
>> >   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
>> > (MESOS-3049 )
>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>> > (MESOS-4039 )
>> >   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
>> > (MESOS-4035 )
>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>> > (MESOS-3215 )
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>> > (MESOS-4047 ,
>> > MESOS-4053 )
>> >
>> > CentOS 7 (non-SSL):
>> >   - LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
>> > (MESOS-4677 )
>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>> > (MESOS-4039 )
>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>> > (MESOS-3215 )
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>> > (MESOS-4047 ,
>> > MESOS-4053 )
>> >
>> > CentOS 7 (SSL):
>> >   - FetcherCacheTest.RemoveLRUCacheEntries
>> > (MESOS-4156 )
>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>> > (MESOS-4039 )
>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>> > (MESOS-3215 )
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>> > (MESOS-4047 ,
>> > MESOS-4053 )
>> >
>> > Debian 8 (non-SSL): Success!
>> > Debian 8 (SSL): Failed with MESOS-2017
>> > 
>> >
>> > Ubuntu 12 (non-SSL):
>> > Ubuntu 12 (SSL):
>> > Ubuntu 14 (non-SSL):
>> > Ubuntu 14 (SSL):
>> >   - UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
>> >   - UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
>> > (MESOS-4035 )
>> >
>> > Ubuntu 15 (non-SSL): Success!
>> > 

Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Vinod Kone
Formalizing Mesos release strategy and support is something that I've been
thinking about a lot lately. In my mind, this is a blocker for Mesos
reaching 1.0.

I've filed https://issues.apache.org/jira/browse/MESOS-4962 to track this
work and added it to our current sprint; hopefully I/we can come up with a
design doc before end of the next week. If you have any suggestions or
requirements please feel free to chime in on the ticket or reach out on the
mailing list.

Thanks,

On Wed, Mar 16, 2016 at 12:37 PM, Jie Yu  wrote:

> Zhitao, that's a fair point. Can you add an agenda item to the next
> community sync to discuss this? Thanks!
>
> - Jie
>
> On Wed, Mar 16, 2016 at 12:16 PM, Zhitao Li  wrote:
>
>> Maybe we can try to draft a formal guideline about when/how something
>> should be back ported, and making sure interested parties in the community
>> have chance to get their voices heard?
>>
>> I'm also interested in knowing how much work it generates when they cut
>> with back port releases, and how the community could help.
>>
>> On Wed, Mar 16, 2016 at 12:11 PM, Cong Wang 
>> wrote:
>>
>> > On Wed, Mar 16, 2016 at 11:58 AM, Jie Yu  wrote:
>> > >
>> > > Currently, it's based on request. We definitely need to improve this
>> > part.
>> >
>> >
>> > It simply doesn't work, like many other review requests are burned or
>> take
>> > 6+ months to merge. I am sure you need to improve that too, but after
>> > watching Mesos community for months, I don't see any improvement yet.
>> >
>>
>>
>>
>> --
>> Cheers,
>>
>> Zhitao Li
>>
>
>


Re: RFC: RevocableInfo Changes

2016-03-19 Thread connor . p . d
Thanks for the good explanations so far Ben and Klaus.  Apologies if you guys 
already covered these questions in the meeting:

If throttling is tolerable but preemption is not, how would that be expressed? 
(Is that supported?)

How does this work with the QoS controller? Will there be a new correction type 
to indicate throttling, or does throttling happen "behind the agent's back"?

Thanks,
--
Connor

> On Mar 19, 2016, at 04:01, Klaus Ma  wrote:
> 
> @team, in the latest meeting, we agree to keep current name ThrottleInfo.
> 
> If any more comments, please let me know.
> 
>> On Wednesday, March 16, 2016 at 9:32:37 PM UTC+8, Guangya Liu wrote:
>> Also please show your comments if any for the name here, the current name is 
>> ThrottleInfo, in Kubernetes resources qos design document, they are using 
>> scavenging as the key work for such behaviour, so a possible name here could 
>> be ScavengeInfo , please show your comments if any for those two names or 
>> even if you want to propose a new name here.
>> 
>> message RevocableInfo {
>> message ThrottleInfo {}
>> 
>> // If set, indicates that the resources may be throttled at
>> // any time. Throttle-able resoruces can be used for tasks
>> // that do not have strict performance requirements and are
>> // capable of handling being throttled.
>> optional ThrottleInfo throttle_info = 1;
>>   }
>> 
>> 在 2016年3月16日星期三 UTC+8上午10:24:14,Klaus Ma写道:
>>> 
>>> The patches are updated accordingly; JIRA: MESOS-3888 , RR: 
>>> https://reviews.apache.org/r/40375/ .
>>> 
>>> Thanks
>>> klaus
>>> 
 On Saturday, March 12, 2016 at 11:09:46 AM UTC+8, Benjamin Mahler wrote:
 Hey folks,
 
 In the resource allocation working group we've been looking into a few 
 projects that will make the allocator able to offer out resources as 
 revocable. For example:
 
 -We'll want to eventually allocate resources as revocable _by default_, 
 only allowing non-revocable when there are guarantees put in place (static 
 reservations or quota).
 
 -On the path to revocable by default, we can incrementally start to offer 
 certain resources as revocable. Consider when quota is set but the role 
 isn't using all of the quota. The unallocated quota can be offered to 
 other roles, but it should be revocable because we may revoke them should 
 the quota'ed role want to use the resources. Unused reservations fall into 
 a similar category.
 
 -Going revocable by default also allows us to enforce fairness in a 
 dynamically changing cluster by revoking resources as weights are changed, 
 frameworks are added or removed, etc.
 
 In this context, "revocable" means that the resources may be taken away 
 and the container will be destroyed. The meaning of "revocable" in the 
 context of usage oversubscription includes this, but also the container 
 may experience a throttling (e.g. lower cpu shares, less network priority, 
 etc).
 
 For this reason, and because we internally need to distinguish revocable 
 resources between the those that are generated by usage oversubscription 
 and those that are generated by the allocator, we're thinking of the 
 following change to the API:
 
 
 
 -  message RevocableInfo {}
 +  message RevocableInfo {
 +message ThrottleInfo {}
 +
 +// If set, indicates that the resources may be throttled at
 +// any time. Throttle-able resoruces can be used for tasks
 +// that do not have strict performance requirements and are
 +// capable of handling being throttled.
 +optional ThrottleInfo throttle_info;
 +  }
 
// If this is set, the resources are revocable, i.e., any tasks or
 -  // executors launched using these resources could get preempted or
 -  // throttled at any time. This could be used by frameworks to run
 -  // best effort tasks that do not need strict uptime or performance
 +  // executors launched using these resources could be terminated at
 +  // any time. This could be used by frameworks to run
 +  // best effort tasks that do not need strict uptime
// guarantees. Note that if this is set, 'disk' or 'reservation'
// cannot be set.
optional RevocableInfo revocable = 9;
 
 
 
 Essentially we want to distinguish between revocable and revocable + 
 throttle-able. This is because usage-oversubscription generates 
 throttle-able revocable resources, whereas the allocator does not. This 
 also solves our problem of distinguishing between these two kinds of 
 revocable resources internally.
 
 Feedback welcome!
 
 Ben
 


Re: [VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Zhitao Li
I don't think it's a blocking issue after some initial investigation.

Changing my vote to +1 (nonbinding)

On Wed, Mar 16, 2016 at 6:07 PM, Vinod Kone  wrote:

>
> On Wed, Mar 16, 2016 at 5:59 PM, Daniel Osborne <
> daniel.osbo...@metaswitch.com> wrote:
>
>> Is this issue a blocker? Are we moving to rc3 or proceeding with 0.28.0?
>>
>
> It was not marked as such, so I'm guessing not. @Jie and @Zhitao, can you
> confirm?
>
> Also, we still need some binding votes for this release to go official.
> @committers: can you please vote?
>



-- 
Cheers,

Zhitao Li


Re: [VOTE] Release Apache Mesos 0.26.1 (rc2)

2016-03-19 Thread Michael Park
Ben,

Do I understand correctly that this is not a regression, but a fix
important enough for us to backport?
I'm curious as to what makes it significant. Could you elaborate a little
as to what the consequences are?

Thanks!

MPark

On 18 March 2016 at 16:20, Benjamin Mahler  wrote:

> These are be captured under:
> https://issues.apache.org/jira/browse/MESOS-4979
>
> On Thu, Mar 17, 2016 at 5:04 PM, Benjamin Mahler 
> wrote:
>
>> Thanks for the hard work! Do we need to backport the rmdir fixes on the
>> outstanding release candidates?
>>
>> commit 5278e5cc50544ed7af28b15a1acd2b2e96a15a47
>> Author: Jojy Varghese 
>> Date:   Tue Mar 15 17:12:01 2016 -0700
>>
>> Added support for FTS_SLNONE in rmdir.
>>
>> Review: https://reviews.apache.org/r/44874/
>>
>> commit fbe1f37f65fd9f1d4f2c30a3cfd7a50df92ccc2c
>> Author: Alex Clemmer 
>> Date:   Tue Mar 1 23:29:21 2016 -0800
>>
>> Stout:[1/2] Fixed error reporting bug in `os::rmdir`.
>>
>> Review: https://reviews.apache.org/r/43907/
>>
>> commit f8b7ac28b1a918864a06b3f99f45b0257c7b6f68
>> Author: Jojy Varghese 
>> Date:   Tue Mar 1 14:32:13 2016 -0800
>>
>> Added FS_DEFAULT case in rmdir.
>>
>> We currently dont handle special files like device files in rmdir.
>> This
>> change adds FS_DEFAULT as one of the cases where we try to unlink a
>> file. Reference: http://man7.org/linux/man-pages/man3/fts.3.html
>>
>> Review: https://reviews.apache.org/r/44230/
>>
>> On Wed, Mar 16, 2016 at 8:21 PM, Vinod Kone  wrote:
>>
>>> +1 (binding)
>>>
>>> Tested on ASF CI.
>>>
>>> On Sun, Mar 13, 2016 at 4:33 PM, Michael Park  wrote:
>>>
>>> > +1 (binding)
>>> >
>>> > Internal CI results with the corresponding JIRA tickets for the failed
>>> > tests:
>>> >
>>> > CentOS 6 (non-SSL):
>>> >   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
>>> > (MESOS-3049 )
>>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>>> > (MESOS-4039 )
>>> >   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
>>> > (MESOS-4035 )
>>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>>> > (MESOS-3215 )
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>>> > (MESOS-4047 ,
>>> > MESOS-4053 )
>>> >
>>> > CentOS 6 (SSL):
>>> >   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
>>> > (MESOS-3049 )
>>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>>> > (MESOS-4039 )
>>> >   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
>>> > (MESOS-4035 )
>>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>>> > (MESOS-3215 )
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>>> > (MESOS-4047 ,
>>> > MESOS-4053 )
>>> >
>>> > CentOS 7 (non-SSL):
>>> >   - LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
>>> > (MESOS-4677 )
>>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>>> > (MESOS-4039 )
>>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>>> > (MESOS-3215 )
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>>> > (MESOS-4047 ,
>>> > MESOS-4053 )
>>> >
>>> > CentOS 7 (SSL):
>>> >   - FetcherCacheTest.RemoveLRUCacheEntries
>>> > (MESOS-4156 )
>>> >   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
>>> > (MESOS-4039 )
>>> >   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
>>> > (MESOS-3215 )
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>>> >   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
>>> > (MESOS-4047 ,
>>> > MESOS-4053 )
>>> >
>>> > Debian 8 (non-SSL): Success!
>>> > Debian 8 

Re: [VOTE] Release Apache Mesos 0.24.2 (rc2)

2016-03-19 Thread Michael Park
As there are insufficient votes on this rc along with a request
from Evan Krall to include additional fixes:
https://www.mail-archive.com/user@mesos.apache.org/msg06205.html,
I'm declaring this rc failed, and will cut be cutting an rc3 early next
week.

Thanks,

MPark

On 13 March 2016 at 20:57, Michael Park  wrote:

> +1 (binding)
>
> Internal CI results with the corresponding JIRA tickets for the failed
> tests:
>
> CentOS 6 (non-SSL):
> CentOS 6 (SSL):
>   - Failed with MESOS-2017
> 
>
> CentOS 7 (non-SSL):
> CentOS 7 (SSL):
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem
>   - LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox
>   - LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost
>   - LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint
>   - LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem
> (MESOS-3296 )
>   - MesosContainerizerLaunchTest.ROOT_ChangeRootfs
> (MESOS-3410 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>   - PerfTest.ROOT_SamplePid
> (MESOS-3079 )
>
> Debian 8 (non-SSL):
> Debian 8 (SSL):
>   - Failed with MESOS-3964
> 
>
> Ubuntu 12 (non-SSL):
> Ubuntu 12 (SSL):
> Ubuntu 14 (non-SSL):
> Ubuntu 14 (SSL):
>   - UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
>   - UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> (MESOS-4035 )
>
> Ubuntu 15 (non-SSL):
>   - DockerContainerizerTest.ROOT_DOCKER_Logs
> (MESOS-4676 )
>   - LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
> (MESOS-4677 )
>
> Ubuntu 15 (SSL): Success!
>
> On 13 March 2016 at 18:42, Michael Park  wrote:
>
>> While the vote for this release was open until Fri Mar 11 23:59:59 EST
>> 2016,
>> I'm going to give it another 3 days since there has not been any -1 votes.
>>
>> The vote is extended until Wed Mar 16 23:59:59 EST 2016.
>>
>> On 10 March 2016 at 12:49, Greg Mann  wrote:
>>
>>> +1 (non-binding)
>>>
>>> Ran `sudo make check` on CentOS 7, using gcc with libevent and SSL
>>> enabled. All tests pass.
>>>
>>> I was also able to successfully test a simple upgrade scenario from
>>> 0.23.1 to 0.24.2-rc2 using the script found here:
>>> https://reviews.apache.org/r/44229/
>>>
>>> Cheers,
>>> Greg
>>>
>>>
>>> On Tue, Mar 8, 2016 at 6:50 PM, Michael Park  wrote:
>>>
 The link to the commit above points to the one on the master branch.
 The following is the one on the `0.24.2-rc2` branch: Fixed compiler
 warning
 in values tests.
 <
 https://github.com/apache/mesos/commit/afb8a0cffaf8bc235ce45087c80bafe87488dcd0
 >

 On 8 March 2016 at 21:21, Michael Park  wrote:

 > Hi all,
 >
 > Please vote on releasing the following candidate as Apache Mesos
 0.24.2.
 >
 >
 > 0.24.2 includes the following:
 >
 >
 
 >
 > The only diff with RC1 is the following: Fixed compiler warning in
 values
 > tests.
 > <
 https://github.com/apache/mesos/commit/bfeb070a2aef52f445eb057076d344fd184eb461
 >

 > As I described in the RC1 [VOTE] thread, even though this is a trivial
 > compile fix,
 > I decided to cut an RC2 in order to avoid breaking those who compile
 Mesos
 > from source.
 >
 > * Improvements
 > - Allocator filter performance
 > - Port Ranges performance
 > - UUID performance
 > - `/state` endpoint performance
 >   - GLOG performance
 >   - Configurable task/framework history
 >   - Offer filter timeout fix for backlogged allocator
 >
 > * Bugs
 >   - SSL
 >   - Libevent
 >   - Fixed point resources math
 >   - HDFS
 >   - Agent upgrade compatibility
 >   - Health checks
 >
 > The CHANGELOG for the release is available at:
 >
 >
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.2-rc2
 >
 >
 
 >
 > The candidate for Mesos 

Re: [VOTE] Release Apache Mesos 0.26.1 (rc2)

2016-03-19 Thread Vinod Kone
+1 (binding)

Tested on ASF CI.

On Sun, Mar 13, 2016 at 4:33 PM, Michael Park  wrote:

> +1 (binding)
>
> Internal CI results with the corresponding JIRA tickets for the failed
> tests:
>
> CentOS 6 (non-SSL):
>   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
> (MESOS-3049 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
> (MESOS-4035 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> CentOS 6 (SSL):
>   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
> (MESOS-3049 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
> (MESOS-4035 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> CentOS 7 (non-SSL):
>   - LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
> (MESOS-4677 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> CentOS 7 (SSL):
>   - FetcherCacheTest.RemoveLRUCacheEntries
> (MESOS-4156 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> Debian 8 (non-SSL): Success!
> Debian 8 (SSL): Failed with MESOS-2017
> 
>
> Ubuntu 12 (non-SSL):
> Ubuntu 12 (SSL):
> Ubuntu 14 (non-SSL):
> Ubuntu 14 (SSL):
>   - UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
>   - UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> (MESOS-4035 )
>
> Ubuntu 15 (non-SSL): Success!
> Ubuntu 15 (SSL): Success!
>
> On 13 March 2016 at 18:43, Michael Park  wrote:
>
> > While the vote for this release was open until Fri Mar 11 23:59:59 EST
> > 2016,
> > I'm going to give it another 3 days since there has not been any -1
> votes.
> >
> > The vote is extended until Wed Mar 16 23:59:59 EST 2016.
> >
> > On 10 March 2016 at 12:40, Michael Park  wrote:
> >
> >> Thanks Greg!
> >>
> >> On 10 March 2016 at 12:32, Greg Mann  wrote:
> >>
> >>> +1 (non-binding)
> >>>
> >>> Ran `sudo make check` on CentOS 7, using gcc with libevent and SSL
> >>> enabled. All tests pass.
> >>>
> >>> I was also able to successfully test a simple upgrade scenario from
> >>> 0.25.1-rc2 to 0.26.1-rc2 using the script found here:
> >>> https://reviews.apache.org/r/44229/
> >>>
> >>> Cheers,
> >>> Greg
> >>>
> >>>
> >>> On Tue, Mar 8, 2016 at 7:48 PM, Michael Park  wrote:
> >>>
>  Hi all,
> 
>  Please vote on releasing the following candidate as Apache Mesos
> 0.26.1.
> 
> 
>  0.26.1 includes the following:
> 
> 
> 
> 
>  The only diff with RC1 is the following: Fix CGROUPS_ROOT_* tests on
>  systemd platforms.
>  <
> https://github.com/apache/mesos/commit/a896cda4aa8bb9c9bbfba20dda4b68df8dbdf569
> >
>  This patch is necessary in order to make the `systemd` integration
> work
>  correctly.
>  It was part of 

Possibility to work on Apache Mesos for GSoC 2016

2016-03-19 Thread Milindu Sanoj Kumarage
Hi,

I'm Milindu Sanoj Kumarage, an undergraduate of University of Colombo
School of Computing doing my 4th year of Computer Science major. I have
just started my 4th year and I am working on a research base on distributed
computing( on Pub/Sub ), therefor I'm working with tools and technologies
related to distributed computing. I'm just querying for the possibility to
work on Apache Mesos for GSoC 2016.

I did my last year GSoC for Apache Stratos building a CLI for Stratos using
Python and year before that it was Sahana Software Foundation building a
GIS module for Sahana Vesuvius. I have much experience working in many
cloud related tools such as Google Compute Engine, AWS, etc. and I have a
good experience with Docker, Kubernetes and has a good knowledge of
cgroups, containers, pods, etc. I have a  decent knowledge on Apache Mesos
and Marathon also.

I'm really interested in working with Apache Mesos project if possible. I
went though the JIRA also. I'm seeking the guidance of the community to
pick a good issue to work in for GSoC 2016. Please help me with this.

-- 
Regards,
Milindu Sanoj Kumarage
LinkedIn  | GitHub
 | agentmilindu.com


Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Vinod Kone
Cong, I understand your frustration with the review process and backports.
I've already created a ticket to track the latter. Would love your
input/feedback on it.

Regarding the former, we understand the pain. Our use of shepherds is a way
to tackle the problem. While it's not perfect it has definitely improved
the situation IMO. As Jie mentioned earlier, if you have some other
concrete suggestions to improve the process please join us in our community
syncs and help us! We will be grateful. It is not an easy problem to solve.

As an aside, I feel the tone of this thread has gone from being
constructive to being attacking and personal. This is not acceptable in the
Mesos community. Please refer to
http://www.apache.org/foundation/policies/conduct.html for our code of
conduct. This might be different from the Linux community.

On Wed, Mar 16, 2016 at 3:23 PM, Cong Wang  wrote:

> On Wed, Mar 16, 2016 at 2:50 PM, Jie Yu  wrote:
> >>
> >> Why not check your backlog for your answer? Or do you need me to write
> >> a script to scan all the pending review requests for you?
> >
> >
> > OK, i just looked at your pending patches:
> > https://reviews.apache.org/users/wangcong/?show-closed=0
> >
> > The associated tickets:
> > https://issues.apache.org/jira/browse/MESOS-4740
> > https://issues.apache.org/jira/browse/MESOS-2769
> > https://issues.apache.org/jira/browse/MESOS-2799
> >
> > (Some of the rb request does not have associated tickets)
>
> Even your rb doesn't have one:
> https://reviews.apache.org/r/44922/
>
> Not to mention those commits don't even have a RB at all...
>
> https://github.com/apache/mesos/commit/19dd467500ea31371dbebe73a4acfa0346aa9e40
>
> https://github.com/apache/mesos/commit/8c83b843dfcd08f82a394c29939f3c5940a78027
>
> https://github.com/apache/mesos/commit/2de2e5791a6c119e26e9e0bc35bdea4b2e54bbec
>
> What's your point here?
>
>
> >
> > I don't see a shepherd for MESOS-4740. Looks like Vinod is the shepherd
> for
> > MESOS-2769. MESOS-2799 does not have shepherd as well, but I think that
> > should be me. Are you still interested in shipping those patches?
>
> Whether to ship my patches or not is a trivial problem to fix, the
> bigger problem,
> which you keep ignoring, is why this rule (shepherd, ping etc.) can't be
> improved?
>
>
> >
> > I think you made a valid point that there is some problem regarding:
> > 1) Do we want to work on all created tickets (i.e., how do we decide if
> we
> > want to accept a ticket or not), and who decide that?
>
> Why always need a ticket? Some big feature does need one to track
> the subtickets, I definitely agree, but for things like a typo fix
> apparently not.
>
> It doesn't worth the time at all to create a ticket when you just
> want to fix some indention like the one you did:
>
>
> https://github.com/apache/mesos/commit/19dd467500ea31371dbebe73a4acfa0346aa9e40
>
> (although I think this worth a RB).
>
>
> > 2) Once we accept the ticket, how can we prioritize those tickets? Should
> > PMC members groom the accepted tickets regularly?
>
> Why prioritize tickets rather than just reviews? Code is on review board
> not
> in tickets, you should be able to evaluate the code to decide if it is
> ready to
> merge or not. Linux kernel never uses tickets to track features, once
> all reviews
> are addresses it would be merged.
>
>
> > 3) If no committer is volunteer for the accepted ticket, what's the
> > procedure in that case, should we pick one?
> > 4) What's the procedure of finding another shepherd if the original
> > shepherd does not have time for that anymore.
>
>
> Promote new committers, seriously.
>
> You have 20+ committers, if all of you are working, you should be able to
> handle all the reviews. The problem is apparently some of you are not
> working, so why not promote new ones to replace non-working ones?
>


Re: RFC: RevocableInfo Changes

2016-03-19 Thread Guangya Liu
Also please show your comments if any for the name here, the current name 
is *ThrottleInfo*, in Kubernetes resources qos design document, they are 
using scavenging as the key work for such behaviour, so a possible name 
here could be *ScavengeInfo , *please show your comments if any for those 
two names or even if you want to propose a new name here.

message RevocableInfo {
*message ThrottleInfo {}*

*// If set, indicates that the resources may be throttled at*
*// any time. Throttle-able resoruces can be used for tasks*
*// that do not have strict performance requirements and are*
*// capable of handling being throttled.*
*optional ThrottleInfo throttle_info = 1;*
  }

在 2016年3月16日星期三 UTC+8上午10:24:14,Klaus Ma写道:
>
> The patches are updated accordingly; JIRA: MESOS-3888 
>  , RR: 
> https://reviews.apache.org/r/40375/ .
>
> Thanks
> klaus
>
> On Saturday, March 12, 2016 at 11:09:46 AM UTC+8, Benjamin Mahler wrote:
>>
>> Hey folks,
>>
>> In the resource allocation working group we've been looking into a few 
>> projects that will make the allocator able to offer out resources as 
>> revocable. For example:
>>
>> -We'll want to eventually allocate resources as revocable _by default_, 
>> only allowing non-revocable when there are guarantees put in place (static 
>> reservations or quota).
>>
>> -On the path to revocable by default, we can incrementally start to offer 
>> certain resources as revocable. Consider when quota is set but the role 
>> isn't using all of the quota. The unallocated quota can be offered to other 
>> roles, but it should be revocable because we may revoke them should the 
>> quota'ed role want to use the resources. Unused reservations fall into a 
>> similar category.
>>
>> -Going revocable by default also allows us to enforce fairness in a 
>> dynamically changing cluster by revoking resources as weights are changed, 
>> frameworks are added or removed, etc.
>>
>> In this context, "revocable" means that the resources may be taken away 
>> and the container will be destroyed. The meaning of "revocable" in the 
>> context of usage oversubscription includes this, but also the container may 
>> experience a throttling (e.g. lower cpu shares, less network priority, etc).
>>
>> For this reason, and because we internally need to distinguish revocable 
>> resources between the those that are generated by usage oversubscription 
>> and those that are generated by the allocator, we're thinking of the 
>> following change to the API:
>>
>>
>>
>> -  message RevocableInfo {}
>> +  message RevocableInfo {
>> +message ThrottleInfo {}
>> +
>> +// If set, indicates that the resources may be throttled at
>> +// any time. Throttle-able resoruces can be used for tasks
>> +// that do not have strict performance requirements and are
>> +// capable of handling being throttled.
>> +optional ThrottleInfo throttle_info;
>> +  }
>>
>>// If this is set, the resources are revocable, i.e., any tasks or
>> -  // executors launched using these resources could get preempted or
>> -  // throttled at any time. This could be used by frameworks to run
>> -  // best effort tasks that do not need strict uptime or performance
>> +  // executors launched using these resources could be terminated at
>> +  // any time. This could be used by frameworks to run
>> +  // best effort tasks that do not need strict uptime
>>// guarantees. Note that if this is set, 'disk' or 'reservation'
>>// cannot be set.
>>optional RevocableInfo revocable = 9;
>>
>>
>>
>> Essentially we want to distinguish between revocable and revocable + 
>> throttle-able. This is because usage-oversubscription generates 
>> throttle-able revocable resources, whereas the allocator does not. This 
>> also solves our problem of distinguishing between these two kinds of 
>> revocable resources internally.
>>
>> Feedback welcome!
>>
>> Ben
>>
>>

[RESULT][VOTE] Release Apache Mesos 0.28.0 (rc2)

2016-03-19 Thread Vinod Kone
Hi all,


The vote for Mesos 0.28.0 (rc2) has passed with the

following votes.


+1 (Binding)

--

Vinod Kone

Michael Park

Kapil Arya


+1 (Non-binding)

--

Greg Mann

Daniel Osborne

Jorg Schad

Zhitao Li


There were no 0 or -1 votes.


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/0.28.0


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.0


The mesos-0.28.0.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,


Looking for shepherd (MESOS-4355 - Docker Volume Isolator)

2016-03-19 Thread Guangya Liu
Hi,

I was now working on the FS for MESOS-4355 with some EMC guys, can anyone
help shepherd for this? There are some issues need to discuss with the
shepherd.

Thanks,

Guangya


Re: Recent changes to MesosTest helpers

2016-03-19 Thread Joseph Wu
We tried to reduce segfaults of this particular pattern (de-referencing
out-of-scope stack variables), as much as possible.  This means the test
suite shouldn't crash due to flaky tests anymore.  And the test suite
should run to completion each time.

(I also replaced a bunch of CHECK_* statements in the tests with ASSERT_*.)

On Wed, Mar 16, 2016 at 8:27 AM, haosdent  wrote:

> Does it exit like segment when CHECK_xxx failed? Or exit until finish all
> test cases?
> On Mar 16, 2016 11:03 PM, "Joseph Wu"  wrote:
>
> > Hello Devs & Contributors,
> >
> > We recently committed a refactor of the MesosTest suite and underlying
> > "Cluster" abstraction.  This affects almost every existing test and
> future
> > test, so here's a summary of what has changed and what you should be
> aware
> > of:
> >
> >- The purpose of the refactor is to make the entire test suite more
> >resilient to flaky tests.  Before, every test that used the "
> >MesosTest::StartMaster" and "MesosTest::StartSlave" helpers also
> needed
> >to have "Shutdown()" at the end of the test.  If the test failed an
> >assertion or expectation, it would exit before "Shutdown()" and would
> >very likely segfault, or hit a "__cxa_pure_virtual__" and exit with a
> >cryptic stack trace.
> >- The signatures of "MesosTest::StartMaster" and
> "MesosTest::StartSlave"
> >have changed.  Both test helpers now return a "
> >Try "Try".
> >To way to access the "PID" was changed from ".get()" to ".get()->pid".
> >- "Shutdown()" has been removed from MesosTest.  It is no longer
> >necessary.
> >- The MasterDetector has been exposed at the top-level for all slaves.
> >This slave dependency was originally populated by the "Cluster"
> > abstraction
> >(which held both Masters and Slaves).  In most cases, it will be
> > sufficient
> >to create the detector like:
> >
> >Owned detector = master->createDetector();
> >- If you need to restart the master in the middle of a test, just
> reset
> >the underlying "Owned" pointer.  i.e:
> >
> >master->reset();
> >master = StartMaster();
> >
> >Note: We can't assign master before resetting the pointer.  This is a
> >limitation related to supporting multiple masters in tests, which is
> >currently not possible.
> >- If you need to restart the slave in the middle of a test, there are
> >several ways:
> >   - To clean up any containers associated with that slave:
> >   slave = StartSlave(...);
> >
> >   Or:
> >   slave.reset();
> >   slave = StartSlave(...);
> >   - To stop a slave without container cleanup (equivalent to the
> >   original "MesosTest::Stop()"), use:
> >   slave->terminate();
> >
> >   Or:
> >   slave->shutdown();
> >
> >   These two methods emulate turning off the slave, but have slightly
> >   different semantics.  "Terminate" generally emulates a crash.
> > "Shutdown"
> >   emulates a graceful exit.
> >
> > If you have any further questions, feel free to ask.  There are still
> quite
> > a few improvements to make, but those will likely be less disruptive.
> >
> > ~Joseph
> >
>


Re: RFC: RevocableInfo Changes

2016-03-19 Thread Klaus Ma
@team, in the latest meeting, we agree to keep current name *ThrottleInfo.*

If any more comments, please let me know.

On Wednesday, March 16, 2016 at 9:32:37 PM UTC+8, Guangya Liu wrote:
>
> Also please show your comments if any for the name here, the current name 
> is *ThrottleInfo*, in Kubernetes resources qos design document, they are 
> using scavenging as the key work for such behaviour, so a possible name 
> here could be *ScavengeInfo , *please show your comments if any for those 
> two names or even if you want to propose a new name here.
>
> message RevocableInfo {
> *message ThrottleInfo {}*
>
> *// If set, indicates that the resources may be throttled at*
> *// any time. Throttle-able resoruces can be used for tasks*
> *// that do not have strict performance requirements and are*
> *// capable of handling being throttled.*
> *optional ThrottleInfo throttle_info = 1;*
>   }
>
> 在 2016年3月16日星期三 UTC+8上午10:24:14,Klaus Ma写道:
>>
>> The patches are updated accordingly; JIRA: MESOS-3888 
>>  , RR: 
>> https://reviews.apache.org/r/40375/ .
>>
>> Thanks
>> klaus
>>
>> On Saturday, March 12, 2016 at 11:09:46 AM UTC+8, Benjamin Mahler wrote:
>>>
>>> Hey folks,
>>>
>>> In the resource allocation working group we've been looking into a few 
>>> projects that will make the allocator able to offer out resources as 
>>> revocable. For example:
>>>
>>> -We'll want to eventually allocate resources as revocable _by default_, 
>>> only allowing non-revocable when there are guarantees put in place (static 
>>> reservations or quota).
>>>
>>> -On the path to revocable by default, we can incrementally start to 
>>> offer certain resources as revocable. Consider when quota is set but the 
>>> role isn't using all of the quota. The unallocated quota can be offered to 
>>> other roles, but it should be revocable because we may revoke them should 
>>> the quota'ed role want to use the resources. Unused reservations fall into 
>>> a similar category.
>>>
>>> -Going revocable by default also allows us to enforce fairness in a 
>>> dynamically changing cluster by revoking resources as weights are changed, 
>>> frameworks are added or removed, etc.
>>>
>>> In this context, "revocable" means that the resources may be taken away 
>>> and the container will be destroyed. The meaning of "revocable" in the 
>>> context of usage oversubscription includes this, but also the container may 
>>> experience a throttling (e.g. lower cpu shares, less network priority, etc).
>>>
>>> For this reason, and because we internally need to distinguish revocable 
>>> resources between the those that are generated by usage oversubscription 
>>> and those that are generated by the allocator, we're thinking of the 
>>> following change to the API:
>>>
>>>
>>>
>>> -  message RevocableInfo {}
>>> +  message RevocableInfo {
>>> +message ThrottleInfo {}
>>> +
>>> +// If set, indicates that the resources may be throttled at
>>> +// any time. Throttle-able resoruces can be used for tasks
>>> +// that do not have strict performance requirements and are
>>> +// capable of handling being throttled.
>>> +optional ThrottleInfo throttle_info;
>>> +  }
>>>
>>>// If this is set, the resources are revocable, i.e., any tasks or
>>> -  // executors launched using these resources could get preempted or
>>> -  // throttled at any time. This could be used by frameworks to run
>>> -  // best effort tasks that do not need strict uptime or performance
>>> +  // executors launched using these resources could be terminated at
>>> +  // any time. This could be used by frameworks to run
>>> +  // best effort tasks that do not need strict uptime
>>>// guarantees. Note that if this is set, 'disk' or 'reservation'
>>>// cannot be set.
>>>optional RevocableInfo revocable = 9;
>>>
>>>
>>>
>>> Essentially we want to distinguish between revocable and revocable + 
>>> throttle-able. This is because usage-oversubscription generates 
>>> throttle-able revocable resources, whereas the allocator does not. This 
>>> also solves our problem of distinguishing between these two kinds of 
>>> revocable resources internally.
>>>
>>> Feedback welcome!
>>>
>>> Ben
>>>
>>>

Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Cong Wang
On Wed, Mar 16, 2016 at 2:50 PM, Jie Yu  wrote:
>>
>> Why not check your backlog for your answer? Or do you need me to write
>> a script to scan all the pending review requests for you?
>
>
> OK, i just looked at your pending patches:
> https://reviews.apache.org/users/wangcong/?show-closed=0
>
> The associated tickets:
> https://issues.apache.org/jira/browse/MESOS-4740
> https://issues.apache.org/jira/browse/MESOS-2769
> https://issues.apache.org/jira/browse/MESOS-2799
>
> (Some of the rb request does not have associated tickets)

Even your rb doesn't have one:
https://reviews.apache.org/r/44922/

Not to mention those commits don't even have a RB at all...
https://github.com/apache/mesos/commit/19dd467500ea31371dbebe73a4acfa0346aa9e40
https://github.com/apache/mesos/commit/8c83b843dfcd08f82a394c29939f3c5940a78027
https://github.com/apache/mesos/commit/2de2e5791a6c119e26e9e0bc35bdea4b2e54bbec

What's your point here?


>
> I don't see a shepherd for MESOS-4740. Looks like Vinod is the shepherd for
> MESOS-2769. MESOS-2799 does not have shepherd as well, but I think that
> should be me. Are you still interested in shipping those patches?

Whether to ship my patches or not is a trivial problem to fix, the
bigger problem,
which you keep ignoring, is why this rule (shepherd, ping etc.) can't be
improved?


>
> I think you made a valid point that there is some problem regarding:
> 1) Do we want to work on all created tickets (i.e., how do we decide if we
> want to accept a ticket or not), and who decide that?

Why always need a ticket? Some big feature does need one to track
the subtickets, I definitely agree, but for things like a typo fix
apparently not.

It doesn't worth the time at all to create a ticket when you just
want to fix some indention like the one you did:

https://github.com/apache/mesos/commit/19dd467500ea31371dbebe73a4acfa0346aa9e40

(although I think this worth a RB).


> 2) Once we accept the ticket, how can we prioritize those tickets? Should
> PMC members groom the accepted tickets regularly?

Why prioritize tickets rather than just reviews? Code is on review board not
in tickets, you should be able to evaluate the code to decide if it is ready to
merge or not. Linux kernel never uses tickets to track features, once
all reviews
are addresses it would be merged.


> 3) If no committer is volunteer for the accepted ticket, what's the
> procedure in that case, should we pick one?
> 4) What's the procedure of finding another shepherd if the original
> shepherd does not have time for that anymore.


Promote new committers, seriously.

You have 20+ committers, if all of you are working, you should be able to
handle all the reviews. The problem is apparently some of you are not
working, so why not promote new ones to replace non-working ones?


RE: Upgrade to clang-format-3.8

2016-03-19 Thread Yong Tang


> Subject: Re: Upgrade to clang-format-3.8
> From: jor...@gmail.com
> Date: Fri, 18 Mar 2016 09:45:22 -0700
> To: dev@mesos.apache.org
> 
> 
> > On Mar 17, 2016, at 10:41 AM, Yong Tang  
> > wrote:
> > 
> > Hi All
> > 
> > 
> > This email is to announce that the default configuration and the 
> > recommended version of the clang-format is being upgraded to 3.8 (from 3.5) 
> > in mesos.
> > 
> > 
> > In clang-format-3.8, the newly introduced option "AlignAfterOpenBracket: 
> > AlwaysBreak" closes the largest gap between ClangFormat and the style guide 
> > in mesos. It avoids  "jaggedness" in function calls and is worth migrating 
> > for.
> > 
> > 
> > Along with the changes in clang-format configuration 
> > (support/clang-format), the documentation (docs/clang-format.md) is also 
> > going to be updated to reflect changes in version and the recommended 
> > installation process.
> > 
> > 
> > More details about this upgrade could be found in MESOS-4906 
> > (https://issues.apache.org/jira/browse/MESOS-4906). By the way, thanks 
> > Michael for the help on this issue.
> 
> This sounds really promising. Is the plan to auto-format everything with 
> clang-format?

Hi James

My understanding is that there are two epic associated. One is the clang-format 
integration that formatting whitespaces, multiple lines, etc. Another is the 
clang-tidy integration that helps handle class/function naming conventions 
etc.. 

clang-format integration:
https://issues.apache.org/jira/browse/MESOS-4905
clang-tidy integration:
https://issues.apache.org/jira/browse/MESOS-4907

Thanks
Yong

  

Compute event at Twitter HQ - 03/31

2016-03-19 Thread Ian Downes
Hello everyone,

I'd like to call attention to an event the Compute group at Twitter is
holding at the end of the month where there will be a few
Aurora/Mesos-related talks:

1. David Robinson, one of our SREs, will talk about how our small team
of SREs manages what is possibly the largest Mesos cluster in
existence.
2. David McLaughlin, Aurora committer/PMC member, will talk about
Workflows, an internal tool we've built to orchestrate deployments
across Aurora clusters.
3. David Hagar, Engineering Manager at TellApart, will talk about
running Aurora/Mesos in AWS.

On top of that there will be lots of other great talks about how we
run the entirety of our compute infrastructure.

The event is on the evening of March 31st at Twitter HQ in San
Francisco. I hope to see many of you there!

https://www.eventbrite.com/e/compute-tickets-22811196904

Thanks,

Ian

Note: This is nearly a straight copy of an email that Joshua sent out
to the Aurora mailing lists.


Re: Recent changes to MesosTest helpers

2016-03-19 Thread haosdent
Got it, thank you for explanation.

On Thu, Mar 17, 2016 at 12:51 AM, Joseph Wu  wrote:

> We tried to reduce segfaults of this particular pattern (de-referencing
> out-of-scope stack variables), as much as possible.  This means the test
> suite shouldn't crash due to flaky tests anymore.  And the test suite
> should run to completion each time.
>
> (I also replaced a bunch of CHECK_* statements in the tests with ASSERT_*.)
>
> On Wed, Mar 16, 2016 at 8:27 AM, haosdent  wrote:
>
> > Does it exit like segment when CHECK_xxx failed? Or exit until finish all
> > test cases?
> > On Mar 16, 2016 11:03 PM, "Joseph Wu"  wrote:
> >
> > > Hello Devs & Contributors,
> > >
> > > We recently committed a refactor of the MesosTest suite and underlying
> > > "Cluster" abstraction.  This affects almost every existing test and
> > future
> > > test, so here's a summary of what has changed and what you should be
> > aware
> > > of:
> > >
> > >- The purpose of the refactor is to make the entire test suite more
> > >resilient to flaky tests.  Before, every test that used the "
> > >MesosTest::StartMaster" and "MesosTest::StartSlave" helpers also
> > needed
> > >to have "Shutdown()" at the end of the test.  If the test failed an
> > >assertion or expectation, it would exit before "Shutdown()" and
> would
> > >very likely segfault, or hit a "__cxa_pure_virtual__" and exit with
> a
> > >cryptic stack trace.
> > >- The signatures of "MesosTest::StartMaster" and
> > "MesosTest::StartSlave"
> > >have changed.  Both test helpers now return a "
> > >Try > "Try".
> > >To way to access the "PID" was changed from ".get()" to
> ".get()->pid".
> > >- "Shutdown()" has been removed from MesosTest.  It is no longer
> > >necessary.
> > >- The MasterDetector has been exposed at the top-level for all
> slaves.
> > >This slave dependency was originally populated by the "Cluster"
> > > abstraction
> > >(which held both Masters and Slaves).  In most cases, it will be
> > > sufficient
> > >to create the detector like:
> > >
> > >Owned detector = master->createDetector();
> > >- If you need to restart the master in the middle of a test, just
> > reset
> > >the underlying "Owned" pointer.  i.e:
> > >
> > >master->reset();
> > >master = StartMaster();
> > >
> > >Note: We can't assign master before resetting the pointer.  This is
> a
> > >limitation related to supporting multiple masters in tests, which is
> > >currently not possible.
> > >- If you need to restart the slave in the middle of a test, there
> are
> > >several ways:
> > >   - To clean up any containers associated with that slave:
> > >   slave = StartSlave(...);
> > >
> > >   Or:
> > >   slave.reset();
> > >   slave = StartSlave(...);
> > >   - To stop a slave without container cleanup (equivalent to the
> > >   original "MesosTest::Stop()"), use:
> > >   slave->terminate();
> > >
> > >   Or:
> > >   slave->shutdown();
> > >
> > >   These two methods emulate turning off the slave, but have
> slightly
> > >   different semantics.  "Terminate" generally emulates a crash.
> > > "Shutdown"
> > >   emulates a graceful exit.
> > >
> > > If you have any further questions, feel free to ask.  There are still
> > quite
> > > a few improvements to make, but those will likely be less disruptive.
> > >
> > > ~Joseph
> > >
> >
>



-- 
Best Regards,
Haosdent Huang


Re: Recent changes to MesosTest helpers

2016-03-19 Thread haosdent
Does it exit like segment when CHECK_xxx failed? Or exit until finish all
test cases?
On Mar 16, 2016 11:03 PM, "Joseph Wu"  wrote:

> Hello Devs & Contributors,
>
> We recently committed a refactor of the MesosTest suite and underlying
> "Cluster" abstraction.  This affects almost every existing test and future
> test, so here's a summary of what has changed and what you should be aware
> of:
>
>- The purpose of the refactor is to make the entire test suite more
>resilient to flaky tests.  Before, every test that used the "
>MesosTest::StartMaster" and "MesosTest::StartSlave" helpers also needed
>to have "Shutdown()" at the end of the test.  If the test failed an
>assertion or expectation, it would exit before "Shutdown()" and would
>very likely segfault, or hit a "__cxa_pure_virtual__" and exit with a
>cryptic stack trace.
>- The signatures of "MesosTest::StartMaster" and "MesosTest::StartSlave"
>have changed.  Both test helpers now return a "
>Try".
>To way to access the "PID" was changed from ".get()" to ".get()->pid".
>- "Shutdown()" has been removed from MesosTest.  It is no longer
>necessary.
>- The MasterDetector has been exposed at the top-level for all slaves.
>This slave dependency was originally populated by the "Cluster"
> abstraction
>(which held both Masters and Slaves).  In most cases, it will be
> sufficient
>to create the detector like:
>
>Owned detector = master->createDetector();
>- If you need to restart the master in the middle of a test, just reset
>the underlying "Owned" pointer.  i.e:
>
>master->reset();
>master = StartMaster();
>
>Note: We can't assign master before resetting the pointer.  This is a
>limitation related to supporting multiple masters in tests, which is
>currently not possible.
>- If you need to restart the slave in the middle of a test, there are
>several ways:
>   - To clean up any containers associated with that slave:
>   slave = StartSlave(...);
>
>   Or:
>   slave.reset();
>   slave = StartSlave(...);
>   - To stop a slave without container cleanup (equivalent to the
>   original "MesosTest::Stop()"), use:
>   slave->terminate();
>
>   Or:
>   slave->shutdown();
>
>   These two methods emulate turning off the slave, but have slightly
>   different semantics.  "Terminate" generally emulates a crash.
> "Shutdown"
>   emulates a graceful exit.
>
> If you have any further questions, feel free to ask.  There are still quite
> a few improvements to make, but those will likely be less disruptive.
>
> ~Joseph
>


Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-19 Thread Michael Browning
I agree with Kevin -- tags are immutable, so they're naturally suited
for labeling releases, which ought to be immutable too.

On Fri, Mar 18, 2016 at 4:59 PM, Kevin Klues  wrote:
> I respectfully disagree.
>
> The whole purpose of tags is to mark permanent things like releases,
> whereas branches are designed as temporary lines of development that
> come and go (and grow and shrink) dynamically all the time.
>
> On Fri, Mar 18, 2016 at 4:04 PM, Jie Yu  wrote:
>> I like the idea of using branches to manage releases.
>>
>> We can use that to manage point releases and backports as well.
>>
>> Say we want to cut 0.29.0 now, we fork a branch 0.29.0 and tag RCs in that
>> branch. Once the RC is accepted, the head of that branch will become the
>> release.
>>
>> Then, we immediate fork that branch and create 0.29.1 branch.
>>
>> When a new bug fix is committed on the trunk, the committer will decide
>> whether it'll affect the old releases (a bounded number, we can decide that
>> later). If it does, the committer of that patch should also cherry-pick
>> that patch to the point releases (e.g., 0.29.1 in this case). We can do a
>> timely based point releases.
>>
>> - Jie
>>
>> On Fri, Mar 18, 2016 at 1:35 PM, Cong Wang  wrote:
>>
>>> On Wed, Mar 16, 2016 at 11:56 AM, Joseph Wu  wrote:
>>> > Cong Wang,
>>> >
>>> > The tags are sync'd.  See: https://github.com/apache/mesos/releases
>>> >
>>> > You might not have done: git pull --tags
>>>
>>>
>>> Yeah, I figured it out by myself too. This is why I hate tags personally,
>>> branches are better since they are fetched without additional parameters.
>>>
>>> Any reason why Mesos maintainers picked tags over branches to manage
>>> releases? Just curious...
>>>
>
>
>
> --
> ~Kevin


Re: Recent changes to MesosTest helpers

2016-03-19 Thread Vinod Kone
This is awesome! Thanks for working on this.

On Wed, Mar 16, 2016 at 9:53 AM, haosdent  wrote:

> Got it, thank you for explanation.
>
> On Thu, Mar 17, 2016 at 12:51 AM, Joseph Wu  wrote:
>
> > We tried to reduce segfaults of this particular pattern (de-referencing
> > out-of-scope stack variables), as much as possible.  This means the test
> > suite shouldn't crash due to flaky tests anymore.  And the test suite
> > should run to completion each time.
> >
> > (I also replaced a bunch of CHECK_* statements in the tests with
> ASSERT_*.)
> >
> > On Wed, Mar 16, 2016 at 8:27 AM, haosdent  wrote:
> >
> > > Does it exit like segment when CHECK_xxx failed? Or exit until finish
> all
> > > test cases?
> > > On Mar 16, 2016 11:03 PM, "Joseph Wu"  wrote:
> > >
> > > > Hello Devs & Contributors,
> > > >
> > > > We recently committed a refactor of the MesosTest suite and
> underlying
> > > > "Cluster" abstraction.  This affects almost every existing test and
> > > future
> > > > test, so here's a summary of what has changed and what you should be
> > > aware
> > > > of:
> > > >
> > > >- The purpose of the refactor is to make the entire test suite
> more
> > > >resilient to flaky tests.  Before, every test that used the "
> > > >MesosTest::StartMaster" and "MesosTest::StartSlave" helpers also
> > > needed
> > > >to have "Shutdown()" at the end of the test.  If the test failed
> an
> > > >assertion or expectation, it would exit before "Shutdown()" and
> > would
> > > >very likely segfault, or hit a "__cxa_pure_virtual__" and exit
> with
> > a
> > > >cryptic stack trace.
> > > >- The signatures of "MesosTest::StartMaster" and
> > > "MesosTest::StartSlave"
> > > >have changed.  Both test helpers now return a "
> > > >Try > > "Try".
> > > >To way to access the "PID" was changed from ".get()" to
> > ".get()->pid".
> > > >- "Shutdown()" has been removed from MesosTest.  It is no longer
> > > >necessary.
> > > >- The MasterDetector has been exposed at the top-level for all
> > slaves.
> > > >This slave dependency was originally populated by the "Cluster"
> > > > abstraction
> > > >(which held both Masters and Slaves).  In most cases, it will be
> > > > sufficient
> > > >to create the detector like:
> > > >
> > > >Owned detector = master->createDetector();
> > > >- If you need to restart the master in the middle of a test, just
> > > reset
> > > >the underlying "Owned" pointer.  i.e:
> > > >
> > > >master->reset();
> > > >master = StartMaster();
> > > >
> > > >Note: We can't assign master before resetting the pointer.  This
> is
> > a
> > > >limitation related to supporting multiple masters in tests, which
> is
> > > >currently not possible.
> > > >- If you need to restart the slave in the middle of a test, there
> > are
> > > >several ways:
> > > >   - To clean up any containers associated with that slave:
> > > >   slave = StartSlave(...);
> > > >
> > > >   Or:
> > > >   slave.reset();
> > > >   slave = StartSlave(...);
> > > >   - To stop a slave without container cleanup (equivalent to the
> > > >   original "MesosTest::Stop()"), use:
> > > >   slave->terminate();
> > > >
> > > >   Or:
> > > >   slave->shutdown();
> > > >
> > > >   These two methods emulate turning off the slave, but have
> > slightly
> > > >   different semantics.  "Terminate" generally emulates a crash.
> > > > "Shutdown"
> > > >   emulates a graceful exit.
> > > >
> > > > If you have any further questions, feel free to ask.  There are still
> > > quite
> > > > a few improvements to make, but those will likely be less disruptive.
> > > >
> > > > ~Joseph
> > > >
> > >
> >
>
>
>
> --
> Best Regards,
> Haosdent Huang
>


Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Zameer Manji
Cong brings up a good point here. Currently Mesos has a very aggressive
release cadence. This results in several questions as a cluster operator
and framework author.

   - What is the support from the community/committers for each release?
   - Do cluster operators and framework authors need to move at the same
   space at the community?
   - Will bugfixes be automatically backported?

The lack of clarity here can result in several issues because it is easy
for the Mesos PMC to cut releases quickly, but it isn't easy for people
with existing clusters to upgrade at that pace. An aggressive release
policy without clear support for older releases can leave several users in
a bad position where they might need to upgrade Mesos through one (or
more!) releases just to get a critical bugfix.



On Wed, Mar 16, 2016 at 11:44 AM, Cong Wang  wrote:

> On Tue, Mar 15, 2016 at 2:39 PM, Jie Yu  wrote:
> > Mesos currently has no notion of long term stable releases (i.e., LTS). I
> > think the consensus in the last community sync was to introduce LTS after
> > 1.0.
>
>
> You don't need LTS as kernel, even talking about short term stable releases
> like 0.27.2 (?), they look horrible too, I don't see any git tags or
> branches for
> these releases, just a tar ball?! Huh...
>
>
> >
> > 0.27.2 has already been released. Looks like we need 0.27.3 if we want to
> > backport it.
>
>
> What determines which patches need to backport for Mesos community?
> It doesn't look like every bug fix is evaluated and considered after they
> are merged into master branch.
>
> >
> > I am OK with back porting it. Then the question is that whether we want
> to
> > backport it to other releases as well.
> >
>
> It should be backported to whichever releases it applies to and you
> support,
> I don't see Mesos community has such a procedure.
>
> --
> Zameer Manji
>
>


Re: Backport r/44230 to 0.27 branch

2016-03-19 Thread Jie Yu
>
> You don't need LTS as kernel, even talking about short term stable releases
> like 0.27.2 (?), they look horrible too, I don't see any git tags or
> branches for
> these releases, just a tar ball?! Huh...


Jies-MacBook-Pro:mesos jie$ git tag | grep 0.27
0.27.0
0.27.0-rc1
0.27.0-rc2
0.27.1
0.27.1-rc1
0.27.2
0.27.2-rc1

What determines which patches need to backport for Mesos community?
> It doesn't look like every bug fix is evaluated and considered after they
> are merged into master branch.


Currently, it's based on request. We definitely need to improve this part.
Note that, Mesos is a fast moving project and is young. Comparing it to
Linux (20+ years) is not a fair comparison.

On Wed, Mar 16, 2016 at 11:44 AM, Cong Wang  wrote:

> On Tue, Mar 15, 2016 at 2:39 PM, Jie Yu  wrote:
> > Mesos currently has no notion of long term stable releases (i.e., LTS). I
> > think the consensus in the last community sync was to introduce LTS after
> > 1.0.
>
>
> You don't need LTS as kernel, even talking about short term stable releases
> like 0.27.2 (?), they look horrible too, I don't see any git tags or
> branches for
> these releases, just a tar ball?! Huh...
>
>
> >
> > 0.27.2 has already been released. Looks like we need 0.27.3 if we want to
> > backport it.
>
>
> What determines which patches need to backport for Mesos community?
> It doesn't look like every bug fix is evaluated and considered after they
> are merged into master branch.
>
> >
> > I am OK with back porting it. Then the question is that whether we want
> to
> > backport it to other releases as well.
> >
>
> It should be backported to whichever releases it applies to and you
> support,
> I don't see Mesos community has such a procedure.
>


Re: shepherd for MESOS-4735, and proposal

2016-03-19 Thread Michael Browning
Bump. This should be a pretty small change, any takers?

On Mon, Mar 14, 2016 at 6:39 PM, Shuai Lin  wrote:
> +1 for the new `filename` field for the URI. It would also be useful for
> implementing base64:// or data:// scheme uri fetcher, so that the scheduler
> can pass arbitary file blobs to the task directly without resorting to a
> custom executor. (https://issues.apache.org/jira/browse/MESOS-4524)
>
> On Tue, Mar 15, 2016 at 8:18 AM, Michael Browning 
> wrote:
>
>> Hi all,
>>
>> Looking for a shepherd for this task:
>>
>> https://issues.apache.org/jira/browse/MESOS-4735
>>
>> As Zhitao mentioned in the ticket, the frequently-encountered
>> inability to extract archives from webhdfs-fetched files due to the
>> inclusion of things like query params in the resulting filename is
>> kind of a blocker for us, and there seems to be interest from others
>> too, q.v.:
>>
>> https://issues.apache.org/jira/browse/MESOS-4779
>>
>> My proposed fix is basically that proposed by Erik in the ticket: add
>> an optional string `filename` field to the CommandInfo.URI protobuf --
>> when omitted, the old behavior will be retained, but when specified
>> the file will be saved to the sandbox with that filename. Any feedback
>> would be appreciated.
>>
>> Thanks,
>> Michael
>>


Re: Compute event at Twitter HQ - 03/31

2016-03-19 Thread haosdent
Would it have youtube live link?

On Thu, Mar 17, 2016 at 12:38 AM, Ian Downes  wrote:

> Hello everyone,
>
> I'd like to call attention to an event the Compute group at Twitter is
> holding at the end of the month where there will be a few
> Aurora/Mesos-related talks:
>
> 1. David Robinson, one of our SREs, will talk about how our small team
> of SREs manages what is possibly the largest Mesos cluster in
> existence.
> 2. David McLaughlin, Aurora committer/PMC member, will talk about
> Workflows, an internal tool we've built to orchestrate deployments
> across Aurora clusters.
> 3. David Hagar, Engineering Manager at TellApart, will talk about
> running Aurora/Mesos in AWS.
>
> On top of that there will be lots of other great talks about how we
> run the entirety of our compute infrastructure.
>
> The event is on the evening of March 31st at Twitter HQ in San
> Francisco. I hope to see many of you there!
>
> https://www.eventbrite.com/e/compute-tickets-22811196904
>
> Thanks,
>
> Ian
>
> Note: This is nearly a straight copy of an email that Joshua sent out
> to the Aurora mailing lists.
>



-- 
Best Regards,
Haosdent Huang


Re: Upgrade to clang-format-3.8

2016-03-19 Thread James Peach

> On Mar 17, 2016, at 10:41 AM, Yong Tang  wrote:
> 
> Hi All
> 
> 
> This email is to announce that the default configuration and the recommended 
> version of the clang-format is being upgraded to 3.8 (from 3.5) in mesos.
> 
> 
> In clang-format-3.8, the newly introduced option "AlignAfterOpenBracket: 
> AlwaysBreak" closes the largest gap between ClangFormat and the style guide 
> in mesos. It avoids  "jaggedness" in function calls and is worth migrating 
> for.
> 
> 
> Along with the changes in clang-format configuration (support/clang-format), 
> the documentation (docs/clang-format.md) is also going to be updated to 
> reflect changes in version and the recommended installation process.
> 
> 
> More details about this upgrade could be found in MESOS-4906 
> (https://issues.apache.org/jira/browse/MESOS-4906). By the way, thanks 
> Michael for the help on this issue.

This sounds really promising. Is the plan to auto-format everything with 
clang-format?

Unzip should work in non interactive mode

2016-03-19 Thread Tomek Janiszewski
Hi

Consider situation when deployed zip file is malformed and contains
duplicated files .
When fetcher downloads malformed zip file, that contains duplicated files
(e.g., dist zips generated by gradle) and try to uncompress it, deployment
hang in staged phase because unzip prompt if file should be replaced. unzip
should overwrite this file or break with error. I created issue for this
MESOS-4885
It looks like easy fix, anyone want to shepherd it?

Best
Tomek