[jira] [Assigned] (MESOS-1162) Add a 'Percentage' abstraction.

2017-07-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-1162:
--

Assignee: (was: Marco Massenzio)

> Add a 'Percentage' abstraction.
> ---
>
> Key: MESOS-1162
> URL: https://issues.apache.org/jira/browse/MESOS-1162
> Project: Mesos
>  Issue Type: Improvement
>  Components: stout
>Reporter: Benjamin Mahler
>Priority: Minor
>  Labels: mesosphere
>
> It is currently difficult to add a percentage-based flag, if one desires it 
> to be specified in the "0%"-"100%" form. This requires creating a {{string}} 
> flag and doing all the parsing  / validation manually.
> An alternative is to use a {{double}} flag with 0.0-1.0 being the valid 
> range, however, this may not read as intuitively to operators.
> Another alternative is to use a {{double}} flag with 0.0-100.0 as the valid 
> range, with the '%' being implicit.
> However, these two alternative techniques can lead to confusion since it's 
> not clear how we're interpreting the value. Requiring the '%' symbol is nice 
> because it leaves no room for ambiguity.
> I would propose adding a 'Percentage' abstraction in stout that provides the 
> parsing logic for use in flags. Percentages can basically be a wrapper around 
> the underlying {{double}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module

2017-07-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-4253:
--

Assignee: (was: Marco Massenzio)

> Provide a minimalist "runtime context" to an Anonymous Module
> -
>
> Key: MESOS-4253
> URL: https://issues.apache.org/jira/browse/MESOS-4253
> Project: Mesos
>  Issue Type: Improvement
>  Components: modules
>Reporter: Marco Massenzio
>
> Currently, {{Anonymous}} modules only receive at creation a copy of the 
> {{"parameters"}} passed in the JSON configuration file.
> However, at runtime, it would be useful to also have a "runtime context" for 
> the module developer to use, when implementing the functionality.
> I would suggest to pass in the {{Flags}} object from the Master/Agent inside 
> an {{setRuntimeContext(const Flags&)}}[0] method, called immediately 
> post-{{create(const Parameters&)}}[1].
> Also, I would suggest adding a {{teardown()}} method too, in case the module 
> needs to release resources / conduct cleanup before exiting (there is a TODO 
> in the code to this effect, and adding this in this patch would be close to 
> trivial).
> [0] In practice, it won't be this trivial, as Master/Agent {{Flags}} are of a 
> different compile-time type - probably use something like variadic templates 
> or something (suggestions appreciated!).
> [1] In fact, the ideal solution would be to add the {{const Flags&}} to 
> {{create()}}, but that would, alas, break everyone's modules; so that's 
> probably a no-go (ideas welcome here too).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-2044) Use one IP address per container for network isolation

2017-07-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-2044:
--

Assignee: (was: Marco Massenzio)

> Use one IP address per container for network isolation
> --
>
> Key: MESOS-2044
> URL: https://issues.apache.org/jira/browse/MESOS-2044
> Project: Mesos
>  Issue Type: Epic
>Reporter: Cong Wang
>  Labels: mesosphere
>
> If there are enough IP addresses, either IPv4 or IPv6, we should use one IP 
> address per container, instead of the ugly port range based solution. One 
> problem with this is the IP address management, usually it is managed by a 
> DHCP server, maybe we need to manage them in mesos master/slave.
> Also, maybe use macvlan instead of veth for better isolation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module

2016-02-28 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171462#comment-15171462
 ] 

Marco Massenzio commented on MESOS-4253:


​It would be good if whoever came up with the "security concerns" could clarify 
them further: in particular, when making an assertion about a particular 
feature introducing a "security vulnerability", it is best practice to describe 
a scenario, a potential attacker's capabilities, and the attack vector - 
otherwise, *anything* can be a "security concern."
​{quote} 
What this means is that I have to retract the ship-it to discuss it further. 
One of the most important issues was the fact that exposing all Master/Agent 
flags could also mean sharing things like credentials and password info and any 
other information that is part of other modules' module.json parameters.
{quote}
​I will be honest and confess that I don't understand the scenario here: please 
bear in mind that the module(s) can *only* be loaded at startup, by using the 
{{--modules}} flag (and associated JSON) by the same person/team/script that is 
launching the Master/Agent.

So, we are really *not* "exposing" the flags: these are already available (by 
definition) to the actor who launched the Agent (or Master), hence this 
facility does not further expand the surface of attack (provided, of course, 
that the module itself is designed according to security principles).

In other words, passing the Flags during module creation is simply a 
convenience, wrt to writing a "wrapper" script that duplicates these Flags of 
interest into the modules' "Parameters" in the JSON.
Also, it gives the modules access to default values that are not explicitly 
defined: as these are, by definition, "public" there is no increase in 
vulnerability.

Again, the very same person that launches Mesos is loading the module - how 
does that represent a greater security concern?
​
{quote} 
Having said that, I am not saying that Mesos is completely secure and these 
patches will make it less secure, but we do need to comeup with a better plan 
going forward.
{quote}
​"better" can only be defined wrt to a security threat scenario: what is it?
​ 
{quote}
On a more detailed note, there are two main avenues that we need to pursue 
here. One, have the modules explicitly request the flags that are needed by 
them in order to work. At which point, the operator can pass in these flags as 
part of Master/Agent commandline and they will be forwarded to the respective 
modules.
{quote}
​how would a module "explicitly request the flags"?
​This seems rather cumbersome, and only minimally better than just the 
"wrapper" script that duplicates the flags inside the JSON's parameters.

It is also completely contrary to treating your cluster "as herd, not pets."
{quote}​
Second, we can come up with a minimal set of Master/Agent flags that we 
consider "safe" and always pass to all modules as part of the `create` call 
along with Parameters. There is already a precedence in the way SSL flags are 
passed on via Master/Agent commandline.
{quote}
This seems to me to be really non-scalable and a bit cumbersome, but probably 
the only viable option, without a clearer definition of what the security 
concerns are.
{quote}
Finally, given the nature of the concerns, I wanted to see if you can join the 
next community sync and discuss it further while involving the whole community? 
After that, we might be able to create a small working group with all 
interested parties to come up with better design decisions.
{quote}

Considering that it's taken two months (of virtually no feedback at all) I 
honestly can't see how this is likely to elicit more interest, but we'll see, I 
guess.

> Provide a minimalist "runtime context" to an Anonymous Module
> -
>
> Key: MESOS-4253
> URL: https://issues.apache.org/jira/browse/MESOS-4253
> Project: Mesos
>  Issue Type: Improvement
>  Components: modules
>Reporter: Marco Massenzio
>Assignee: Marco Massenzio
>
> Currently, {{Anonymous}} modules only receive at creation a copy of the 
> {{"parameters"}} passed in the JSON configuration file.
> However, at runtime, it would be useful to also have a "runtime context" for 
> the module developer to use, when implementing the functionality.
> I would suggest to pass in the {{Flags}} object from the Master/Agent inside 
> an {{setRuntimeContext(const Flags&)}}[0] method, called immediately 
> post-{{create(const Parameters&)}}[1].
> Also, I would suggest adding a {{teardown()}} method too, in case the module 
> needs to release resources / conduct cleanup before exiting (there is a TODO 
> in the code to this effect, and adding this in this patch would be close to 
> trivial).
> [0] In practice, it won't be this trivial, as 

[jira] [Commented] (MESOS-4582) state.json serving duplicate "active" fields

2016-02-06 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136137#comment-15136137
 ] 

Marco Massenzio commented on MESOS-4582:


Sounds good to me!
I'd suggest to document the behavior someplace with a reference to the 
appropriate standard document, so that people won't make the same mistake I did.

> state.json serving duplicate "active" fields
> 
>
> Key: MESOS-4582
> URL: https://issues.apache.org/jira/browse/MESOS-4582
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.0
>Reporter: Michael Gummelt
>Assignee: Michael Park
>Priority: Blocker
> Attachments: error.json
>
>
> state.json is serving duplicate "active" fields in frameworks.  See the 
> framework "47df96c2-3f85-4bc5-b781-709b2c30c752-" In the attached file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4582) state.json serving duplicate "active" fields

2016-02-05 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15135137#comment-15135137
 ] 

Marco Massenzio commented on MESOS-4582:


I'm almost sure that duplicate keys are not legal JSON - worth checking the 
standard, but I'd be in favor of keepin the checks and throwing back a 406 (Bad 
Request).

If you want, I can look it up later this weekend and find out what the JSON 
standard says?

Thanks for fixing it!

> state.json serving duplicate "active" fields
> 
>
> Key: MESOS-4582
> URL: https://issues.apache.org/jira/browse/MESOS-4582
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.0
>Reporter: Michael Gummelt
>Assignee: Michael Park
>Priority: Blocker
> Attachments: error.json
>
>
> state.json is serving duplicate "active" fields in frameworks.  See the 
> framework "47df96c2-3f85-4bc5-b781-709b2c30c752-" In the attached file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4582) state.json serving duplicate "active" fields

2016-02-05 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15135137#comment-15135137
 ] 

Marco Massenzio edited comment on MESOS-4582 at 2/5/16 10:17 PM:
-

I'm almost sure that duplicate keys are not legal JSON - worth checking the 
standard, but I'd be in favor of keeping the checks and throwing back a 406 
(Bad Request).

Incidentally, as almost *all* JSON libraries in most languages (I know of Java, 
Python, C++, Scala) model JSON documents with the {{map}} structure, it is 
virtually impossible (or, at best, extremely difficult) to generate a JSON 
document with duplicate keys (even assuming that such a thing is syntactically 
correct).

If you want, I can look it up later this weekend and find out what the JSON 
standard says?

Thanks for fixing it!


was (Author: marco-mesos):
I'm almost sure that duplicate keys are not legal JSON - worth checking the 
standard, but I'd be in favor of keepin the checks and throwing back a 406 (Bad 
Request).

Incidentally, as almost *all* JSON libraries in most languages (I know of Java, 
Python, C++, Scala) model JSON documents with the {{map}} structure, it is 
virtually impossible (or, at best, extremely difficult) to generate a JSON 
document with duplicate keys (even assuming that such a thing is syntactically 
correct).

If you want, I can look it up later this weekend and find out what the JSON 
standard says?

Thanks for fixing it!

> state.json serving duplicate "active" fields
> 
>
> Key: MESOS-4582
> URL: https://issues.apache.org/jira/browse/MESOS-4582
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.0
>Reporter: Michael Gummelt
>Assignee: Michael Park
>Priority: Blocker
> Attachments: error.json
>
>
> state.json is serving duplicate "active" fields in frameworks.  See the 
> framework "47df96c2-3f85-4bc5-b781-709b2c30c752-" In the attached file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3914) Make request format consistent across endpoints

2016-01-14 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15100798#comment-15100798
 ] 

Marco Massenzio commented on MESOS-3914:


Totally agree - good point.

> Make request format consistent across endpoints
> ---
>
> Key: MESOS-3914
> URL: https://issues.apache.org/jira/browse/MESOS-3914
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Alexander Rukletsov
>  Labels: http, mesosphere, tech-debt
>
> We are inconsistent with the format of requests we expect for operator 
> endpoints. For example, dynamic reservations take a string 
> "slaveId={{}}={{}}", while maintenance 
> expects a {{JSON}} object representing {{maintenance::Schedule}} protobuf 
> directly.
> We should agree on the input: either we expect a string with key-value pairs, 
> where values can be {{JSON}} objects, or we request {{JSON}} directly.
> Once we agree on the approach, we should document the outcome and convert all 
> nonconformant endpoints via a deprecation cycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module

2016-01-14 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15099254#comment-15099254
 ] 

Marco Massenzio commented on MESOS-4253:


Sure thing, not a problem.

I'd really like to have it in by the 1.0, though - as this is an "externally 
facing" API (ie, one that external developers code against) it would be awesome 
to have it stable by then.

In particular, I'd love to hear your thoughts about 

(a) a {{shutdown()}} method (naming TBD, this would be consistent with the 
current naming for frameworks; although, {{finalize()}} may be more 
appropriate); and

(b) whether we should also "fix" (most likely in a different Jira/RR) the fact 
that currently the module pointers are never deallocated in the {{main()}} 
methods, so the class destructors are never called (AFAICT, anyway).

Thanks! 

> Provide a minimalist "runtime context" to an Anonymous Module
> -
>
> Key: MESOS-4253
> URL: https://issues.apache.org/jira/browse/MESOS-4253
> Project: Mesos
>  Issue Type: Improvement
>  Components: modules
>Reporter: Marco Massenzio
>Assignee: Marco Massenzio
>
> Currently, {{Anonymous}} modules only receive at creation a copy of the 
> {{"parameters"}} passed in the JSON configuration file.
> However, at runtime, it would be useful to also have a "runtime context" for 
> the module developer to use, when implementing the functionality.
> I would suggest to pass in the {{Flags}} object from the Master/Agent inside 
> an {{setRuntimeContext(const Flags&)}}[0] method, called immediately 
> post-{{create(const Parameters&)}}[1].
> Also, I would suggest adding a {{teardown()}} method too, in case the module 
> needs to release resources / conduct cleanup before exiting (there is a TODO 
> in the code to this effect, and adding this in this patch would be close to 
> trivial).
> [0] In practice, it won't be this trivial, as Master/Agent {{Flags}} are of a 
> different compile-time type - probably use something like variadic templates 
> or something (suggestions appreciated!).
> [1] In fact, the ideal solution would be to add the {{const Flags&}} to 
> {{create()}}, but that would, alas, break everyone's modules; so that's 
> probably a no-go (ideas welcome here too).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module

2016-01-12 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095477#comment-15095477
 ] 

Marco Massenzio commented on MESOS-4253:


Thanks and no worries, it was the holidays!

Also, [~haosd...@gmail.com] and [~jpe...@apache.org] have chimed in, so they 
may want to add their thoughts here.

> Provide a minimalist "runtime context" to an Anonymous Module
> -
>
> Key: MESOS-4253
> URL: https://issues.apache.org/jira/browse/MESOS-4253
> Project: Mesos
>  Issue Type: Improvement
>  Components: modules
>Reporter: Marco Massenzio
>Assignee: Marco Massenzio
>
> Currently, {{Anonymous}} modules only receive at creation a copy of the 
> {{"parameters"}} passed in the JSON configuration file.
> However, at runtime, it would be useful to also have a "runtime context" for 
> the module developer to use, when implementing the functionality.
> I would suggest to pass in the {{Flags}} object from the Master/Agent inside 
> an {{setRuntimeContext(const Flags&)}}[0] method, called immediately 
> post-{{create(const Parameters&)}}[1].
> Also, I would suggest adding a {{teardown()}} method too, in case the module 
> needs to release resources / conduct cleanup before exiting (there is a TODO 
> in the code to this effect, and adding this in this patch would be close to 
> trivial).
> [0] In practice, it won't be this trivial, as Master/Agent {{Flags}} are of a 
> different compile-time type - probably use something like variadic templates 
> or something (suggestions appreciated!).
> [1] In fact, the ideal solution would be to add the {{const Flags&}} to 
> {{create()}}, but that would, alas, break everyone's modules; so that's 
> probably a no-go (ideas welcome here too).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3035) As a Developer I would like a standard way to run a Subprocess in libprocess

2016-01-12 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095472#comment-15095472
 ] 

Marco Massenzio commented on MESOS-3035:


Thanks, [~karya].
There has actually been plenty of "activity" on the review itself - but that 
ended up being discarded.

I think [~jieyu] has done something similar in another review (I have no longer 
access to that email thread, I'm afraid) so he may be able to chime in.

For myself, I've implemented the ["wrapper" 
class|https://github.com/massenz/execute-module/blob/develop/src/include/cmdexecute.hpp#L28]
 for {{Subprocess}} in my own module, so the "boilerplate" has been eliminated.

As the original "reporter" for the issue, I'm happy for this to be closed as a 
"won't fix" - or we can keep it "open" if others think it's still worth 
pursuing in the future.

> As a Developer I would like a standard way to run a Subprocess in libprocess
> 
>
> Key: MESOS-3035
> URL: https://issues.apache.org/jira/browse/MESOS-3035
> Project: Mesos
>  Issue Type: Story
>  Components: libprocess
>Reporter: Marco Massenzio
>  Labels: mesosphere, tech-debt
>
> As part of MESOS-2830 and MESOS-2902 I have been researching the ability to 
> run a {{Subprocess}} and capture the {{stdout / stderr}} along with the exit 
> status code.
> {{process::subprocess()}} offers much of the functionality, but in a way that 
> still requires a lot of handiwork on the developer's part; we would like to 
> further abstract away the ability to just pass a string, an optional set of 
> command-line arguments and then collect the output of the command (bonus: 
> without blocking).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3972) Framework CPU counters on slave page are always zero

2015-12-31 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15076188#comment-15076188
 ] 

Marco Massenzio commented on MESOS-3972:


I'm afraid I no longer have that particular magic power ;)
You can try and post an email on the dev@ mailing list.

> Framework CPU counters on slave page are always zero
> 
>
> Key: MESOS-3972
> URL: https://issues.apache.org/jira/browse/MESOS-3972
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 0.25.0
>Reporter: Ian Babrou
>
> Don't know why, but system and user are always zero, but total is not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3035) As a Developer I would like a standard way to run a Subprocess in libprocess

2015-12-27 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3035:
---
Assignee: (was: Marco Massenzio)

> As a Developer I would like a standard way to run a Subprocess in libprocess
> 
>
> Key: MESOS-3035
> URL: https://issues.apache.org/jira/browse/MESOS-3035
> Project: Mesos
>  Issue Type: Story
>  Components: libprocess
>Reporter: Marco Massenzio
>  Labels: mesosphere, tech-debt
>
> As part of MESOS-2830 and MESOS-2902 I have been researching the ability to 
> run a {{Subprocess}} and capture the {{stdout / stderr}} along with the exit 
> status code.
> {{process::subprocess()}} offers much of the functionality, but in a way that 
> still requires a lot of handiwork on the developer's part; we would like to 
> further abstract away the ability to just pass a string, an optional set of 
> command-line arguments and then collect the output of the command (bonus: 
> without blocking).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3350) Create a protobuf VersionInfo to store mesos version information

2015-12-27 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072378#comment-15072378
 ] 

Marco Massenzio commented on MESOS-3350:


[~vinodkone], [~bmahler]: Do you guys think this is still useful?

If yes, happy to implement it - please let me know what you think.
(if not, I'll just close it).

Thanks.

> Create a protobuf VersionInfo to store mesos version information
> 
>
> Key: MESOS-3350
> URL: https://issues.apache.org/jira/browse/MESOS-3350
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: Marco Massenzio
>  Labels: tech-debt
>
> Currently we use string to store mesos version in protobuf. In 
> [MESOS-1841-reviews|https://reviews.apache.org/r/37024/], [~marco-mesos] 
> think it would be better to create a protobuf struct which named VersionInfo 
> like:
> {code}
> message VersionInfo {
>  option string git_sha = 1;
>  option string build_user = 2;
>  x
> }
> {code}
> So that we could use this struct everywhere (expose informations to http 
> endpoint, replace the version string in MasterInfo).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module

2015-12-27 Thread Marco Massenzio (JIRA)
Marco Massenzio created MESOS-4253:
--

 Summary: Provide a minimalist "runtime context" to an Anonymous 
Module
 Key: MESOS-4253
 URL: https://issues.apache.org/jira/browse/MESOS-4253
 Project: Mesos
  Issue Type: Improvement
  Components: modules
Reporter: Marco Massenzio
Assignee: Marco Massenzio


Currently, {{Anonymous}} modules only receive at creation a copy of the 
{{"parameters"}} passed in the JSON configuration file.

However, at runtime, it would be useful to also have a "runtime context" for 
the module developer to use, when implementing the functionality.
I would suggest to pass in the {{Flags}} object from the Master/Agent inside an 
{{setRuntimeContext(const Flags&)}}[0] method, called immediately 
post-{{create(const Parameters&)}}[1].

Also, I would suggest adding a {{teardown()}} method too, in case the module 
needs to release resources / conduct cleanup before exiting (there is a TODO in 
the code to this effect, and adding this in this patch would be close to 
trivial).

[0] In practice, it won't be this trivial, as Master/Agent {{Flags}} are of a 
different compile-time type - probably use something like variadic templates or 
something (suggestions appreciated!).

[1] In fact, the ideal solution would be to add the {{const Flags&}} to 
{{create()}}, but that would, alas, break everyone's modules; so that's 
probably a no-go (ideas welcome here too).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4253) Provide a minimalist "runtime context" to an Anonymous Module

2015-12-27 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15072372#comment-15072372
 ] 

Marco Massenzio commented on MESOS-4253:


[~karya] - would you mind terribly shepherding this one, please?

> Provide a minimalist "runtime context" to an Anonymous Module
> -
>
> Key: MESOS-4253
> URL: https://issues.apache.org/jira/browse/MESOS-4253
> Project: Mesos
>  Issue Type: Improvement
>  Components: modules
>Reporter: Marco Massenzio
>Assignee: Marco Massenzio
>
> Currently, {{Anonymous}} modules only receive at creation a copy of the 
> {{"parameters"}} passed in the JSON configuration file.
> However, at runtime, it would be useful to also have a "runtime context" for 
> the module developer to use, when implementing the functionality.
> I would suggest to pass in the {{Flags}} object from the Master/Agent inside 
> an {{setRuntimeContext(const Flags&)}}[0] method, called immediately 
> post-{{create(const Parameters&)}}[1].
> Also, I would suggest adding a {{teardown()}} method too, in case the module 
> needs to release resources / conduct cleanup before exiting (there is a TODO 
> in the code to this effect, and adding this in this patch would be close to 
> trivial).
> [0] In practice, it won't be this trivial, as Master/Agent {{Flags}} are of a 
> different compile-time type - probably use something like variadic templates 
> or something (suggestions appreciated!).
> [1] In fact, the ideal solution would be to add the {{const Flags&}} to 
> {{create()}}, but that would, alas, break everyone's modules; so that's 
> probably a no-go (ideas welcome here too).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2948) Generalize authorizer interface in order to allow for arbitrary Subjects, Actions and Objects

2015-11-25 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-2948:
--

Assignee: Marco Massenzio

> Generalize authorizer interface in order to allow for arbitrary Subjects, 
> Actions and Objects
> -
>
> Key: MESOS-2948
> URL: https://issues.apache.org/jira/browse/MESOS-2948
> Project: Mesos
>  Issue Type: Epic
>  Components: master, security
>Reporter: Alexander Rojas
>Assignee: Marco Massenzio
>  Labels: acl, mesosphere, security
>
> The current 
> [{{mesos::Authorizer}}|https://github.com/apache/mesos/blob/40b596402521be25b93b9ef4edd8f5c727c9d20e/src/authorizer/authorizer.hpp]
>  API has one method for each of the _actions_ supported (Register Framework, 
> Launch Task and Shutdown Framework), and each of these _actions_ themselves 
> define the _objects_ on which they operate.
> Currently, in case a new action needs to be authorized it is necessary to 
> modify the {{mesos::Authorizer}} interface and all its implementations 
> (currently only {{mesos::LocalAuthorizer}}), and add a new nested message to 
> the {{ACL}} message in {{mesos.proto}}.
> An update to the API should allow for new _actions_ and _objects_ to be added 
> without the need to change the {{mesos::Authorizer}} interface while 
> encapsulating implementation details on how the authorization process is 
> performed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2297) Add authentication support for HTTP API

2015-11-25 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-2297:
--

Assignee: Marco Massenzio  (was: Alexander Rojas)

> Add authentication support for HTTP API
> ---
>
> Key: MESOS-2297
> URL: https://issues.apache.org/jira/browse/MESOS-2297
> Project: Mesos
>  Issue Type: Epic
>Reporter: Vinod Kone
>Assignee: Marco Massenzio
>  Labels: mesosphere, security
>
> Since most of the communication between mesos components will happen through 
> HTTP with the arrival of the [HTTP 
> API|https://issues.apache.org/jira/browse/MESOS-2288], it makes sense to use 
> HTTP standard mechanisms to authenticate this communication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-25 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027271#comment-15027271
 ] 

Marco Massenzio commented on MESOS-3937:


So I read the thread and, honestly, it looks like we're making all this song 
and dance to make a test pass? who cares?
The question, with a failing test, is always the same:
{quote}
Is the test buggy, or are we uncovering a genuine issue in the code?
{quote}

It seems to me that this tests does not identify an issue in the code; at best, 
it has highlighted a combination of Ubuntu / Kernel / Docker 
versions/configurations that *may* cause an Executor launched inside a Docker 
container to fail (and, even there, I'm not so sure).

Also, please let's remind ourselves that tests are useful so that, when 
introducing code changes; refactorings; or new features, we can be assured that 
we haven't broken something that was working before: I'm not even sure this 
test achieves that?
(this may be a harsh statement borne out of my ignorance - please, correct me 
if I'm wrong on this one).

Here is my suggestion as to how to solve this issue:

- short-term: we disable this test and remove it as a {{0.26}} blocker (it 
doesn't seem to me that the failure highlights a regression in the code - 
again, correct me if I'm wrong);
- short-term: document the issue and possible workarounds for folks who may 
need to run Docker executors on Ubuntu;
- medium-term: if possible at all, let's find ways to identify in the test the 
conditions under which it's supposed to pass and, if they are met on the given 
platform the test is run - if not, a warning is emitted, but no failure (or 
something similar);
- long-run: decide whether to keep the test (modified, possibly) and / or 
discard it.

What do people think?

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" 

[jira] [Updated] (MESOS-3518) Assertions that compare doubles with == can fail due to rounding issues and can crash the master.

2015-11-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3518:
---
Description: 
mesos-0.23.0/src/common/resources.cpp has
  CHECK(result.cpus() == cpus() &&
result.mem() == mem() &&
result.disk() == disk() &&
result.ports() == ports());
at around line 869. Sometimes, rounding errors can trigger this check to fail 
because of the cpus() part. One should take the absolute value of the 
difference and compare with a small value to avoid this problem. The same 
problem could be true in various places, so far I have not yet checked.
Seems to be present in all versions I checked. I could trigger this by asking 
for some resource value of {{cpus( *):0.2}}

  was:
mesos-0.23.0/src/common/resources.cpp has
  CHECK(result.cpus() == cpus() &&
result.mem() == mem() &&
result.disk() == disk() &&
result.ports() == ports());
at around line 869. Sometimes, rounding errors can trigger this check to fail 
because of the cpus() part. One should take the absolute value of the 
difference and compare with a small value to avoid this problem. The same 
problem could be true in various places, so far I have not yet checked.
Seems to be present in all versions I checked. I could trigger this by asking 
for some resource value of {{cpus(*):0.2}}


> Assertions that compare doubles with == can fail due to rounding issues and 
> can crash the master.
> -
>
> Key: MESOS-3518
> URL: https://issues.apache.org/jira/browse/MESOS-3518
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1
>Reporter: Max Neunhöffer
>Priority: Minor
>
> mesos-0.23.0/src/common/resources.cpp has
>   CHECK(result.cpus() == cpus() &&
> result.mem() == mem() &&
> result.disk() == disk() &&
> result.ports() == ports());
> at around line 869. Sometimes, rounding errors can trigger this check to fail 
> because of the cpus() part. One should take the absolute value of the 
> difference and compare with a small value to avoid this problem. The same 
> problem could be true in various places, so far I have not yet checked.
> Seems to be present in all versions I checked. I could trigger this by asking 
> for some resource value of {{cpus( *):0.2}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3518) Assertions that compare doubles with == can fail due to rounding issues and can crash the master.

2015-11-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3518:
---
Description: 
mesos-0.23.0/src/common/resources.cpp has
  CHECK(result.cpus() == cpus() &&
result.mem() == mem() &&
result.disk() == disk() &&
result.ports() == ports());
at around line 869. Sometimes, rounding errors can trigger this check to fail 
because of the cpus() part. One should take the absolute value of the 
difference and compare with a small value to avoid this problem. The same 
problem could be true in various places, so far I have not yet checked.
Seems to be present in all versions I checked. I could trigger this by asking 
for some resource value of {{cpus(*):0.2}}

  was:
mesos-0.23.0/src/common/resources.cpp has
  CHECK(result.cpus() == cpus() &&
result.mem() == mem() &&
result.disk() == disk() &&
result.ports() == ports());
at around line 869. Sometimes, rounding errors can trigger this check to fail 
because of the cpus() part. One should take the absolute value of the 
difference and compare with a small value to avoid this problem. The same 
problem could be true in various places, so far I have not yet checked.
Seems to be present in all versions I checked. I could trigger this by asking 
for some resource value of cpus(*):0.2


> Assertions that compare doubles with == can fail due to rounding issues and 
> can crash the master.
> -
>
> Key: MESOS-3518
> URL: https://issues.apache.org/jira/browse/MESOS-3518
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1
>Reporter: Max Neunhöffer
>Priority: Minor
>
> mesos-0.23.0/src/common/resources.cpp has
>   CHECK(result.cpus() == cpus() &&
> result.mem() == mem() &&
> result.disk() == disk() &&
> result.ports() == ports());
> at around line 869. Sometimes, rounding errors can trigger this check to fail 
> because of the cpus() part. One should take the absolute value of the 
> difference and compare with a small value to avoid this problem. The same 
> problem could be true in various places, so far I have not yet checked.
> Seems to be present in all versions I checked. I could trigger this by asking 
> for some resource value of {{cpus(*):0.2}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-24 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025816#comment-15025816
 ] 

Marco Massenzio commented on MESOS-3552:


As this is something that has been present in Mesos since forever, and that the 
{{0.26}} release is in process, I have removed this as a {{0.26}} blocker.
While it's not great that this slipped this release, I also believe that not 
rushing it through gives us an opportunity to address the problem "properly" 
(eg, using fixed point for resources) in time for {{0.27}}.

As a halfway, we could just remove the {{CHECK( ) }} calls that crash Mesos and 
replace them with {{LOG(ERROR)}} and return an error to the caller - this won't 
solve the issue where the cause is actually a rounding error, but at least we 
don't risk introducing regressions/bugs this close to the release.

Finally, it is my opinion that we should consolidate this ticket and MESOS-1187 
into one (probably by closing this one as a duplicate of that one, older) so 
that we don't have a "split brain" conversation, but wouldn't want to do that 
and risk losing valuable information in this one - does anyone have suggestions 
as how to do this cleanly?
(or do people feel that simply linking them and closing this one as duplicate 
would be sufficient?)

Just to be perfectly clear, I fully agree this is an important issue to 
address, I'm just suggesting here that it should not block the release.
If people feel strongly about this, please let's have a conversation either via 
hangout or email.

Thanks, everyone for looking into this!

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3552:
---
Labels: mesosphere tech-debt  (was: )

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3552:
---
Issue Type: Improvement  (was: Bug)

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3552) CHECK failure due to floating point precision on reservation request

2015-11-24 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025816#comment-15025816
 ] 

Marco Massenzio edited comment on MESOS-3552 at 11/25/15 12:28 AM:
---

As this is something that has been present in Mesos since forever, and that the 
{{0.26}} release is in process, I have removed this as a {{0.26}} blocker.
While it's not great that this slipped this release, I also believe that not 
rushing it through gives us an opportunity to address the problem "properly" 
(eg, using fixed point for resources) in time for {{0.27}}.

As a halfway, we could just remove the {{CHECK( )}} calls that crash Mesos and 
replace them with {{LOG(ERROR)}} and return an error to the caller - this won't 
solve the issue where the cause is actually a rounding error, but at least we 
don't risk introducing regressions/bugs this close to the release.

Finally, it is my opinion that we should consolidate this ticket and MESOS-1187 
into one (probably by closing this one as a duplicate of that one, older) so 
that we don't have a "split brain" conversation, but wouldn't want to do that 
and risk losing valuable information in this one - does anyone have suggestions 
as how to do this cleanly?
(or do people feel that simply linking them and closing this one as duplicate 
would be sufficient?)

Just to be perfectly clear, I fully agree this is an important issue to 
address, I'm just suggesting here that it should not block the release.
If people feel strongly about this, please let's have a conversation either via 
hangout or email.

Thanks, everyone for looking into this!


was (Author: marco-mesos):
As this is something that has been present in Mesos since forever, and that the 
{{0.26}} release is in process, I have removed this as a {{0.26}} blocker.
While it's not great that this slipped this release, I also believe that not 
rushing it through gives us an opportunity to address the problem "properly" 
(eg, using fixed point for resources) in time for {{0.27}}.

As a halfway, we could just remove the {{CHECK( ) }} calls that crash Mesos and 
replace them with {{LOG(ERROR)}} and return an error to the caller - this won't 
solve the issue where the cause is actually a rounding error, but at least we 
don't risk introducing regressions/bugs this close to the release.

Finally, it is my opinion that we should consolidate this ticket and MESOS-1187 
into one (probably by closing this one as a duplicate of that one, older) so 
that we don't have a "split brain" conversation, but wouldn't want to do that 
and risk losing valuable information in this one - does anyone have suggestions 
as how to do this cleanly?
(or do people feel that simply linking them and closing this one as duplicate 
would be sufficient?)

Just to be perfectly clear, I fully agree this is an important issue to 
address, I'm just suggesting here that it should not block the release.
If people feel strongly about this, please let's have a conversation either via 
hangout or email.

Thanks, everyone for looking into this!

> CHECK failure due to floating point precision on reservation request
> 
>
> Key: MESOS-3552
> URL: https://issues.apache.org/jira/browse/MESOS-3552
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Mandeep Chadha
>Assignee: Mandeep Chadha
>  Labels: mesosphere, tech-debt
>
> result.cpus() == cpus() check is failing due to ( double == double ) 
> comparison problem. 
> Root Cause : 
> Framework requested 0.1 cpu reservation for the first task. So far so good. 
> Next Reserve operation — lead to double operations resulting in following 
> double values :
>  results.cpus() : 23.9964472863211995 cpus() : 24
> And the check ( result.cpus() == cpus() ) failed. 
>  The double arithmetic operations caused results.cpus() value to be :  
> 23.9964472863211995 and hence ( 23.9964472863211995 
> == 24 ) failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3024) HTTP endpoint authN is enabled merely by specifying --credentials

2015-11-24 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025742#comment-15025742
 ] 

Marco Massenzio commented on MESOS-3024:


Got it, thanks!

> HTTP endpoint authN is enabled merely by specifying --credentials
> -
>
> Key: MESOS-3024
> URL: https://issues.apache.org/jira/browse/MESOS-3024
> Project: Mesos
>  Issue Type: Bug
>  Components: master, security
>Reporter: Adam B
>Assignee: Marco Massenzio
>  Labels: authentication, http, mesosphere
>
> If I set `--credentials` on the master, framework and slave authentication 
> are allowed, but not required. On the other hand, http authentication is now 
> required for authenticated endpoints (currently only `/shutdown`). That means 
> that I cannot enable framework or slave authentication without also enabling 
> http endpoint authentication. This is undesirable.
> Framework and slave authentication have separate flags (`\--authenticate` and 
> `\--authenticate_slaves`) to require authentication for each. It would be 
> great if there was also such a flag for http authentication. Or maybe we get 
> rid of these flags altogether and rely on ACLs to determine which 
> unauthenticated principals are even allowed to authenticate for each 
> endpoint/action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2044) Use one IP address per container for network isolation

2015-11-24 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-2044:
--

Assignee: Marco Massenzio  (was: Kapil Arya)

> Use one IP address per container for network isolation
> --
>
> Key: MESOS-2044
> URL: https://issues.apache.org/jira/browse/MESOS-2044
> Project: Mesos
>  Issue Type: Epic
>Reporter: Cong Wang
>Assignee: Marco Massenzio
>  Labels: mesosphere
>
> If there are enough IP addresses, either IPv4 or IPv6, we should use one IP 
> address per container, instead of the ugly port range based solution. One 
> problem with this is the IP address management, usually it is managed by a 
> DHCP server, maybe we need to manage them in mesos master/slave.
> Also, maybe use macvlan instead of veth for better isolation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3851) Investigate recent crashes in Command Executor

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-3851:
--

Assignee: Anand Mazumdar  (was: Benjamin Mahler)

> Investigate recent crashes in Command Executor
> --
>
> Key: MESOS-3851
> URL: https://issues.apache.org/jira/browse/MESOS-3851
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>Priority: Blocker
>  Labels: mesosphere
>
> Post https://reviews.apache.org/r/38900 i.e. updating CommandExecutor to 
> support rootfs. There seem to be some tests showing frequent crashes due to 
> assert violations.
> {{FetcherCacheTest.SimpleEviction}} failed due to the following log:
> {code}
> I1107 19:36:46.360908 30657 slave.cpp:1793] Sending queued task '3' to 
> executor ''3' of framework 7d94c7fb-8950-4bcf-80c1-46112292dcd6- at 
> executor(1)@172.17.5.200:33871'
> I1107 19:36:46.363682  1236 exec.cpp:297] 
> I1107 19:36:46.373569  1245 exec.cpp:210] Executor registered on slave 
> 7d94c7fb-8950-4bcf-80c1-46112292dcd6-S0
> @ 0x7f9f5a7db3fa  google::LogMessage::Fail()
> I1107 19:36:46.394081  1245 exec.cpp:222] Executor::registered took 395411ns
> @ 0x7f9f5a7db359  google::LogMessage::SendToLog()
> @ 0x7f9f5a7dad6a  google::LogMessage::Flush()
> @ 0x7f9f5a7dda9e  google::LogMessageFatal::~LogMessageFatal()
> @   0x48d00a  _CheckFatal::~_CheckFatal()
> @   0x49c99d  
> mesos::internal::CommandExecutorProcess::launchTask()
> @   0x4b3dd7  
> _ZZN7process8dispatchIN5mesos8internal22CommandExecutorProcessEPNS1_14ExecutorDriverERKNS1_8TaskInfoES5_S6_EEvRKNS_3PIDIT_EEMSA_FvT0_T1_ET2_T3_ENKUlPNS_11ProcessBaseEE_clESL_
> @   0x4c470c  
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal22CommandExecutorProcessEPNS5_14ExecutorDriverERKNS5_8TaskInfoES9_SA_EEvRKNS0_3PIDIT_EEMSE_FvT0_T1_ET2_T3_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x7f9f5a761b1b  std::function<>::operator()()
> @ 0x7f9f5a749935  process::ProcessBase::visit()
> @ 0x7f9f5a74d700  process::DispatchEvent::visit()
> @   0x48e004  process::ProcessBase::serve()
> @ 0x7f9f5a745d21  process::ProcessManager::resume()
> @ 0x7f9f5a742f52  
> _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_
> @ 0x7f9f5a74cf2c  
> _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE
> @ 0x7f9f5a74cedc  
> _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_
> @ 0x7f9f5a74ce6e  
> _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE
> @ 0x7f9f5a74cdc5  
> _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv
> @ 0x7f9f5a74cd5e  
> _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv
> @ 0x7f9f5624f1e0  (unknown)
> @ 0x7f9f564a8df5  start_thread
> @ 0x7f9f559b71ad  __clone
> I1107 19:36:46.551370 30656 containerizer.cpp:1257] Executor for container 
> '6553a617-6b4a-418d-9759-5681f45ff854' has exited
> I1107 19:36:46.551429 30656 containerizer.cpp:1074] Destroying container 
> '6553a617-6b4a-418d-9759-5681f45ff854'
> I1107 19:36:46.553869 30656 containerizer.cpp:1257] Executor for container 
> 'd2c1f924-c92a-453e-82b1-c294d09c4873' has exited
> {code}
> The reason seems to be a race between the executor receiving a 
> {{RunTaskMessage}} before {{ExecutorRegisteredMessage}} leading to the 
> {{CHECK_SOME(executorInfo)}} failure.
> Link to complete log: 
> https://issues.apache.org/jira/browse/MESOS-2831?focusedCommentId=14995535=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14995535
> Another related failure from {{ExamplesTest.PersistentVolumeFramework}}
> {code}
> @ 0x7f4f71529cbd  google::LogMessage::SendToLog()
> I1107 13:15:09.949987 31573 slave.cpp:2337] Status update manager 
> successfully handled status update acknowledgement (UUID: 
> 721c7316-5580-4636-a83a-098e3bd4ed1f) for task 
> ad90531f-d3d8-43f6-96f2-c81c4548a12d of framework 
> ac4ea54a-7d19-4e41-9ee3-1a761f8e5b0f-
> @ 0x7f4f715296ce  google::LogMessage::Flush()
> @ 0x7f4f7152c402  google::LogMessageFatal::~LogMessageFatal()
> @   0x48d00a  

[jira] [Updated] (MESOS-3939) ubsan error in net::IP::create(sockaddr const&): misaligned address

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3939:
---
Sprint:   (was: Mesosphere Sprint 23)

> ubsan error in net::IP::create(sockaddr const&): misaligned address
> ---
>
> Key: MESOS-3939
> URL: https://issues.apache.org/jira/browse/MESOS-3939
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Neil Conway
>Assignee: Neil Conway
>Priority: Minor
>  Labels: mesosphere, ubsan
>
> Running ubsan from GCC 5.2 on the current Mesos unit tests yields this, among 
> other problems:
> {noformat}
> /mesos/3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp:230:56: 
> runtime error: reference binding to misaligned address 0x0199629c for 
> type 'const struct sockaddr_storage', which requires 8 byte alignment
> 0x0199629c: note: pointer points here
>   00 00 00 00 02 00 00 00  ff ff ff 00 00 00 00 00  00 00 00 00 00 00 00 00  
> 00 00 00 00 00 00 00 00
>   ^
> #0 0x5950cb in net::IP::create(sockaddr const&) 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x5950cb)
> #1 0x5970cd in 
> net::IPNetwork::fromLinkDevice(std::__cxx11::basic_string std::char_traits, std::allocator > const&, int) 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x5970cd)
> #2 0x58e006 in NetTest_LinkDevice_Test::TestBody() 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x58e006)
> #3 0x85abd5 in void 
> testing::internal::HandleSehExceptionsInMethodIfSupported void>(testing::Test*, void (testing::Test::*)(), char const*) 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x85abd5)
> #4 0x848abc in void 
> testing::internal::HandleExceptionsInMethodIfSupported void>(testing::Test*, void (testing::Test::*)(), char const*) 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x848abc)
> #5 0x7e2755 in testing::Test::Run() 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x7e2755)
> #6 0x7e44a0 in testing::TestInfo::Run() 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x7e44a0)
> #7 0x7e5ffa in testing::TestCase::Run() 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x7e5ffa)
> #8 0x7ffe21 in testing::internal::UnitTestImpl::RunAllTests() 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x7ffe21)
> #9 0x85d7a5 in bool 
> testing::internal::HandleSehExceptionsInMethodIfSupported  bool>(testing::internal::UnitTestImpl*, bool 
> (testing::internal::UnitTestImpl::*)(), char const*) 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x85d7a5)
> #10 0x84b37a in bool 
> testing::internal::HandleExceptionsInMethodIfSupported  bool>(testing::internal::UnitTestImpl*, bool 
> (testing::internal::UnitTestImpl::*)(), char const*) 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x84b37a)
> #11 0x7f8a4a in testing::UnitTest::Run() 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x7f8a4a)
> #12 0x608a96 in RUN_ALL_TESTS() 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x608a96)
> #13 0x60896b in main 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x60896b)
> #14 0x7fd0f0c7fa3f in __libc_start_main 
> (/lib/x86_64-linux-gnu/libc.so.6+0x20a3f)
> #15 0x4145c8 in _start 
> (/home/vagrant/build-mesos-ubsan/3rdparty/libprocess/3rdparty/stout-tests+0x4145c8)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3964) LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3964:
---
Story Points: 2

> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and 
> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
> ---
>
> Key: MESOS-3964
> URL: https://issues.apache.org/jira/browse/MESOS-3964
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, test
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile: see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Greg Mann
>  Labels: mesosphere
>
> sudo ./bin/mesos-test.sh 
> --gtest_filter="LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs"
> {noformat}
> ...
> F1119 14:34:52.514742 30706 isolator_tests.cpp:455] CHECK_SOME(isolator): 
> Failed to find 'cpu.cfs_quota_us'. Your kernel might be too old to use the 
> CFS cgroups feature.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3969) Failing 'make distcheck' on Debian 8, somehow SSL-related.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3969:
---
Story Points: 3

> Failing 'make distcheck' on Debian 8, somehow SSL-related.
> --
>
> Key: MESOS-3969
> URL: https://issues.apache.org/jira/browse/MESOS-3969
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Joseph Wu
>  Labels: build, build-failure, mesosphere
>
> As non-root: make distcheck.
> {noformat}
> /bin/mkdir -p '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin'
> /bin/bash ../libtool --mode=install /usr/bin/install -c mesos-local mesos-log 
> mesos mesos-execute mesos-resolve 
> '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin'
> libtool: install: /usr/bin/install -c .libs/mesos-local 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-local
> libtool: install: /usr/bin/install -c .libs/mesos-log 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-log
> libtool: install: /usr/bin/install -c .libs/mesos 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos
> libtool: install: /usr/bin/install -c .libs/mesos-execute 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-execute
> libtool: install: /usr/bin/install -c .libs/mesos-resolve 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-resolve
> Traceback (most recent call last):
> File "", line 1, in 
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/__init_.py",
>  line 11, in 
> from pip.vcs import git, mercurial, subversion, bazaar # noqa
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/vcs/mercurial.py",
>  line 9, in 
> from pip.download import path_to_url
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/download.py",
>  line 22, in 
> from pip._vendor import requests, six
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/_vendor/requests/__init_.py",
>  line 53, in 
> from .packages.urllib3.contrib import pyopenssl
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/_vendor/requests/packages/urllib3/contrib/pyopenssl.py",
>  line 70, in 
> ssl.PROTOCOL_SSLv3: OpenSSL.SSL.SSLv3_METHOD,
> AttributeError: 'module' object has no attribute 'PROTOCOL_SSLv3'
> Traceback (most recent call last):
> File "", line 1, in 
> File "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rd
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3973) Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3973:
---
Story Points: 2

> Failing 'make distcheck' on Mac OS X 10.10.5, also 10.11.
> -
>
> Key: MESOS-3973
> URL: https://issues.apache.org/jira/browse/MESOS-3973
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.26.0
> Environment: Mac OS X 10.10.5, Clang 7.0.0.
>Reporter: Bernd Mathiske
>Assignee: Gilbert Song
>  Labels: build, build-failure, mesosphere
>
> Non-root 'make distcheck.
> {noformat}
> ...
> [--] Global test environment tear-down
> [==] 826 tests from 113 test cases ran. (276624 ms total)
> [  PASSED  ] 826 tests.
>   YOU HAVE 6 DISABLED TESTS
> Making install in .
> make[3]: Nothing to be done for `install-exec-am'.
>  ../install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig'
>  /usr/bin/install -c -m 644 mesos.pc 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/lib/pkgconfig'
> Making install in 3rdparty
> /Applications/Xcode.app/Contents/Developer/usr/bin/make  install-recursive
> Making install in libprocess
> Making install in 3rdparty
> /Applications/Xcode.app/Contents/Developer/usr/bin/make  install-recursive
> Making install in stout
> Making install in .
> make[9]: Nothing to be done for `install-exec-am'.
> make[9]: Nothing to be done for `install-data-am'.
> Making install in include
> make[9]: Nothing to be done for `install-exec-am'.
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include'
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/install-sh -c -d 
> '/Users/bernd/mesos/mesos/build/mesos-0.26.0/_inst/include/stout'
>  /usr/bin/install -c -m 644  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/abort.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/attributes.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/base64.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bits.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/bytes.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/cache.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/dynamiclibrary.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/error.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/exit.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/flags.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/foreach.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/format.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/fs.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gtest.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/gzip.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashmap.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/hashset.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/interval.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/ip.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/lambda.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/linkedhashmap.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/list.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/mac.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multihashmap.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/multimap.hpp
>  ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/net.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/none.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/nothing.hpp
>  
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/numify.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/path.hpp 
> ../../../../../../3rdparty/libprocess/3rdparty/stout/include/stout/preprocessor.hpp
>  

[jira] [Updated] (MESOS-3988) Implicit roles

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3988:
---
  Assignee: Neil Conway
Issue Type: Epic  (was: Improvement)

> Implicit roles
> --
>
> Key: MESOS-3988
> URL: https://issues.apache.org/jira/browse/MESOS-3988
> Project: Mesos
>  Issue Type: Epic
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere, roles
>
> At present, Mesos uses a static list of roles that are configured when the 
> master starts up. This places some severe limitations on how roles can be 
> used (e.g., changing the set of roles requires restarting all the masters).
> As an alternative (or a precursor) to implementing full-blown dynamic roles, 
> we could instead relax the concept of roles, so that:
> * frameworks can register with any role (subject to ACLs/authz)
> * reservations can be made for any role
> Open questions, at least to me:
> * This would mean weights cannot be configured dynamically. Is that okay?
> * Is this feature useful enough without dynamic ACL changes?
> * If we implement this (+ dynamic ACLs), do we also need dynamic roles?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3916) MasterMaintenanceTest.InverseOffersFilters is flaky

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3916:
---
Story Points: 3

> MasterMaintenanceTest.InverseOffersFilters is flaky
> ---
>
> Key: MESOS-3916
> URL: https://issues.apache.org/jira/browse/MESOS-3916
> Project: Mesos
>  Issue Type: Bug
> Environment: Ubuntu Wily 64 bit
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: flaky-test, maintenance, mesosphere
> Attachments: wily_maintenance_test_verbose.txt
>
>
> Verbose Logs:
> {code}
> [ RUN  ] MasterMaintenanceTest.InverseOffersFilters
> I1113 16:43:58.486469  8728 leveldb.cpp:176] Opened db in 2.360405ms
> I1113 16:43:58.486935  8728 leveldb.cpp:183] Compacted db in 407105ns
> I1113 16:43:58.486995  8728 leveldb.cpp:198] Created db iterator in 16221ns
> I1113 16:43:58.487030  8728 leveldb.cpp:204] Seeked to beginning of db in 
> 10935ns
> I1113 16:43:58.487046  8728 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 999ns
> I1113 16:43:58.487090  8728 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1113 16:43:58.487735  8747 recover.cpp:449] Starting replica recovery
> I1113 16:43:58.488047  8747 recover.cpp:475] Replica is in EMPTY status
> I1113 16:43:58.488977  8745 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (58)@10.0.2.15:45384
> I1113 16:43:58.489452  8746 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1113 16:43:58.489712  8747 recover.cpp:566] Updating replica status to 
> STARTING
> I1113 16:43:58.490706  8742 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 745443ns
> I1113 16:43:58.490739  8742 replica.cpp:323] Persisted replica status to 
> STARTING
> I1113 16:43:58.490859  8742 recover.cpp:475] Replica is in STARTING status
> I1113 16:43:58.491786  8747 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (59)@10.0.2.15:45384
> I1113 16:43:58.492542  8749 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1113 16:43:58.493221  8743 recover.cpp:566] Updating replica status to VOTING
> I1113 16:43:58.493710  8743 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 331874ns
> I1113 16:43:58.493767  8743 replica.cpp:323] Persisted replica status to 
> VOTING
> I1113 16:43:58.493868  8743 recover.cpp:580] Successfully joined the Paxos 
> group
> I1113 16:43:58.494119  8743 recover.cpp:464] Recover process terminated
> I1113 16:43:58.504369  8749 master.cpp:367] Master 
> d59449fc-5462-43c5-b935-e05563fdd4b6 (vagrant-ubuntu-wily-64) started on 
> 10.0.2.15:45384
> I1113 16:43:58.504438  8749 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/ZB7csS/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_slave_ping_timeouts="5" --quiet="false" 
> --recovery_slave_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="25secs" 
> --registry_strict="true" --root_submissions="true" 
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
> --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/ZB7csS/master" 
> --zk_session_timeout="10secs"
> I1113 16:43:58.504717  8749 master.cpp:416] Master allowing unauthenticated 
> frameworks to register
> I1113 16:43:58.504889  8749 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1113 16:43:58.504922  8749 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/ZB7csS/credentials'
> I1113 16:43:58.505497  8749 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1113 16:43:58.505759  8749 master.cpp:495] Authorization enabled
> I1113 16:43:58.507638  8746 master.cpp:1606] The newly elected leader is 
> master@10.0.2.15:45384 with id d59449fc-5462-43c5-b935-e05563fdd4b6
> I1113 16:43:58.507693  8746 master.cpp:1619] Elected as the leading master!
> I1113 16:43:58.507720  8746 master.cpp:1379] Recovering from registrar
> I1113 16:43:58.507946  8749 registrar.cpp:309] Recovering registrar
> I1113 16:43:58.508561  8749 log.cpp:661] Attempting to start the writer
> I1113 16:43:58.510282  8747 replica.cpp:496] Replica received implicit 
> promise request from (60)@10.0.2.15:45384 with proposal 1
> I1113 16:43:58.510867  8747 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 

[jira] [Updated] (MESOS-3851) Investigate recent crashes in Command Executor

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3851:
---
Story Points: 2

> Investigate recent crashes in Command Executor
> --
>
> Key: MESOS-3851
> URL: https://issues.apache.org/jira/browse/MESOS-3851
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>Priority: Blocker
>  Labels: mesosphere
>
> Post https://reviews.apache.org/r/38900 i.e. updating CommandExecutor to 
> support rootfs. There seem to be some tests showing frequent crashes due to 
> assert violations.
> {{FetcherCacheTest.SimpleEviction}} failed due to the following log:
> {code}
> I1107 19:36:46.360908 30657 slave.cpp:1793] Sending queued task '3' to 
> executor ''3' of framework 7d94c7fb-8950-4bcf-80c1-46112292dcd6- at 
> executor(1)@172.17.5.200:33871'
> I1107 19:36:46.363682  1236 exec.cpp:297] 
> I1107 19:36:46.373569  1245 exec.cpp:210] Executor registered on slave 
> 7d94c7fb-8950-4bcf-80c1-46112292dcd6-S0
> @ 0x7f9f5a7db3fa  google::LogMessage::Fail()
> I1107 19:36:46.394081  1245 exec.cpp:222] Executor::registered took 395411ns
> @ 0x7f9f5a7db359  google::LogMessage::SendToLog()
> @ 0x7f9f5a7dad6a  google::LogMessage::Flush()
> @ 0x7f9f5a7dda9e  google::LogMessageFatal::~LogMessageFatal()
> @   0x48d00a  _CheckFatal::~_CheckFatal()
> @   0x49c99d  
> mesos::internal::CommandExecutorProcess::launchTask()
> @   0x4b3dd7  
> _ZZN7process8dispatchIN5mesos8internal22CommandExecutorProcessEPNS1_14ExecutorDriverERKNS1_8TaskInfoES5_S6_EEvRKNS_3PIDIT_EEMSA_FvT0_T1_ET2_T3_ENKUlPNS_11ProcessBaseEE_clESL_
> @   0x4c470c  
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal22CommandExecutorProcessEPNS5_14ExecutorDriverERKNS5_8TaskInfoES9_SA_EEvRKNS0_3PIDIT_EEMSE_FvT0_T1_ET2_T3_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x7f9f5a761b1b  std::function<>::operator()()
> @ 0x7f9f5a749935  process::ProcessBase::visit()
> @ 0x7f9f5a74d700  process::DispatchEvent::visit()
> @   0x48e004  process::ProcessBase::serve()
> @ 0x7f9f5a745d21  process::ProcessManager::resume()
> @ 0x7f9f5a742f52  
> _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_
> @ 0x7f9f5a74cf2c  
> _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE
> @ 0x7f9f5a74cedc  
> _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_
> @ 0x7f9f5a74ce6e  
> _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE
> @ 0x7f9f5a74cdc5  
> _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv
> @ 0x7f9f5a74cd5e  
> _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv
> @ 0x7f9f5624f1e0  (unknown)
> @ 0x7f9f564a8df5  start_thread
> @ 0x7f9f559b71ad  __clone
> I1107 19:36:46.551370 30656 containerizer.cpp:1257] Executor for container 
> '6553a617-6b4a-418d-9759-5681f45ff854' has exited
> I1107 19:36:46.551429 30656 containerizer.cpp:1074] Destroying container 
> '6553a617-6b4a-418d-9759-5681f45ff854'
> I1107 19:36:46.553869 30656 containerizer.cpp:1257] Executor for container 
> 'd2c1f924-c92a-453e-82b1-c294d09c4873' has exited
> {code}
> The reason seems to be a race between the executor receiving a 
> {{RunTaskMessage}} before {{ExecutorRegisteredMessage}} leading to the 
> {{CHECK_SOME(executorInfo)}} failure.
> Link to complete log: 
> https://issues.apache.org/jira/browse/MESOS-2831?focusedCommentId=14995535=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14995535
> Another related failure from {{ExamplesTest.PersistentVolumeFramework}}
> {code}
> @ 0x7f4f71529cbd  google::LogMessage::SendToLog()
> I1107 13:15:09.949987 31573 slave.cpp:2337] Status update manager 
> successfully handled status update acknowledgement (UUID: 
> 721c7316-5580-4636-a83a-098e3bd4ed1f) for task 
> ad90531f-d3d8-43f6-96f2-c81c4548a12d of framework 
> ac4ea54a-7d19-4e41-9ee3-1a761f8e5b0f-
> @ 0x7f4f715296ce  google::LogMessage::Flush()
> @ 0x7f4f7152c402  google::LogMessageFatal::~LogMessageFatal()
> @   0x48d00a  _CheckFatal::~_CheckFatal()
> @   0x49c99d  
> 

[jira] [Updated] (MESOS-3937) Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3937:
---
Story Points: 2

> Test DockerContainerizerTest.ROOT_DOCKER_Launch_Executor fails.
> ---
>
> Key: MESOS-3937
> URL: https://issues.apache.org/jira/browse/MESOS-3937
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.26.0
> Environment: Ubuntu 14.04, gcc 4.8.4, Docker version 1.6.2
> 8 CPUs, 16 GB memory
> Vagrant, libvirt/Virtual Box or VMware
>Reporter: Bernd Mathiske
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> {noformat}
> ../configure
> make check
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Launch_Executor" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from DockerContainerizerTest
> I1117 15:08:09.265943 26380 leveldb.cpp:176] Opened db in 3.199666ms
> I1117 15:08:09.267761 26380 leveldb.cpp:183] Compacted db in 1.684873ms
> I1117 15:08:09.267902 26380 leveldb.cpp:198] Created db iterator in 58313ns
> I1117 15:08:09.267966 26380 leveldb.cpp:204] Seeked to beginning of db in 
> 4927ns
> I1117 15:08:09.267997 26380 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 1605ns
> I1117 15:08:09.268156 26380 replica.cpp:780] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1117 15:08:09.270148 26396 recover.cpp:449] Starting replica recovery
> I1117 15:08:09.272105 26396 recover.cpp:475] Replica is in EMPTY status
> I1117 15:08:09.275640 26396 replica.cpp:676] Replica in EMPTY status received 
> a broadcasted recover request from (4)@10.0.2.15:50088
> I1117 15:08:09.276578 26399 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1117 15:08:09.277600 26397 recover.cpp:566] Updating replica status to 
> STARTING
> I1117 15:08:09.279613 26396 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.016098ms
> I1117 15:08:09.279731 26396 replica.cpp:323] Persisted replica status to 
> STARTING
> I1117 15:08:09.280306 26399 recover.cpp:475] Replica is in STARTING status
> I1117 15:08:09.282181 26400 replica.cpp:676] Replica in STARTING status 
> received a broadcasted recover request from (5)@10.0.2.15:50088
> I1117 15:08:09.282552 26400 master.cpp:367] Master 
> 59c600f1-92ff-4926-9c84-073d9b81f68a (vagrant-ubuntu-trusty-64) started on 
> 10.0.2.15:50088
> I1117 15:08:09.283021 26400 master.cpp:369] Flags at startup: --acls="" 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/40AlT8/credentials" 
> --framework_sorter="drf" --help="false" --hostname_lookup="true" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="25secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/40AlT8/master" 
> --zk_session_timeout="10secs"
> I1117 15:08:09.283920 26400 master.cpp:414] Master only allowing 
> authenticated frameworks to register
> I1117 15:08:09.283972 26400 master.cpp:419] Master only allowing 
> authenticated slaves to register
> I1117 15:08:09.284032 26400 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/40AlT8/credentials'
> I1117 15:08:09.282944 26401 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1117 15:08:09.284639 26401 recover.cpp:566] Updating replica status to VOTING
> I1117 15:08:09.285539 26400 master.cpp:458] Using default 'crammd5' 
> authenticator
> I1117 15:08:09.285995 26401 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.075466ms
> I1117 15:08:09.286062 26401 replica.cpp:323] Persisted replica status to 
> VOTING
> I1117 15:08:09.286200 26401 recover.cpp:580] Successfully joined the Paxos 
> group
> I1117 15:08:09.286471 26401 recover.cpp:464] Recover process terminated
> I1117 15:08:09.287303 26400 authenticator.cpp:520] Initializing server SASL
> I1117 15:08:09.289371 26400 master.cpp:495] Authorization enabled
> I1117 15:08:09.296018 26399 master.cpp:1606] The newly elected leader is 
> master@10.0.2.15:50088 with id 59c600f1-92ff-4926-9c84-073d9b81f68a
> I1117 15:08:09.296115 26399 master.cpp:1619] Elected as the leading master!
> I1117 15:08:09.296187 26399 

[jira] [Updated] (MESOS-3949) User CGroup Isolation tests fail on Centos 6.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3949:
---
Story Points: 3

> User CGroup Isolation tests fail on Centos 6.
> -
>
> Key: MESOS-3949
> URL: https://issues.apache.org/jira/browse/MESOS-3949
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation
>Affects Versions: 0.26.0
> Environment: CentOS 6.6, gcc 4.8.1, on vagrant libvirt, 16GB, 8 CPUs,
> ../configure --enable-libevent --enable-ssl
>Reporter: Bernd Mathiske
>Assignee: Alexander Rojas
>  Labels: mesosphere
>
> UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup and 
> UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup fail on CentOS 6.6 with 
> similar output when libevent and SSL are enabled.
> {noformat}
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from UserCgroupIsolatorTest/0, where TypeParam = 
> mesos::internal::slave::CgroupsMemIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN  ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
> I1118 16:53:35.273717 30249 mem.cpp:605] Started listening for OOM events for 
> container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.274538 30249 mem.cpp:725] Started listening on low memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.275164 30249 mem.cpp:725] Started listening on medium memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.275784 30249 mem.cpp:725] Started listening on critical memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.276448 30249 mem.cpp:356] Updated 'memory.soft_limit_in_bytes' 
> to 1GB for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.277331 30249 mem.cpp:391] Updated 'memory.limit_in_bytes' to 
> 1GB for container 867a829e-4a26-43f5-86e0-938bf1f47688
> -bash: 
> /sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/cgroup.procs:
>  No such file or directory
> mkdir: cannot create directory 
> `/sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/user': No 
> such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1307: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'mkdir " + 
> path::join(flags.cgroups_hierarchy, userCgroup) + "'")
>   Actual: 256
> Expected: 0
> -bash: 
> /sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/user/cgroup.procs:
>  No such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1316: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'echo $$ >" + 
> path::join(flags.cgroups_hierarchy, userCgroup, "cgroup.procs") + "'")
>   Actual: 256
> Expected: 0
> [  FAILED  ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess (149 ms)
> {noformat}
> {noformat}
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from UserCgroupIsolatorTest/1, where TypeParam = 
> mesos::internal::slave::CgroupsCpushareIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> I1118 17:01:00.550706 30357 cpushare.cpp:392] Updated 'cpu.shares' to 1024 
> (cpus 1) for container e57f4343-1a97-4b44-b347-803be47ace80
> -bash: 
> /sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/cgroup.procs:
>  No such file or directory
> mkdir: cannot create directory 
> `/sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/user': No 
> such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1307: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'mkdir " + 
> path::join(flags.cgroups_hierarchy, userCgroup) + "'")
>   Actual: 256
> Expected: 0
> -bash: 
> /sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/user/cgroup.procs:
>  No such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1316: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'echo $$ >" + 
> path::join(flags.cgroups_hierarchy, userCgroup, "cgroup.procs") + "'")
>   Actual: 256
> Expected: 0
> -bash: 
> /sys/fs/cgroup/cpu/mesos/e57f4343-1a97-4b44-b347-803be47ace80/cgroup.procs: 
> No such file or directory
> mkdir: cannot create directory 
> 

[jira] [Updated] (MESOS-3988) Implicit roles

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3988:
---
Story Points: 8  (was: 10)

> Implicit roles
> --
>
> Key: MESOS-3988
> URL: https://issues.apache.org/jira/browse/MESOS-3988
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>  Labels: mesosphere, roles
>
> At present, Mesos uses a static list of roles that are configured when the 
> master starts up. This places some severe limitations on how roles can be 
> used (e.g., changing the set of roles requires restarting all the masters).
> As an alternative (or a precursor) to implementing full-blown dynamic roles, 
> we could instead relax the concept of roles, so that:
> * frameworks can register with any role (subject to ACLs/authz)
> * reservations can be made for any role
> Open questions, at least to me:
> * This would mean weights cannot be configured dynamically. Is that okay?
> * Is this feature useful enough without dynamic ACL changes?
> * If we implement this (+ dynamic ACLs), do we also need dynamic roles?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3975) SSL build of mesos causes flaky testsuite.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3975:
---
Story Points: 5

> SSL build of mesos causes flaky testsuite.
> --
>
> Key: MESOS-3975
> URL: https://issues.apache.org/jira/browse/MESOS-3975
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.26.0
> Environment: CentOS 7.1, Kernel 3.10.0-229.20.1.el7.x86_64, gcc 
> 4.8.3, Docker 1.9
>Reporter: Till Toenshoff
>Assignee: Joris Van Remoortere
>  Labels: mesosphere
>
> When running the tests of an SSL build of Mesos on CentOS 7.1, I see spurious 
> test failures that are, so far, not reproducible.
> The following tests did fail for me in complete runs but did seem fine when 
> running them individually, in repetition.  
> {noformat}
> DockerTest.ROOT_DOCKER_CheckPortResource
> {noformat}
> {noformat}
> ContainerizerTest.ROOT_CGROUPS_BalloonFramework
> {noformat}
> {noformat}
> [ RUN  ] 
> LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystemCommandExecutor
> 2015-11-20 
> 19:08:38,826:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> + /home/vagrant/mesos/build/src/mesos-containerizer mount --help=false 
> --operation=make-rslave --path=/
> + grep -E 
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/.+
>  /proc/self/mountinfo
> + grep -v 2b98025c-74f1-41d2-b35a-ce2cdfae347e
> + cut '-d ' -f5
> + xargs --no-run-if-empty umount -l
> + mount -n --rbind 
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/provisioner/containers/2b98025c-74f1-41d2-b35a-ce2cdfae347e/backends/copy/rootfses/bed11080-474b-4c69-8e7f-0ab85e895b0d
>  
> /tmp/LinuxFilesystemIsolatorTest_ROOT_ChangeRootFilesystemCommandExecutor_Tz7P8c/slaves/830e842e-c36a-4e4c-bff4-5b9568d7df12-S0/frameworks/830e842e-c36a-4e4c-bff4-5b9568d7df12-/executors/c735be54-c47f-4645-bfc1-2f4647e2cddb/runs/2b98025c-74f1-41d2-b35a-ce2cdfae347e/.rootfs
> Could not load cert file
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:354: Failure
> Value of: statusRunning.get().state()
>   Actual: TASK_FAILED
> Expected: TASK_RUNNING
> 2015-11-20 
> 19:08:42,164:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:45,501:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:48,837:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> 2015-11-20 
> 19:08:52,174:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:355: Failure
> Failed to wait 15secs for statusFinished
> ../../src/tests/containerizer/filesystem_isolator_tests.cpp:349: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> statusUpdate(, _))...
>  Expected: to be called twice
>Actual: called once - unsatisfied and active
> 2015-11-20 
> 19:08:55,511:21380(0x7fa10d5f2700):ZOO_ERROR@handle_socket_error_msg@1697: 
> Socket [127.0.0.1:53444] zk retcode=-4, errno=111(Connection refused): server 
> refused to accept the client
> *** Aborted at 1448046536 (unix time) try "date -d @1448046536" if you are 
> using GNU date ***
> PC: @0x0 (unknown)
> *** SIGSEGV (@0x0) received by PID 21380 (TID 0x7fa1549e68c0) from PID 0; 
> stack trace: ***
> @ 0x7fa141796fbb (unknown)
> @ 0x7fa14179b341 (unknown)
> @ 0x7fa14f096130 (unknown)
> {noformat}
> Vagrantfile generator:
> {noformat}
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
>   # Disable shared folder to prevent certain kernel module dependencies.
>   config.vm.synced_folder ".", "/vagrant", disabled: true
>   config.vm.hostname = "centos71"
>   config.vm.box = "bento/centos-7.1"
>   config.vm.provider "virtualbox" do |vb|
> vb.memory = 16384
> vb.cpus = 8
>   end
>   config.vm.provider "vmware_fusion" do |vb|
> vb.memory = 9216
> vb.cpus = 4
>   end
>   config.vm.provision "shell", inline: <<-SHELL
>  sudo yum -y update systemd
>  sudo yum install -y tar wget
>  sudo wget 
> http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo 
> -O 

[jira] [Commented] (MESOS-3949) User CGroup Isolation tests fail on Centos 6.

2015-11-23 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023507#comment-15023507
 ] 

Marco Massenzio commented on MESOS-3949:


{quote}
even if they didn't ever pass, it is a good idea to check why they didn't.
{quote}

I couldn't agree more!
Thanks for the investigative work, looking forward to learning what you find.

What I meant was, however, related to whether these tests' failure should block 
the {{0.26}} release; regardless of that, we should of course get to the bottom 
of the failure and make a determination as to whether the best course of action 
is to implement a fix (in the actual code and/or the test) or disable them on 
some given platforms.

Thanks again for being "on the ball" :)

> User CGroup Isolation tests fail on Centos 6.
> -
>
> Key: MESOS-3949
> URL: https://issues.apache.org/jira/browse/MESOS-3949
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation
>Affects Versions: 0.26.0
> Environment: CentOS 6.6, gcc 4.8.1, on vagrant libvirt, 16GB, 8 CPUs,
> ../configure --enable-libevent --enable-ssl
>Reporter: Bernd Mathiske
>Assignee: Alexander Rojas
>  Labels: mesosphere
>
> UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup and 
> UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup fail on CentOS 6.6 with 
> similar output when libevent and SSL are enabled.
> {noformat}
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from UserCgroupIsolatorTest/0, where TypeParam = 
> mesos::internal::slave::CgroupsMemIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN  ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
> I1118 16:53:35.273717 30249 mem.cpp:605] Started listening for OOM events for 
> container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.274538 30249 mem.cpp:725] Started listening on low memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.275164 30249 mem.cpp:725] Started listening on medium memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.275784 30249 mem.cpp:725] Started listening on critical memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.276448 30249 mem.cpp:356] Updated 'memory.soft_limit_in_bytes' 
> to 1GB for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.277331 30249 mem.cpp:391] Updated 'memory.limit_in_bytes' to 
> 1GB for container 867a829e-4a26-43f5-86e0-938bf1f47688
> -bash: 
> /sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/cgroup.procs:
>  No such file or directory
> mkdir: cannot create directory 
> `/sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/user': No 
> such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1307: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'mkdir " + 
> path::join(flags.cgroups_hierarchy, userCgroup) + "'")
>   Actual: 256
> Expected: 0
> -bash: 
> /sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/user/cgroup.procs:
>  No such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1316: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'echo $$ >" + 
> path::join(flags.cgroups_hierarchy, userCgroup, "cgroup.procs") + "'")
>   Actual: 256
> Expected: 0
> [  FAILED  ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess (149 ms)
> {noformat}
> {noformat}
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from UserCgroupIsolatorTest/1, where TypeParam = 
> mesos::internal::slave::CgroupsCpushareIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> I1118 17:01:00.550706 30357 cpushare.cpp:392] Updated 'cpu.shares' to 1024 
> (cpus 1) for container e57f4343-1a97-4b44-b347-803be47ace80
> -bash: 
> /sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/cgroup.procs:
>  No such file or directory
> mkdir: cannot create directory 
> `/sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/user': No 
> such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1307: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'mkdir " + 
> path::join(flags.cgroups_hierarchy, 

[jira] [Commented] (MESOS-3964) LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.

2015-11-23 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023475#comment-15023475
 ] 

Marco Massenzio commented on MESOS-3964:


Thanks for the investigation and proposed solution!

As I mentioned earlier today, please let's not confuse "tests passing on 
Configuration X" with "supporting Mesos running on OS Distro X" - those are two 
different (if somewhat overlapping) domains.

If a test consistently fails due to the lack of a kernel feature (or whatever) 
on a given platform, I find it perfectly reasonable to just disable the test 
(we can then have a conversation as to what it would take to make Mesos run on 
that same distro/OS; assuming that it doesn't, which also doesn't seem the case 
here anyway?).

> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and 
> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
> ---
>
> Key: MESOS-3964
> URL: https://issues.apache.org/jira/browse/MESOS-3964
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, test
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile: see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Greg Mann
>Priority: Blocker
>  Labels: mesosphere
>
> sudo ./bin/mesos-test.sh 
> --gtest_filter="LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs"
> {noformat}
> ...
> F1119 14:34:52.514742 30706 isolator_tests.cpp:455] CHECK_SOME(isolator): 
> Failed to find 'cpu.cfs_quota_us'. Your kernel might be too old to use the 
> CFS cgroups feature.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3497) Add implementation for sha256 based file content verification.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3497:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21  
(was: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 22)

> Add implementation for sha256 based file content verification.
> --
>
> Key: MESOS-3497
> URL: https://issues.apache.org/jira/browse/MESOS-3497
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> https://reviews.apache.org/r/38747/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3496) Create interface for digest verifier

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3496:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 23  (was: Mesosphere Sprint 19, Mesosphere Sprint 20, 
Mesosphere Sprint 21)

> Create interface for digest verifier
> 
>
> Key: MESOS-3496
> URL: https://issues.apache.org/jira/browse/MESOS-3496
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Add interface for digest verifier so that we can add implementations for 
> digest types like sha256, sha512 etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3497) Add implementation for sha256 based file content verification.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3497:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 23  (was: Mesosphere Sprint 19, Mesosphere Sprint 20, 
Mesosphere Sprint 21)

> Add implementation for sha256 based file content verification.
> --
>
> Key: MESOS-3497
> URL: https://issues.apache.org/jira/browse/MESOS-3497
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> https://reviews.apache.org/r/38747/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3496) Create interface for digest verifier

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3496:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21  
(was: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 22)

> Create interface for digest verifier
> 
>
> Key: MESOS-3496
> URL: https://issues.apache.org/jira/browse/MESOS-3496
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Add interface for digest verifier so that we can add implementations for 
> digest types like sha256, sha512 etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3024) HTTP endpoint authN is enabled merely by specifying --credentials

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3024:
---
Sprint: Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23  
(was: Mesosphere Sprint 21, Mesosphere Sprint 22)

> HTTP endpoint authN is enabled merely by specifying --credentials
> -
>
> Key: MESOS-3024
> URL: https://issues.apache.org/jira/browse/MESOS-3024
> Project: Mesos
>  Issue Type: Bug
>  Components: master, security
>Reporter: Adam B
>Assignee: Marco Massenzio
>  Labels: authentication, http, mesosphere
>
> If I set `--credentials` on the master, framework and slave authentication 
> are allowed, but not required. On the other hand, http authentication is now 
> required for authenticated endpoints (currently only `/shutdown`). That means 
> that I cannot enable framework or slave authentication without also enabling 
> http endpoint authentication. This is undesirable.
> Framework and slave authentication have separate flags (`\--authenticate` and 
> `\--authenticate_slaves`) to require authentication for each. It would be 
> great if there was also such a flag for framework authentication. Or maybe we 
> get rid of these flags altogether and rely on ACLs to determine which 
> unauthenticated principals are even allowed to authenticate for each 
> endpoint/action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3086:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere 
Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, 
Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere 
Sprint 22)

> Create cgroups TasksKiller for non freeze subsystems.
> -
>
> Key: MESOS-3086
> URL: https://issues.apache.org/jira/browse/MESOS-3086
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We have a number of test issues when we cannot remove cgroups (in case there 
> are still related tasks running) in cases where the freezer subsystem is not 
> available. 
> In the current code 
> (https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728)  we 
> will fallback to a very simple mechnism of recursivly trying to remove the 
> cgroups which fails if there are still tasks running. 
> Therefore we need an additional  (NonFreeze)TasksKiller which doesn't  rely 
> on the freezer subsystem.
> This problem caused issues when running 'sudo make check' during 0.23 release 
> testing, where BenH provided already a better error message with 
> b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3785) Use URI content modification time to trigger fetcher cache updates.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3785:
---
Sprint: Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23  
(was: Mesosphere Sprint 21, Mesosphere Sprint 22)

> Use URI content modification time to trigger fetcher cache updates.
> ---
>
> Key: MESOS-3785
> URL: https://issues.apache.org/jira/browse/MESOS-3785
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Bernd Mathiske
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> Instead of using checksums to trigger fetcher cache updates, we can for 
> starters use the content modification time (mtime), which is available for a 
> number of download protocols, e.g. HTTP and HDFS.
> Proposal: Instead of just fetching the content size, we fetch both size  and 
> mtime together. As before, if there is no size, then caching fails and we 
> fall back on direct downloading to the sandbox. 
> Assuming a size is given, we compare the mtime from the fetch URI with the 
> mtime known to the cache. If it differs, we update the cache. (As a defensive 
> measure, a difference in size should also trigger an update.) 
> Not having an mtime available at the fetch URI is simply treated as a unique 
> valid mtime value that differs from all others. This means that when 
> initially there is no mtime, cache content remains valid until there is one. 
> Thereafter,  anew lack of an mtime invalidates the cache once. In other 
> words: any change from no mtime to having one or back is the same as 
> encountering a new mtime.
> Note that this scheme does not require any new protobuf fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3862) Authorize quota requests

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3862:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Authorize quota requests
> 
>
> Key: MESOS-3862
> URL: https://issues.apache.org/jira/browse/MESOS-3862
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>  Labels: acl, mesosphere, security
>
> When quotas are requested they should authorize their roles.
> This ticket will authorize quota requests with ACLs. The existing 
> authorization support that has been implemented in MESOS-1342 will be 
> extended to add a `request_quotas` ACL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2210) Disallow special characters in role.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2210:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Disallow special characters in role.
> 
>
> Key: MESOS-2210
> URL: https://issues.apache.org/jira/browse/MESOS-2210
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: haosdent
>  Labels: mesosphere, newbie, persistent-volumes
>
> As we introduce persistent volumes in MESOS-1524, we will use roles as 
> directory names on the slave (https://reviews.apache.org/r/28562/). As a 
> result, the master should disallow special characters (like space and slash) 
> in role.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3839) Update documentation for FetcherCache mtime-related changes

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3839:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Update documentation for FetcherCache mtime-related changes
> ---
>
> Key: MESOS-3839
> URL: https://issues.apache.org/jira/browse/MESOS-3839
> Project: Mesos
>  Issue Type: Documentation
>  Components: fetcher, slave
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3476) Refactor Status Update method on Agent to handle HTTP based Executors

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3476:
---
Sprint: Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22, 
Mesosphere Sprint 23  (was: Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 22)

> Refactor Status Update method on Agent to handle HTTP based Executors
> -
>
> Key: MESOS-3476
> URL: https://issues.apache.org/jira/browse/MESOS-3476
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, receiving a status update sent from slave to itself , {{runTask}} 
> , {{killTask}} and status updates from executors are handled by the 
> {{Slave::statusUpdate}} method on Slave. The signature of the method is 
> {{void Slave::statusUpdate(StatusUpdate update, const UPID& pid)}}. 
> We need to create another overload of it that can also handle HTTP based 
> executors which the previous PID based function can also call into. The 
> signature of the new function could be:
> {{void Slave::statusUpdate(StatusUpdate update, Executor* executor)}}
> The HTTP Executor would also call into this new function via 
> {{src/slave/http.cpp}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3065) Add framework authorization for persistent volume

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3065:
---
Sprint: Mesosphere Sprint 16, Mesosphere Sprint 22, Mesosphere Sprint 23  
(was: Mesosphere Sprint 16, Mesosphere Sprint 22)

> Add framework authorization for persistent volume
> -
>
> Key: MESOS-3065
> URL: https://issues.apache.org/jira/browse/MESOS-3065
> Project: Mesos
>  Issue Type: Task
>Reporter: Michael Park
>Assignee: Greg Mann
>  Labels: mesosphere, persistent-volumes
>
> Persistent volume should be authorized with the {{principal}} of the 
> reserving entity (framework or master). The idea is to introduce {{Create}} 
> and {{Destroy}} into the ACL.
> {code}
>   message Create {
> // Subjects.
> required Entity principals = 1;
> // Objects? Perhaps the kind of volume? allowed permissions?
>   }
>   message Destroy {
> // Subjects.
> required Entity principals = 1;
> // Objects.
> required Entity creator_principals = 2;
>   }
> {code}
> When a framework creates a persistent volume, "create" ACLs are checked to 
> see if the framework (FrameworkInfo.principal) or the operator 
> (Credential.user) is authorized to create persistent volumes. If not 
> authorized, the create operation is rejected.
> When a framework destroys a persistent volume, "destroy" ACLs are checked to 
> see if the framework (FrameworkInfo.principal) or the operator 
> (Credential.user) is authorized to destroy the persistent volume created by a 
> framework or operator (Resource.DiskInfo.principal). If not authorized, the 
> destroy operation is rejected.
> A separate ticket will use the structures created here to enable 
> authorization of the "/create" and "/destroy" HTTP endpoints: 
> https://issues.apache.org/jira/browse/MESOS-3903



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3912) Rescind offers in order to satisfy quota

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3912:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Rescind offers in order to satisfy quota
> 
>
> Key: MESOS-3912
> URL: https://issues.apache.org/jira/browse/MESOS-3912
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> When a quota request comes in, we may need to rescind a certain amount of 
> outstanding offers in order to satisfy it. Because resources are allocated in 
> the allocator, there can be a race between rescinding and allocating. This 
> race makes it hard to determine the exact amount of offers that should be 
> rescinded in the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3856) Add mtime-related fetcher tests

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3856:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Add mtime-related fetcher tests
> ---
>
> Key: MESOS-3856
> URL: https://issues.apache.org/jira/browse/MESOS-3856
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3858) Draft quota limits design document

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3858:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Draft quota limits design document
> --
>
> Key: MESOS-3858
> URL: https://issues.apache.org/jira/browse/MESOS-3858
> Project: Mesos
>  Issue Type: Task
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>  Labels: mesosphere, quota
>
> In the design documents for Quota 
> (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit#)
>  the proposed MVP does not include quota limits. Quota limits represent an 
> upper bound of resources that a role is allowed to use. The task of this 
> ticket is to outline a design document on how to implement quota limits when 
> the quota MVP is implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3073:
---
Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16, Mesosphere Sprint 17, 
Mesosphere Sprint 18, Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere 
Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
15, Mesosphere Sprint 16, Mesosphere Sprint 17, Mesosphere Sprint 18, 
Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere 
Sprint 22)

> Introduce HTTP endpoints for Quota
> --
>
> Key: MESOS-3073
> URL: https://issues.apache.org/jira/browse/MESOS-3073
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> We need to implement the HTTP endpoints for Quota as outlined in the Design 
> Doc: 
> (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3732) Speed up FaultToleranceTest.FrameworkReregister test

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3732:
---
Sprint: Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23  
(was: Mesosphere Sprint 21, Mesosphere Sprint 22)

> Speed up FaultToleranceTest.FrameworkReregister test
> 
>
> Key: MESOS-3732
> URL: https://issues.apache.org/jira/browse/MESOS-3732
> Project: Mesos
>  Issue Type: Improvement
>  Components: test
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere, newbie
>
> FaultToleranceTest.FrameworkReregister test takes more than one second to 
> complete:
> {code}
> [ RUN  ] FaultToleranceTest.FrameworkReregister
> [   OK ] FaultToleranceTest.FrameworkReregister (1056 ms)
> {code}
> There must be a {{1s}} timeout somewhere which we should mitigate via 
> {{Clock::advance()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3550) Create a Executor Library based on the new Executor HTTP API

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3550:
---
Sprint: Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23  
(was: Mesosphere Sprint 21, Mesosphere Sprint 22)

> Create a Executor Library based on the new Executor HTTP API
> 
>
> Key: MESOS-3550
> URL: https://issues.apache.org/jira/browse/MESOS-3550
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Similar to the Scheduler Library {{src/scheduler/scheduler.cpp}} , we would 
> need a Executor Library that speaks the new Executor HTTP API. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3868) Make apply-review.sh use apply-reviews.py

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3868:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Make apply-review.sh use apply-reviews.py
> -
>
> Key: MESOS-3868
> URL: https://issues.apache.org/jira/browse/MESOS-3868
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3581) License headers show up all over doxygen documentation.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3581:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> License headers show up all over doxygen documentation.
> ---
>
> Key: MESOS-3581
> URL: https://issues.apache.org/jira/browse/MESOS-3581
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.24.1
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Minor
>  Labels: mesosphere
>
> Currently license headers are commented in something resembling Javadoc style,
> {code}
> /**
> * Licensed ...
> {code}
> Since we use Javadoc-style comment blocks for doxygen documentation all 
> license headers appear in the generated documentation, potentially and likely 
> hiding the actual documentation.
> Using {{/*}} to start the comment blocks would be enough to hide them from 
> doxygen, but would likely also result in a largish (though mostly 
> uninteresting) patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3339) Implement filtering mechanism for (Scheduler API Events) Testing

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3339:
---
Sprint: Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22, 
Mesosphere Sprint 23  (was: Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 22)

> Implement filtering mechanism for (Scheduler API Events) Testing
> 
>
> Key: MESOS-3339
> URL: https://issues.apache.org/jira/browse/MESOS-3339
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, our testing infrastructure does not have a mechanism of 
> filtering/dropping HTTP events of a particular type from the Scheduler API 
> response stream.  We need a {{DROP_HTTP_CALLS}} abstraction that can help us 
> to filter a particular event type.
> {code}
> // Enqueues all received events into a libprocess queue.
> ACTION_P(Enqueue, queue)
> {
>   std::queue events = arg0;
>   while (!events.empty()) {
> // Note that we currently drop HEARTBEATs because most of these tests
> // are not designed to deal with heartbeats.
> // TODO(vinod): Implement DROP_HTTP_CALLS that can filter heartbeats.
> if (events.front().type() == Event::HEARTBEAT) {
>   VLOG(1) << "Ignoring HEARTBEAT event";
> } else {
>   queue->put(events.front());
> }
> events.pop();
>   }
> }
> {code}
> This helper code is duplicated in at least two places currently, Scheduler 
> Library/Maintenance Primitives tests. 
> - The solution can be as trivial as moving this helper function to a common 
> test-header.
> - Implement a {{DROP_HTTP_CALLS}} similar to what we do for other protobufs 
> via {{DROP_CALLS}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3857) Draft Design Doc for first Step External Volume MVP

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3857:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Draft Design Doc for first Step External Volume MVP
> ---
>
> Key: MESOS-3857
> URL: https://issues.apache.org/jira/browse/MESOS-3857
> Project: Mesos
>  Issue Type: Task
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> As part of the overall design doc for global resources we would like to 
> introduce improvements for Docker Volume Driver isolator module 
> (https://github.com/emccode/mesos-module-dvdi).
> Currently the isolator module is controlled by setting environment variables 
> as follows: {code} "env": {
>   "DVDI_VOLUME_NAME": "testing",
>   "DVDI_VOLUME_DRIVER": "platform1",
>   "DVDI_VOLUME_OPTS": 
> "size=5,iops=150,volumetype=io1,newfstype=ext4,overwritefs=false",
>   "DVDI_VOLUME_NAME1": "testing2",
>   "DVDI_VOLUME_DRIVER1": "platform2",
>   "DVDI_VOLUME_OPTS1": "size=6,volumetype=gp2,newfstype=xfs,overwritefs=true"
> } {code} We should develop a more structured way for passing these settings 
> to the isolator module which is in line with the overall goal of global 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3064) Add 'principal' field to 'Resource.DiskInfo.Persistence'

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3064:
---
Sprint: Mesosphere Sprint 16, Mesosphere Sprint 22, Mesosphere Sprint 23  
(was: Mesosphere Sprint 16, Mesosphere Sprint 22)

> Add 'principal' field to 'Resource.DiskInfo.Persistence'
> 
>
> Key: MESOS-3064
> URL: https://issues.apache.org/jira/browse/MESOS-3064
> Project: Mesos
>  Issue Type: Task
>Reporter: Michael Park
>Assignee: Greg Mann
>  Labels: mesosphere, persistent-volumes
>
> In order to support authorization for persistent volumes, we should add the 
> {{principal}} to {{Resource.DiskInfo}}, analogous to 
> {{Resource.ReservationInfo.principal}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3859) Add github support to apply-reviews.py.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3859:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Add github support to apply-reviews.py.
> ---
>
> Key: MESOS-3859
> URL: https://issues.apache.org/jira/browse/MESOS-3859
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3515) Support Subscribe Call for HTTP based Executors

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3515:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 19, 
Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22)

> Support Subscribe Call for HTTP based Executors
> ---
>
> Key: MESOS-3515
> URL: https://issues.apache.org/jira/browse/MESOS-3515
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> We need to add a {{subscribe(...)}} method in {{src/slave/slave.cpp}} to 
> introduce the ability for HTTP based executors to subscribe and then receive 
> events on the persistent HTTP connection. Most of the functionality needed 
> would be similar to {{Master::subscribe}} in {{src/master/master.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3035) As a Developer I would like a standard way to run a Subprocess in libprocess

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3035:
---
Sprint: Mesosphere Sprint 14, Mesosphere Sprint 16, Mesosphere Sprint 19, 
Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 14, 
Mesosphere Sprint 16, Mesosphere Sprint 19, Mesosphere Sprint 22)

> As a Developer I would like a standard way to run a Subprocess in libprocess
> 
>
> Key: MESOS-3035
> URL: https://issues.apache.org/jira/browse/MESOS-3035
> Project: Mesos
>  Issue Type: Story
>  Components: libprocess
>Reporter: Marco Massenzio
>Assignee: Marco Massenzio
>  Labels: mesosphere, tech-debt
>
> As part of MESOS-2830 and MESOS-2902 I have been researching the ability to 
> run a {{Subprocess}} and capture the {{stdout / stderr}} along with the exit 
> status code.
> {{process::subprocess()}} offers much of the functionality, but in a way that 
> still requires a lot of handiwork on the developer's part; we would like to 
> further abstract away the ability to just pass a string, an optional set of 
> command-line arguments and then collect the output of the command (bonus: 
> without blocking).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3861) Authenticate quota requests

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3861:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Authenticate quota requests
> ---
>
> Key: MESOS-3861
> URL: https://issues.apache.org/jira/browse/MESOS-3861
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>  Labels: mesosphere, security
>
> Quota requests need to be authenticated.
> This ticket will authenticate quota requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3718) Implement Quota support in allocator

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3718:
---
Sprint: Mesosphere Sprint 21, Mesosphere Sprint 22, Mesosphere Sprint 23  
(was: Mesosphere Sprint 21, Mesosphere Sprint 22)

> Implement Quota support in allocator
> 
>
> Key: MESOS-3718
> URL: https://issues.apache.org/jira/browse/MESOS-3718
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> The built-in Hierarchical DRF allocator should support Quota. This includes 
> (but not limited to): adding, updating, removing and satisfying quota; 
> avoiding both overcomitting resources and handing them to non-quota'ed roles 
> in presence of master failover.
> A [design doc for Quota support in 
> Allocator|https://issues.apache.org/jira/browse/MESOS-2937] provides an 
> overview of a feature set required to be implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3911) Add a `--force` flag to disable sanity check in quota

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3911:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Add a `--force` flag to disable sanity check in quota
> -
>
> Key: MESOS-3911
> URL: https://issues.apache.org/jira/browse/MESOS-3911
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Joerg Schad
>  Labels: mesosphere
>
> There are use cases when an operator may want to disable the sanity check for 
> quota endpoints (MESOS-3074), even if this renders the cluster under quota. 
> For example, an operator sets quota before adding more agents in order to 
> make sure that no non-quota allocations from new agents are made. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3899) Wrong syntax and inconsistent formatting of JSON examples in flag documentation

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3899:
---
Sprint: Mesosphere Sprint 22, Mesosphere Sprint 23  (was: Mesosphere Sprint 
22)

> Wrong syntax and inconsistent formatting of JSON examples in flag 
> documentation
> ---
>
> Key: MESOS-3899
> URL: https://issues.apache.org/jira/browse/MESOS-3899
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>Priority: Minor
>  Labels: documentation, easyfix, mesosphere
>
> The JSON examples in the documentation of the commandline flags 
> ({{mesos-master.sh --help}} and {{mesos-slave.sh --help}}) don't have a 
> consistent formatting. Furthermore, some examples aren't even compliant JSON 
> because they have trailing commas were they shouldn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3035) As a Developer I would like a standard way to run a Subprocess in libprocess

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3035:
---
Sprint: Mesosphere Sprint 14, Mesosphere Sprint 16, Mesosphere Sprint 19, 
Mesosphere Sprint 22  (was: Mesosphere Sprint 14, Mesosphere Sprint 16, 
Mesosphere Sprint 19, Mesosphere Sprint 22, Mesosphere Sprint 23)

> As a Developer I would like a standard way to run a Subprocess in libprocess
> 
>
> Key: MESOS-3035
> URL: https://issues.apache.org/jira/browse/MESOS-3035
> Project: Mesos
>  Issue Type: Story
>  Components: libprocess
>Reporter: Marco Massenzio
>Assignee: Marco Massenzio
>  Labels: mesosphere, tech-debt
>
> As part of MESOS-2830 and MESOS-2902 I have been researching the ability to 
> run a {{Subprocess}} and capture the {{stdout / stderr}} along with the exit 
> status code.
> {{process::subprocess()}} offers much of the functionality, but in a way that 
> still requires a lot of handiwork on the developer's part; we would like to 
> further abstract away the ability to just pass a string, an optional set of 
> command-line arguments and then collect the output of the command (bonus: 
> without blocking).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3024) HTTP endpoint authN is enabled merely by specifying --credentials

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3024:
---
Sprint: Mesosphere Sprint 21, Mesosphere Sprint 22  (was: Mesosphere Sprint 
21, Mesosphere Sprint 22, Mesosphere Sprint 23)

> HTTP endpoint authN is enabled merely by specifying --credentials
> -
>
> Key: MESOS-3024
> URL: https://issues.apache.org/jira/browse/MESOS-3024
> Project: Mesos
>  Issue Type: Bug
>  Components: master, security
>Reporter: Adam B
>Assignee: Marco Massenzio
>  Labels: authentication, http, mesosphere
>
> If I set `--credentials` on the master, framework and slave authentication 
> are allowed, but not required. On the other hand, http authentication is now 
> required for authenticated endpoints (currently only `/shutdown`). That means 
> that I cannot enable framework or slave authentication without also enabling 
> http endpoint authentication. This is undesirable.
> Framework and slave authentication have separate flags (`\--authenticate` and 
> `\--authenticate_slaves`) to require authentication for each. It would be 
> great if there was also such a flag for framework authentication. Or maybe we 
> get rid of these flags altogether and rely on ACLs to determine which 
> unauthenticated principals are even allowed to authenticate for each 
> endpoint/action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3581) License headers show up all over doxygen documentation.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3581:
---
Story Points: 2

> License headers show up all over doxygen documentation.
> ---
>
> Key: MESOS-3581
> URL: https://issues.apache.org/jira/browse/MESOS-3581
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.24.1
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Minor
>  Labels: mesosphere
>
> Currently license headers are commented in something resembling Javadoc style,
> {code}
> /**
> * Licensed ...
> {code}
> Since we use Javadoc-style comment blocks for doxygen documentation all 
> license headers appear in the generated documentation, potentially and likely 
> hiding the actual documentation.
> Using {{/*}} to start the comment blocks would be enough to hide them from 
> doxygen, but would likely also result in a largish (though mostly 
> uninteresting) patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3065) Add framework authorization for persistent volume

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3065:
---
Sprint: Mesosphere Sprint 16, Mesosphere Sprint 22  (was: Mesosphere Sprint 
16, Mesosphere Sprint 22, Mesosphere Sprint 23)

> Add framework authorization for persistent volume
> -
>
> Key: MESOS-3065
> URL: https://issues.apache.org/jira/browse/MESOS-3065
> Project: Mesos
>  Issue Type: Task
>Reporter: Michael Park
>Assignee: Greg Mann
>  Labels: mesosphere, persistent-volumes
>
> Persistent volume should be authorized with the {{principal}} of the 
> reserving entity (framework or master). The idea is to introduce {{Create}} 
> and {{Destroy}} into the ACL.
> {code}
>   message Create {
> // Subjects.
> required Entity principals = 1;
> // Objects? Perhaps the kind of volume? allowed permissions?
>   }
>   message Destroy {
> // Subjects.
> required Entity principals = 1;
> // Objects.
> required Entity creator_principals = 2;
>   }
> {code}
> When a framework creates a persistent volume, "create" ACLs are checked to 
> see if the framework (FrameworkInfo.principal) or the operator 
> (Credential.user) is authorized to create persistent volumes. If not 
> authorized, the create operation is rejected.
> When a framework destroys a persistent volume, "destroy" ACLs are checked to 
> see if the framework (FrameworkInfo.principal) or the operator 
> (Credential.user) is authorized to destroy the persistent volume created by a 
> framework or operator (Resource.DiskInfo.principal). If not authorized, the 
> destroy operation is rejected.
> A separate ticket will use the structures created here to enable 
> authorization of the "/create" and "/destroy" HTTP endpoints: 
> https://issues.apache.org/jira/browse/MESOS-3903



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3339) Implement filtering mechanism for (Scheduler API Events) Testing

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3339:
---
Sprint: Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22  
(was: Mesosphere Sprint 20, Mesosphere Sprint 21, Mesosphere Sprint 22, 
Mesosphere Sprint 23)

> Implement filtering mechanism for (Scheduler API Events) Testing
> 
>
> Key: MESOS-3339
> URL: https://issues.apache.org/jira/browse/MESOS-3339
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, our testing infrastructure does not have a mechanism of 
> filtering/dropping HTTP events of a particular type from the Scheduler API 
> response stream.  We need a {{DROP_HTTP_CALLS}} abstraction that can help us 
> to filter a particular event type.
> {code}
> // Enqueues all received events into a libprocess queue.
> ACTION_P(Enqueue, queue)
> {
>   std::queue events = arg0;
>   while (!events.empty()) {
> // Note that we currently drop HEARTBEATs because most of these tests
> // are not designed to deal with heartbeats.
> // TODO(vinod): Implement DROP_HTTP_CALLS that can filter heartbeats.
> if (events.front().type() == Event::HEARTBEAT) {
>   VLOG(1) << "Ignoring HEARTBEAT event";
> } else {
>   queue->put(events.front());
> }
> events.pop();
>   }
> }
> {code}
> This helper code is duplicated in at least two places currently, Scheduler 
> Library/Maintenance Primitives tests. 
> - The solution can be as trivial as moving this helper function to a common 
> test-header.
> - Implement a {{DROP_HTTP_CALLS}} similar to what we do for other protobufs 
> via {{DROP_CALLS}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3859) Add github support to apply-reviews.py.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3859:
---
Sprint: Mesosphere Sprint 22  (was: Mesosphere Sprint 22, Mesosphere Sprint 
23)

> Add github support to apply-reviews.py.
> ---
>
> Key: MESOS-3859
> URL: https://issues.apache.org/jira/browse/MESOS-3859
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3868) Make apply-review.sh use apply-reviews.py

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3868:
---
Sprint: Mesosphere Sprint 22  (was: Mesosphere Sprint 22, Mesosphere Sprint 
23)

> Make apply-review.sh use apply-reviews.py
> -
>
> Key: MESOS-3868
> URL: https://issues.apache.org/jira/browse/MESOS-3868
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3496) Create interface for digest verifier

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3496:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21  
(was: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 23)

> Create interface for digest verifier
> 
>
> Key: MESOS-3496
> URL: https://issues.apache.org/jira/browse/MESOS-3496
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Add interface for digest verifier so that we can add implementations for 
> digest types like sha256, sha512 etc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3497) Add implementation for sha256 based file content verification.

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3497:
---
Sprint: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21  
(was: Mesosphere Sprint 19, Mesosphere Sprint 20, Mesosphere Sprint 21, 
Mesosphere Sprint 23)

> Add implementation for sha256 based file content verification.
> --
>
> Key: MESOS-3497
> URL: https://issues.apache.org/jira/browse/MESOS-3497
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> https://reviews.apache.org/r/38747/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3934) Libprocess: Unify the initialization of the MetricsProcess and ReaperProcess

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3934:
---
Sprint:   (was: Mesosphere Sprint 23)

> Libprocess: Unify the initialization of the MetricsProcess and ReaperProcess
> 
>
> Key: MESOS-3934
> URL: https://issues.apache.org/jira/browse/MESOS-3934
> Project: Mesos
>  Issue Type: Task
>  Components: libprocess, test
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> Related to this 
> [TODO|https://github.com/apache/mesos/blob/aa0cd7ed4edf1184cbc592b5caa2429a8373e813/3rdparty/libprocess/src/process.cpp#L949-L950].
> The {{MetricsProcess}} and {{ReaperProcess}} are global processes 
> (singletons) which are initialized upon first use.  The two processes could 
> be initialized alongside the {{gc}}, {{help}}, {{logging}}, {{profiler}}, and 
> {{system}} (statistics) processes inside {{process::initialize}}.
> This is also necessary for libprocess re-initialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3910) Libprocess: Implement cleanup of the SocketManager in process::finalize

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3910:
---
Sprint:   (was: Mesosphere Sprint 23)

> Libprocess: Implement cleanup of the SocketManager in process::finalize
> ---
>
> Key: MESOS-3910
> URL: https://issues.apache.org/jira/browse/MESOS-3910
> Project: Mesos
>  Issue Type: Task
>  Components: libprocess, test
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> The {{socket_manager}} and {{process_manager}} are intricately tied together. 
>  Currently, only the {{process_manager}} is cleaned up by 
> {{process::finalize}}.
> To clean up the {{socket_manager}}, we must close all sockets and deallocate 
> any existing {{HttpProxy}} or {{Encoder}} objects.  And we should prevent 
> further objects from being created/tracked by the {{socket_manager}}.
> *Proposal*
> # Clean up all processes other than {{gc}}.  This will clear all links and 
> delete all {{HttpProxy}} s while {{socket_manager}} still exists.
> # Close all sockets via {{SocketManager::close}}.  All of {{socket_manager}} 
> 's state is cleaned up via {{SocketManager::close}}, including termination of 
> {{HttpProxy}} (termination is idempotent, meaning that killing {{HttpProxy}} 
> s via {{process_manager}} is safe).
> # At this point, {{socket_manager}} should be empty and only the {{gc}} 
> process should be running.  (Since we're finalizing, assume there are no 
> threads trying to spawn processes.)  {{socket_manager}} can be deleted.
> # {{gc}} can be deleted.  This is currently a leaked pointer, so we'll also 
> need to track and delete that.
> # {{process_manager}} should be devoid of processes, so we can proceed with 
> cleanup (join threads, stop the {{EventLoop}}, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3740) LIBPROCESS_IP not passed to Docker containers

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3740:
---
Sprint: Mesosphere Sprint 21  (was: Mesosphere Sprint 21, Mesosphere Sprint 
23)

> LIBPROCESS_IP not passed to Docker containers
> -
>
> Key: MESOS-3740
> URL: https://issues.apache.org/jira/browse/MESOS-3740
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.25.0
> Environment: Mesos 0.24.1
>Reporter: Cody Maloney
>  Labels: mesosphere
>
> Docker containers aren't currently passed all the same environment variables 
> that Mesos Containerizer tasks are. See: 
> https://github.com/apache/mesos/blob/master/src/slave/containerizer/containerizer.cpp#L254
>  for all the environment variables explicitly set for mesos containers.
> While some of them don't necessarily make sense for docker containers, when 
> the docker has inside of it a libprocess process (A mesos framework 
> scheduler) and is using {{--net=host}} the task needs to have LIBPROCESS_IP 
> set otherwise the same sort of problems that happen because of MESOS-3553 can 
> happen (libprocess will try to guess the machine's IP address with likely bad 
> results in a number of operating environment).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3820) Refactor libprocess initialization to allow for test-only reinitialization of the server socket

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3820:
---
Sprint:   (was: Mesosphere Sprint 23)

> Refactor libprocess initialization to allow for test-only reinitialization of 
> the server socket
> ---
>
> Key: MESOS-3820
> URL: https://issues.apache.org/jira/browse/MESOS-3820
> Project: Mesos
>  Issue Type: Story
>  Components: libprocess, test
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: mesosphere
>
> *Background*
> Libprocess initialization includes the spawning of a variety of global 
> processes and the creation of the server socket which listens for incoming 
> requests.  Some properties of the server socket are configured via 
> environment variables, such as the IP and port or the SSL configuration.
> In the case of tests, libprocess is initialized once per test binary.  This 
> means that testing different configurations (SSL in particular) is cumbersome 
> as a separate process would be needed for every test case.
> *Proposal* (Still under investigation)
> # Investigate using {{process::finalize}} to completely clean up libprocess.  
> See [MESOS-3863].
> # Add a test-only {{process::reinitialize}} function, which should be roughly 
> equivalent to a first-time run of {{process::initialize}}.
> -*Proposal to swap out server socket*- (Does not work)
> # Follow the [example of the SSL 
> library|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/openssl.cpp#L280]
>  and allow tests to declare an internal function for re-initializing a 
> portion of libprocess.
> # Move the [existing creation of the server 
> socket|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L852-L856]
>  into a {{reinitialize_server_socket}} function.
> # Add any necessary cleanup for swapping server sockets.
> # Consider whether any additional locking is required in the 
> {{reinitialize_server_socket}} function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3793) Cannot start mesos local on a Debian GNU/Linux 8 docker machine

2015-11-23 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3793:
---
Story Points: 3

> Cannot start mesos local on a Debian GNU/Linux 8 docker machine
> ---
>
> Key: MESOS-3793
> URL: https://issues.apache.org/jira/browse/MESOS-3793
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.25.0
> Environment: Debian GNU/Linux 8 docker machine
>Reporter: Matthias Veit
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> We updated the mesos version to 0.25.0 in our Marathon docker image, that 
> runs our integration tests.
> We use mesos local for those tests. This fails with this message:
> {noformat}
> root@a06e4b4eb776:/marathon# mesos local
> I1022 18:42:26.852485   136 leveldb.cpp:176] Opened db in 6.103258ms
> I1022 18:42:26.853302   136 leveldb.cpp:183] Compacted db in 765740ns
> I1022 18:42:26.853343   136 leveldb.cpp:198] Created db iterator in 9001ns
> I1022 18:42:26.853355   136 leveldb.cpp:204] Seeked to beginning of db in 
> 1287ns
> I1022 18:42:26.853366   136 leveldb.cpp:273] Iterated through 0 keys in the 
> db in ns
> I1022 18:42:26.853406   136 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I1022 18:42:26.853775   141 recover.cpp:449] Starting replica recovery
> I1022 18:42:26.853862   141 recover.cpp:475] Replica is in EMPTY status
> I1022 18:42:26.854751   138 replica.cpp:641] Replica in EMPTY status received 
> a broadcasted recover request
> I1022 18:42:26.854856   140 recover.cpp:195] Received a recover response from 
> a replica in EMPTY status
> I1022 18:42:26.855002   140 recover.cpp:566] Updating replica status to 
> STARTING
> I1022 18:42:26.855655   138 master.cpp:376] Master 
> a3f39818-1bda-4710-b96b-2a60ed4d12b8 (a06e4b4eb776) started on 
> 172.17.0.14:5050
> I1022 18:42:26.855680   138 master.cpp:378] Flags at startup: 
> --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_slaves="false" 
> --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_slave_ping_timeouts="5" --quiet="false" 
> --recovery_slave_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" 
> --registry_strict="false" --root_submissions="true" 
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
> --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" 
> --work_dir="/tmp/mesos/local/AK0XpG" --zk_session_timeout="10secs"
> I1022 18:42:26.855790   138 master.cpp:425] Master allowing unauthenticated 
> frameworks to register
> I1022 18:42:26.855803   138 master.cpp:430] Master allowing unauthenticated 
> slaves to register
> I1022 18:42:26.855815   138 master.cpp:467] Using default 'crammd5' 
> authenticator
> W1022 18:42:26.855829   138 authenticator.cpp:505] No credentials provided, 
> authentication requests will be refused
> I1022 18:42:26.855840   138 authenticator.cpp:512] Initializing server SASL
> I1022 18:42:26.856442   136 containerizer.cpp:143] Using isolation: 
> posix/cpu,posix/mem,filesystem/posix
> I1022 18:42:26.856943   140 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 1.888185ms
> I1022 18:42:26.856987   140 replica.cpp:323] Persisted replica status to 
> STARTING
> I1022 18:42:26.857115   140 recover.cpp:475] Replica is in STARTING status
> I1022 18:42:26.857270   140 replica.cpp:641] Replica in STARTING status 
> received a broadcasted recover request
> I1022 18:42:26.857312   140 recover.cpp:195] Received a recover response from 
> a replica in STARTING status
> I1022 18:42:26.857368   140 recover.cpp:566] Updating replica status to VOTING
> I1022 18:42:26.857781   140 leveldb.cpp:306] Persisting metadata (8 bytes) to 
> leveldb took 371121ns
> I1022 18:42:26.857841   140 replica.cpp:323] Persisted replica status to 
> VOTING
> I1022 18:42:26.857895   140 recover.cpp:580] Successfully joined the Paxos 
> group
> I1022 18:42:26.857928   140 recover.cpp:464] Recover process terminated
> I1022 18:42:26.862455   137 master.cpp:1603] The newly elected leader is 
> master@172.17.0.14:5050 with id a3f39818-1bda-4710-b96b-2a60ed4d12b8
> I1022 18:42:26.862498   137 master.cpp:1616] Elected as the leading master!
> I1022 18:42:26.862511   137 master.cpp:1376] Recovering from registrar
> I1022 18:42:26.862560   137 registrar.cpp:309] Recovering registrar
> Failed to create a containerizer: Could not create MesosContainerizer: Failed 
> to create launcher: Failed to create Linux launcher: Failed to mount cgroups 
> 

[jira] [Commented] (MESOS-3948) Authorize /roles request

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018215#comment-15018215
 ] 

Marco Massenzio commented on MESOS-3948:


Same comment as in MESOS-3947, this should be part of a broader effort around 
implementing the {{HttpAuthorizer}} interface.

See MESOS-2945

> Authorize /roles request
> 
>
> Key: MESOS-3948
> URL: https://issues.apache.org/jira/browse/MESOS-3948
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> When /roles are requested it should authorize the updated role.
> This ticket will authorize /roles requests with ACLs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3953) DockerTest.ROOT_DOCKER_CheckPortResource fails.

2015-11-20 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-3953:
--

Assignee: Timothy Chen

Release 0.26 blocker.

As for the others, can you please coordinate with [~jojy] and [~greggomann] and 
verify whether this is a regression and/or the root cause?

thanks!

> DockerTest.ROOT_DOCKER_CheckPortResource fails.
> ---
>
> Key: MESOS-3953
> URL: https://issues.apache.org/jira/browse/MESOS-3953
> Project: Mesos
>  Issue Type: Bug
> Environment: CentOS Linux release 7.1.1503 (Core),
> gcc (GCC) 4.8.3,
> Docker version 1.9.0, build 76d6bc9
>Reporter: Till Toenshoff
>Assignee: Timothy Chen
>
> The following is happening on my CentOS 7 installation (100% reproducible).
> {noformat}
> [ RUN  ] DockerTest.ROOT_DOCKER_CheckPortResource
> I1118 08:18:50.336110 20979 docker.cpp:684] Running docker -H 
> unix:///var/run/docker.sock rm -f -v mesos-docker-port-resource-test
> I1118 08:18:50.413763 20979 resources.cpp:474] Parsing resources as JSON 
> failed: ports:[9998-];ports:[10001-11000]
> Trying semicolon-delimited string format instead
> I1118 08:18:50.414670 20979 resources.cpp:474] Parsing resources as JSON 
> failed: ports:[9998-];ports:[1-11000]
> Trying semicolon-delimited string format instead
> I1118 08:18:50.415073 20979 docker.cpp:564] Running docker -H 
> unix:///var/run/docker.sock run -e MESOS_SANDBOX=/mnt/mesos/sandbox -e 
> MESOS_CONTAINER_NAME=mesos-docker-port-resource-test -v 
> /tmp/DockerTest_ROOT_DOCKER_CheckPortResource_4e34OB:/mnt/mesos/sandbox --net 
> bridge -p 1:80 --name mesos-docker-port-resource-test busybox true
> ../../src/tests/containerizer/docker_tests.cpp:338: Failure
> (run).failure(): Container exited on error: exited with status 1
> I1118 08:18:50.717136 20979 docker.cpp:842] Running docker -H 
> unix:///var/run/docker.sock ps -a
> I1118 08:18:50.819042 20999 docker.cpp:723] Running docker -H 
> unix:///var/run/docker.sock inspect mesos-docker-port-resource-test
> I1118 08:18:50.924579 20979 docker.cpp:684] Running docker -H 
> unix:///var/run/docker.sock rm -f -v 
> 67781b79c7641a6450c3ddb4ba13112b6f5a50060eac3f65cac3ad57a2a527ea
> [  FAILED  ] DockerTest.ROOT_DOCKER_CheckPortResource
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3935) mesos-master crashes when a scheduler with an unresolvable hostname attempts to connect

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009153#comment-15009153
 ] 

Marco Massenzio edited comment on MESOS-3935 at 11/20/15 4:44 PM:
--

This looks like a name resolution issue for the *master's* hostname and not the 
scheduler's hostname. If looking up the hostname via ip doesn't work in your 
network, you can use {{-- hostname}} or {{--hostname_lookup}} flags.


was (Author: vinodkone):
This looks like a name resolution issue for the *master's* hostname and not the 
scheduler's hostname. If looking up the hostname via ip doesn't work in your 
network, you can use "--hostname" or "--hostname_lookup" flags.

> mesos-master crashes when a scheduler with an unresolvable hostname attempts 
> to connect
> ---
>
> Key: MESOS-3935
> URL: https://issues.apache.org/jira/browse/MESOS-3935
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.25.0
>Reporter: Hamza Faran
>
> {code}
> $ sudo mesos-master --ip=10.8.0.5 --work_dir=work_dir --authenticate 
> --authenticate_slaves --credentials=credentials --port=5045   
>
> I1117 07:05:15.371150  5852 main.cpp:229] Build: 2015-10-12 21:00:09 by root  
>   
>   
> I1117 07:05:15.371314  5852 main.cpp:231] Version: 0.25.0 
>   
>   
> I1117 07:05:15.371340  5852 main.cpp:234] Git tag: 0.25.0 
>   
>   
> I1117 07:05:15.371366  5852 main.cpp:238] Git SHA: 
> 2dd7f7ee115fe00b8e098b0a10762a4fa8f4600f  
>   
>
> I1117 07:05:15.371439  5852 main.cpp:252] Using 'HierarchicalDRF' allocator   
>   
>   
> I1117 07:05:15.373845  5852 leveldb.cpp:176] Opened db in 2.267831ms  
>   
>   
> I1117 07:05:15.374606  5852 leveldb.cpp:183] Compacted db in 678911ns 
>   
>   
> I1117 07:05:15.374668  5852 leveldb.cpp:198] Created db iterator in 19310ns   
>   
>   
> I1117 07:05:15.374775  5852 leveldb.cpp:204] Seeked to beginning of db in 
> 79269ns   
>   
> I1117 07:05:15.374882  5852 leveldb.cpp:273] Iterated through 3 keys in the 
> db in 79949ns 
> 
> I1117 07:05:15.374953  5852 replica.cpp:744] Replica recovered with log 
> positions 91 -> 92 with 0 holes and 0 unlearned   
> 
> I1117 07:05:15.375820  5852 main.cpp:465] Starting Mesos master   
>   
>   
> I1117 07:05:15.376049  5856 recover.cpp:449] Starting replica recovery
>   
>   
> I1117 07:05:15.376188  5858 recover.cpp:475] Replica is in VOTING status  
>   
>   
> I1117 07:05:15.376332  5858 recover.cpp:464] Recover process terminated   
>   
>   
> F1117 07:05:43.398336  5852 master.cpp:330] Failed to get hostname: Temporary 
> failure in name resolution
>   
> *** Check failure stack trace: ***   

[jira] [Commented] (MESOS-3964) LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018223#comment-15018223
 ] 

Marco Massenzio commented on MESOS-3964:


Same as the other ones, it would be great to know whether these are {{0.26}} 
regressions or have been failing for a while (or, even, did they ever pass?).

Thanks, this is a release blocker.

> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs and 
> LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota fail on Debian 8.
> ---
>
> Key: MESOS-3964
> URL: https://issues.apache.org/jira/browse/MESOS-3964
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, test
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile: see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Greg Mann
>  Labels: mesosphere
>
> sudo ./bin/mesos-test.sh 
> --gtest_filter="LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs"
> {noformat}
> ...
> F1119 14:34:52.514742 30706 isolator_tests.cpp:455] CHECK_SOME(isolator): 
> Failed to find 'cpu.cfs_quota_us'. Your kernel might be too old to use the 
> CFS cgroups feature.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3959) Executor page of mesos ui does not show slave hostname

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018317#comment-15018317
 ] 

Marco Massenzio commented on MESOS-3959:


It seems you have made your review request "private"?
Also, did you add the {{mesos}} group to reviewers.

As with other tickets, please make sure you have a shepherd for your changes.

> Executor page of mesos ui does not show slave hostname
> --
>
> Key: MESOS-3959
> URL: https://issues.apache.org/jira/browse/MESOS-3959
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Reporter: Ian Babrou
>
> This is not really convenient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3947) Authenticate /roles request

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018212#comment-15018212
 ] 

Marco Massenzio commented on MESOS-3947:


This makes obviously a lot of sense (although, I'm not sure why the {{GET}} 
should not be authenticated, so it'd be great if you could elaborate that 
point).

However, I believe, this should be part of a broader activity involving *all* 
endpoints, within the scope of the {{HttpAuthorizer}} effort.

Please see MESOS-2297 for more details.

> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles requests except GET method need to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3949) User CGroup Isolation tests fail on Centos 6.

2015-11-20 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-3949:
--

Assignee: Timothy Chen  (was: Joris Van Remoortere)

Can you please coordinate with [~jojy] and [~greggomann] and figure out whether 
these tests ever passed, and/or what the issue may be here?

I have a suspicion these tests keep on failing every release, but can't be sure.

Thanks!

> User CGroup Isolation tests fail on Centos 6.
> -
>
> Key: MESOS-3949
> URL: https://issues.apache.org/jira/browse/MESOS-3949
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation
>Affects Versions: 0.26.0
> Environment: CentOS 6.6, gcc 4.8.1, on vagrant libvirt, 16GB, 8 CPUs,
> ../configure --enable-libevent --enable-ssl
>Reporter: Bernd Mathiske
>Assignee: Timothy Chen
>  Labels: mesosphere
>
> UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup and 
> UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup fail on CentOS 6.6 with 
> similar output when libevent and SSL are enabled.
> {noformat}
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from UserCgroupIsolatorTest/0, where TypeParam = 
> mesos::internal::slave::CgroupsMemIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN  ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
> I1118 16:53:35.273717 30249 mem.cpp:605] Started listening for OOM events for 
> container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.274538 30249 mem.cpp:725] Started listening on low memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.275164 30249 mem.cpp:725] Started listening on medium memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.275784 30249 mem.cpp:725] Started listening on critical memory 
> pressure events for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.276448 30249 mem.cpp:356] Updated 'memory.soft_limit_in_bytes' 
> to 1GB for container 867a829e-4a26-43f5-86e0-938bf1f47688
> I1118 16:53:35.277331 30249 mem.cpp:391] Updated 'memory.limit_in_bytes' to 
> 1GB for container 867a829e-4a26-43f5-86e0-938bf1f47688
> -bash: 
> /sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/cgroup.procs:
>  No such file or directory
> mkdir: cannot create directory 
> `/sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/user': No 
> such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1307: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'mkdir " + 
> path::join(flags.cgroups_hierarchy, userCgroup) + "'")
>   Actual: 256
> Expected: 0
> -bash: 
> /sys/fs/cgroup/memory/mesos/867a829e-4a26-43f5-86e0-938bf1f47688/user/cgroup.procs:
>  No such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1316: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'echo $$ >" + 
> path::join(flags.cgroups_hierarchy, userCgroup, "cgroup.procs") + "'")
>   Actual: 256
> Expected: 0
> [  FAILED  ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where 
> TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess (149 ms)
> {noformat}
> {noformat}
> sudo ./bin/mesos-tests.sh 
> --gtest_filter="UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup" --verbose
> {noformat}
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from UserCgroupIsolatorTest/1, where TypeParam = 
> mesos::internal::slave::CgroupsCpushareIsolatorProcess
> userdel: user 'mesos.test.unprivileged.user' does not exist
> [ RUN  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> I1118 17:01:00.550706 30357 cpushare.cpp:392] Updated 'cpu.shares' to 1024 
> (cpus 1) for container e57f4343-1a97-4b44-b347-803be47ace80
> -bash: 
> /sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/cgroup.procs:
>  No such file or directory
> mkdir: cannot create directory 
> `/sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/user': No 
> such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1307: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'mkdir " + 
> path::join(flags.cgroups_hierarchy, userCgroup) + "'")
>   Actual: 256
> Expected: 0
> -bash: 
> /sys/fs/cgroup/cpuacct/mesos/e57f4343-1a97-4b44-b347-803be47ace80/user/cgroup.procs:
>  No such file or directory
> ../../src/tests/containerizer/isolator_tests.cpp:1316: Failure
> Value of: os::system( "su - " + UNPRIVILEGED_USERNAME + " -c 'echo $$ >" + 
> 

[jira] [Updated] (MESOS-3971) Running task counters in mesos UI is racy.

2015-11-20 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3971:
---
Summary: Running task counters in mesos UI is racy.  (was: Runing task 
counters in mesos UI are racy)

> Running task counters in mesos UI is racy.
> --
>
> Key: MESOS-3971
> URL: https://issues.apache.org/jira/browse/MESOS-3971
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 0.25.0
>Reporter: Ian Babrou
>
> On slave's page task counters in the left panel are racy.
> The reason for that is: in src/webui/master/static/js/controllers.js 
> properties like "staged_tasks" are populated from both master and slave 
> states.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3969) Failing 'make distcheck' on Debian 8, somehow SSL-related.

2015-11-20 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-3969:
--

Assignee: Joris Van Remoortere

This is a release blocker (I presume, assuming that this ever worked).

Can you please (a) find out what's the root cause and (b) whether this is a 
regression (or maybe just a configuration issue on the test rig).

Thanks!

> Failing 'make distcheck' on Debian 8, somehow SSL-related.
> --
>
> Key: MESOS-3969
> URL: https://issues.apache.org/jira/browse/MESOS-3969
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.26.0
> Environment: Debian 8, gcc 4.9.2, Docker 1.9.0, vagrant, libvirt
> Vagrantfile see MESOS-3957
>Reporter: Bernd Mathiske
>Assignee: Joris Van Remoortere
>  Labels: build, build-failure, mesosphere
>
> As non-root: make distcheck.
> {noformat}
> /bin/mkdir -p '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin'
> /bin/bash ../libtool --mode=install /usr/bin/install -c mesos-local mesos-log 
> mesos mesos-execute mesos-resolve 
> '/home/vagrant/mesos/build/mesos-0.26.0/_inst/bin'
> libtool: install: /usr/bin/install -c .libs/mesos-local 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-local
> libtool: install: /usr/bin/install -c .libs/mesos-log 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-log
> libtool: install: /usr/bin/install -c .libs/mesos 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos
> libtool: install: /usr/bin/install -c .libs/mesos-execute 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-execute
> libtool: install: /usr/bin/install -c .libs/mesos-resolve 
> /home/vagrant/mesos/build/mesos-0.26.0/_inst/bin/mesos-resolve
> Traceback (most recent call last):
> File "", line 1, in 
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/__init_.py",
>  line 11, in 
> from pip.vcs import git, mercurial, subversion, bazaar # noqa
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/vcs/mercurial.py",
>  line 9, in 
> from pip.download import path_to_url
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/download.py",
>  line 22, in 
> from pip._vendor import requests, six
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/build/3rdparty/pip-1.5.6/pip/_vendor/requests/__init_.py",
>  line 53, in 
> from .packages.urllib3.contrib import pyopenssl
> File 
> "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rdparty/pip-1.5.6/pip/_vendor/requests/packages/urllib3/contrib/pyopenssl.py",
>  line 70, in 
> ssl.PROTOCOL_SSLv3: OpenSSL.SSL.SSLv3_METHOD,
> AttributeError: 'module' object has no attribute 'PROTOCOL_SSLv3'
> Traceback (most recent call last):
> File "", line 1, in 
> File "/home/vagrant/mesos/build/mesos-0.26.0/_build/3rd
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3936) Document possible task state transitions for framework authors

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018308#comment-15018308
 ] 

Marco Massenzio commented on MESOS-3936:


Great suggestion!
Would love to do this together, as I really want to understand this in much 
greater detail.

> Document possible task state transitions for framework authors
> --
>
> Key: MESOS-3936
> URL: https://issues.apache.org/jira/browse/MESOS-3936
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Neil Conway
>  Labels: documentation,, mesosphere
>
> We should document the possible ways in which the state of a task can evolve 
> over time; what happens when an agent is partitioned from the master; and 
> more generally, how we recommend that framework authors develop 
> fault-tolerant schedulers and do task state reconciliation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3952) Prevent use of outdated test-executor Docker image.

2015-11-20 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-3952:
--

Assignee: Timothy Chen

[~tnachen]:

Assigning to you so you can re-assign accordingly, thanks!

> Prevent use of outdated test-executor Docker image.
> ---
>
> Key: MESOS-3952
> URL: https://issues.apache.org/jira/browse/MESOS-3952
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Till Toenshoff
>Assignee: Timothy Chen
>
> The test-executor docker image (tnachen/test-executor) contains mesos proto 
> files / generated go code (mesos-go). We need to make sure that any update on 
> those protos within mesos does force a Docker pull on this image instead of 
> relying on an outdated, cached version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3947) Authenticate /roles request

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018212#comment-15018212
 ] 

Marco Massenzio edited comment on MESOS-3947 at 11/20/15 4:13 PM:
--

This makes obviously a lot of sense (although, I'm not sure why the {{GET}} 
should not be authenticated, so it'd be great if you could elaborate that 
point).

However, I believe, this should be part of a broader activity involving *all* 
endpoints, within the scope of the {{HttpAuthenticator}} effort.

Please see MESOS-2297 for more details.


was (Author: marco-mesos):
This makes obviously a lot of sense (although, I'm not sure why the {{GET}} 
should not be authenticated, so it'd be great if you could elaborate that 
point).

However, I believe, this should be part of a broader activity involving *all* 
endpoints, within the scope of the {{HttpAuthorizer}} effort.

Please see MESOS-2297 for more details.

> Authenticate /roles request
> ---
>
> Key: MESOS-3947
> URL: https://issues.apache.org/jira/browse/MESOS-3947
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> /roles requests except GET method need to be authenticated.
> This ticket will authenticate /roles requests using credentials provided by 
> the `Authorization` field of the HTTP request. This is similar to how 
> authentication is implemented in `Master::Http`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2000) Support libprocess tracing

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018226#comment-15018226
 ] 

Marco Massenzio commented on MESOS-2000:


[~jpe...@apache.org] - yes, we have something working as part of a "hackathon" 
project: this ticket is about getting the code into shape and release.

> Support libprocess tracing 
> ---
>
> Key: MESOS-2000
> URL: https://issues.apache.org/jira/browse/MESOS-2000
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Timothy Chen
>
> Adding the ability to trace the libprocess calls adds a lot of opportunity 
> for easier debugging, help understanding how mesos work, profiling, and also 
> visualizations.
> Ideally we also want to include timing information as well.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3961) Consider equality behavior for DiskInfo resource

2015-11-20 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio reassigned MESOS-3961:
--

Assignee: Greg Mann

> Consider equality behavior for DiskInfo resource
> 
>
> Key: MESOS-3961
> URL: https://issues.apache.org/jira/browse/MESOS-3961
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Greg Mann
>Priority: Minor
>  Labels: mesosphere, persistent-volumes
>
> Relevant code:
> {code}
> bool operator==(const Resource::DiskInfo& left, const Resource::DiskInfo& 
> right)
> {
>   // NOTE: We ignore 'volume' inside DiskInfo when doing comparison
>   // because it describes how this resource will be used which has
>   // nothing to do with the Resource object itself. A framework can
>   // use this resource and specify different 'volume' every time it
>   // uses it.
>   if (left.has_persistence() != right.has_persistence()) {
> return false;
>   }
>   if (left.has_persistence()) {
> return left.persistence().id() == right.persistence().id();
>   }
>   return true;
> }
> {code}
> A consequence of this behavior is that if you pass the wrong path to a 
> `destroy-volume` request (but there is a persistent volume that otherwise 
> matches the request), the path will be ignored and the volume will be 
> destroyed. Not clear if that is undesirable, but it does seem surprising.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3946) Test for role management

2015-11-20 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15018209#comment-15018209
 ] 

Marco Massenzio commented on MESOS-3946:


Can you please add more details as to what you are proposing here, what's 
missing, etc.?

thanks.

> Test for role management
> 
>
> Key: MESOS-3946
> URL: https://issues.apache.org/jira/browse/MESOS-3946
> Project: Mesos
>  Issue Type: Task
>Reporter: Yong Qiao Wang
>Assignee: Yong Qiao Wang
>
> Add test for role dynamic configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >