[jira] [Updated] (MESOS-5473) Enable Docker and HDFS on Windows

2016-05-27 Thread Daniel Pravat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Pravat updated MESOS-5473:
-
Summary: Enable Docker and HDFS on Windows  (was: Enable 
downloadWithHadoopClient on Windows)

> Enable Docker and HDFS on Windows
> -
>
> Key: MESOS-5473
> URL: https://issues.apache.org/jira/browse/MESOS-5473
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Daniel Pravat
>  Labels: Windows, hdfs
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5425) Consider using IntervalSet for Port range resource math

2016-05-27 Thread Yanyan Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305167#comment-15305167
 ] 

Yanyan Hu commented on MESOS-5425:
--

Sure, will be very glad to post my existing work. I will read the following 
guide to understand how to submit a patch, thanks!

http://mesos.apache.org/documentation/latest/submitting-a-patch/

> Consider using IntervalSet for Port range resource math
> ---
>
> Key: MESOS-5425
> URL: https://issues.apache.org/jira/browse/MESOS-5425
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Joseph Wu
>  Labels: mesosphere
>
> Follow-up JIRA for comments raised in MESOS-3051 (see comments there).
> We should consider utilizing 
> [{{IntervalSet}}|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/3rdparty/stout/include/stout/interval.hpp]
>  in [Port range resource 
> math|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/src/common/values.cpp#L143].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5473) Enable downloadWithHadoopClient on Windows

2016-05-27 Thread Daniel Pravat (JIRA)
Daniel Pravat created MESOS-5473:


 Summary: Enable downloadWithHadoopClient on Windows
 Key: MESOS-5473
 URL: https://issues.apache.org/jira/browse/MESOS-5473
 Project: Mesos
  Issue Type: Improvement
Reporter: Daniel Pravat






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON

2016-05-27 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305063#comment-15305063
 ] 

Joseph Wu commented on MESOS-4642:
--

Looks like the protobuf response in the V1 operator API will neatly side-step 
this issue.  (By effectively creating a new endpoint.)

The response protobuf in the document is:
{code}
message FileContents {
  repeated byte bytes = 1;
}
{code}

The {{byte}} type becomes a base64 encoded string, which will always be valid 
JSON.

> Mesos Agent Json API can dump binary data from log files out as invalid JSON
> 
>
> Key: MESOS-4642
> URL: https://issues.apache.org/jira/browse/MESOS-4642
> Project: Mesos
>  Issue Type: Bug
>  Components: json api, slave
>Affects Versions: 0.27.0
>Reporter: Steven Schlansker
>Priority: Critical
> Fix For: 1.0.0
>
>
> One of our tasks accidentally started logging binary data to stderr.  This 
> was not intentional and generally should not happen -- however, it causes 
> severe problems with the Mesos Agent "files/read.json" API, since it gladly 
> dumps this binary data out as invalid JSON.
> {code}
> # hexdump -C /path/to/task/stderr | tail
> 0003d1f0  6f 6e 6e 65 63 74 69 6f  6e 0a 4e 45 54 3a 20 31  |onnection.NET: 1|
> 0003d200  20 6f 6e 72 65 61 64 20  45 4e 4f 45 4e 54 20 32  | onread ENOENT 2|
> 0003d210  39 35 34 35 36 20 32 35  31 20 32 39 35 37 30 37  |95456 251 295707|
> 0003d220  0a 01 00 00 00 00 00 00  ac 57 65 64 2c 20 31 30  |.Wed, 10|
> 0003d230  20 55 6e 72 65 63 6f 67  6e 69 7a 65 64 20 69 6e  | Unrecognized in|
> 0003d240  70 75 74 20 68 65 61 64  65 72 0a |put header.|
> {code}
> {code}
> # curl 
> 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9='
>  | hexdump -C
> 7970  6e 65 63 74 69 6f 6e 5c  6e 4e 45 54 3a 20 31 20  |nection\nNET: 1 |
> 7980  6f 6e 72 65 61 64 20 45  4e 4f 45 4e 54 20 32 39  |onread ENOENT 29|
> 7990  35 34 35 36 20 32 35 31  20 32 39 35 37 30 37 5c  |5456 251 295707\|
> 79a0  6e 5c 75 30 30 30 31 5c  75 30 30 30 30 5c 75 30  |n\u0001\u\u0|
> 79b0  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u\u\|
> 79c0  75 30 30 30 30 5c 75 30  30 30 30 ac 57 65 64 2c  |u\u.Wed,|
> 79d0  20 31 30 20 55 6e 72 65  63 6f 67 6e 69 7a 65 64  | 10 Unrecognized|
> 79e0  20 69 6e 70 75 74 20 68  65 61 64 65 72 5c 6e 22  | input header\n"|
> 79f0  2c 22 6f 66 66 73 65 74  22 3a 32 32 30 34 34 33  |,"offset":220443|
> 7a00  7d|}|
> {code}
> This causes downstream sadness:
> {code}
> ERROR [2016-02-10 18:55:12,303] 
> io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 
> 0ee749630f8b26f1
> ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac
> !  at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 
> 1, column: 31181]
> ! at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) 
> ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> 

[jira] [Commented] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON

2016-05-27 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305052#comment-15305052
 ] 

Vinod Kone commented on MESOS-4642:
---

Can we do the right thing in v1 API instead of adding a new JSON endpoint?

> Mesos Agent Json API can dump binary data from log files out as invalid JSON
> 
>
> Key: MESOS-4642
> URL: https://issues.apache.org/jira/browse/MESOS-4642
> Project: Mesos
>  Issue Type: Bug
>  Components: json api, slave
>Affects Versions: 0.27.0
>Reporter: Steven Schlansker
>Priority: Critical
> Fix For: 1.0.0
>
>
> One of our tasks accidentally started logging binary data to stderr.  This 
> was not intentional and generally should not happen -- however, it causes 
> severe problems with the Mesos Agent "files/read.json" API, since it gladly 
> dumps this binary data out as invalid JSON.
> {code}
> # hexdump -C /path/to/task/stderr | tail
> 0003d1f0  6f 6e 6e 65 63 74 69 6f  6e 0a 4e 45 54 3a 20 31  |onnection.NET: 1|
> 0003d200  20 6f 6e 72 65 61 64 20  45 4e 4f 45 4e 54 20 32  | onread ENOENT 2|
> 0003d210  39 35 34 35 36 20 32 35  31 20 32 39 35 37 30 37  |95456 251 295707|
> 0003d220  0a 01 00 00 00 00 00 00  ac 57 65 64 2c 20 31 30  |.Wed, 10|
> 0003d230  20 55 6e 72 65 63 6f 67  6e 69 7a 65 64 20 69 6e  | Unrecognized in|
> 0003d240  70 75 74 20 68 65 61 64  65 72 0a |put header.|
> {code}
> {code}
> # curl 
> 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9='
>  | hexdump -C
> 7970  6e 65 63 74 69 6f 6e 5c  6e 4e 45 54 3a 20 31 20  |nection\nNET: 1 |
> 7980  6f 6e 72 65 61 64 20 45  4e 4f 45 4e 54 20 32 39  |onread ENOENT 29|
> 7990  35 34 35 36 20 32 35 31  20 32 39 35 37 30 37 5c  |5456 251 295707\|
> 79a0  6e 5c 75 30 30 30 31 5c  75 30 30 30 30 5c 75 30  |n\u0001\u\u0|
> 79b0  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u\u\|
> 79c0  75 30 30 30 30 5c 75 30  30 30 30 ac 57 65 64 2c  |u\u.Wed,|
> 79d0  20 31 30 20 55 6e 72 65  63 6f 67 6e 69 7a 65 64  | 10 Unrecognized|
> 79e0  20 69 6e 70 75 74 20 68  65 61 64 65 72 5c 6e 22  | input header\n"|
> 79f0  2c 22 6f 66 66 73 65 74  22 3a 32 32 30 34 34 33  |,"offset":220443|
> 7a00  7d|}|
> {code}
> This causes downstream sadness:
> {code}
> ERROR [2016-02-10 18:55:12,303] 
> io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 
> 0ee749630f8b26f1
> ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac
> !  at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 
> 1, column: 31181]
> ! at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) 
> ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.module.afterburner.deser.SuperSonicBeanDeserializer.deserializeFromObject(SuperSonicBeanDeserializer.java:196)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:142)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> 

[jira] [Commented] (MESOS-5350) Add asynchronous hook for validating docker containerizer tasks

2016-05-27 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305032#comment-15305032
 ] 

Jie Yu commented on MESOS-5350:
---

commit f5b3f2a6c4795d2c4e7effbb594749cdc9b8ea4e
Author: Joseph Wu 
Date:   Fri May 27 17:19:43 2016 -0700

Wired up the new docker environment hook.

Modifies the code path for docker executors.

Docker command executors are now launched with an additional flag
that is filled in by a hook.  The --task_environment flag tells the
command executor to pass some specified mapping of environment
variables to the task.

Custom executors are launched with the environment variables directly.
It is up to custom executors to pass these variables into tasks.

Review: https://reviews.apache.org/r/47216/

commit 8486e9829435d9f09ad0c13de8a4c14257d8a988
Author: Joseph Wu 
Date:   Fri May 27 17:19:30 2016 -0700

Implemented new asynchronous docker pre-launch hook.

Introduces, but does not fully wire up a new hook.

The new hook, "slavePreLaunchDockerEnvironmentDecorator",
has divergent semantics compared with existing hooks:

* The hook is asynchronous,
* can prevent a task from launching if it errors,
* can overwrite environment variables.

The new hook is intended to be a strictly-superior
replacement for the existing hook "slavePreLaunchDockerHook".

Review: https://reviews.apache.org/r/47150/

commit cd46db8073fd1ee3d8bd63d3bfedab2e7a522bd7
Author: Joseph Wu 
Date:   Fri May 27 17:19:26 2016 -0700

Changed the dockerized docker command executor CommandInfo usage.

This changes how we override the `CommandInfo` when launching a
dockerized executor; from `shell == true` to `shell = false`.
This means that flags are now passed directly rather than as
a long string.

i.e.
From: 'mesos-docker-executor --foo="bar" --some="thing"'
To: [ 'mesos-docker-executor', '--foo=bar', '--some=thing' ]

Review: https://reviews.apache.org/r/47215/

commit 9b054cc46d462ad5c8c5074b8b5c9e7eeac3dabf
Author: Joseph Wu 
Date:   Fri May 27 17:19:22 2016 -0700

Removed duplicate call to containerizer::executorEnvironment.

In this code path, where the task uses the default command executor,
and the agent is not dockerized
(i.e. `taskInfo.isSome() && flags.docker_mesos_image.isNone()`),
the `executorEnvironment` function is called twice.

The first call is inside the `Container*` constructor called by
`Container::create`.  Since `Container::create` gives passes `None`
for the `environment` field, the constructor will call
`executorEnvironment` to populate the `environment` field.
The populated field is then accessible by `launchExecutorProcess`.

Review: https://reviews.apache.org/r/47212/

commit 82029372c3eb0a12218fd9864cc0f5da38f5b108
Author: Joseph Wu 
Date:   Fri May 27 17:19:15 2016 -0700

Added optional environment variable argument to mesos-docker-executor.

This flag opens up a way for hooks to specify environment variables for
docker tasks.  Existing hooks can only affect the environment variables
of docker executors.

Review: https://reviews.apache.org/r/47205/

commit 21a3ab6fbb89945fbd7b2ea773fff67894bf24bb
Author: Joseph Wu 
Date:   Fri May 27 17:19:10 2016 -0700

Split DockerContainerizerProcess::launch into two functions.

This prepares the `::launch` method for an asynchronous hook.

Review: https://reviews.apache.org/r/47149/

> Add asynchronous hook for validating docker containerizer tasks
> ---
>
> Key: MESOS-5350
> URL: https://issues.apache.org/jira/browse/MESOS-5350
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker, modules
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>Priority: Minor
>  Labels: containerizer, hooks, mesosphere
>
> It is possible to plug in custom validation logic for the MesosContainerizer 
> via an {{Isolator}} module, but the same is not true of the 
> DockerContainerizer.
> Basic logic can be plugged into the DockerContainerizer via {{Hooks}}, but 
> this has some notable differences compared to isolators:
> * Hooks are synchronous.
> * Modifications to tasks via Hooks have lower priority compared to the task 
> itself.  i.e. If both the {{TaskInfo}} and 
> {{slaveExecutorEnvironmentDecorator}} define the same environment variable, 
> the {{TaskInfo}} wins.
> * Hooks have no effect if they fail (short of segfaulting)
> i.e. The {{slavePreLaunchDockerHook}} has a return type of {{Try}}:
> 

[jira] [Commented] (MESOS-5412) Support CNI_ARGS

2016-05-27 Thread Dan Osborne (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304992#comment-15304992
 ] 

Dan Osborne commented on MESOS-5412:


[~hartem] Not planning on submitting a patch by then, so feel free to bump this 
out.

I'm no longer convinced that CNI args would be the right place to inject 
network policy definitions for a Task. Shall we leave this issue open as  a 
backlog item until a more pressing need / more defined use case for CNI_ARGS 
arises?

> Support CNI_ARGS
> 
>
> Key: MESOS-5412
> URL: https://issues.apache.org/jira/browse/MESOS-5412
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Dan Osborne
>
> Mesos-CNI should support the 
> [CNI_ARGS|https://github.com/containernetworking/cni/blob/master/SPEC.md#parameters]
>  field.
> This would allow CNI plugins to be able to implement advanced networking 
> capabilities without needing modifications to Mesos. Current use case I am 
> facing: Allowing users to specify policy for their CNI plugin. 
> I'm proposing the following implementation: Pass a task's [NetworkInfo 
> Labels|https://github.com/apache/mesos/blob/b7e50fe8b20c96cda5546db5f2c2f47bee461edb/include/mesos/mesos.proto#L1732]
>  to the CNI plugin as CNI_ARGS. CNI args are simply key-value pairs split by 
> a '=', e.g. "FOO=BAR;ABC=123", which could be easily generated from the 
> NetworkInfo's key-value labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5354) Update "driver" as optional for DockerVolume.

2016-05-27 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304969#comment-15304969
 ] 

Jie Yu commented on MESOS-5354:
---

commit 183fb0431ceb185cd29ea34578415883c2db29cc
Author: Guangya Liu 
Date:   Fri May 27 16:08:34 2016 -0700

Made "driver" as optional for DockerVolume.

Review: https://reviews.apache.org/r/45377/

> Update "driver" as optional for DockerVolume.
> -
>
> Key: MESOS-5354
> URL: https://issues.apache.org/jira/browse/MESOS-5354
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>Priority: Blocker
> Fix For: 0.29.0
>
>
> After some test with docker API, I found that when "docker run" to create a 
> container, the volume name is required but volume driver is optional. When 
> using "dvdcli", both name and driver are required. We are now defining the 
> "driver" as required, we should update "driver" to optional so that the 
> DockerContainerizer still works even if user did not specify driver when 
> creating a container with volume.
> {code}
> message DockerVolume {
>   // Driver of the volume, it can be flocker, convoy, raxrey etc.
>   required string driver = 1;     // Name of the volume.
>   required string name = 2;
>   // Volume driver specific options.
>   optional Parameters driver_options = 3;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON

2016-05-27 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304965#comment-15304965
 ] 

Joseph Wu commented on MESOS-4642:
--

There isn't a straight forward solution for this one.  Our options are to make 
a small breaking change, omit data from the file, or create a new analogous 
endpoint (and have frameworks use that one instead).

> Mesos Agent Json API can dump binary data from log files out as invalid JSON
> 
>
> Key: MESOS-4642
> URL: https://issues.apache.org/jira/browse/MESOS-4642
> Project: Mesos
>  Issue Type: Bug
>  Components: json api, slave
>Affects Versions: 0.27.0
>Reporter: Steven Schlansker
>Priority: Critical
> Fix For: 1.0.0
>
>
> One of our tasks accidentally started logging binary data to stderr.  This 
> was not intentional and generally should not happen -- however, it causes 
> severe problems with the Mesos Agent "files/read.json" API, since it gladly 
> dumps this binary data out as invalid JSON.
> {code}
> # hexdump -C /path/to/task/stderr | tail
> 0003d1f0  6f 6e 6e 65 63 74 69 6f  6e 0a 4e 45 54 3a 20 31  |onnection.NET: 1|
> 0003d200  20 6f 6e 72 65 61 64 20  45 4e 4f 45 4e 54 20 32  | onread ENOENT 2|
> 0003d210  39 35 34 35 36 20 32 35  31 20 32 39 35 37 30 37  |95456 251 295707|
> 0003d220  0a 01 00 00 00 00 00 00  ac 57 65 64 2c 20 31 30  |.Wed, 10|
> 0003d230  20 55 6e 72 65 63 6f 67  6e 69 7a 65 64 20 69 6e  | Unrecognized in|
> 0003d240  70 75 74 20 68 65 61 64  65 72 0a |put header.|
> {code}
> {code}
> # curl 
> 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9='
>  | hexdump -C
> 7970  6e 65 63 74 69 6f 6e 5c  6e 4e 45 54 3a 20 31 20  |nection\nNET: 1 |
> 7980  6f 6e 72 65 61 64 20 45  4e 4f 45 4e 54 20 32 39  |onread ENOENT 29|
> 7990  35 34 35 36 20 32 35 31  20 32 39 35 37 30 37 5c  |5456 251 295707\|
> 79a0  6e 5c 75 30 30 30 31 5c  75 30 30 30 30 5c 75 30  |n\u0001\u\u0|
> 79b0  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u\u\|
> 79c0  75 30 30 30 30 5c 75 30  30 30 30 ac 57 65 64 2c  |u\u.Wed,|
> 79d0  20 31 30 20 55 6e 72 65  63 6f 67 6e 69 7a 65 64  | 10 Unrecognized|
> 79e0  20 69 6e 70 75 74 20 68  65 61 64 65 72 5c 6e 22  | input header\n"|
> 79f0  2c 22 6f 66 66 73 65 74  22 3a 32 32 30 34 34 33  |,"offset":220443|
> 7a00  7d|}|
> {code}
> This causes downstream sadness:
> {code}
> ERROR [2016-02-10 18:55:12,303] 
> io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 
> 0ee749630f8b26f1
> ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac
> !  at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 
> 1, column: 31181]
> ! at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) 
> ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.module.afterburner.deser.SuperSonicBeanDeserializer.deserializeFromObject(SuperSonicBeanDeserializer.java:196)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> 

[jira] [Commented] (MESOS-5453) CNI should not store subnet of address in NetworkInfo

2016-05-27 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304958#comment-15304958
 ] 

Jie Yu commented on MESOS-5453:
---

Thanks for contributing! Nope. I've already resolved this ticket.

> CNI should not store subnet of address in NetworkInfo
> -
>
> Key: MESOS-5453
> URL: https://issues.apache.org/jira/browse/MESOS-5453
> Project: Mesos
>  Issue Type: Bug
>Reporter: Dan Osborne
>Assignee: Dan Osborne
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> When the CNI isolator executes the CNI plugin, that CNI plugin will return an 
> IP Address and Subnet (192.168.0.1/32). Mesos should strip the subnet before 
> storing the address in the Task.NetworkInfo.IPAddress.
> Reason being - most current mesos components are not expecting a subnet in 
> the Task's NetworkInfo.IPAddress, and instead expect just the IP address. 
> This can cause errors in those components, such as Mesos-DNS failing to 
> return a NetworkInfo address (and instead defaulting to the next configured 
> IPSource), and Marathon generating invalid links to tasks (as it includes /32 
> in the link)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5453) CNI should not store subnet of address in NetworkInfo

2016-05-27 Thread Dan Osborne (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304948#comment-15304948
 ] 

Dan Osborne commented on MESOS-5453:


First time submitting - Is there something left for me to do to close this out?

> CNI should not store subnet of address in NetworkInfo
> -
>
> Key: MESOS-5453
> URL: https://issues.apache.org/jira/browse/MESOS-5453
> Project: Mesos
>  Issue Type: Bug
>Reporter: Dan Osborne
>Assignee: Dan Osborne
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> When the CNI isolator executes the CNI plugin, that CNI plugin will return an 
> IP Address and Subnet (192.168.0.1/32). Mesos should strip the subnet before 
> storing the address in the Task.NetworkInfo.IPAddress.
> Reason being - most current mesos components are not expecting a subnet in 
> the Task's NetworkInfo.IPAddress, and instead expect just the IP address. 
> This can cause errors in those components, such as Mesos-DNS failing to 
> return a NetworkInfo address (and instead defaulting to the next configured 
> IPSource), and Marathon generating invalid links to tasks (as it includes /32 
> in the link)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4609) Subprocess should be more intelligent about setting/inheriting libprocess environment variables

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4609:
-
Fix Version/s: 1.0.0

> Subprocess should be more intelligent about setting/inheriting libprocess 
> environment variables 
> 
>
> Key: MESOS-4609
> URL: https://issues.apache.org/jira/browse/MESOS-4609
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.0
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: mesosphere
> Fix For: 1.0.0
>
>
> Mostly copied from [this 
> comment|https://issues.apache.org/jira/browse/MESOS-4598?focusedCommentId=15133497=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15133497]
> A subprocess inheriting the environment variables {{LIBPROCESS_*}} may run 
> into some accidental fatalities:
> | || Subprocess uses libprocess || Subprocess is something else ||
> || Subprocess sets/inherits the same {{PORT}} by accident | Bind failure -> 
> exit | Nothing happens (?) |
> || Subprocess sets a different {{PORT}} on purpose | Bind success (?) | 
> Nothing happens (?) |
> (?) = means this is usually the case, but not 100%.
> A complete fix would look something like:
> * If the {{subprocess}} call gets {{environment = None()}}, we should 
> automatically remove {{LIBPROCESS_PORT}} from the inherited environment.  
> * The parts of 
> [{{executorEnvironment}}|https://github.com/apache/mesos/blame/master/src/slave/containerizer/containerizer.cpp#L265]
>  dealing with libprocess & libmesos should be refactored into libprocess as a 
> helper.  We would use this helper for the Containerizer, Fetcher, and 
> ContainerLogger module.
> * If the {{subprocess}} call is given {{LIBPROCESS_PORT == 
> os::getenv("LIBPROCESS_PORT")}}, we can LOG(WARN) and unset the env var 
> locally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON

2016-05-27 Thread Artem Harutyunyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304933#comment-15304933
 ] 

Artem Harutyunyan commented on MESOS-4642:
--

[~kaysoky] can you take a look at this one please? 

> Mesos Agent Json API can dump binary data from log files out as invalid JSON
> 
>
> Key: MESOS-4642
> URL: https://issues.apache.org/jira/browse/MESOS-4642
> Project: Mesos
>  Issue Type: Bug
>  Components: json api, slave
>Affects Versions: 0.27.0
>Reporter: Steven Schlansker
>Priority: Critical
> Fix For: 1.0.0
>
>
> One of our tasks accidentally started logging binary data to stderr.  This 
> was not intentional and generally should not happen -- however, it causes 
> severe problems with the Mesos Agent "files/read.json" API, since it gladly 
> dumps this binary data out as invalid JSON.
> {code}
> # hexdump -C /path/to/task/stderr | tail
> 0003d1f0  6f 6e 6e 65 63 74 69 6f  6e 0a 4e 45 54 3a 20 31  |onnection.NET: 1|
> 0003d200  20 6f 6e 72 65 61 64 20  45 4e 4f 45 4e 54 20 32  | onread ENOENT 2|
> 0003d210  39 35 34 35 36 20 32 35  31 20 32 39 35 37 30 37  |95456 251 295707|
> 0003d220  0a 01 00 00 00 00 00 00  ac 57 65 64 2c 20 31 30  |.Wed, 10|
> 0003d230  20 55 6e 72 65 63 6f 67  6e 69 7a 65 64 20 69 6e  | Unrecognized in|
> 0003d240  70 75 74 20 68 65 61 64  65 72 0a |put header.|
> {code}
> {code}
> # curl 
> 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9='
>  | hexdump -C
> 7970  6e 65 63 74 69 6f 6e 5c  6e 4e 45 54 3a 20 31 20  |nection\nNET: 1 |
> 7980  6f 6e 72 65 61 64 20 45  4e 4f 45 4e 54 20 32 39  |onread ENOENT 29|
> 7990  35 34 35 36 20 32 35 31  20 32 39 35 37 30 37 5c  |5456 251 295707\|
> 79a0  6e 5c 75 30 30 30 31 5c  75 30 30 30 30 5c 75 30  |n\u0001\u\u0|
> 79b0  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u\u\|
> 79c0  75 30 30 30 30 5c 75 30  30 30 30 ac 57 65 64 2c  |u\u.Wed,|
> 79d0  20 31 30 20 55 6e 72 65  63 6f 67 6e 69 7a 65 64  | 10 Unrecognized|
> 79e0  20 69 6e 70 75 74 20 68  65 61 64 65 72 5c 6e 22  | input header\n"|
> 79f0  2c 22 6f 66 66 73 65 74  22 3a 32 32 30 34 34 33  |,"offset":220443|
> 7a00  7d|}|
> {code}
> This causes downstream sadness:
> {code}
> ERROR [2016-02-10 18:55:12,303] 
> io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 
> 0ee749630f8b26f1
> ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac
> !  at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 
> 1, column: 31181]
> ! at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) 
> ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.module.afterburner.deser.SuperSonicBeanDeserializer.deserializeFromObject(SuperSonicBeanDeserializer.java:196)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:142)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> 

[jira] [Updated] (MESOS-4642) Mesos Agent Json API can dump binary data from log files out as invalid JSON

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4642:
-
Fix Version/s: 1.0.0

> Mesos Agent Json API can dump binary data from log files out as invalid JSON
> 
>
> Key: MESOS-4642
> URL: https://issues.apache.org/jira/browse/MESOS-4642
> Project: Mesos
>  Issue Type: Bug
>  Components: json api, slave
>Affects Versions: 0.27.0
>Reporter: Steven Schlansker
>Priority: Critical
> Fix For: 1.0.0
>
>
> One of our tasks accidentally started logging binary data to stderr.  This 
> was not intentional and generally should not happen -- however, it causes 
> severe problems with the Mesos Agent "files/read.json" API, since it gladly 
> dumps this binary data out as invalid JSON.
> {code}
> # hexdump -C /path/to/task/stderr | tail
> 0003d1f0  6f 6e 6e 65 63 74 69 6f  6e 0a 4e 45 54 3a 20 31  |onnection.NET: 1|
> 0003d200  20 6f 6e 72 65 61 64 20  45 4e 4f 45 4e 54 20 32  | onread ENOENT 2|
> 0003d210  39 35 34 35 36 20 32 35  31 20 32 39 35 37 30 37  |95456 251 295707|
> 0003d220  0a 01 00 00 00 00 00 00  ac 57 65 64 2c 20 31 30  |.Wed, 10|
> 0003d230  20 55 6e 72 65 63 6f 67  6e 69 7a 65 64 20 69 6e  | Unrecognized in|
> 0003d240  70 75 74 20 68 65 61 64  65 72 0a |put header.|
> {code}
> {code}
> # curl 
> 'http://agent-host:5051/files/read.json?path=/path/to/task/stderr=220443=9='
>  | hexdump -C
> 7970  6e 65 63 74 69 6f 6e 5c  6e 4e 45 54 3a 20 31 20  |nection\nNET: 1 |
> 7980  6f 6e 72 65 61 64 20 45  4e 4f 45 4e 54 20 32 39  |onread ENOENT 29|
> 7990  35 34 35 36 20 32 35 31  20 32 39 35 37 30 37 5c  |5456 251 295707\|
> 79a0  6e 5c 75 30 30 30 31 5c  75 30 30 30 30 5c 75 30  |n\u0001\u\u0|
> 79b0  30 30 30 5c 75 30 30 30  30 5c 75 30 30 30 30 5c  |000\u\u\|
> 79c0  75 30 30 30 30 5c 75 30  30 30 30 ac 57 65 64 2c  |u\u.Wed,|
> 79d0  20 31 30 20 55 6e 72 65  63 6f 67 6e 69 7a 65 64  | 10 Unrecognized|
> 79e0  20 69 6e 70 75 74 20 68  65 61 64 65 72 5c 6e 22  | input header\n"|
> 79f0  2c 22 6f 66 66 73 65 74  22 3a 32 32 30 34 34 33  |,"offset":220443|
> 7a00  7d|}|
> {code}
> This causes downstream sadness:
> {code}
> ERROR [2016-02-10 18:55:12,303] 
> io.dropwizard.jersey.errors.LoggingExceptionMapper: Error handling a request: 
> 0ee749630f8b26f1
> ! com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0xac
> !  at [Source: org.jboss.netty.buffer.ChannelBufferInputStream@6d69ee8; line: 
> 1, column: 31181]
> ! at 
> com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1487) 
> ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:518)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3339)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidChar(UTF8StreamJsonParser.java:)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2360)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2287)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:286)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:29)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:12)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:523)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:381)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1073)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.module.afterburner.deser.SuperSonicBeanDeserializer.deserializeFromObject(SuperSonicBeanDeserializer.java:196)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:142)
>  ~[singularity-0.4.9.jar:0.4.9]
> ! at 
> 

[jira] [Updated] (MESOS-5188) docker executor thinks task is failed when docker container was stopped

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5188:
-
Assignee: (was: Jie Yu)

> docker executor thinks task is failed when docker container was stopped
> ---
>
> Key: MESOS-5188
> URL: https://issues.apache.org/jira/browse/MESOS-5188
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.28.0
>Reporter: Liqiang Lin
> Fix For: 1.0.0
>
>
> Test cases:
> 1. Launch a task with Swarm (on Mesos).
> {code}
> # docker -H 192.168.56.110:54375 run -d --cpu-shares 1 ubuntu sleep 300
> {code}
> 2. Then stop the docker container.
> {code}
> # docker -H 192.168.56.110:54375 ps
> CONTAINER IDIMAGE   COMMAND CREATED   
>   STATUS  PORTS   NAMES
> b4813ba3ed4dubuntu  "sleep 300" 9 seconds ago 
>   Up 8 seconds
> mesos1/mesos-2cd5576e-6260-4262-a62c-b0dc45c86c45-S1.1595e79b-aef2-44b6-a313-ad4ff8626958
> # docker -H 192.168.56.110:54375 stop b4813ba3ed4d
> b4813ba3ed4d
> {code}
> 3. Found the task is failed. See Mesos slave log,
> {code}
> I0407 09:10:57.606552 32307 slave.cpp:1508] Got assigned task 99ee7dc74861 
> for framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c-
> I0407 09:10:57.608230 32307 slave.cpp:1627] Launching task 99ee7dc74861 for 
> framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c-
> I0407 09:10:57.609979 32307 paths.cpp:528] Trying to chown 
> '/var/lib/mesos/slaves/2cd5576e-6260-4262-a62c-b0dc45c86c45-S0/frameworks/5b84aad8-dd60-40b3-84c2-93be6b7aa81c-/executors/99ee7dc74861/runs/250a169f-7aba-474d-a4f5-cd24ecf0e7d9'
>  to user 'root'
> I0407 09:10:57.615881 32307 slave.cpp:5586] Launching executor 99ee7dc74861 
> of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- with resources 
> cpus(*):0.1; mem(*):32 in work directory 
> '/var/lib/mesos/slaves/2cd5576e-6260-4262-a62c-b0dc45c86c45-S0/frameworks/5b84aad8-dd60-40b3-84c2-93be6b7aa81c-/executors/99ee7dc74861/runs/250a169f-7aba-474d-a4f5-cd24ecf0e7d9'
> I0407 09:12:18.458449 32307 slave.cpp:1845] Queuing task '99ee7dc74861' for 
> executor '99ee7dc74861' of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c-
> I0407 09:12:18.459092 32307 slave.cpp:3711] No pings from master received 
> within 75secs
> I0407 09:12:18.460212 32307 slave.cpp:4593] Current disk usage 56.53%. Max 
> allowed age: 2.342613645432778days
> I0407 09:12:18.463484 32307 slave.cpp:928] Re-detecting master
> I0407 09:12:18.463969 32307 slave.cpp:975] Detecting new master
> I0407 09:12:18.464501 32307 slave.cpp:939] New master detected at 
> master@192.168.56.110:5050
> I0407 09:12:18.464848 32307 slave.cpp:964] No credentials provided. 
> Attempting to register without authentication
> I0407 09:12:18.465237 32307 slave.cpp:975] Detecting new master
> I0407 09:12:18.463611 32312 status_update_manager.cpp:174] Pausing sending 
> status updates
> I0407 09:12:18.465744 32312 status_update_manager.cpp:174] Pausing sending 
> status updates
> I0407 09:12:18.472323 32313 docker.cpp:1011] Starting container 
> '250a169f-7aba-474d-a4f5-cd24ecf0e7d9' for task '99ee7dc74861' (and executor 
> '99ee7dc74861') of framework '5b84aad8-dd60-40b3-84c2-93be6b7aa81c-'
> I0407 09:12:18.588739 32313 slave.cpp:1218] Re-registered with master 
> master@192.168.56.110:5050
> I0407 09:12:18.588927 32313 slave.cpp:1254] Forwarding total oversubscribed 
> resources
> I0407 09:12:18.589320 32313 slave.cpp:2395] Updating framework 
> 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- pid to 
> scheduler(1)@192.168.56.110:53375
> I0407 09:12:18.592079 32308 status_update_manager.cpp:181] Resuming sending 
> status updates
> I0407 09:12:18.592842 32313 slave.cpp:2534] Updated checkpointed resources 
> from  to
> I0407 09:12:18.592793 32308 status_update_manager.cpp:181] Resuming sending 
> status updates
> I0407 09:12:20.582041 32307 slave.cpp:2836] Got registration for executor 
> '99ee7dc74861' of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- from 
> executor(1)@192.168.56.110:40725
> I0407 09:12:20.584446 32307 docker.cpp:1308] Ignoring updating container 
> '250a169f-7aba-474d-a4f5-cd24ecf0e7d9' with resources passed to update is 
> identical to existing resources
> I0407 09:12:20.585093 32307 slave.cpp:2010] Sending queued task 
> '99ee7dc74861' to executor '99ee7dc74861' of framework 
> 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- at executor(1)@192.168.56.110:40725
> I0407 09:12:21.307077 32312 slave.cpp:3195] Handling status update 
> TASK_RUNNING (UUID: a7098650-cbf6-4445-8216-b5f658d2f5f4) for task 
> 99ee7dc74861 of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- from 
> executor(1)@192.168.56.110:40725
> I0407 09:12:21.308820 32308 

[jira] [Updated] (MESOS-5188) docker executor thinks task is failed when docker container was stopped

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5188:
-
Fix Version/s: 1.0.0

> docker executor thinks task is failed when docker container was stopped
> ---
>
> Key: MESOS-5188
> URL: https://issues.apache.org/jira/browse/MESOS-5188
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.28.0
>Reporter: Liqiang Lin
>Assignee: Jie Yu
> Fix For: 1.0.0
>
>
> Test cases:
> 1. Launch a task with Swarm (on Mesos).
> {code}
> # docker -H 192.168.56.110:54375 run -d --cpu-shares 1 ubuntu sleep 300
> {code}
> 2. Then stop the docker container.
> {code}
> # docker -H 192.168.56.110:54375 ps
> CONTAINER IDIMAGE   COMMAND CREATED   
>   STATUS  PORTS   NAMES
> b4813ba3ed4dubuntu  "sleep 300" 9 seconds ago 
>   Up 8 seconds
> mesos1/mesos-2cd5576e-6260-4262-a62c-b0dc45c86c45-S1.1595e79b-aef2-44b6-a313-ad4ff8626958
> # docker -H 192.168.56.110:54375 stop b4813ba3ed4d
> b4813ba3ed4d
> {code}
> 3. Found the task is failed. See Mesos slave log,
> {code}
> I0407 09:10:57.606552 32307 slave.cpp:1508] Got assigned task 99ee7dc74861 
> for framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c-
> I0407 09:10:57.608230 32307 slave.cpp:1627] Launching task 99ee7dc74861 for 
> framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c-
> I0407 09:10:57.609979 32307 paths.cpp:528] Trying to chown 
> '/var/lib/mesos/slaves/2cd5576e-6260-4262-a62c-b0dc45c86c45-S0/frameworks/5b84aad8-dd60-40b3-84c2-93be6b7aa81c-/executors/99ee7dc74861/runs/250a169f-7aba-474d-a4f5-cd24ecf0e7d9'
>  to user 'root'
> I0407 09:10:57.615881 32307 slave.cpp:5586] Launching executor 99ee7dc74861 
> of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- with resources 
> cpus(*):0.1; mem(*):32 in work directory 
> '/var/lib/mesos/slaves/2cd5576e-6260-4262-a62c-b0dc45c86c45-S0/frameworks/5b84aad8-dd60-40b3-84c2-93be6b7aa81c-/executors/99ee7dc74861/runs/250a169f-7aba-474d-a4f5-cd24ecf0e7d9'
> I0407 09:12:18.458449 32307 slave.cpp:1845] Queuing task '99ee7dc74861' for 
> executor '99ee7dc74861' of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c-
> I0407 09:12:18.459092 32307 slave.cpp:3711] No pings from master received 
> within 75secs
> I0407 09:12:18.460212 32307 slave.cpp:4593] Current disk usage 56.53%. Max 
> allowed age: 2.342613645432778days
> I0407 09:12:18.463484 32307 slave.cpp:928] Re-detecting master
> I0407 09:12:18.463969 32307 slave.cpp:975] Detecting new master
> I0407 09:12:18.464501 32307 slave.cpp:939] New master detected at 
> master@192.168.56.110:5050
> I0407 09:12:18.464848 32307 slave.cpp:964] No credentials provided. 
> Attempting to register without authentication
> I0407 09:12:18.465237 32307 slave.cpp:975] Detecting new master
> I0407 09:12:18.463611 32312 status_update_manager.cpp:174] Pausing sending 
> status updates
> I0407 09:12:18.465744 32312 status_update_manager.cpp:174] Pausing sending 
> status updates
> I0407 09:12:18.472323 32313 docker.cpp:1011] Starting container 
> '250a169f-7aba-474d-a4f5-cd24ecf0e7d9' for task '99ee7dc74861' (and executor 
> '99ee7dc74861') of framework '5b84aad8-dd60-40b3-84c2-93be6b7aa81c-'
> I0407 09:12:18.588739 32313 slave.cpp:1218] Re-registered with master 
> master@192.168.56.110:5050
> I0407 09:12:18.588927 32313 slave.cpp:1254] Forwarding total oversubscribed 
> resources
> I0407 09:12:18.589320 32313 slave.cpp:2395] Updating framework 
> 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- pid to 
> scheduler(1)@192.168.56.110:53375
> I0407 09:12:18.592079 32308 status_update_manager.cpp:181] Resuming sending 
> status updates
> I0407 09:12:18.592842 32313 slave.cpp:2534] Updated checkpointed resources 
> from  to
> I0407 09:12:18.592793 32308 status_update_manager.cpp:181] Resuming sending 
> status updates
> I0407 09:12:20.582041 32307 slave.cpp:2836] Got registration for executor 
> '99ee7dc74861' of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- from 
> executor(1)@192.168.56.110:40725
> I0407 09:12:20.584446 32307 docker.cpp:1308] Ignoring updating container 
> '250a169f-7aba-474d-a4f5-cd24ecf0e7d9' with resources passed to update is 
> identical to existing resources
> I0407 09:12:20.585093 32307 slave.cpp:2010] Sending queued task 
> '99ee7dc74861' to executor '99ee7dc74861' of framework 
> 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- at executor(1)@192.168.56.110:40725
> I0407 09:12:21.307077 32312 slave.cpp:3195] Handling status update 
> TASK_RUNNING (UUID: a7098650-cbf6-4445-8216-b5f658d2f5f4) for task 
> 99ee7dc74861 of framework 5b84aad8-dd60-40b3-84c2-93be6b7aa81c- from 
> executor(1)@192.168.56.110:40725
> I0407 

[jira] [Updated] (MESOS-5195) Docker executor: task logs lost on shutdown

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5195:
-
Fix Version/s: 1.0.0

> Docker executor: task logs lost on shutdown
> ---
>
> Key: MESOS-5195
> URL: https://issues.apache.org/jira/browse/MESOS-5195
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.27.2
> Environment: Linux 4.4.2 "Ubuntu 14.04.2 LTS"
>Reporter: Steven Schlansker
> Fix For: 1.0.0
>
>
> When you try to kill a task running in the Docker executor (in our case via 
> Singularity), the task shuts down cleanly but the last logs to standard out / 
> standard error are lost in teardown.
> For example, we run dumb-init.  With debugging on, you can see it should 
> write:
> {noformat}
> DEBUG("Forwarded signal %d to children.\n", signum);
> {noformat}
> If you attach strace to the process, you can see it clearly writes the text 
> to stderr.  But that message is lost and never is written to the sandbox 
> 'stderr' file.
> We believe the issue starts here, in Docker executor.cpp:
> {code}
>   void shutdown(ExecutorDriver* driver)
>   {
> cout << "Shutting down" << endl;
> if (run.isSome() && !killed) {
>   // The docker daemon might still be in progress starting the
>   // container, therefore we kill both the docker run process
>   // and also ask the daemon to stop the container.
>   // Making a mutable copy of the future so we can call discard.
>   Future(run.get()).discard();
>   stop = docker->stop(containerName, stopTimeout);
>   killed = true;
> }
>   }
> {code}
> Notice how the "run" future is discarded *before* the Docker daemon is told 
> to stop -- now what will discarding it do?
> {code}
> void commandDiscarded(const Subprocess& s, const string& cmd)
> {
>   VLOG(1) << "'" << cmd << "' is being discarded";
>   os::killtree(s.pid(), SIGKILL);
> }
> {code}
> Oops, just sent SIGKILL to the entire process tree...
> You can see another (harmless?) side effect in the Docker daemon logs, it 
> never gets a chance to kill the task:
> {noformat}
> ERROR Handler for DELETE 
> /v1.22/containers/mesos-f3bb39fe-8fd9-43d2-80a6-93df6a76807e-S2.0c509380-c326-4ff7-bb68-86a37b54f233
>  returned error: No such container: 
> mesos-f3bb39fe-8fd9-43d2-80a6-93df6a76807e-S2.0c509380-c326-4ff7-bb68-86a37b54f233
> {noformat}
> I suspect that the fix is wait for 'docker->stop()' to complete before 
> discarding the 'run' future.
> Happy to provide more information if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5224) buffer overflow error in slave upon processing malformed UUIDs

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5224:
-
Fix Version/s: 1.0.0

> buffer overflow error in slave upon processing malformed UUIDs
> --
>
> Key: MESOS-5224
> URL: https://issues.apache.org/jira/browse/MESOS-5224
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.28.0
> Environment: {code}
> $ dpkg -l|grep -e mesos
> ii  mesos   0.28.0-2.0.16.ubuntu1404 
> amd64Cluster resource manager with efficient resource isolation
> $ uname -a
> Linux node-3 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 
> x86_64 x86_64 x86_64 GNU/Linux
> {code}
>Reporter: James DeFelice
>Assignee: deshna jain
>  Labels: mesosphere
> Fix For: 1.0.0
>
>
> implementing support for executor HTTP v1 API in mesos-go:next and my 
> executor can't send status updates because the slave dies upon receiving 
> them. protobufs generated from 0.28.1
> from syslog:
> {code}
> Apr 17 17:53:53 node-1 mesos-slave[4462]: I0417 17:53:53.121467  4489 
> http.cpp:190] HTTP POST for /slave(1)/api/v1/executor from 10.2.0.5:51800 
> with User-Agent='Go-http-client/1.1'
> Apr 17 17:53:53 node-1 mesos-slave[4462]: *** buffer overflow detected ***: 
> /usr/sbin/mesos-slave terminated
> Apr 17 17:53:53 node-1 mesos-slave[4462]: === Backtrace: =
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /lib/x86_64-linux-gnu/libc.so.6(+0x7338f)[0x7fc53064e38f]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7fc5306e5c9c]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /lib/x86_64-linux-gnu/libc.so.6(+0x109b60)[0x7fc5306e4b60]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/local/lib/libmesos-0.28.0.so(_ZN5mesos8internallsERSoRKNS0_12StatusUpdateE+0x16a)[0x7fc531cc617a]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/local/lib/libmesos-0.28.0.so(_ZN5mesos8internal5slave5Slave12statusUpdateENS0_12StatusUpdateERK6OptionIN7process4UPIDEE+0xe7)[0x7fc531d71837]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/local/lib/libmesos-0.28.0.so(_ZNK5mesos8internal5slave5Slave4Http8executorERKN7process4http7RequestE+0xb52)[0x7fc531d302a2]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/local/lib/libmesos-0.28.0.so(+0xc754a3)[0x7fc531d4d4a3]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/local/lib/libmesos-0.28.0.so(+0x1295aa8)[0x7fc53236daa8]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/local/lib/libmesos-0.28.0.so(_ZN7process14ProcessManager6resumeEPNS_11ProcessBaseE+0x2d1)[0x7fc532375a71]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/local/lib/libmesos-0.28.0.so(+0x129dd77)[0x7fc532375d77]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb1bf0)[0x7fc530e85bf0]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fc5309a8182]
> Apr 17 17:53:53 node-1 mesos-slave[4462]: 
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc5306d547d]
> ...
> Apr 17 17:53:53 node-1 mesos-slave[4462]: *** Aborted at 1460915633 (unix 
> time) try "date -d @1460915633" if you are using GNU date ***
> Apr 17 17:53:53 node-1 mesos-slave[4462]: PC: @ 0x7fc530611cc9 (unknown)
> Apr 17 17:53:53 node-1 mesos-slave[4462]: *** SIGABRT (@0x116e) received by 
> PID 4462 (TID 0x7fc5275f5700) from PID 4462; stack trace: ***
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5309b0340 (unknown)
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc530611cc9 (unknown)
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5306150d8 (unknown)
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc53064e394 (unknown)
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5306e5c9c (unknown)
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc5306e4b60 (unknown)
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531cc617a 
> mesos::internal::operator<<()
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531d71837 
> mesos::internal::slave::Slave::statusUpdate()
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531d302a2 
> mesos::internal::slave::Slave::Http::executor()
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc531d4d4a3 
> _ZNSt17_Function_handlerIFN7process6FutureINS0_4http8ResponseEEERKNS2_7RequestEEZN5mesos8internal5slave5Slave10initializeEvEUlS7_E19_E9_M_invokeERKSt9_Any_dataS7_
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 0x7fc53236daa8 
> _ZZN7process11ProcessBase5visitERKNS_9HttpEventEENKUlRKNS_6FutureI6OptionINS_4http14authentication20AuthenticationResultE0_clESC_
> Apr 17 17:53:53 node-1 mesos-slave[4462]: @ 

[jira] [Updated] (MESOS-5064) Remove default value for the agent `work_dir`

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5064:
-
Priority: Blocker  (was: Major)

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
>Priority: Blocker
> Fix For: 0.29.0
>
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}. In addition, we can remove 
> the default value for the {{\-\-work_dir}} flag, forcing users to explicitly 
> set the work directory for the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5457) Create a small testing doc for the v1 Scheduler/Executor API

2016-05-27 Thread Artem Harutyunyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304882#comment-15304882
 ] 

Artem Harutyunyan commented on MESOS-5457:
--

[~anandmazumdar] can you please resolve this one after you're done with tests?

> Create a small testing doc for the v1 Scheduler/Executor API
> 
>
> Key: MESOS-5457
> URL: https://issues.apache.org/jira/browse/MESOS-5457
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>Assignee: Jay Guo
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> This is a follow up JIRA based on the comments from MESOS-3302 around testing 
> the v1 Scheduler/Executor API. I created a small document that has the 
> details of the manual testing done by me. The intent of this issue is to 
> track  all the details on this ticket rather then on the epic.
> Link to the doc: 
> https://docs.google.com/document/d/1Z8_8pn-x-VYInm12_En-1oP-FxkLzpG8EgC1qQ0eDRY/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2386) Provide full filesystem isolation as a native mesos isolator

2016-05-27 Thread Charles Allen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304879#comment-15304879
 ] 

Charles Allen commented on MESOS-2386:
--

It still isn't :(

> Provide full filesystem isolation as a native mesos isolator
> 
>
> Key: MESOS-2386
> URL: https://issues.apache.org/jira/browse/MESOS-2386
> Project: Mesos
>  Issue Type: Epic
>  Components: isolation
>Affects Versions: 0.22.1
>Reporter: Dominic Hamon
>Assignee: Ian Downes
>  Labels: mesosphere, twitter
>
> Design
> https://docs.google.com/a/twitter.com/document/d/1Fx5TS0LytV7u5MZExQS0-g-gScX2yKCKQg9UPFzhp6U/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5426) Relax version compatibility requirement for some modules

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5426:
-
Fix Version/s: (was: 0.29.0)

> Relax version compatibility requirement for some modules
> 
>
> Key: MESOS-5426
> URL: https://issues.apache.org/jira/browse/MESOS-5426
> Project: Mesos
>  Issue Type: Task
>  Components: modules
>Affects Versions: 0.29.0
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere, security
>
> Some module interfaces such as authenticatee, have not changed for a while 
> and so we should be able to relax the version compatibility checks. This 
> needs to be done on a case-by-case basis.
> I am also hoping, this change will also provide a framework for updating the 
> version requirement for other modules as we go towards a stable module API.
> [cc: [~adam-mesos] [~tillt] ]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5452) Agent modules should be initialized before all components except firewall.

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5452:
-
Fix Version/s: (was: 0.29.0)

> Agent modules should be initialized before all components except firewall.
> --
>
> Key: MESOS-5452
> URL: https://issues.apache.org/jira/browse/MESOS-5452
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> On Mesos Agents Anonymous modules should not have any dependencies, by 
> design, on any other Mesos components. This implies that Anonymous modules 
> should be initialized before all other Mesos components other than 
> `Firewall`. The dependency on `Firewall` is primarily to enforce any policies 
> to secure endpoints that might be owned by the Anonymous module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5452) Agent modules should be initialized before all components except firewall.

2016-05-27 Thread Artem Harutyunyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304877#comment-15304877
 ] 

Artem Harutyunyan commented on MESOS-5452:
--

[~avin...@mesosphere.io] is there a patch for this one somewhere?

> Agent modules should be initialized before all components except firewall.
> --
>
> Key: MESOS-5452
> URL: https://issues.apache.org/jira/browse/MESOS-5452
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> On Mesos Agents Anonymous modules should not have any dependencies, by 
> design, on any other Mesos components. This implies that Anonymous modules 
> should be initialized before all other Mesos components other than 
> `Firewall`. The dependency on `Firewall` is primarily to enforce any policies 
> to secure endpoints that might be owned by the Anonymous module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5456) Master anonymous modules should initialized before any other components.

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5456:
-
Fix Version/s: (was: 0.29.0)

> Master anonymous modules should initialized before any other components.
> 
>
> Key: MESOS-5456
> URL: https://issues.apache.org/jira/browse/MESOS-5456
> Project: Mesos
>  Issue Type: Improvement
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> Anonymous modules on the Master are by design supposed to be independent of 
> any Mesos components. However, there might be a dependency in the reverse 
> direction. For e.g., Anonymous modules might want to influence the behavior 
> of Mesos components (say by generating configuration, that might be consumed 
> later by the components). 
> The Anonymous modules on the Master therefore need to be initialized before 
> other Mesos components. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5265) Update mesos-execute to support docker volume isolator.

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5265:
-
Fix Version/s: (was: 0.29.0)

> Update mesos-execute to support docker volume isolator.
> ---
>
> Key: MESOS-5265
> URL: https://issues.apache.org/jira/browse/MESOS-5265
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> The mesos-execute needs to be updated to support docker volume isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5267) Check dvdcli version when create the DriverClient

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5267:
-
Fix Version/s: (was: 0.29.0)

> Check dvdcli version when create the DriverClient
> -
>
> Key: MESOS-5267
> URL: https://issues.apache.org/jira/browse/MESOS-5267
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> The dvdcli version needs to be checked when create the DriverClient as now 
> only 0.1.0 will be supported.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5341) Enabled docker volume support for DockerContainerizer

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5341:
-
Fix Version/s: (was: 0.29.0)

> Enabled docker volume support for DockerContainerizer
> -
>
> Key: MESOS-5341
> URL: https://issues.apache.org/jira/browse/MESOS-5341
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> When a user specifies Volume.Source, we need to prepare the `docker run` 
> command accordingly to support that. The {{DockerInfo.volume_driver}} can be 
> retired now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5296) Split Resource and Inverse offer protobufs for V1 API

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5296:
-
Priority: Blocker  (was: Major)

> Split Resource and Inverse offer protobufs for V1 API
> -
>
> Key: MESOS-5296
> URL: https://issues.apache.org/jira/browse/MESOS-5296
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joris Van Remoortere
>Assignee: Joris Van Remoortere
>Priority: Blocker
> Fix For: 0.29.0
>
>
> The protobufs for the V1 api regarding inverse offers initially re-used the 
> existing offer / rescind / accept / decline messages for regular offers.
> We should split these out the be more explicit, and provide the ability to 
> augment the messages with particulars to either resource or inverse offers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5123) Docker task may fail if path to agent work_dir is relative.

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5123:
-
Fix Version/s: (was: 0.29.0)

> Docker task may fail if path to agent work_dir is relative. 
> 
>
> Key: MESOS-5123
> URL: https://issues.apache.org/jira/browse/MESOS-5123
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Affects Versions: 0.28.0, 0.29.0
>Reporter: Alexander Rukletsov
>Assignee: Klaus Ma
>  Labels: docker, documentation, mesosphere
>
> When a local folder for agent’s {{\-\-work_dir}} is specified (e.g., 
> {{\-\-work_dir=w/s}}) docker complains that there are forbidden symbols in a 
> *local* volume name. Specifying an absolute path (e.g., 
> {{\-\-work_dir=/tmp}}) solves the problem.
> Docker error observed:
> {noformat}
> docker: Error response from daemon: create 
> w/s/slaves/33b8fe47-e9e0-468a-83a6-98c1e3537e59-S1/frameworks/33b8fe47-e9e0-468a-83a6-98c1e3537e59-0001/executors/docker-test/runs/3cc5cb04-d0a9-490e-94d5-d446b66c97cc:
>  volume name invalid: 
> "w/s/slaves/33b8fe47-e9e0-468a-83a6-98c1e3537e59-S1/frameworks/33b8fe47-e9e0-468a-83a6-98c1e3537e59-0001/executors/docker-test/runs/3cc5cb04-d0a9-490e-94d5-d446b66c97cc"
>  includes invalid characters for a local volume name, only 
> "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed.
> {noformat}
> First off, it is not obvious that Mesos always creates a volume for the 
> sandbox. We may want to document it.
> Second, it's hard to understand that local {{work_dir}} can trigger forbidden 
> symbols error in docker. Does it make sense to check it during agent launch 
> if docker containerizer is enabled? Or reject docker tasks during task 
> validation?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5405) Make fields in authorization::Request protobuf optional.

2016-05-27 Thread Artem Harutyunyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304858#comment-15304858
 ] 

Artem Harutyunyan commented on MESOS-5405:
--

[~adam-mesos] Can you take a look at this one please? It's marked as a blocker 
for the release.

> Make fields in authorization::Request protobuf optional.
> 
>
> Key: MESOS-5405
> URL: https://issues.apache.org/jira/browse/MESOS-5405
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Till Toenshoff
>Priority: Blocker
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
>
> Currently {{authorization::Request}} protobuf declares {{subject}} and 
> {{object}} as required fields. However, in the codebase we not always set 
> them, which renders the message in the uninitialized state, for example:
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/common/http.cpp#L603
>  * 
> https://github.com/apache/mesos/blob/0bfd6999ebb55ddd45e2c8566db17ab49bc1ffec/src/master/http.cpp#L2057
> I believe that the reason why we don't see issues related to this is because 
> we never send authz requests over the wire, i.e., never serialize/deserialize 
> them. However, they are still invalid protobuf messages. Moreover, some 
> external authorizers may serialize these messages.
> We can either ensure all required fields are set or make both {{subject}} and 
> {{object}} fields optional. This will also require updating local authorizer, 
> which should properly handle the situation when these fields are absent. We 
> may also want to notify authors of external authorizers to update their code 
> accordingly.
> It looks like no deprecation is necessary, mainly because we 
> already—erroneously!—treat these fields as optional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5412) Support CNI_ARGS

2016-05-27 Thread Artem Harutyunyan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304855#comment-15304855
 ] 

Artem Harutyunyan commented on MESOS-5412:
--

Hey [~djosborne], we will be cutting a release next Monday (05.30.2016). Are 
you planning on submitting a patch for this?

> Support CNI_ARGS
> 
>
> Key: MESOS-5412
> URL: https://issues.apache.org/jira/browse/MESOS-5412
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Dan Osborne
>
> Mesos-CNI should support the 
> [CNI_ARGS|https://github.com/containernetworking/cni/blob/master/SPEC.md#parameters]
>  field.
> This would allow CNI plugins to be able to implement advanced networking 
> capabilities without needing modifications to Mesos. Current use case I am 
> facing: Allowing users to specify policy for their CNI plugin. 
> I'm proposing the following implementation: Pass a task's [NetworkInfo 
> Labels|https://github.com/apache/mesos/blob/b7e50fe8b20c96cda5546db5f2c2f47bee461edb/include/mesos/mesos.proto#L1732]
>  to the CNI plugin as CNI_ARGS. CNI args are simply key-value pairs split by 
> a '=', e.g. "FOO=BAR;ABC=123", which could be easily generated from the 
> NetworkInfo's key-value labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5412) Support CNI_ARGS

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5412:
-
Fix Version/s: (was: 0.29.0)

> Support CNI_ARGS
> 
>
> Key: MESOS-5412
> URL: https://issues.apache.org/jira/browse/MESOS-5412
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Dan Osborne
>
> Mesos-CNI should support the 
> [CNI_ARGS|https://github.com/containernetworking/cni/blob/master/SPEC.md#parameters]
>  field.
> This would allow CNI plugins to be able to implement advanced networking 
> capabilities without needing modifications to Mesos. Current use case I am 
> facing: Allowing users to specify policy for their CNI plugin. 
> I'm proposing the following implementation: Pass a task's [NetworkInfo 
> Labels|https://github.com/apache/mesos/blob/b7e50fe8b20c96cda5546db5f2c2f47bee461edb/include/mesos/mesos.proto#L1732]
>  to the CNI plugin as CNI_ARGS. CNI args are simply key-value pairs split by 
> a '=', e.g. "FOO=BAR;ABC=123", which could be easily generated from the 
> NetworkInfo's key-value labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5061) process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is not connected

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5061:
-
Fix Version/s: (was: 0.29.0)

> process.cpp:1966] Failed to shutdown socket with fd x: Transport endpoint is 
> not connected
> --
>
> Key: MESOS-5061
> URL: https://issues.apache.org/jira/browse/MESOS-5061
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, modules
>Affects Versions: 0.27.0, 0.27.1, 0.28.0, 0.27.2
> Environment: Centos 7.1
>Reporter: Zogg
>
> When launching a task through Marathon and asking the task to assign an IP 
> (using Calico networking):
> {noformat}
> {
> "id":"/calico-apps",
> "apps": [
> {
> "id": "hello-world-1",
> "cmd": "ip addr && sleep 3",
> "cpus": 0.1,
> "mem": 64.0,
> "ipAddress": {
> "groups": ["calico-k8s-network"]
> }
> }
> ]
> }
> {noformat}
> Mesos slave fails to launch a task, locking in STAGING state forewer, with 
> error:
> {noformat}
> [centos@rtmi-worker-001 mesos]$ tail mesos-slave.INFO
> I0325 20:35:43.420171 13495 slave.cpp:2642] Got registration for executor 
> 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework 
> 23b404e4-700a-4348-a7c0-226239348981- from executor(1)@10.0.0.10:33443
> I0325 20:35:43.422652 13495 slave.cpp:1862] Sending queued task 
> 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' to executor 
> 'calico-apps_hello-world-1.23ff72e9-f2c9-11e5-bb22-be052ff413d3' of framework 
> 23b404e4-700a-4348-a7c0-226239348981- at executor(1)@10.0.0.10:33443
> E0325 20:35:43.423159 13502 process.cpp:1966] Failed to shutdown socket with 
> fd 22: Transport endpoint is not connected
> I0325 20:35:43.423316 13501 slave.cpp:3481] executor(1)@10.0.0.10:33443 exited
> {noformat}
> However, when deploying a task without ipAddress field, mesos slave launches 
> a task successfully. 
> Tested with various Mesos/Marathon/Calico versions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5081) Posix disk isolator allows unrestricted sandbox disk usage if the executor/task doesn't specify disk resource

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5081:
-
Fix Version/s: (was: 0.29.0)

> Posix disk isolator allows unrestricted sandbox disk usage if the 
> executor/task doesn't specify disk resource
> -
>
> Key: MESOS-5081
> URL: https://issues.apache.org/jira/browse/MESOS-5081
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Yan Xu
>  Labels: mesosphere
>
> This is the case even if {{flags.enforce_container_disk_quota}} is true. When 
> a task/executor doesn't specify a disk resource, it still gets to write to 
> the container sandbox. However the posix disk isolator doesn't limit it.
> Even though tasks always have access to the sandbox, it should be able to 
> write zero bytes if it doesn't have any {{disk}} resource (it can still touch 
> files). This likely will cause tasks to immediately fail due to 
> stdout/stderr/executor download, etc. but should be the correct behavior 
> (when {{flags.enforce_container_disk_quota}} is true).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5179) Enhance the error message for Duration flag.

2016-05-27 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5179:
-
Fix Version/s: (was: 0.29.0)

> Enhance the error message for Duration flag.
> 
>
> Key: MESOS-5179
> URL: https://issues.apache.org/jira/browse/MESOS-5179
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>Priority: Minor
>
> Enhance the error message for  
> https://github.com/apache/mesos/blob/4dfa91fc21f80204f5125b2e2f35c489f8fb41d8/3rdparty/libprocess/3rdparty/stout/include/stout/duration.hpp#L70
>  to list all of the supported duration unit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5182) mesos-executor (CommandScheduler) does not accept offer with revocable resources

2016-05-27 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5182:
--
Fix Version/s: (was: 0.29.0)

> mesos-executor (CommandScheduler) does not accept offer with revocable 
> resources
> 
>
> Key: MESOS-5182
> URL: https://issues.apache.org/jira/browse/MESOS-5182
> Project: Mesos
>  Issue Type: Bug
>  Components: framework
>Affects Versions: 0.28.0
>Reporter: Liqiang Lin
>  Labels: easyfix
>
> Currently mesos-executor (CommandScheduler) does not accept offer with 
> revocable resources. It's unable to verify cases using revocable resources to 
> launch tasks with this example framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-338) Mesos 1.0

2016-05-27 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone reassigned MESOS-338:


Assignee: Vinod Kone

> Mesos 1.0
> -
>
> Key: MESOS-338
> URL: https://issues.apache.org/jira/browse/MESOS-338
> Project: Mesos
>  Issue Type: Task
>Reporter: Benjamin Mahler
>Assignee: Vinod Kone
>Priority: Critical
>  Labels: mesosphere
> Fix For: 1.0.0
>
>
> This ticket tracks the Mesos 1.0 road map. Specifically, the blockers, a.k.a 
> roadmap items, for 1.0 are linked to this ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-27 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304771#comment-15304771
 ] 

Vinod Kone commented on MESOS-5430:
---

Thanks guys. This is awesome!

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
> Attachments: page_1.png, page_2.png
>
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-27 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304609#comment-15304609
 ] 

haosdent commented on MESOS-5430:
-

[~jmanalus] Thanks a lot for your reviews!

Hi, [~vinodkone] Let me refactor/reorganize the code and posted it in the 
review board. 

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
> Attachments: page_1.png, page_2.png
>
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-27 Thread Jonathan Manalus (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304602#comment-15304602
 ] 

Jonathan Manalus commented on MESOS-5430:
-

[~haosd...@gmail.com] It looks perfect.

Let's ship it out.

[~vinodkone] - It's ready to become the Mesos Homepage 

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
> Attachments: page_1.png, page_2.png
>
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-970) Upgrade bundled leveldb to 1.18

2016-05-27 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304593#comment-15304593
 ] 

haosdent commented on MESOS-970:


[~vinodkone] [~janisz] [~chenzhiwei] [~bingli1000] Finish the benchmark test 
cases as well. You could comment in 
https://docs.google.com/document/d/1fv2OMvH6hVm6waacOejSrTJwUuDQeXlqqPDZjBmbcKU/edit#
 so that I could rerun or add new test cases for this issue. Thank you in 
advance.

> Upgrade bundled leveldb to 1.18
> ---
>
> Key: MESOS-970
> URL: https://issues.apache.org/jira/browse/MESOS-970
> Project: Mesos
>  Issue Type: Improvement
>  Components: replicated log
>Reporter: Benjamin Mahler
>Assignee: Tomasz Janiszewski
>
> We currently bundle leveldb 1.4, and the latest version is leveldb 1.18.
> Upgrade to 1.18 could solve the problems when build Mesos in some non-x86 
> architecture CPU.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-27 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304520#comment-15304520
 ] 

haosdent commented on MESOS-5430:
-

[~jmanalus] Nice catch, I didn't notice tablet as well. Could you help review 
http://blog.haosdent.me/mesos-site-demo/source/ again? Thank you in advance.

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
> Attachments: page_1.png, page_2.png
>
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-27 Thread Jonathan Manalus (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304489#comment-15304489
 ] 

Jonathan Manalus commented on MESOS-5430:
-

Okay A few Tablet issues I didn't notice earlier.

- The Menu Bar on tablets is pushed to a second line http://cl.ly/111k3g2s0I2d
- On tablet can we have the rows of points display instead of the Mobile view 
http://cl.ly/1c0Z0u210j2M

Otherwise everything else is perfect.

Thanks again for building the new landing page [~haosd...@gmail.com]


> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
> Attachments: page_1.png, page_2.png
>
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-27 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304480#comment-15304480
 ] 

haosdent commented on MESOS-5430:
-

Sure! Just updated it in  http://blog.haosdent.me/mesos-site-demo/source/ as 
well.

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
> Attachments: page_1.png, page_2.png
>
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5472) Hadoop-free S3 fetcher

2016-05-27 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-5472:
-

We will consider adding an {{S3}} plugin once we finish moving the 
{{mesos-fetcher}} to the URI fetcher (MESOS-3918).

> Hadoop-free S3 fetcher
> --
>
> Key: MESOS-5472
> URL: https://issues.apache.org/jira/browse/MESOS-5472
> Project: Mesos
>  Issue Type: Wish
>  Components: fetcher
>Reporter: Marc Villacorta
>Priority: Minor
>
> My mesos agents are running on systems without Hadoop.
> I would like to fetch _S3_ uris into my sandboxes.
> How about using the _'awscli'_?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5430) Design the improvement of the home page of mesos.apache.org

2016-05-27 Thread Jonathan Manalus (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304374#comment-15304374
 ] 

Jonathan Manalus commented on MESOS-5430:
-

[~haosd...@gmail.com]

Last issue I was able to find, and then I believe we can ship the page.

On mobile can you bump the font weight up to 200 for the sub-header only on 
mobile. http://cl.ly/37452u3G2B3g

> Design the improvement of the home page of mesos.apache.org
> ---
>
> Key: MESOS-5430
> URL: https://issues.apache.org/jira/browse/MESOS-5430
> Project: Mesos
>  Issue Type: Improvement
>  Components: project website
>Reporter: Vinod Kone
>Assignee: Jonathan Manalus
> Attachments: page_1.png, page_2.png
>
>
> The idea is to come up with a minimal improvement for the design of the home 
> page of mesos.apache.org.
> Proposed Redesign: https://invis.io/CV7DZF1JW#/159898819_Mesos-apache-org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5468) Add logic in long-lived-framework to handle network partitions.

2016-05-27 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304360#comment-15304360
 ] 

Jay Guo commented on MESOS-5468:


What is your iptables command? I can constantly reproduce the problem on latest 
build.

* How long does it take for master to disconnect the framework after network 
partition {{iptables command issued}}?

* Do tcp sockets go into FIN_WAIT_1 state?

I think the point is how does a master notice network partition? IIUC, it 
relies on tcp socket timeout, which is typically 13-30 min on a linux box 
(manpage of tcp), and that is the duration I experienced between disconnect and 
give-up. And at this point, tcp socket informs user (mesos-master) of broken 
link while remaining ESTABLISHED. It is up to the app now to handle this 
failure and I suspect that libprocess does not properly close the socket here. 
I'll need to do some more investigation.

I see other users experiencing {{Transport endpoint is not connected}} error 
and I personally see this for many times as well. So I think we should 
definitely take a serious look into that.

Another question, why don't we use a mature http library at the very beginning, 
instead of having our own implementation?

Cheers,
/J

> Add logic in long-lived-framework to handle network partitions.
> ---
>
> Key: MESOS-5468
> URL: https://issues.apache.org/jira/browse/MESOS-5468
> Project: Mesos
>  Issue Type: Task
>  Components: framework, master
>Reporter: Jay Guo
>
> Currently long-lived-framework does not handle network partitions i.e 
> explicitly trying to {{reconnect}} with the master upon not receiving 
> {{HEARTBEAT}} events for a prolonged amount of time. If the master 
> disconnects a framework without the framework being aware of it (one way 
> partition), the framework should explicitly issue a {{reconnect}} request via 
> the scheduler library after a certain period of time.
> *On the other hand*, should we close TCP socket on master side when teardown 
> a framework? Currently the tcp socket is left alive even framework has been 
> deactivated. This results in framework sending invalid {{Call}} to master and 
> re-detection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5425) Consider using IntervalSet for Port range resource math

2016-05-27 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304353#comment-15304353
 ] 

Joseph Wu commented on MESOS-5425:
--

[~yanyanhu], can you post your existing work on Reviewboard?  The performance 
improvements look promising and I'd be happy to help review.  

> Consider using IntervalSet for Port range resource math
> ---
>
> Key: MESOS-5425
> URL: https://issues.apache.org/jira/browse/MESOS-5425
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Joseph Wu
>  Labels: mesosphere
>
> Follow-up JIRA for comments raised in MESOS-3051 (see comments there).
> We should consider utilizing 
> [{{IntervalSet}}|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/3rdparty/stout/include/stout/interval.hpp]
>  in [Port range resource 
> math|https://github.com/apache/mesos/blob/a0b798d2fac39445ce0545cfaf05a682cd393abe/src/common/values.cpp#L143].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5472) Hadoop-free S3 fetcher

2016-05-27 Thread Marc Villacorta (JIRA)
Marc Villacorta created MESOS-5472:
--

 Summary: Hadoop-free S3 fetcher
 Key: MESOS-5472
 URL: https://issues.apache.org/jira/browse/MESOS-5472
 Project: Mesos
  Issue Type: Wish
  Components: fetcher
Reporter: Marc Villacorta
Priority: Minor


My mesos agents are running on systems without Hadoop.
I would like to fetch _S3_ uris into my sandboxes.
How about using the _'awscli'_?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5468) Add logic in long-lived-framework to handle network partitions.

2016-05-27 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304189#comment-15304189
 ] 

Anand Mazumdar commented on MESOS-5468:
---

If for some reason, a framework gets disconnected from the master. The master 
gives it {{failover_timeout}} to register before removing it completely. 
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L231

We currently don't specify a timeout value for the example long lived framework 
so it defaults to 0ns i.e. it would be removed as soon as it disconnects 
initially.

{noformat}
I0527 05:48:45.583395 13101 master.cpp:1396] Giving framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++)) 0ns to 
failover
{noformat}

I wasn't able to reproduce the socket closure issue on my end i.e. the socket 
is closed as soon as the master disconnects the long-lived-framework. 

Can you have a look into the reproduction steps on the JIRA and let me know if 
it's missing any steps?

{noformat}
$  ~  netstat -tpn | grep -i 5050
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp0  0 127.0.1.1:5050  127.0.0.1:45226 ESTABLISHED 
32402/lt-mesos-mast
tcp0  0 127.0.0.1:45224 127.0.1.1:5050  ESTABLISHED 
961/lt-long-lived-f
tcp0  0 127.0.0.1:45226 127.0.1.1:5050  ESTABLISHED 
961/lt-long-lived-f
tcp0  0 127.0.1.1:5050  127.0.0.1:45224 ESTABLISHED 
32402/lt-mesos-mast
{noformat}

After following the steps on the JIRA i.e. the long running framework gets 
disconnected.

{noformat}
$ ~  netstat -tpn | grep -i 5050
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp0  0 127.0.0.1:45224 127.0.1.1:5050  TIME_WAIT   
-
tcp0  0 127.0.0.1:45226 127.0.1.1:5050  TIME_WAIT   
-
{noformat}


> Add logic in long-lived-framework to handle network partitions.
> ---
>
> Key: MESOS-5468
> URL: https://issues.apache.org/jira/browse/MESOS-5468
> Project: Mesos
>  Issue Type: Task
>  Components: framework, master
>Reporter: Jay Guo
>
> Currently long-lived-framework does not handle network partitions i.e 
> explicitly trying to {{reconnect}} with the master upon not receiving 
> {{HEARTBEAT}} events for a prolonged amount of time. If the master 
> disconnects a framework without the framework being aware of it (one way 
> partition), the framework should explicitly issue a {{reconnect}} request via 
> the scheduler library after a certain period of time.
> *On the other hand*, should we close TCP socket on master side when teardown 
> a framework? Currently the tcp socket is left alive even framework has been 
> deactivated. This results in framework sending invalid {{Call}} to master and 
> re-detection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2043) framework auth fail with timeout error and never get authenticated

2016-05-27 Thread Kevin Cox (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304120#comment-15304120
 ] 

Kevin Cox commented on MESOS-2043:
--

A patch release would be great because it really sucks to be afraid to upgrade 
my cluster.

> framework auth fail with timeout error and never get authenticated
> --
>
> Key: MESOS-2043
> URL: https://issues.apache.org/jira/browse/MESOS-2043
> Project: Mesos
>  Issue Type: Bug
>  Components: master, scheduler driver, security, slave
>Affects Versions: 0.21.0
>Reporter: Bhuvan Arumugam
>Priority: Critical
>  Labels: mesosphere, security
> Attachments: aurora-scheduler.20141104-1606-1706.log, master.log, 
> mesos-master.20141104-1606-1706.log, slave.log
>
>
> I'm facing this issue in master as of 
> https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4
> As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm 
> running 1 master and 1 scheduler (aurora). The framework authentication fail 
> due to time out:
> error on mesos master:
> {code}
> I1104 19:37:17.741449  8329 master.cpp:3874] Authenticating 
> scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083
> I1104 19:37:17.741585  8329 master.cpp:3885] Using default CRAM-MD5 
> authenticator
> I1104 19:37:17.742106  8336 authenticator.hpp:169] Creating new server SASL 
> connection
> W1104 19:37:22.742959  8329 master.cpp:3953] Authentication timed out
> W1104 19:37:22.743548  8329 master.cpp:3930] Failed to authenticate 
> scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: 
> Authentication discarded
> {code}
> scheduler error:
> {code}
> I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master 
> master@MASTER_IP:PORT
> I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL 
> connection
> I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL 
> authentication mechanisms: CRAM-MD5
> I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out
> I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master 
> master@MASTER_IP:PORT: Authentication discarded
> {code}
> Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} & 
> {{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is 
> trying to authenticate and fail.
> {code}
> W1104 19:36:30.769420  8319 master.cpp:3930] Failed to authenticate 
> scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to 
> communicate with authenticatee
> I1104 19:36:42.701441  8328 master.cpp:3860] Queuing up authentication 
> request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 
> because authentication is still in progress
> {code}
> Restarting master and scheduler didn't fix it. 
> This particular issue happen with 1 master and 1 scheduler after MESOS-1866 
> is fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-5064) Remove default value for the agent `work_dir`

2016-05-27 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5064:
-
Comment: was deleted

(was: Reviews here:
https://reviews.apache.org/r/47078/
https://reviews.apache.org/r/46003/
https://reviews.apache.org/r/46004/
https://reviews.apache.org/r/45562/)

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
> Fix For: 0.29.0
>
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}. In addition, we can remove 
> the default value for the {{\-\-work_dir}} flag, forcing users to explicitly 
> set the work directory for the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5064) Remove default value for the agent `work_dir`

2016-05-27 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303975#comment-15303975
 ] 

Greg Mann commented on MESOS-5064:
--

Reviews here:
https://reviews.apache.org/r/47078/
https://reviews.apache.org/r/46003/
https://reviews.apache.org/r/46004/
https://reviews.apache.org/r/45562/
https://reviews.apache.org/r/47952/

> Remove default value for the agent `work_dir`
> -
>
> Key: MESOS-5064
> URL: https://issues.apache.org/jira/browse/MESOS-5064
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Greg Mann
> Fix For: 0.29.0
>
>
> Following a crash report from the user we need to be more explicit about the 
> dangers of using {{/tmp}} as agent {{work_dir}}. In addition, we can remove 
> the default value for the {{\-\-work_dir}} flag, forcing users to explicitly 
> set the work directory for the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5153) Sandboxes contents should be protected from unauthorized users

2016-05-27 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303836#comment-15303836
 ] 

Adam B commented on MESOS-5153:
---

Still reviewing: ACCESS_MESOS_LOGS
  https://reviews.apache.org/r/47921/

In addition, we'll need to update the files endpoint help (and autogenerated 
endpoint docs), and perhaps authorization.md.

> Sandboxes contents should be protected from unauthorized users
> --
>
> Key: MESOS-5153
> URL: https://issues.apache.org/jira/browse/MESOS-5153
> Project: Mesos
>  Issue Type: Bug
>  Components: security, slave
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
>
> MESOS-4956 introduced authentication support for the sandboxes. However, 
> authentication can only go as far as to tell whether an user is known to 
> mesos or not. An extra additional step is necessary to verify whether the 
> known user is allowed to executed the requested operation on the sandbox 
> (browse, read, download, debug).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5153) Sandboxes contents should be protected from unauthorized users

2016-05-27 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303834#comment-15303834
 ] 

Adam B commented on MESOS-5153:
---

commit bcdc1d151a0423593ea39411519165a1b6e900ff
Author: Alexander Rojas 
Date:   Fri May 27 01:00:09 2016 -0700

Enabled authorization for sandboxes.

Enables authorization of the sandboxes using the callback function
parameter of `Files::attach()`.

It also adds relevant ACLs and support on the authorizer interface.

Review: https://reviews.apache.org/r/47795/

commit 62150e441540c93e3f7dcbaed98679bf81c14c94
Author: Alexander Rojas 
Date:   Fri May 27 00:49:20 2016 -0700

Added authorization support for mesos::internal::Files.

Adds an optional parameter to the `mesos::internal::Files::attach()`
method. The type of this parameter is a callable object which returns
a future to a boolean and takes as parameter an optional string
representing a principal name.

The parameter is called, if set, whenever one of the routed endpoints
of the `Files` object is accessed through HTTP. If the callable object
returns a false boolean, then processing of the request is aborted
and a `403 Forbidden` response is returned.

Review: https://reviews.apache.org/r/47794/


> Sandboxes contents should be protected from unauthorized users
> --
>
> Key: MESOS-5153
> URL: https://issues.apache.org/jira/browse/MESOS-5153
> Project: Mesos
>  Issue Type: Bug
>  Components: security, slave
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
>
> MESOS-4956 introduced authentication support for the sandboxes. However, 
> authentication can only go as far as to tell whether an user is known to 
> mesos or not. An extra additional step is necessary to verify whether the 
> known user is allowed to executed the requested operation on the sandbox 
> (browse, read, download, debug).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5384) Improve error message for missing resources file

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5384:
--
Labels: easyfix newbie  (was: easyfix)

> Improve error message for missing resources file
> 
>
> Key: MESOS-5384
> URL: https://issues.apache.org/jira/browse/MESOS-5384
> Project: Mesos
>  Issue Type: Bug
>  Components: general
>Affects Versions: 0.28.1
> Environment: Centos 7
>Reporter: John Yost
>Priority: Minor
>  Labels: easyfix, newbie
>
> Attempting to specify resources file via 
> --resources=/etc/mesos-slave/small-slave-config.json threw the following 
> error:
> Failed to determine slave resources: Bad value for resources, missing or 
> extra ':' in /etc/mesos-slave/small-slave-config.json
> I confirmed I had valid JSON: 
> [
>   {
> "name": "cpus",
> "type": "SCALAR",
> "scalar": {
>   "value": 0.5
> }
>   },
>   {
> "name": "mem",
> "type": "SCALAR",
> "scalar": {
>   "value": 512
> }
>   }
> ]
> In actuality, I misread to docs with my file pattern. Once I changed to 
> resources=file:///etc/mesos-slave/small-slave-config.json the mesos slave 
> started up fine. Just need a missing file check and corresponding error 
> message to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5471) Enable `Option` to handle string literals gracefully

2016-05-27 Thread Greg Mann (JIRA)
Greg Mann created MESOS-5471:


 Summary: Enable `Option` to handle string literals gracefully
 Key: MESOS-5471
 URL: https://issues.apache.org/jira/browse/MESOS-5471
 Project: Mesos
  Issue Type: Improvement
Reporter: Greg Mann


In {{FlagsBase::add}}, MESOS-5064 begins making use of template function 
parameters like {{T2*}} for the default flag value rather than {{Option&}}. 
This is because in some places in the code base, we pass string literals for 
this argument. If an {{Option}} type is used, the compiler infers a {{char 
[x]}} type for {{T2}}, which breaks {{Option::getOrElse}}, which attempts to 
return that same type, since returning arrays is disallowed.

To fix this, we could employ {{std::decay}}, which would convert a return type 
of {{char [x]}} into {{const char *}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5384) Improve error message for missing resources file

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5384:
--
Fix Version/s: (was: 0.29.0)

> Improve error message for missing resources file
> 
>
> Key: MESOS-5384
> URL: https://issues.apache.org/jira/browse/MESOS-5384
> Project: Mesos
>  Issue Type: Bug
>  Components: general
>Affects Versions: 0.28.1
> Environment: Centos 7
>Reporter: John Yost
>Priority: Minor
>  Labels: easyfix
>
> Attempting to specify resources file via 
> --resources=/etc/mesos-slave/small-slave-config.json threw the following 
> error:
> Failed to determine slave resources: Bad value for resources, missing or 
> extra ':' in /etc/mesos-slave/small-slave-config.json
> I confirmed I had valid JSON: 
> [
>   {
> "name": "cpus",
> "type": "SCALAR",
> "scalar": {
>   "value": 0.5
> }
>   },
>   {
> "name": "mem",
> "type": "SCALAR",
> "scalar": {
>   "value": 512
> }
>   }
> ]
> In actuality, I misread to docs with my file pattern. Once I changed to 
> resources=file:///etc/mesos-slave/small-slave-config.json the mesos slave 
> started up fine. Just need a missing file check and corresponding error 
> message to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5197) Log executor commands w/o verbose logs enabled

2016-05-27 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303766#comment-15303766
 ] 

haosdent commented on MESOS-5197:
-

I think for run/rm/create/pull, it still useful for docker containerizer.

> Log executor commands w/o verbose logs enabled
> --
>
> Key: MESOS-5197
> URL: https://issues.apache.org/jira/browse/MESOS-5197
> Project: Mesos
>  Issue Type: Task
>Reporter: Michael Gummelt
>Assignee: Yong Tang
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> To debug executors, it's often necessary to know the command that ran the 
> executor.  For example, when Spark executors fail, I'd like to know the 
> command used to invoke the executor (Spark uses the command executor in a 
> docker container).  Currently, it's only output if GLOG_v is enabled, but I 
> don't think this should be a "verbose" output.  It's a common debugging need.
> https://github.com/apache/mesos/blob/2e76199a3dd977152110fbb474928873f31f7213/src/docker/docker.cpp#L677
> cc [~kaysoky]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5197) Log executor commands w/o verbose logs enabled

2016-05-27 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303752#comment-15303752
 ] 

Guangya Liu commented on MESOS-5197:


I posted a patch here https://reviews.apache.org/r/37989/ for MESOS-5348. The 
solution is setting {{GLOG_v=1}} if agent start without GLOG_v configuration, 
this can make sure the docker-command-executor can always log message with 
{{GLOG_v=1}} to sandbox.

> Log executor commands w/o verbose logs enabled
> --
>
> Key: MESOS-5197
> URL: https://issues.apache.org/jira/browse/MESOS-5197
> Project: Mesos
>  Issue Type: Task
>Reporter: Michael Gummelt
>Assignee: Yong Tang
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> To debug executors, it's often necessary to know the command that ran the 
> executor.  For example, when Spark executors fail, I'd like to know the 
> command used to invoke the executor (Spark uses the command executor in a 
> docker container).  Currently, it's only output if GLOG_v is enabled, but I 
> don't think this should be a "verbose" output.  It's a common debugging need.
> https://github.com/apache/mesos/blob/2e76199a3dd977152110fbb474928873f31f7213/src/docker/docker.cpp#L677
> cc [~kaysoky]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4643) PortMappingIsolatorTest fail when no namespaces are set.

2016-05-27 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-4643:

Priority: Major  (was: Minor)

> PortMappingIsolatorTest fail when no namespaces are set.
> 
>
> Key: MESOS-4643
> URL: https://issues.apache.org/jira/browse/MESOS-4643
> Project: Mesos
>  Issue Type: Bug
> Environment: Linux Kernel 3.19.0-49-generic,
> libnl-3.2.27
>Reporter: Till Toenshoff
>
> Currently our network isolator tests fail with the following output on a 
> Ubuntu 14.04 VM.
> {noformat}
> [02:10:15][Step 8/8] [ RUN  ] 
> PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP
> [02:10:15][Step 8/8] 
> ../../src/tests/containerizer/port_mapping_tests.cpp:164: Failure
> [02:10:15][Step 8/8] entries: Failed to opendir '/var/run/netns': No such 
> file or directory
> [02:10:15][Step 8/8] 
> ../../src/tests/containerizer/port_mapping_tests.cpp:164: Failure
> [02:10:15][Step 8/8] entries: Failed to opendir '/var/run/netns': No such 
> file or directory
> [02:10:15][Step 8/8] [  FAILED  ] 
> PortMappingIsolatorTest.ROOT_NC_ContainerToContainerTCP (4 ms)
> {noformat}
> The machine has no network namespaces set, hence {{/var/run/netns}} does not 
> exist. 
> We should help users understanding this prerequisite or maybe even get these 
> things in a fixture.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4843) Authorize Master Operator Endpoints

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-4843:
--
Shepherd: Adam B

> Authorize Master Operator Endpoints
> ---
>
> Key: MESOS-4843
> URL: https://issues.apache.org/jira/browse/MESOS-4843
> Project: Mesos
>  Issue Type: Epic
>  Components: master, security
>Reporter: Adam B
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> In a secure, multi-tenant cluster, the operator doesn't want to give every 
> user access to read or modify cluster state/config, nor to perform 
> administrative actions. As such, we need to make sure that all such endpoints 
> are authenticated and authorized.
> We've already added authorization to some operator endpoints (/teardown, 
> /reserve, etc.), but many remain unsecured.
> - /roles, /observe, /registrar, /state-summary
> - /maintenance, /machine,
> - /logging, /profiler, /metrics, /flags, /system/stats.json
> - Leave open? /redirect, /health, /version
> See http://mesos.apache.org/documentation/latest/endpoints/ for a more 
> complete list. Some endpoints (e.g. state.json) will need a finer-grained 
> authz.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4843) Authorize Master Operator Endpoints

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-4843:
--
Fix Version/s: 0.29.0

> Authorize Master Operator Endpoints
> ---
>
> Key: MESOS-4843
> URL: https://issues.apache.org/jira/browse/MESOS-4843
> Project: Mesos
>  Issue Type: Epic
>  Components: master, security
>Reporter: Adam B
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> In a secure, multi-tenant cluster, the operator doesn't want to give every 
> user access to read or modify cluster state/config, nor to perform 
> administrative actions. As such, we need to make sure that all such endpoints 
> are authenticated and authorized.
> We've already added authorization to some operator endpoints (/teardown, 
> /reserve, etc.), but many remain unsecured.
> - /roles, /observe, /registrar, /state-summary
> - /maintenance, /machine,
> - /logging, /profiler, /metrics, /flags, /system/stats.json
> - Leave open? /redirect, /health, /version
> See http://mesos.apache.org/documentation/latest/endpoints/ for a more 
> complete list. Some endpoints (e.g. state.json) will need a finer-grained 
> authz.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-27 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303652#comment-15303652
 ] 

Adam B edited comment on MESOS-5379 at 5/27/16 6:51 AM:


Untargeting from 0.29, since we don't have time/assignee to work on it. Also 
downgraded from a Blocker, but I doubt it's even Critical. [~bbannier], can you 
explain why this is a "Blocker"? Or I guess [~alexr] upgraded it..


was (Author: adam-mesos):
Untargeting from 0.29, since we don't have time/assignee to work on it. Also 
downgraded from a Blocker, but I doubt it's even Critical. [~bbannier], can you 
explain why this is a "Blocker"?

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Critical
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-27 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303652#comment-15303652
 ] 

Adam B commented on MESOS-5379:
---

Untargeting from 0.29, since we don't have time/assignee to work on it. Also 
downgraded from a Blocker, but I doubt it's even Critical. [~bbannier], can you 
explain why this is a "Blocker"?

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Critical
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5379:
--
Priority: Critical  (was: Blocker)

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Critical
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5379:
--
Fix Version/s: (was: 0.29.0)

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Blocker
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5357) Add a function to extract HTTP endpoints from an URL.

2016-05-27 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303645#comment-15303645
 ] 

Adam B commented on MESOS-5357:
---

Untargeting this from 0.29 since no progress has been made.
[~nfnt], did you still want to work on this for the next release? If not, 
please unassign yourself.

> Add a function to extract HTTP endpoints from an URL.
> -
>
> Key: MESOS-5357
> URL: https://issues.apache.org/jira/browse/MESOS-5357
> Project: Mesos
>  Issue Type: Improvement
>  Components: libprocess
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>  Labels: libprocess, mesosphere, newbie, security
>
> HTTP endpoints in Mesos receive a {{process::http::Request}} that includes a 
> {{process::http::URL}}. The {{path}} member of the URL instance is of the 
> form {{/master/endpoint}} or {{/slave\(n\)/endpoint}}. We want to implement 
> authorization of endpoints and need to extract the endpoint from that path 
> and that function should be accessible for masters as well as agents.
> This can be done by adding a method to {{process::http::URL}} that implements 
> the extraction logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4992) sandbox uri does not work outisde mesos http server

2016-05-27 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303643#comment-15303643
 ] 

Adam B commented on MESOS-4992:
---

No time/assignee for this left in 0.29, but we'll try to at least get 
containerId reported in ContainerStatus soon.

> sandbox uri does not work outisde mesos http server
> ---
>
> Key: MESOS-4992
> URL: https://issues.apache.org/jira/browse/MESOS-4992
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 0.27.1
>Reporter: Stavros Kontopoulos
>  Labels: mesosphere
>
> The SandBox uri of a framework does not work if i just copy paste it to the 
> browser.
> For example the following sandbox uri:
> http://172.17.0.1:5050/#/slaves/50f87c73-79ef-4f2a-95f0-b2b4062b2de6-S0/frameworks/50f87c73-79ef-4f2a-95f0-b2b4062b2de6-0009/executors/driver-20160321155016-0001/browse
> should redirect to:
> http://172.17.0.1:5050/#/slaves/50f87c73-79ef-4f2a-95f0-b2b4062b2de6-S0/browse?path=%2Ftmp%2Fmesos%2Fslaves%2F50f87c73-79ef-4f2a-95f0-b2b4062b2de6-S0%2Fframeworks%2F50f87c73-79ef-4f2a-95f0-b2b4062b2de6-0009%2Fexecutors%2Fdriver-20160321155016-0001%2Fruns%2F60533483-31fb-4353-987d-f3393911cc80
> yet it fails with the message:
> "Failed to find slaves.
> Navigate to the slave's sandbox via the Mesos UI."
> and redirects to:
> http://172.17.0.1:5050/#/
> It is an issue for me because im working on expanding the mesos spark ui with 
> sandbox uri, The other option is to get the slave info and parse the json 
> file there and get executor paths not so straightforward or elegant though.
> Moreover i dont see the runs/container_id in the Mesos Proto Api. I guess 
> this is hidden info, this is the needed piece of info to re-write the uri 
> without redirection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5357) Add a function to extract HTTP endpoints from an URL.

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5357:
--
Fix Version/s: (was: 0.29.0)

> Add a function to extract HTTP endpoints from an URL.
> -
>
> Key: MESOS-5357
> URL: https://issues.apache.org/jira/browse/MESOS-5357
> Project: Mesos
>  Issue Type: Improvement
>  Components: libprocess
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>  Labels: libprocess, mesosphere, newbie, security
>
> HTTP endpoints in Mesos receive a {{process::http::Request}} that includes a 
> {{process::http::URL}}. The {{path}} member of the URL instance is of the 
> form {{/master/endpoint}} or {{/slave\(n\)/endpoint}}. We want to implement 
> authorization of endpoints and need to extract the endpoint from that path 
> and that function should be accessible for masters as well as agents.
> This can be done by adding a method to {{process::http::URL}} that implements 
> the extraction logic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5343) Behavior of custom HTTP authenticators with disabled HTTP authentication is inconsistent between master and agent

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5343:
--
Fix Version/s: (was: 0.29.0)

> Behavior of custom HTTP authenticators with disabled HTTP authentication is 
> inconsistent between master and agent
> -
>
> Key: MESOS-5343
> URL: https://issues.apache.org/jira/browse/MESOS-5343
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Minor
>  Labels: mesosphere, security
>
> When setting a custom authenticator with {{http_authenticators}} and also 
> specifying {{authenticate_http=false}} currently agents refuse to start with
> {code}
> A custom HTTP authenticator was specified with the '--http_authenticators' 
> flag, but HTTP authentication was not enabled via '--authenticate_http'
> {code}
> Masters on the other hand accept this setting.
> Having differing behavior between master and agents is confusing, and we 
> should decide on whether we want to accept these settings or not, and make 
> the implementations consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2043) framework auth fail with timeout error and never get authenticated

2016-05-27 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2043:
--
Fix Version/s: (was: 0.29.0)

> framework auth fail with timeout error and never get authenticated
> --
>
> Key: MESOS-2043
> URL: https://issues.apache.org/jira/browse/MESOS-2043
> Project: Mesos
>  Issue Type: Bug
>  Components: master, scheduler driver, security, slave
>Affects Versions: 0.21.0
>Reporter: Bhuvan Arumugam
>Priority: Critical
>  Labels: mesosphere, security
> Attachments: aurora-scheduler.20141104-1606-1706.log, master.log, 
> mesos-master.20141104-1606-1706.log, slave.log
>
>
> I'm facing this issue in master as of 
> https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4
> As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm 
> running 1 master and 1 scheduler (aurora). The framework authentication fail 
> due to time out:
> error on mesos master:
> {code}
> I1104 19:37:17.741449  8329 master.cpp:3874] Authenticating 
> scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083
> I1104 19:37:17.741585  8329 master.cpp:3885] Using default CRAM-MD5 
> authenticator
> I1104 19:37:17.742106  8336 authenticator.hpp:169] Creating new server SASL 
> connection
> W1104 19:37:22.742959  8329 master.cpp:3953] Authentication timed out
> W1104 19:37:22.743548  8329 master.cpp:3930] Failed to authenticate 
> scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: 
> Authentication discarded
> {code}
> scheduler error:
> {code}
> I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master 
> master@MASTER_IP:PORT
> I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL 
> connection
> I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL 
> authentication mechanisms: CRAM-MD5
> I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out
> I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master 
> master@MASTER_IP:PORT: Authentication discarded
> {code}
> Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} & 
> {{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is 
> trying to authenticate and fail.
> {code}
> W1104 19:36:30.769420  8319 master.cpp:3930] Failed to authenticate 
> scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to 
> communicate with authenticatee
> I1104 19:36:42.701441  8328 master.cpp:3860] Queuing up authentication 
> request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 
> because authentication is still in progress
> {code}
> Restarting master and scheduler didn't fix it. 
> This particular issue happen with 1 master and 1 scheduler after MESOS-1866 
> is fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5468) Add logic in long-lived-framework to handle network partitions.

2016-05-27 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303589#comment-15303589
 ] 

Jay Guo commented on MESOS-5468:


Another question, how long do we timeout a framework? I don't see the option in 
configurations. Or are we using other mechanisms to invalidate a framework 
instead of timeout?

> Add logic in long-lived-framework to handle network partitions.
> ---
>
> Key: MESOS-5468
> URL: https://issues.apache.org/jira/browse/MESOS-5468
> Project: Mesos
>  Issue Type: Task
>  Components: framework, master
>Reporter: Jay Guo
>
> Currently long-lived-framework does not handle network partitions i.e 
> explicitly trying to {{reconnect}} with the master upon not receiving 
> {{HEARTBEAT}} events for a prolonged amount of time. If the master 
> disconnects a framework without the framework being aware of it (one way 
> partition), the framework should explicitly issue a {{reconnect}} request via 
> the scheduler library after a certain period of time.
> *On the other hand*, should we close TCP socket on master side when teardown 
> a framework? Currently the tcp socket is left alive even framework has been 
> deactivated. This results in framework sending invalid {{Call}} to master and 
> re-detection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5468) Add logic in long-lived-framework to handle network partitions.

2016-05-27 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303587#comment-15303587
 ] 

Jay Guo commented on MESOS-5468:


See steps to reproduce in my first comment.

> Add logic in long-lived-framework to handle network partitions.
> ---
>
> Key: MESOS-5468
> URL: https://issues.apache.org/jira/browse/MESOS-5468
> Project: Mesos
>  Issue Type: Task
>  Components: framework, master
>Reporter: Jay Guo
>
> Currently long-lived-framework does not handle network partitions i.e 
> explicitly trying to {{reconnect}} with the master upon not receiving 
> {{HEARTBEAT}} events for a prolonged amount of time. If the master 
> disconnects a framework without the framework being aware of it (one way 
> partition), the framework should explicitly issue a {{reconnect}} request via 
> the scheduler library after a certain period of time.
> *On the other hand*, should we close TCP socket on master side when teardown 
> a framework? Currently the tcp socket is left alive even framework has been 
> deactivated. This results in framework sending invalid {{Call}} to master and 
> re-detection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5468) Add logic in long-lived-framework to handle network partitions.

2016-05-27 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303581#comment-15303581
 ] 

Jay Guo commented on MESOS-5468:


[~anandmazumdar]
The socket is NOT successfully closed and still left in ESTABLISHED (can be 
observed from {{netstat}}). And I suspect it somehow happens before master 
explicitly issues close. Here's the log:
{code:title=master.log}
E0527 05:48:45.564194 13105 process.cpp:2033] Failed to shutdown socket with fd 
33: Transport endpoint is not connected
I0527 05:48:45.573005 13101 master.cpp:1383] Framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++)) 
disconnected
I0527 05:48:45.573212 13101 master.cpp:2792] Disconnecting framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++))
I0527 05:48:45.573431 13101 master.cpp:2816] Deactivating framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++))
W0527 05:48:45.574806 13101 master.hpp:1846] Master attempted to send message 
to disconnected framework 61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived 
Framework (C++))
I0527 05:48:45.575145 13100 hierarchical.cpp:375] Deactivated framework 
61100b89-f964-4aa2-b084-e1089d205b83-
W0527 05:48:45.580201 13101 master.hpp:1852] Unable to send event to framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++)): 
connection closed
W0527 05:48:45.581838 13101 master.hpp:1846] Master attempted to send message 
to disconnected framework 61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived 
Framework (C++))
W0527 05:48:45.582034 13101 master.hpp:1852] Unable to send event to framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++)): 
connection closed
W0527 05:48:45.583015 13101 master.hpp:1846] Master attempted to send message 
to disconnected framework 61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived 
Framework (C++))
W0527 05:48:45.583124 13101 master.hpp:1852] Unable to send event to framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++)): 
connection closed
I0527 05:48:45.583395 13101 master.cpp:1396] Giving framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++)) 0ns to 
failover
I0527 05:48:45.585503 13102 master.cpp:5516] Framework failover timeout, 
removing framework 61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived 
Framework (C++))
I0527 05:48:45.585793 13102 master.cpp:6246] Removing framework 
61100b89-f964-4aa2-b084-e1089d205b83- (Long Lived Framework (C++))
I0527 05:48:45.588471 13102 master.cpp:6761] Updating the state of task 2 of 
framework 61100b89-f964-4aa2-b084-e1089d205b83- (latest state: 
TASK_FINISHED, status update state: TASK_KILLED)
I0527 05:48:45.589534 13102 master.cpp:6827] Removing task 2 with resources 
cpus(*):0.001; mem(*):1 of framework 61100b89-f964-4aa2-b084-e1089d205b83- 
on agent af46d7b0-4e75-443d-9e11-e89d5605f012-S2 at slave(1)@10.11.13.10:5051 
(agent-3.novalocal)
I0527 05:48:45.590454 13102 master.cpp:6856] Removing executor 'default' with 
resources cpus(*):0.1; mem(*):32 of framework 
61100b89-f964-4aa2-b084-e1089d205b83- on agent 
af46d7b0-4e75-443d-9e11-e89d5605f012-S2 at slave(1)@10.11.13.10:5051 
(agent-3.novalocal)
I0527 05:48:45.592897 13100 hierarchical.cpp:326] Removed framework 
61100b89-f964-4aa2-b084-e1089d205b83-
W0527 05:48:50.662726 13098 master.cpp:5199] Ignoring unknown exited executor 
'default' of framework 61100b89-f964-4aa2-b084-e1089d205b83- on agent 
af46d7b0-4e75-443d-9e11-e89d5605f012-S2 at slave(1)@10.11.13.10:5051 
(agent-3.novalocal)
{code}

The build is not super fresh (within 1 week), so you may find line number not 
consistent with latest code.

> Add logic in long-lived-framework to handle network partitions.
> ---
>
> Key: MESOS-5468
> URL: https://issues.apache.org/jira/browse/MESOS-5468
> Project: Mesos
>  Issue Type: Task
>  Components: framework, master
>Reporter: Jay Guo
>
> Currently long-lived-framework does not handle network partitions i.e 
> explicitly trying to {{reconnect}} with the master upon not receiving 
> {{HEARTBEAT}} events for a prolonged amount of time. If the master 
> disconnects a framework without the framework being aware of it (one way 
> partition), the framework should explicitly issue a {{reconnect}} request via 
> the scheduler library after a certain period of time.
> *On the other hand*, should we close TCP socket on master side when teardown 
> a framework? Currently the tcp socket is left alive even framework has been 
> deactivated. This results in framework sending invalid {{Call}} to master and 
> re-detection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)