[jira] [Assigned] (MESOS-7300) Mesos failed to build on Windows due to error C2440: 'return': cannot convert from 'Error' to 'bool'

2017-03-24 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu reassigned MESOS-7300:


   Resolution: Fixed
 Assignee: Andrew Schwartzmeyer
Fix Version/s: 1.3.0

{code}
commit 1de39e676a0dc5f78eeff303cb0eba5467168b9f
Author: Andrew Schwartzmeyer 
Date:   Fri Mar 24 21:07:30 2017 -0700

Windows: Fixed return of bad types in stat.hpp.

Commit 5f159cdcb introduced `return Error(...)` logic to functions
which return `bool`, not `Try`, which broke the Windows build.

Furthermore, in the instances of `isdir` and `isfile`, erroring
when asked to not follow a symlink is not correct. The semantics
of symlinks provide clear answers to `isdir` and `isfile` when the
target is a link, and is not being followed (it is neither a regular
file nor a directory).

We explicitly match the POSIX semantics for `isfile` where `S_IFREG`
returns `false` for symbolic links.

For the functions `mode` and `dev`, which return types wrapped by `Try`,
we should only error if asked not to follow symlinks, and the target is
actually a symlink. If it is not a symlink to begin with, we should not
prematurely error. If it is a symlink, we should error because there is
no equivalent of `lstat` on Windows to obtain `st_mode` or `st_dev` of a
symlink itself.

Review: https://reviews.apache.org/r/57926/
{code}

> Mesos failed to build on Windows due to error C2440: 'return': cannot convert 
> from 'Error' to 'bool'
> 
>
> Key: MESOS-7300
> URL: https://issues.apache.org/jira/browse/MESOS-7300
> Project: Mesos
>  Issue Type: Bug
>  Components: build
> Environment: Windows Server 2012 R2 + VS2015 Update 3
>Reporter: Karen Huang
>Assignee: Andrew Schwartzmeyer
>Priority: Blocker
> Fix For: 1.3.0
>
>
> I try to build Mesos (master branch revision 322300f) with VS2015 Update 3 on 
> Windows. It failed to build with the following error:
> D:\Mesos\src\3rdparty\stout\include\stout/os/windows/stat.hpp(41): error 
> C2440: 'return': cannot convert from 'Error' to 'bool' (compiling source file 
> D:\Mesos\src\3rdparty\libprocess\src\time.cpp) 
> [D:\Mesos\build_x64\3rdparty\libprocess\src\process-0.0.1.vcxproj]
> D:\Mesos\src\3rdparty\stout\include\stout/os/windows/stat.hpp(59): error 
> C2440: 'return': cannot convert from 'Error' to 'bool' (compiling source file 
> D:\Mesos\src\3rdparty\libprocess\src\time.cpp) 
> [D:\Mesos\build_x64\3rdparty\libprocess\src\process-0.0.1.vcxproj]
> This issue starts to be reproduce form master branch revision "82e4077" 
> (https://github.com/apache/mesos/commit/82e4077ceb40e84c2796be43f1448eec0bfd7c69#diff-76a72473075f57f8d0d3b3bf6f150672)
> I presume this is a issue in your source code. The function "inline bool 
> isdir(  const std::string& path, const FollowSymlink follow = FOLLOW_SYMLINK)"
> needs a bool type return value. But the return value type of 
> "Error("Non-following stat not supported for '" + path + "'")" is not bool.
> In D:\Mesos\src\3rdparty\stout\include\stout\os\windows\stat.hpp file:
> inline bool isdir(
> const std::string& path,
> const FollowSymlink follow = FOLLOW_SYMLINK)
> {
>   struct _stat s;
>   if (follow == DO_NOT_FOLLOW_SYMLINK) {
>   return Error("Non-following stat not supported for '" + path + "'");
>   }
>   if (::_stat(path.c_str(), ) < 0) {
> return false;
>   }
>   return S_ISDIR(s.st_mode);
> }
> In D:\Mesos\src\3rdparty\stout\include\stout\errorbase.hpp file:
> class Error
> {
> public:
>   explicit Error(const std::string& _message) : message(_message) {}
>   const std::string message;
> };



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7311) CopyFetcherPluginTest.FetchExistingFile

2017-03-24 Thread Andrew Schwartzmeyer (JIRA)
Andrew Schwartzmeyer created MESOS-7311:
---

 Summary: CopyFetcherPluginTest.FetchExistingFile
 Key: MESOS-7311
 URL: https://issues.apache.org/jira/browse/MESOS-7311
 Project: Mesos
  Issue Type: Bug
  Components: fetcher
 Environment: Windows 10
Reporter: Andrew Schwartzmeyer


The CopyFetcherPluginTest.FetchExistingFile unit tests (from mesos-tests) is 
routinely failing on Windows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7310) Implement a separate python client library for the new cli

2017-03-24 Thread Eric Chung (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Chung updated MESOS-7310:
--
Description: 
cli_new in its current form is very difficult to package due to the following 
reasons:
1. src/cli_new/lib/mesos imports plugins using relative imports, which fails if 
it is built into a pip package
2. there is no setup.py script which defines what should be installed
3. plugins/tests are unnecessarily included in the package, which are things 
consumers of the package shouldn’t be able to import

having such a package will allow external consumers to be able to add 
application-specific wrappers on it, e.g. integration with ACL libraries of 
their choice.

The plan as discussed will create a `mesos` package under `src/python/lib`, 
potentially including a `setup.py` for building the package into a PyPI package.


> Implement a separate python client library for the new cli
> --
>
> Key: MESOS-7310
> URL: https://issues.apache.org/jira/browse/MESOS-7310
> Project: Mesos
>  Issue Type: Task
>  Components: cli
>Affects Versions: 1.3.0
>Reporter: Eric Chung
>Assignee: Eric Chung
>
> cli_new in its current form is very difficult to package due to the following 
> reasons:
> 1. src/cli_new/lib/mesos imports plugins using relative imports, which fails 
> if it is built into a pip package
> 2. there is no setup.py script which defines what should be installed
> 3. plugins/tests are unnecessarily included in the package, which are things 
> consumers of the package shouldn’t be able to import
> having such a package will allow external consumers to be able to add 
> application-specific wrappers on it, e.g. integration with ACL libraries of 
> their choice.
> The plan as discussed will create a `mesos` package under `src/python/lib`, 
> potentially including a `setup.py` for building the package into a PyPI 
> package.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7310) Implement a separate python client library for the new cli

2017-03-24 Thread Eric Chung (JIRA)
Eric Chung created MESOS-7310:
-

 Summary: Implement a separate python client library for the new cli
 Key: MESOS-7310
 URL: https://issues.apache.org/jira/browse/MESOS-7310
 Project: Mesos
  Issue Type: Task
  Components: cli
Affects Versions: 1.3.0
Reporter: Eric Chung
Assignee: Eric Chung






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7309) Support specifying devices for a container.

2017-03-24 Thread Jie Yu (JIRA)
Jie Yu created MESOS-7309:
-

 Summary: Support specifying devices for a container.
 Key: MESOS-7309
 URL: https://issues.apache.org/jira/browse/MESOS-7309
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Jie Yu


Some container requires some devices to be available in the container (e.g., 
/dev/fuse). Currently, the default devices are hard coded if the rootfs image 
is specified for the container.

We should allow frameworks to specify additional devices that will be made 
available to the container. Besides bind mount the device file, the devices 
cgroup needs to be configured properly to allow access to that device.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7308) Race condition in `updateAllocation()` on DESTORY of a shared volume.

2017-03-24 Thread Anindya Sinha (JIRA)
Anindya Sinha created MESOS-7308:


 Summary: Race condition in `updateAllocation()` on DESTORY of a 
shared volume.
 Key: MESOS-7308
 URL: https://issues.apache.org/jira/browse/MESOS-7308
 Project: Mesos
  Issue Type: Bug
  Components: general
Reporter: Anindya Sinha
Assignee: Anindya Sinha


When a {{DESTROY}} (for shared volume) is processed in the master actor, we 
rescind pending offers to which the volume to be destroyed is already offered 
to. Before allocator executes the {{updateAllocation()}} API, offers with the 
same shared volume can be sent to frameworks since the destroyed shared volume 
is not removed from {{slaves.total}} till {{updateAllocation()}} completes. As 
a result, the following check can fail:
{code}
  CHECK_EQ(
  frameworkAllocation.flatten().createStrippedScalarQuantity(),
  updatedFrameworkAllocation.flatten().createStrippedScalarQuantity());
{code}

We need to address this condition by not failing the {{CHECK_EQ}}, and also 
ensuring that the master's state is restored to honor the {{DESTROY}} of the 
shared volume.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7307) Fix Windows build break by stat.hpp changes

2017-03-24 Thread Andrew Schwartzmeyer (JIRA)
Andrew Schwartzmeyer created MESOS-7307:
---

 Summary: Fix Windows build break by stat.hpp changes
 Key: MESOS-7307
 URL: https://issues.apache.org/jira/browse/MESOS-7307
 Project: Mesos
  Issue Type: Bug
 Environment: Windows 10
Reporter: Andrew Schwartzmeyer
Assignee: Andrew Schwartzmeyer


Commit 5f159cdcb introduced `return Error(...)` logic to functions which return 
`bool`, not `Try` in stat.hpp, which broke the Windows build.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7306) Support mount propagation for Volumes.

2017-03-24 Thread Jie Yu (JIRA)
Jie Yu created MESOS-7306:
-

 Summary: Support mount propagation for Volumes.
 Key: MESOS-7306
 URL: https://issues.apache.org/jira/browse/MESOS-7306
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Jie Yu


Currently, all mounts in a container are marked as 'slave' by default. However, 
for some cases, we may want mounts under certain directory in a container to be 
propagate back to the root mount namespace. This is useful for the case where 
we want the mounts to survive container failures.

See more documentation about mount propagation in:
https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-5995) Protobuf JSON deserialisation does not accept numbers formated as strings

2017-03-24 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-5995:
---
Priority: Critical  (was: Minor)

> Protobuf JSON deserialisation does not accept numbers formated as strings
> -
>
> Key: MESOS-5995
> URL: https://issues.apache.org/jira/browse/MESOS-5995
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 1.0.0
>Reporter: Tomasz Janiszewski
>Assignee: Tomasz Janiszewski
>Priority: Critical
>
> Proto2 does not specify JSON mappings but 
> [Proto3|https://developers.google.com/protocol-buffers/docs/proto3#json] does 
> and it recommend to map 64bit numbers as a string. Unfortunately Mesos does 
> not accepts strings in places of uint64 and return 400 Bad 
> {quote}
> Request error Failed to convert JSON into Call protobuf: Not expecting a JSON 
> string for field 'value'.
> {quote}
> Is this by purpose or is this a bug?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7303) Support Isolator capabilities.

2017-03-24 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7303:
--
Description: Currently, isolators have one capability: whether it supports 
nesting or not. To support launching containers that are not tied to Mesos 
tasks or executors (standalone containers), we need to add another capability 
to the Isolator interface so that we can avoid invoking those isolators that 
are not yet support that when launching standalone containers.  (was: 
Currently, isolators have one capability: whether it supports nesting
or not. To support launching containers that are not tied to Mesos tasks or 
executors, we need to add another capability to the Isolator interface so that 
we can avoid invoking those isolators that are not yet support that when 
launching such containers.)

> Support Isolator capabilities.
> --
>
> Key: MESOS-7303
> URL: https://issues.apache.org/jira/browse/MESOS-7303
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jie Yu
>
> Currently, isolators have one capability: whether it supports nesting or not. 
> To support launching containers that are not tied to Mesos tasks or executors 
> (standalone containers), we need to add another capability to the Isolator 
> interface so that we can avoid invoking those isolators that are not yet 
> support that when launching standalone containers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7302) Support launching standalone containers.

2017-03-24 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7302:
--
Epic Name: Standalone Container  (was: Taskless and Executorless Containers)

> Support launching standalone containers.
> 
>
> Key: MESOS-7302
> URL: https://issues.apache.org/jira/browse/MESOS-7302
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization
>Reporter: Jie Yu
>
> Containerizer should support launching containers (both top level and nested) 
> that are not tied to a particular Mesos task or executor. This is for the 
> case where the agent wants to launch some system containers (e.g., for CSI 
> plugin) that will be managed by Mesos.
> More specifically, the Containerizer interfaces should be refactored so that 
> they do not depend on TaskInfo or ExecutorInfo. Currently, the `launch` 
> interface depends on them. Instead, we should consistently use ContainerInfo 
> and CommandInfo in Containerizer and isolators.
> This is also one necessary step towards running MesosContainerizer in 
> standalone mode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7305) Adjust the recover logic of MesosContainerizer to allow standalone containers.

2017-03-24 Thread Jie Yu (JIRA)
Jie Yu created MESOS-7305:
-

 Summary: Adjust the recover logic of MesosContainerizer to allow 
standalone containers.
 Key: MESOS-7305
 URL: https://issues.apache.org/jira/browse/MESOS-7305
 Project: Mesos
  Issue Type: Task
  Components: containerization
Reporter: Jie Yu


The current recovery logic in MesosContainerizer assumes that all top level 
containers are tied to some Mesos executors. Adding standalone containers will 
invalid this assumption. The recovery logic must be changed to adapt to that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7302) Support launching standalone containers.

2017-03-24 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7302:
--
Summary: Support launching standalone containers.  (was: Support launching 
containers that are not tied to Mesos tasks or executors.)

> Support launching standalone containers.
> 
>
> Key: MESOS-7302
> URL: https://issues.apache.org/jira/browse/MESOS-7302
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization
>Reporter: Jie Yu
>
> Containerizer should support launching containers (both top level and nested) 
> that are not tied to a particular Mesos task or executor. This is for the 
> case where the agent wants to launch some system containers (e.g., for CSI 
> plugin) that will be managed by Mesos.
> More specifically, the Containerizer interfaces should be refactored so that 
> they do not depend on TaskInfo or ExecutorInfo. Currently, the `launch` 
> interface depends on them. Instead, we should consistently use ContainerInfo 
> and CommandInfo in Containerizer and isolators.
> This is also one necessary step towards running MesosContainerizer in 
> standalone mode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7302) Support launching containers that are not tied to Mesos tasks or executors.

2017-03-24 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7302:
--
Description: 
Containerizer should support launching containers (both top level and nested) 
that are not tied to a particular Mesos task or executor. This is for the case 
where the agent wants to launch some system containers (e.g., for CSI plugin) 
that will be managed by Mesos.

More specifically, the Containerizer interfaces should be refactored so that 
they do not depend on TaskInfo or ExecutorInfo. Currently, the `launch` 
interface depends on them. Instead, we should consistently use ContainerInfo 
and CommandInfo in Containerizer and isolators.

This is also one necessary step towards running MesosContainerizer in 
standalone mode.

  was:
Containerizer should support launching containers (both top level and nested) 
that are not tied to a particular Mesos task or executor. This is for the case 
where the agent wants to launch some system containers (e.g., for CSI plugin) 
that will be managed by Mesos.

More specifically, the Containerizer interfaces should be refactored so that 
they do not depend on TaskInfo or ExecutorInfo. Currently, the `launch` 
interface depends on them. Instead, we should consistently use ContainerInfo 
and CommandInfo in Containerizer and isolators.


> Support launching containers that are not tied to Mesos tasks or executors.
> ---
>
> Key: MESOS-7302
> URL: https://issues.apache.org/jira/browse/MESOS-7302
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization
>Reporter: Jie Yu
>
> Containerizer should support launching containers (both top level and nested) 
> that are not tied to a particular Mesos task or executor. This is for the 
> case where the agent wants to launch some system containers (e.g., for CSI 
> plugin) that will be managed by Mesos.
> More specifically, the Containerizer interfaces should be refactored so that 
> they do not depend on TaskInfo or ExecutorInfo. Currently, the `launch` 
> interface depends on them. Instead, we should consistently use ContainerInfo 
> and CommandInfo in Containerizer and isolators.
> This is also one necessary step towards running MesosContainerizer in 
> standalone mode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7304) Fetcher should not depends on SlaveID.

2017-03-24 Thread Jie Yu (JIRA)
Jie Yu created MESOS-7304:
-

 Summary: Fetcher should not depends on SlaveID.
 Key: MESOS-7304
 URL: https://issues.apache.org/jira/browse/MESOS-7304
 Project: Mesos
  Issue Type: Task
  Components: containerization, fetcher
Reporter: Jie Yu


Currently, various Fetcher interfaces depends on SlaveID, which is an 
unnecessary coupling. For instance:
{code}
Try Fetcher::recover(const SlaveID& slaveId, const Flags& flags);

Future Fetcher::fetch(
const ContainerID& containerId,
const CommandInfo& commandInfo,
const string& sandboxDirectory,
const Option& user,
const SlaveID& slaveId,
const Flags& flags);
{code}

Looks like the only reason we need a SlaveID is because we need to calculate 
the fetcher cache directory based on that. We should calculate the fetcher 
cache directory in the caller and pass that directory to Fetcher.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-6127) Implement suppport for HTTP/2

2017-03-24 Thread Aaron Wood (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940982#comment-15940982
 ] 

Aaron Wood commented on MESOS-6127:
---

Current design doc is here 
https://docs.google.com/document/d/1vD8x2l5X6LzrHMCIOcWnx8ky_nT_Y2UIrQUXQB7DdLw/edit
I don't have the bandwidth at work to take this on right now but I believe 
[~ipronin] was interested in taking this further.

> Implement suppport for HTTP/2
> -
>
> Key: MESOS-6127
> URL: https://issues.apache.org/jira/browse/MESOS-6127
> Project: Mesos
>  Issue Type: Epic
>  Components: HTTP API, libprocess
>Reporter: Aaron Wood
>  Labels: performance
>
> HTTP/2 will allow us to take advantage of connection multiplexing, header 
> compression, streams, server push, etc. Add support for communication over 
> HTTP/2 between masters and agents, framework endpoints, etc.
> Should we support HTTP/2 without TLS? The spec allows for this but most major 
> browser vendors, libraries, and implementations aren't supporting it unless 
> TLS is used. If we do require TLS, what can be done to reduce the performance 
> hit of the TLS handshake? Might need to change more code to make sure that we 
> are taking advantage of connection sharing so that we can (ideally) only ever 
> have a one-time TLS handshake per shared connection.
> Some ideas for libs:
> https://nghttp2.org/documentation/package_README.html - Has encoders/decoders 
> supporting HPACK https://nghttp2.org/documentation/tutorial-hpack.html
> https://nghttp2.org/documentation/libnghttp2_asio.html - Currently marked as 
> experimental by the nghttp2 docs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7303) Support Isolator capabilities.

2017-03-24 Thread Jie Yu (JIRA)
Jie Yu created MESOS-7303:
-

 Summary: Support Isolator capabilities.
 Key: MESOS-7303
 URL: https://issues.apache.org/jira/browse/MESOS-7303
 Project: Mesos
  Issue Type: Task
  Components: containerization
Reporter: Jie Yu


Currently, isolators have one capability: whether it supports nesting
or not. To support launching containers that are not tied to Mesos tasks or 
executors, we need to add another capability to the Isolator interface so that 
we can avoid invoking those isolators that are not yet support that when 
launching such containers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7302) Support launching containers that are not tied to Mesos tasks or executors.

2017-03-24 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7302:
--
Description: 
Containerizer should support launching containers (both top level and nested) 
that are not tied to a particular Mesos task or executor. This is for the case 
where the agent wants to launch some system containers (e.g., for CSI plugin) 
that will be managed by Mesos.

More specifically, the Containerizer interfaces should be refactored so that 
they do not depend on TaskInfo or ExecutorInfo. Currently, the `launch` 
interface depends on them. Instead, we should consistently use ContainerInfo 
and CommandInfo in Containerizer and isolators.

  was:
Containerizer should support launching containers (both top level and 
nested) that are not tied to a particular Mesos task or executor. This
is for the case where the agent wants to launch some system containers
(e.g., for CSI plugin) that will be managed by Mesos.

More specifically, the Containerizer interfaces should be refactored
so that they do not depend on TaskInfo or ExecutorInfo. Currently, the 
`launch` interface depends on them. Instead, we should consistently
use ContainerInfo and CommandInfo in Containerizer and isolators.


> Support launching containers that are not tied to Mesos tasks or executors.
> ---
>
> Key: MESOS-7302
> URL: https://issues.apache.org/jira/browse/MESOS-7302
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization
>Reporter: Jie Yu
>
> Containerizer should support launching containers (both top level and nested) 
> that are not tied to a particular Mesos task or executor. This is for the 
> case where the agent wants to launch some system containers (e.g., for CSI 
> plugin) that will be managed by Mesos.
> More specifically, the Containerizer interfaces should be refactored so that 
> they do not depend on TaskInfo or ExecutorInfo. Currently, the `launch` 
> interface depends on them. Instead, we should consistently use ContainerInfo 
> and CommandInfo in Containerizer and isolators.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7302) Support launching containers that are not tied to Mesos tasks or executors.

2017-03-24 Thread Jie Yu (JIRA)
Jie Yu created MESOS-7302:
-

 Summary: Support launching containers that are not tied to Mesos 
tasks or executors.
 Key: MESOS-7302
 URL: https://issues.apache.org/jira/browse/MESOS-7302
 Project: Mesos
  Issue Type: Epic
  Components: containerization
Reporter: Jie Yu


Containerizer should support launching containers (both top level and 
nested) that are not tied to a particular Mesos task or executor. This
is for the case where the agent wants to launch some system containers
(e.g., for CSI plugin) that will be managed by Mesos.

More specifically, the Containerizer interfaces should be refactored
so that they do not depend on TaskInfo or ExecutorInfo. Currently, the 
`launch` interface depends on them. Instead, we should consistently
use ContainerInfo and CommandInfo in Containerizer and isolators.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7279) State Diagrams for V1 Schedulers & Executors

2017-03-24 Thread Chun-Hung Hsiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940901#comment-15940901
 ] 

Chun-Hung Hsiao commented on MESOS-7279:


Is it more clear now?

> State Diagrams for V1 Schedulers & Executors
> 
>
> Key: MESOS-7279
> URL: https://issues.apache.org/jira/browse/MESOS-7279
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>Priority: Minor
>  Labels: documentation
>
> State diagrams for schedulers' and executors' finite state machines in the 
> Mesos V1 Framework API to show that what events they would receive under what 
> situations. For example, when a scheduler is in a certain state, it would 
> only receive certain events from the master or connected/disconnected events, 
> and the diagram can show that what action it should take for each scenario. 
> This would be useful for framework developers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7279) State Diagrams for V1 Schedulers & Executors

2017-03-24 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940767#comment-15940767
 ] 

Anand Mazumdar commented on MESOS-7279:
---

It's not immediately clear from the ticket what type of state diagrams are we 
looking for. Can we modify the description accordingly?

> State Diagrams for V1 Schedulers & Executors
> 
>
> Key: MESOS-7279
> URL: https://issues.apache.org/jira/browse/MESOS-7279
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>Priority: Minor
>  Labels: documentation
>
> State diagrams for schedulers and executors in the Mesos V1 Framework API to 
> show that what events they would receive under what situations. This would be 
> useful for framework developers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7279) State Diagrams for V1 Schedulers & Executors

2017-03-24 Thread Chun-Hung Hsiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940750#comment-15940750
 ] 

Chun-Hung Hsiao commented on MESOS-7279:


Sure but I'll put it at a lower priority ;)

> State Diagrams for V1 Schedulers & Executors
> 
>
> Key: MESOS-7279
> URL: https://issues.apache.org/jira/browse/MESOS-7279
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>Priority: Minor
>  Labels: documentation
>
> State diagrams for schedulers and executors in the Mesos V1 Framework API to 
> show that what events they would receive under what situations. This would be 
> useful for framework developers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7279) State Diagrams for V1 Schedulers & Executors

2017-03-24 Thread Chun-Hung Hsiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun-Hung Hsiao reassigned MESOS-7279:
--

Assignee: Chun-Hung Hsiao

> State Diagrams for V1 Schedulers & Executors
> 
>
> Key: MESOS-7279
> URL: https://issues.apache.org/jira/browse/MESOS-7279
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>Priority: Minor
>  Labels: documentation
>
> State diagrams for schedulers and executors in the Mesos V1 Framework API to 
> show that what events they would receive under what situations. This would be 
> useful for framework developers.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7271) JNI SIGSEGV failed when connecting spark to mesos master

2017-03-24 Thread Michael Gummelt (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940744#comment-15940744
 ] 

Michael Gummelt commented on MESOS-7271:


I don't know, but I've been running Spark 2.1 against Mesos 1.2 w/o any 
problems, so I can't repro this.

> JNI SIGSEGV failed when connecting spark to mesos master
> 
>
> Key: MESOS-7271
> URL: https://issues.apache.org/jira/browse/MESOS-7271
> Project: Mesos
>  Issue Type: Bug
>  Components: java api
>Affects Versions: 1.1.0, 1.2.0
> Environment: Ubuntu 16.04, OpenJDK 8, Spark 2.1.1
>Reporter: Qi Cui
>
> Run starting. Expected test count is: 1
> SampleDataFrameTest:
> 17/03/20 11:53:16 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I0320 11:53:19.775842  4679 process.cpp:1071] libprocess is initialized on 
> 192.168.0.99:38293 with 8 worker threads
> I0320 11:53:19.775975  4679 logging.cpp:199] Logging to STDERR
> I0320 11:53:19.789871  4725 sched.cpp:226] Version: 1.1.0
> I0320 11:53:19.832826  4717 sched.cpp:330] New master detected at 
> master@192.168.0.50:5050
> I0320 11:53:19.838253  4717 sched.cpp:341] No credentials provided. 
> Attempting to register without authentication
> I0320 11:53:19.838337  4717 sched.cpp:820] Sending SUBSCRIBE call to 
> master@192.168.0.50:5050
> I0320 11:53:19.840265  4717 sched.cpp:853] Will retry registration in 
> 32.354951ms if necessary
> I0320 11:53:19.844734  4717 sched.cpp:743] Framework registered with 
> 6e147824-5d88-411b-9c09-a7137565c309-0001
> I0320 11:53:19.864850  4717 sched.cpp:757] Scheduler::registered took 
> 20.022604ms
> ERROR: exception pending on entry to FindMesosClass()
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7ffa06fea4a6, pid=4677, tid=0x7ff9a1a46700
> #
> # JRE version: OpenJDK Runtime Environment (8.0_121-b13) (build 
> 1.8.0_121-8u121-b13-0ubuntu1.16.04.2-b13)
> # Java VM: OpenJDK 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 
> compressed oops)
> # Problematic frame:
> # V  [libjvm.so+0x6744a6]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /media/sf_G_DRIVE/src/spark-testing-base/hs_err_pid4677.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7277) General checker does not support command checks via agent.

2017-03-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/MESOS-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940605#comment-15940605
 ] 

Gastón Kleiman commented on MESOS-7277:
---

https://reviews.apache.org/r/57912/

> General checker does not support command checks via agent.
> --
>
> Key: MESOS-7277
> URL: https://issues.apache.org/jira/browse/MESOS-7277
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Gastón Kleiman
>  Labels: health-check, mesosphere
>
> Command checks via agent are necessary for executors, that launch their tasks 
> via agent, e.g., default executor. General checker should support launching 
> command as nested containers via agent in order to be used by such executors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7301) CommandExecutorTest.NoTransitionFromKillingToRunning is flaky.

2017-03-24 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-7301:
--

 Summary: CommandExecutorTest.NoTransitionFromKillingToRunning is 
flaky.
 Key: MESOS-7301
 URL: https://issues.apache.org/jira/browse/MESOS-7301
 Project: Mesos
  Issue Type: Bug
  Components: test
Affects Versions: 1.3.0
 Environment: Mac Mini with Mac OS 10.11.6 with SSL enabled
Reporter: Alexander Rukletsov


I see {{CommandExecutorTest.NoTransitionFromKillingToRunning}} failing often on 
Mac. According to the logs, the task is not transitioning to {{KILLED}} from 
{{KILLING}}. The reason is however unclear at first glance.
>From a single `make check` session:
  * "good" run http://pastebin.com/88Ar34Lz
  * "bad" run http://pastebin.com/RKMmYV8z

I've seen both versions of the test (with HTTP and non-HTTP) command executor 
fail.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MESOS-7290) make fails at protobuf stage

2017-03-24 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940442#comment-15940442
 ] 

Till Toenshoff edited comment on MESOS-7290 at 3/24/17 2:42 PM:


[~rharnasch] the compilation error {{g++: internal compiler error: Killed 
(program cc1plus)}} is indeed a common problem when using too many compilation 
processes for the available RAM on your build machine.

You were obviously hitting two problems and the internal error should be 
entirely unrelated to the python egg / proto problem.

I do not have exact, current numbers but generally g++ needs more than clang++. 
As a rule of thumb, make sure that you have about 2gb per compilation process 
for clang++ and 2.5gb for g++. You can adjust the number of processes by 
providing the {{j N}} flag to {{make}}.
So for 16gb of RAM with clang++ you may go all the way up to {{j 8}} - 
certainly also depending on the amount of cores/threads on the build machines 
CPU. Numbers may vary for your system...
If it fails again, reduce {{N}}.

The python egg version comparison issues, to me, indeed sounds like a bug that 
should get triaged and fixed. 


was (Author: tillt):
[~rharnasch] the compilation error {{g++: internal compiler error: Killed 
(program cc1plus)}} is indeed a common problem when using too many compilation 
processes for the available RAM on your build machine.

I do not have exact, current numbers but generally g++ needs more than clang++. 
As a rule of thumb, make sure that you have about 2gb per compilation process 
for clang++ and 2.5gb for g++. You can adjust the number of processes by 
providing the {{j N}} flag to {{make}}.
So for 16gb of RAM with clang++ you may go all the way up to {{j 8}} - 
certainly also depending on the amount of cores/threads on the build machines 
CPU. Numbers may vary for your system...
If it fails again, reduce {{N}}.

> make fails at protobuf stage
> 
>
> Key: MESOS-7290
> URL: https://issues.apache.org/jira/browse/MESOS-7290
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: CentOS 7.3 (built from 1611 image)
>Reporter: Raul Harnasch
>Assignee: Kapil Arya
>
> {noformat}
> Building protobuf Python egg ...
> cd ../3rdparty/protobuf-2.6.1/python &&   \
>   CC="gcc"\
>   CXX="g++"   \
>   CFLAGS="-g1 -O0 -Wno-unused-local-typedefs" \
>   CXXFLAGS="-g1 -O0 -Wno-unused-local-typedefs -std=c++11"
> \
>   PYTHONPATH=/opt/mesos/build/3rdparty/setuptools-20.9.0  \
>   /bin/python setup.py build bdist_egg
> Installed 
> /opt/mesos/build/3rdparty/protobuf-2.6.1/python/.eggs/google_apputils-0.4.2-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 200, in 
> "Protocol Buffers are Google's data interchange format.",
>   File "/usr/lib64/python2.7/distutils/core.py", line 112, in setup
> _setup_distribution = dist = klass(attrs)
>   File "/opt/mesos/build/3rdparty/setuptools-20.9.0/setuptools/dist.py", line 
> 269, in __init__
> self.fetch_build_eggs(attrs['setup_requires'])
>   File "/opt/mesos/build/3rdparty/setuptools-20.9.0/setuptools/dist.py", line 
> 313, in fetch_build_eggs
> replace_conflicting=True,
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 826, in resolve
> dist = best[req.key] = env.best_match(req, ws, installer)
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 1085, in best_match
> dist = working_set.find(req)
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 695, in find
> raise VersionConflict(dist, req)
> pkg_resources.VersionConflict: (pytz 2012d 
> (/usr/lib/python2.7/site-packages), Requirement.parse('pytz>=2010'))
> make[2]: *** 
> [../3rdparty/protobuf-2.6.1/python/dist/protobuf-2.6.1-py2.7.egg] Error 1
> make[2]: *** Waiting for unfinished jobs
> make[2]: Leaving directory `/opt/mesos/build/src'
> make[1]: *** [all] Error 2
> make[1]: Leaving directory `/opt/mesos/build/src'
> make: *** [all-recursive] Error 1
> {noformat}
> Looks like a dependency issue, but as the error suggests, I have pytz 2012d 
> when the minimum requirement is 2010. Pip confirms this:
> {noformat}
> $ pip freeze | grep pytz
> pytz===2012d
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7290) make fails at protobuf stage

2017-03-24 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940442#comment-15940442
 ] 

Till Toenshoff commented on MESOS-7290:
---

[~rharnasch] the compilation error {{g++: internal compiler error: Killed 
(program cc1plus)}} is indeed a common problem when using too many compilation 
processes for the available RAM on your build machine.

I do not have exact, current numbers but generally g++ needs more than clang++. 
As a rule of thumb, make sure that you have about 2gb per compilation process 
for clang++ and 2.5gb for g++. You can adjust the number of processes by 
providing the {{j N}} flag to {{make}}.
So for 16gb of RAM with clang++ you may go all the way up to {{j 8}} - 
certainly also depending on the amount of cores/threads on the build machines 
CPU. Numbers may vary for your system...
If it fails again, reduce {{N}}.

> make fails at protobuf stage
> 
>
> Key: MESOS-7290
> URL: https://issues.apache.org/jira/browse/MESOS-7290
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: CentOS 7.3 (built from 1611 image)
>Reporter: Raul Harnasch
>Assignee: Kapil Arya
>
> {noformat}
> Building protobuf Python egg ...
> cd ../3rdparty/protobuf-2.6.1/python &&   \
>   CC="gcc"\
>   CXX="g++"   \
>   CFLAGS="-g1 -O0 -Wno-unused-local-typedefs" \
>   CXXFLAGS="-g1 -O0 -Wno-unused-local-typedefs -std=c++11"
> \
>   PYTHONPATH=/opt/mesos/build/3rdparty/setuptools-20.9.0  \
>   /bin/python setup.py build bdist_egg
> Installed 
> /opt/mesos/build/3rdparty/protobuf-2.6.1/python/.eggs/google_apputils-0.4.2-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 200, in 
> "Protocol Buffers are Google's data interchange format.",
>   File "/usr/lib64/python2.7/distutils/core.py", line 112, in setup
> _setup_distribution = dist = klass(attrs)
>   File "/opt/mesos/build/3rdparty/setuptools-20.9.0/setuptools/dist.py", line 
> 269, in __init__
> self.fetch_build_eggs(attrs['setup_requires'])
>   File "/opt/mesos/build/3rdparty/setuptools-20.9.0/setuptools/dist.py", line 
> 313, in fetch_build_eggs
> replace_conflicting=True,
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 826, in resolve
> dist = best[req.key] = env.best_match(req, ws, installer)
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 1085, in best_match
> dist = working_set.find(req)
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 695, in find
> raise VersionConflict(dist, req)
> pkg_resources.VersionConflict: (pytz 2012d 
> (/usr/lib/python2.7/site-packages), Requirement.parse('pytz>=2010'))
> make[2]: *** 
> [../3rdparty/protobuf-2.6.1/python/dist/protobuf-2.6.1-py2.7.egg] Error 1
> make[2]: *** Waiting for unfinished jobs
> make[2]: Leaving directory `/opt/mesos/build/src'
> make[1]: *** [all] Error 2
> make[1]: Leaving directory `/opt/mesos/build/src'
> make: *** [all-recursive] Error 1
> {noformat}
> Looks like a dependency issue, but as the error suggests, I have pytz 2012d 
> when the minimum requirement is 2010. Pip confirms this:
> {noformat}
> $ pip freeze | grep pytz
> pytz===2012d
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7290) make fails at protobuf stage

2017-03-24 Thread Raul Harnasch (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15940418#comment-15940418
 ] 

Raul Harnasch commented on MESOS-7290:
--

Yes.  I'm telling you, the error has to do with a comparison in the version 
numbers.

{noformat}
pkg_resources.VersionConflict: (pytz 2012d (/usr/lib/python2.7/site-packages), 
Requirement.parse('pytz>=2010'))
{noformat}

That error -- at least to me -- looks like it wants a version >= 2010.  But, 
obviously, I have 2012d.  The comparison doesn't like the {{d}} in {{2012d}}.  
Given that I can go in and change the version number listed in the egg-info 
file within the pytz library, and mesos builds, tells me that the issue has 
nothing to do with what dependancies I have installed and everything to do with 
the version checking that's going on.  It failed without reason.

> make fails at protobuf stage
> 
>
> Key: MESOS-7290
> URL: https://issues.apache.org/jira/browse/MESOS-7290
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: CentOS 7.3 (built from 1611 image)
>Reporter: Raul Harnasch
>Assignee: Kapil Arya
>
> {noformat}
> Building protobuf Python egg ...
> cd ../3rdparty/protobuf-2.6.1/python &&   \
>   CC="gcc"\
>   CXX="g++"   \
>   CFLAGS="-g1 -O0 -Wno-unused-local-typedefs" \
>   CXXFLAGS="-g1 -O0 -Wno-unused-local-typedefs -std=c++11"
> \
>   PYTHONPATH=/opt/mesos/build/3rdparty/setuptools-20.9.0  \
>   /bin/python setup.py build bdist_egg
> Installed 
> /opt/mesos/build/3rdparty/protobuf-2.6.1/python/.eggs/google_apputils-0.4.2-py2.7.egg
> Traceback (most recent call last):
>   File "setup.py", line 200, in 
> "Protocol Buffers are Google's data interchange format.",
>   File "/usr/lib64/python2.7/distutils/core.py", line 112, in setup
> _setup_distribution = dist = klass(attrs)
>   File "/opt/mesos/build/3rdparty/setuptools-20.9.0/setuptools/dist.py", line 
> 269, in __init__
> self.fetch_build_eggs(attrs['setup_requires'])
>   File "/opt/mesos/build/3rdparty/setuptools-20.9.0/setuptools/dist.py", line 
> 313, in fetch_build_eggs
> replace_conflicting=True,
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 826, in resolve
> dist = best[req.key] = env.best_match(req, ws, installer)
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 1085, in best_match
> dist = working_set.find(req)
>   File 
> "/opt/mesos/build/3rdparty/setuptools-20.9.0/pkg_resources/__init__.py", line 
> 695, in find
> raise VersionConflict(dist, req)
> pkg_resources.VersionConflict: (pytz 2012d 
> (/usr/lib/python2.7/site-packages), Requirement.parse('pytz>=2010'))
> make[2]: *** 
> [../3rdparty/protobuf-2.6.1/python/dist/protobuf-2.6.1-py2.7.egg] Error 1
> make[2]: *** Waiting for unfinished jobs
> make[2]: Leaving directory `/opt/mesos/build/src'
> make[1]: *** [all] Error 2
> make[1]: Leaving directory `/opt/mesos/build/src'
> make: *** [all-recursive] Error 1
> {noformat}
> Looks like a dependency issue, but as the error suggests, I have pytz 2012d 
> when the minimum requirement is 2010. Pip confirms this:
> {noformat}
> $ pip freeze | grep pytz
> pytz===2012d
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7300) Mesos failed to build on Windows due to error C2440: 'return': cannot convert from 'Error' to 'bool'

2017-03-24 Thread Karen Huang (JIRA)
Karen Huang created MESOS-7300:
--

 Summary: Mesos failed to build on Windows due to error C2440: 
'return': cannot convert from 'Error' to 'bool'
 Key: MESOS-7300
 URL: https://issues.apache.org/jira/browse/MESOS-7300
 Project: Mesos
  Issue Type: Bug
  Components: build
 Environment: Windows Server 2012 R2 + VS2015 Update 3
Reporter: Karen Huang
Priority: Blocker


I try to build Mesos (master branch revision 322300f) with VS2015 Update 3 on 
Windows. It failed to build with the following error:
D:\Mesos\src\3rdparty\stout\include\stout/os/windows/stat.hpp(41): error C2440: 
'return': cannot convert from 'Error' to 'bool' (compiling source file 
D:\Mesos\src\3rdparty\libprocess\src\time.cpp) 
[D:\Mesos\build_x64\3rdparty\libprocess\src\process-0.0.1.vcxproj]
D:\Mesos\src\3rdparty\stout\include\stout/os/windows/stat.hpp(59): error C2440: 
'return': cannot convert from 'Error' to 'bool' (compiling source file 
D:\Mesos\src\3rdparty\libprocess\src\time.cpp) 
[D:\Mesos\build_x64\3rdparty\libprocess\src\process-0.0.1.vcxproj]

This issue starts to be reproduce form master branch revision "82e4077" 
(https://github.com/apache/mesos/commit/82e4077ceb40e84c2796be43f1448eec0bfd7c69#diff-76a72473075f57f8d0d3b3bf6f150672)

I presume this is a issue in your source code. The function "inline bool isdir( 
 const std::string& path, const FollowSymlink follow = FOLLOW_SYMLINK)"
needs a bool type return value. But the return value type of 
"Error("Non-following stat not supported for '" + path + "'")" is not bool.

In D:\Mesos\src\3rdparty\stout\include\stout\os\windows\stat.hpp file:
inline bool isdir(
const std::string& path,
const FollowSymlink follow = FOLLOW_SYMLINK)
{
  struct _stat s;
  if (follow == DO_NOT_FOLLOW_SYMLINK) {
  return Error("Non-following stat not supported for '" + path + "'");
  }
  if (::_stat(path.c_str(), ) < 0) {
return false;
  }
  return S_ISDIR(s.st_mode);
}

In D:\Mesos\src\3rdparty\stout\include\stout\errorbase.hpp file:
class Error
{
public:
  explicit Error(const std::string& _message) : message(_message) {}

  const std::string message;
};



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7181) Stale frameworks seen on Mesos, but not known to scheduler

2017-03-24 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939922#comment-15939922
 ] 

Yan Xu commented on MESOS-7181:
---

Yeah so I meant that on the receiving end the process manager doesn't know 
whether an actor is being {{link}} ed or not so it has to send 
{{TargetPIDExited}} in all situations, this is different than the current local 
{{PID}} behavior. Also this message is sent not when the actor dies but when a 
message arrives, so I guess if a frameworks dies when it's suppressed and with 
no pending status updates, the master will not find out about it because it 
doesn't send messages?

Perhaps we can have a {{Link}} message sent to the linkee based on which it can 
send a special {{Exited}} message to the sender when the actor terminates?

> Stale frameworks seen on Mesos, but not known to scheduler
> --
>
> Key: MESOS-7181
> URL: https://issues.apache.org/jira/browse/MESOS-7181
> Project: Mesos
>  Issue Type: Bug
>  Components: general
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>
> Using a scheduler which launches multiple frameworks using scheduler driver, 
> we observe occasionally that a framework exists on Mesos which is not known 
> to the scheduler. Since there is no entity that acts on the offers, this 
> framework ends up hogging all the offers leading to starvation in the cluster.
> This particular scenario is as follows:
> 1) Scheduler does a driver.start() which results in the 1st SUBSCRIBE sent to 
> master.
> 2) The scheduler driver resends the SUBSCRIBE (since the framework has not 
> yet registered) which is a result of the exponential backoff.
> 3) Framework is registered based on the 1st SUBSCRIBE, but the scheduler 
> issues a driver.stop() immediately which results in a TEARDOWN sent to the 
> master.
> 4) Master processes the TEARDOWN which removes the framework.
> 5) Master now processes the 2nd SUBSCRIBE (after authorization) and tries to 
> add this framework. This succeeds and a new framework id is generated (since 
> the original framework is no longer registered after the TEARDOWN) but the 
> Scheduler driver by now has already terminated once the scheduler issued the 
> driver.stop(). So, master continues to send offers to this 2nd framework and 
> hogs on to offers till offer time out.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7263) User supplied task environment variables cause warnings in sandbox stdout.

2017-03-24 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7263:
---
Story Points: 3  (was: 5)

> User supplied task environment variables cause warnings in sandbox stdout.
> --
>
> Key: MESOS-7263
> URL: https://issues.apache.org/jira/browse/MESOS-7263
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.2.0
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>  Labels: mesosphere
> Fix For: 1.2.1, 1.3.0
>
>
> The default executor causes task/command environment variables to get 
> duplicated internally, causing warnings in the resulting sandbox {{stdout}}.
> {noformat}
> $ ./src/mesos-execute --name="test" --env='{"key1":"value1"}' 
> --command='sleep 1000' --master=127.0.0.1:5050
> {noformat}
> Result in {{stdout}} of the sandbox:
> {noformat}
> Overwriting environment variable 'key1', original: 'value1', new: 'value1'
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-6951) Docker containerizer: mangled environment when env value contains LF byte.

2017-03-24 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6951:
---
Summary: Docker containerizer: mangled environment when env value contains 
LF byte.  (was: Docker containerizer: mangled environment when env value 
contains LF byte)

> Docker containerizer: mangled environment when env value contains LF byte.
> --
>
> Key: MESOS-6951
> URL: https://issues.apache.org/jira/browse/MESOS-6951
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Jan-Philip Gehrcke
>Assignee: Till Toenshoff
>  Labels: mesosphere
> Fix For: 1.2.1, 1.3.0
>
>
> Consider this Marathon app definition:
> {code}
> {
>   "id": "/testapp",
>   "cmd": "env && tail -f /dev/null",
>   "env":{
> "TESTVAR":"line1\nline2"
>   },
>   "cpus": 0.1,
>   "mem": 10,
>   "instances": 1,
>   "container": {
> "type": "DOCKER",
> "docker": {
>   "image": "alpine"
> }
>   }
> }
> {code}
> The JSON-encoded newline in the value of the {{TESTVAR}} environment variable 
> leads to a corrupted task environment. What follows is a subset of the 
> resulting task environment (as printed via {{env}}, i.e. in key=value 
> notation):
> {code}
> line2=
> TESTVAR=line1
> {code}
> That is, the trailing part of the intended value ended up being interpreted 
> as variable name, and only the leading part of the intended value was used as 
> actual value for {{TESTVAR}}.
> Common application scenarios that would badly break with that involve 
> pretty-printed JSON documents or YAML documents passed along via the 
> environment.
> Following the code and information flow led to the conclusion that Docker's 
> {{--env-file}} command line interface is the weak point in the flow. It is 
> currently used in Mesos' Docker containerizer for passing the environment to 
> the container:
> {code}
>   argv.push_back("--env-file");
>   argv.push_back(environmentFile);
> {code}
> (Ref: 
> [code|https://github.com/apache/mesos/blob/c0aee8cc10b1d1f4b2db5ff12b771372fdd5b1f3/src/docker/docker.cpp#L584])
> Docker's {{--env-file}} argument behavior is documented via
> {quote}
> The --env-file flag takes a filename as an argument
> and expects each line to be in the VAR=VAL format,
> {quote}
> (Ref: https://docs.docker.com/engine/reference/commandline/run/)
> That is, Docker identifies individual environment variable key/value pair 
> definitions based on newline bytes in that file which explains the observed 
> environment variable value fragmentation. Notably, Docker does not provide a 
> mechanism for escaping newline bytes in the values specified in this 
> environment file.
> I think it is important to understand that Docker's {{--env-file}} mechanism 
> is ill-posed in the sense that it is not capable of transmitting the whole 
> range of environment variable values allowed by POSIX. That's what the Single 
> UNIX Specification, Version 3 has to say about environment variable values:
> {quote}
> the value shall be composed of characters from the
> portable character set (except NUL and as indicated below). 
> {quote}
> (Ref: http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html)
> About "The portable character set": 
> http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap06.html#tagtcjh_3
> It includes (among others) the LF byte. Understandably, the current Docker 
> {{--env-file}} behavior will not change, so this is not an issue that can be 
> deferred to Docker: https://github.com/docker/docker/issues/12997
> Notably, the {{--env-file}} method for communicating environment variables to 
> Docker containers was just recently introduced to Mesos as of 
> https://issues.apache.org/jira/browse/MESOS-6566, for not leaking secrets 
> through the process listing. Previously, we specified env key/value pairs on 
> the command line which leaked secrets to the process list and probably also 
> did not support the full range of valid environment variable values.
> We need a solution that
> 1) does not leak sensitive values (i.e. is compliant with MESOS-6566).
> 2) allows for passing arbitrary environment variable values.
> It seems that Docker's {{--env}} method can be used for that. It can be used 
> to define _just the names of the environment variables_ to-be-passed-along, 
> in which case the docker binary will read the corresponding values from its 
> own environment, which we can clearly prepare appropriately when we invoke 
> the corresponding child process. This method would still leak environment 
> variable _names_ to the process listing, but (especially if documented) this 
> should be fine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7265) Containerizer startup may cause sensitive data to leak into sandbox logs.

2017-03-24 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7265:
---
Fix Version/s: 1.2.1

> Containerizer startup may cause sensitive data to leak into sandbox logs.
> -
>
> Key: MESOS-7265
> URL: https://issues.apache.org/jira/browse/MESOS-7265
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.2.0
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>  Labels: mesosphere
> Fix For: 1.2.1, 1.3.0
>
>
> The task sandbox logging does show the callup for the containerizer launch 
> with all of its flags.
> This is not safe when assuming that we may not want to leak sensitive data 
> into the sandbox logging.
> Example:
> {noformat}
> Received SUBSCRIBED event
> Subscribed executor on lobomacpro2.fritz.box
> Received LAUNCH event
> Starting task test
> /Users/till/Development/mesos-private/build/src/mesos-containerizer launch 
> --help="false" 
> --launch_info="{"command":{"environment":{"variables":[{"name":"key1","type":"VALUE","value":"value1"}]},"shell":true,"value":"sleep
>  
> 1000"},"environment":{"variables":[{"name":"BIN_SH","type":"VALUE","value":"xpg4"},{"name":"DUALCASE","type":"VALUE","value":"1"},{"name":"DYLD_LIBRARY_PATH","type":"VALUE","value":"\/Users\/till\/Development\/mesos-private\/build\/src\/.libs"},{"name":"LIBPROCESS_PORT","type":"VALUE","value":"0"},{"name":"MESOS_AGENT_ENDPOINT","type":"VALUE","value":"192.168.178.20:5051"},{"name":"MESOS_CHECKPOINT","type":"VALUE","value":"0"},{"name":"MESOS_DIRECTORY","type":"VALUE","value":"\/tmp\/mesos\/slaves\/816619b6-f5ce-42d6-ad6b-2ef2001adc0a-S0\/frameworks\/4c8a82d4-8a5b-47f5-a660-5fef15da71a5-\/executors\/test\/runs\/b4bd0251-b42a-4ab3-9f02-60ede75bf3b1"},{"name":"MESOS_EXECUTOR_ID","type":"VALUE","value":"test"},{"name":"MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD","type":"VALUE","value":"5secs"},{"name":"MESOS_FRAMEWORK_ID","type":"VALUE","value":"4c8a82d4-8a5b-47f5-a660-5fef15da71a5-"},{"name":"MESOS_HTTP_COMMAND_EXECUTOR","type":"VALUE","value":"0"},{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/mesos\/slaves\/816619b6-f5ce-42d6-ad6b-2ef2001adc0a-S0\/frameworks\/4c8a82d4-8a5b-47f5-a660-5fef15da71a5-\/executors\/test\/runs\/b4bd0251-b42a-4ab3-9f02-60ede75bf3b1"},{"name":"MESOS_SLAVE_ID","type":"VALUE","value":"816619b6-f5ce-42d6-ad6b-2ef2001adc0a-S0"},{"name":"MESOS_SLAVE_PID","type":"VALUE","value":"slave(1)@192.168.178.20:5051"},{"name":"PATH","type":"VALUE","value":"\/usr\/local\/sbin:\/usr\/local\/bin:\/usr\/sbin:\/usr\/bin:\/sbin:\/bin"},{"name":"PWD","type":"VALUE","value":"\/private\/tmp\/mesos\/slaves\/816619b6-f5ce-42d6-ad6b-2ef2001adc0a-S0\/frameworks\/4c8a82d4-8a5b-47f5-a660-5fef15da71a5-\/executors\/test\/runs\/b4bd0251-b42a-4ab3-9f02-60ede75bf3b1"},{"name":"SHLVL","type":"VALUE","value":"0"},{"name":"__CF_USER_TEXT_ENCODING","type":"VALUE","value":"0x1F5:0x0:0x0"},{"name":"key1","type":"VALUE","value":"value1"},{"name":"key1","type":"VALUE","value":"value1"}]}}"
> Forked command at 16329
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7263) User supplied task environment variables cause warnings in sandbox stdout.

2017-03-24 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7263:
---
Fix Version/s: 1.3.0

> User supplied task environment variables cause warnings in sandbox stdout.
> --
>
> Key: MESOS-7263
> URL: https://issues.apache.org/jira/browse/MESOS-7263
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.2.0
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>  Labels: mesosphere
> Fix For: 1.3.0
>
>
> The default executor causes task/command environment variables to get 
> duplicated internally, causing warnings in the resulting sandbox {{stdout}}.
> {noformat}
> $ ./src/mesos-execute --name="test" --env='{"key1":"value1"}' 
> --command='sleep 1000' --master=127.0.0.1:5050
> {noformat}
> Result in {{stdout}} of the sandbox:
> {noformat}
> Overwriting environment variable 'key1', original: 'value1', new: 'value1'
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7263) User supplied task environment variables cause warnings in sandbox stdout.

2017-03-24 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15939880#comment-15939880
 ] 

Alexander Rukletsov commented on MESOS-7263:


{noformat}
Commit: 71a9feffd768d8857da34e2d6d06cd765403ccbc [71a9fef]
Author: Till Toenshoff 
Date: 24 March 2017 at 06:57:42 GMT+1
Committer: Alexander Rukletsov 
Commit Date: 24 March 2017 at 07:05:25 GMT+1

Fixed environment duplication in command executor.

Review: https://reviews.apache.org/r/57762/
{noformat}

> User supplied task environment variables cause warnings in sandbox stdout.
> --
>
> Key: MESOS-7263
> URL: https://issues.apache.org/jira/browse/MESOS-7263
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.2.0
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>  Labels: mesosphere
> Fix For: 1.3.0
>
>
> The default executor causes task/command environment variables to get 
> duplicated internally, causing warnings in the resulting sandbox {{stdout}}.
> {noformat}
> $ ./src/mesos-execute --name="test" --env='{"key1":"value1"}' 
> --command='sleep 1000' --master=127.0.0.1:5050
> {noformat}
> Result in {{stdout}} of the sandbox:
> {noformat}
> Overwriting environment variable 'key1', original: 'value1', new: 'value1'
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)