[jira] [Created] (MESOS-9108) Test `ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskWithFileURI` is flaky.

2018-07-23 Thread Meng Zhu (JIRA)
Meng Zhu created MESOS-9108:
---

 Summary: Test 
`ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskWithFileURI` 
is flaky.
 Key: MESOS-9108
 URL: https://issues.apache.org/jira/browse/MESOS-9108
 Project: Mesos
  Issue Type: Bug
Reporter: Meng Zhu
Assignee: Meng Zhu
 Attachments: DefaultExecutorTest_TaskWithFileURI_badrun.txt

The test is flaky and segfault on CI ubuntu-16.04-SSL, log attached.

Looks like this is due to a race condition during the test destruction sequence:
The test 

{noformat}
  Future startingUpdate;
  Future runningUpdate;
  Future finishedUpdate;
  EXPECT_CALL(*scheduler, update(_, _))
.WillOnce(
DoAll(
FutureArg<1>(),
v1::scheduler::SendAcknowledge(frameworkId, agentId)))
.WillOnce(
DoAll(
FutureArg<1>(),
v1::scheduler::SendAcknowledge(frameworkId, agentId)))
.WillOnce(
DoAll(
FutureArg<1>(),
v1::scheduler::SendAcknowledge(frameworkId, agentId)));

  mesos.send(
  v1::createCallAccept(
  frameworkId,
  offer,
  {v1::LAUNCH_GROUP(
  executorInfo, v1::createTaskGroupInfo({taskInfo}))}));

  AWAIT_READY(startingUpdate);
  ASSERT_EQ(v1::TASK_STARTING, startingUpdate->status().state());
  ASSERT_EQ(taskInfo.task_id(), startingUpdate->status().task_id());

  AWAIT_READY(runningUpdate);
  ASSERT_EQ(v1::TASK_RUNNING, runningUpdate->status().state());
  ASSERT_EQ(taskInfo.task_id(), runningUpdate->status().task_id());

  AWAIT_READY(finishedUpdate);
  ASSERT_EQ(v1::TASK_FINISHED, finishedUpdate->status().state());
  ASSERT_EQ(taskInfo.task_id(), finishedUpdate->status().task_id());
}
{noformat}

The sending acknowledgment of the last task status update (TASK_FINISHED) could 
race with the scheduler destruction. Removing the last ack should fix the test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks

2018-07-23 Thread Ilya Pronin (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16543695#comment-16543695
 ] 

Ilya Pronin edited comment on MESOS-9007 at 7/24/18 12:43 AM:
--

Review requests:
https://reviews.apache.org/r/67915/
https://reviews.apache.org/r/67914/
https://reviews.apache.org/r/68029/


was (Author: ipronin):
Review requests:
https://reviews.apache.org/r/67915/
https://reviews.apache.org/r/67914/

> XFS disk isolator doesn't clean up project ID from symlinks
> ---
>
> Key: MESOS-9007
> URL: https://issues.apache.org/jira/browse/MESOS-9007
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.5.0
>Reporter: Ilya Pronin
>Assignee: Ilya Pronin
>Priority: Minor
>
> Upon container destruction its project ID is unallocated by the isolator and 
> removed from the container work directory. However the removing function 
> skips symbolic links and because of that the project still exists until the 
> container directory is garbage collected. If the project ID is reused for a 
> new container, any lingering symlinks that still have that project ID will 
> contribute to disk usage of the new container. Typically symlinks don't take 
> much space, but still this leads to inaccuracy in disk space usage accounting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-7947) Add GC capability to nested containers

2018-07-23 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553625#comment-16553625
 ] 

Joseph Wu commented on MESOS-7947:
--

In terms of GC-ing container sandboxes created via the LAUNCH_CONTAINER APIs, I 
think it will be relatively neat to pass the Agent's GarbageCollector to the 
Containerizer.  The Containerizer is the one with direct access to the sandbox 
directories (held within the checkpointed {{ContainerConfig}} protobufs) and 
can schedule GC whenever a container exits, or during recovery.  In future, if 
we provide a GCPolicy, that information would presumably be checkpointed into 
the {{ContainerConfig}} too; so it would be better to give the Containerizer 
access to the GarbageCollector.

This implementation should cover both nested containers and standalone 
containers.  And it would protect against the case where the user/executor 
forgets to call REMOVE_CONTAINER.

For now, the plan is to defer making framework changes.  Instead of adding a 
boolean or protobuf GCPolicy, I'll add an agent flag to tell the agent to GC 
non-executor sandboxes by default.  I don't have a nice name for this flag yet 
(currently {{--gc_non_executor_container_sandboxes}}.

---

Additionally, since the default executor (and custom executors) can be 
long-lived and run many tasks in its lifetime, we'll need to prune some of the 
Task metadata.  This is limited to directories like 
{{/meta/slaves//frameworks//executors//runs//tasks/}}.
  This metadata GC will happen for all tasks, and frameworks shouldn't need to 
change how this works.

> Add GC capability to nested containers
> --
>
> Key: MESOS-7947
> URL: https://issues.apache.org/jira/browse/MESOS-7947
> Project: Mesos
>  Issue Type: Improvement
>  Components: executor
>Reporter: Chun-Hung Hsiao
>Assignee: Joseph Wu
>Priority: Major
>
> We should extend the existing API or add a new API for nested containers for 
> an executor to tell the Mesos agent that a nested container is no longer 
> needed and can be scheduled for GC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-7211) Document SUPPRESS HTTP call

2018-07-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553138#comment-16553138
 ] 

ASF GitHub Bot commented on MESOS-7211:
---

Github user asfgit closed the pull request at:

https://github.com/apache/mesos/pull/301


> Document SUPPRESS HTTP call
> ---
>
> Key: MESOS-7211
> URL: https://issues.apache.org/jira/browse/MESOS-7211
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 1.1.0
>Reporter: Bruce Merry
>Priority: Minor
>  Labels: mesosphere, newbie
>
> The documentation at 
> http://mesos.apache.org/documentation/latest/scheduler-http-api/ doesn't list 
> the SUPPRESS call as one of the call types, but it does seem to be 
> implemented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9107) mesos-execute cannot use reserved resources

2018-07-23 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-9107:
---

 Summary: mesos-execute cannot use reserved resources
 Key: MESOS-9107
 URL: https://issues.apache.org/jira/browse/MESOS-9107
 Project: Mesos
  Issue Type: Bug
Reporter: Benjamin Bannier


{{mesos-execute}} cannot be used to run tasks on reserved resources, e.g., and 
invocation
{code}
$ mesos-execute \
--resources=disk(some/role):32;cpus:0.1;mem:32 \
--command=true \
--name= \
--role=some/role \
--master=
{code}

will not be able to use {{disk}} resources reserved to {{some/role}}.

This is due to the way {{mesos-execute}} performs its offer matching, e.g., as 
of {{9af920c75d1}}

{code}
if (!launched && offered.toUnreserved().contains(requiredResources)) {
{code}

Any reservations are stripped from offers before matching against the required 
resources.

We should update {{mesos-execute}} so it can make use of reserved resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9106) Add seccomp filter into containerizer launcher.

2018-07-23 Thread Andrei Budnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552909#comment-16552909
 ] 

Andrei Budnik commented on MESOS-9106:
--

[https://reviews.apache.org/r/68022/]

> Add seccomp filter into containerizer launcher.
> ---
>
> Key: MESOS-9106
> URL: https://issues.apache.org/jira/browse/MESOS-9106
> Project: Mesos
>  Issue Type: Task
>Reporter: Andrei Budnik
>Assignee: Andrei Budnik
>Priority: Major
>  Labels: mesosphere
>
> Containerizer launcher should create an instance of the `SeccompFilter` 
> class, which will be used to setup/load a Seccomp filter rules using the 
> given `ContainerSeccompProfile` message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9035) Implement `linux/seccomp` isolator

2018-07-23 Thread Andrei Budnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552908#comment-16552908
 ] 

Andrei Budnik commented on MESOS-9035:
--

[https://reviews.apache.org/r/68021/]

> Implement `linux/seccomp` isolator
> --
>
> Key: MESOS-9035
> URL: https://issues.apache.org/jira/browse/MESOS-9035
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Andrei Budnik
>Assignee: Andrei Budnik
>Priority: Major
>  Labels: mesosphere
>
> The main purpose of this isolator is to prepare `ContainerSeccompProfile` for 
> a containerizer launcher. `ContainerSeccompProfile` message is generated by 
> the isolator from a JSON-file that contains declaration of Seccomp filter 
> rules.
> In addition, seccomp isolator should check for a Seccomp support by the Linux 
> kernel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9034) Implement a wrapper class for `libseccomp` API

2018-07-23 Thread Andrei Budnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552905#comment-16552905
 ] 

Andrei Budnik commented on MESOS-9034:
--

[https://reviews.apache.org/r/68018/]

> Implement a wrapper class for `libseccomp` API
> --
>
> Key: MESOS-9034
> URL: https://issues.apache.org/jira/browse/MESOS-9034
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Andrei Budnik
>Assignee: Andrei Budnik
>Priority: Major
>  Labels: mesosphere
>
> The main purpose of this class is to provide translation of `SeccompProfile` 
> protobuf into invocations of `libseccomp` API. The main user of this class is 
> a containerizer launcher.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9106) Add seccomp filter into containerizer launcher.

2018-07-23 Thread Andrei Budnik (JIRA)
Andrei Budnik created MESOS-9106:


 Summary: Add seccomp filter into containerizer launcher.
 Key: MESOS-9106
 URL: https://issues.apache.org/jira/browse/MESOS-9106
 Project: Mesos
  Issue Type: Task
Reporter: Andrei Budnik
Assignee: Andrei Budnik


Containerizer launcher should create an instance of the `SeccompFilter` class, 
which will be used to setup/load a Seccomp filter rules using the given 
`ContainerSeccompProfile` message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9105) Implement Docker Seccomp profile parser.

2018-07-23 Thread Andrei Budnik (JIRA)
Andrei Budnik created MESOS-9105:


 Summary: Implement Docker Seccomp profile parser.
 Key: MESOS-9105
 URL: https://issues.apache.org/jira/browse/MESOS-9105
 Project: Mesos
  Issue Type: Task
Reporter: Andrei Budnik
Assignee: Andrei Budnik


The parser should translate Docker seccomp profile into the 
`ContainerSeccompProfile` protobuf message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8990) Build failure of the google-test dependency on Windows using MSVC.

2018-07-23 Thread PhoebeHui (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552619#comment-16552619
 ] 

PhoebeHui commented on MESOS-8990:
--

You can install visual studio latest release version, and follow the steps to 
verify:
 # Open x64 native command tool
 # git clone -c core.autocrlf=true https://github.com/apache/mesos D:\mesos\src
 # set _CL_=/D_HAS_AUTO_PTR_ETC=1 /D_HAS_TR1_NAMESPACE=1 
/D_SILENCE_TR1_NAMESPACE_DEPRECATION_WARNING /std:c++latest
 # cd d:\Mesos\src
 # .\bootstrap.bat
 # mkdir build_x64 && pushd build_x64
 # cmake ..\src -G "Visual Studio 15 2017 Win64" 
-DCMAKE_SYSTEM_VERSION=10.0.17134.0 -DENABLE_LIBEVENT=1 -DHAS_AUTHENTICATION=0 
-DPATCHEXE_PATH="C:\gnuwin32\bin" -T host=x64
 # msbuild Mesos.sln /p:Configuration=Debug /p:Platform=x64 /maxcpucount:4 
/t:Rebuild

> Build failure of the google-test dependency on Windows using MSVC. 
> ---
>
> Key: MESOS-8990
> URL: https://issues.apache.org/jira/browse/MESOS-8990
> Project: Mesos
>  Issue Type: Task
>  Components: build
>Reporter: PhoebeHui
>Assignee: Andrew Schwartzmeyer
>Priority: Blocker
>  Labels: agent, build, dependency, windows
> Attachments: googletest-release-f66ab00.patch
>
>
> Build Mesos with msvc on windows currently blocked by the following issue, 
> this issue has fixed on Goolgetest, could you help pick up it in Mesos?
> The next release msvc toolset will have this behavior.
>  See background in 
> [{color:#0066cc}https://github.com/google/googletest/issues/1616{color}], and 
> the fix 
> [{color:#0066cc}https://github.com/google/googletest/pull/1620{color}].
> The failures like:
> {noformat}
>  
> d:\mesos\build_x64\3rdparty\googletest-1.8.0\src\googletest-1.8.0\googletest\include\gtest\gtest-printers.h(249,1):
>  error C2593:  'operator <<' is ambiguous 
> [D:\Mesos\build_x64\src\tests\mesos-tests.vcxproj]
> d:\mesos\build_x64\3rdparty\googletest-1.8.0\src\googletest-1.8.0\googletest\include\gtest\gtest-printers.h(249,1):
>  error C2593:   *os << value; 
> [D:\Mesos\build_x64\src\tests\mesos-tests.vcxproj]
> d:\mesos\build_x64\3rdparty\googletest-1.8.0\src\googletest-1.8.0\googletest\include\gtest\gtest-printers.h(249,1):
>  error C2593: ^ (compiling source file 
> D:\Mesos\src\src\tests\command_executor_tests.cpp) 
> [D:\Mesos\build_x64\src\tests\mesos-tests.vcxproj]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9082) Avoid two trips through the master mailbox for state.json requests.

2018-07-23 Thread Benno Evers (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benno Evers reassigned MESOS-9082:
--

Assignee: Benno Evers

> Avoid two trips through the master mailbox for state.json requests.
> ---
>
> Key: MESOS-9082
> URL: https://issues.apache.org/jira/browse/MESOS-9082
> Project: Mesos
>  Issue Type: Task
>Reporter: Alexander Rukletsov
>Assignee: Benno Evers
>Priority: Major
>  Labels: mesosphere, performance
>
> Currently, a state.json request travels through the master's mailbox twice: 
> before authorization and after. This increases the overall state.json 
> response time by around 30%.
> To remove one mailbox trip, we can perform the initial portion (validation 
> and authorization) of state and /state off the master actor by using a 
> top-level {{Route}}, then dispatch onto the master actor only for json / 
> protobuf serialization. This should drop the authorization time down to near 
> 0 if it's indeed mostly queuing delay.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)