[jira] [Created] (MESOS-9108) Test `ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskWithFileURI` is flaky.
Meng Zhu created MESOS-9108: --- Summary: Test `ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskWithFileURI` is flaky. Key: MESOS-9108 URL: https://issues.apache.org/jira/browse/MESOS-9108 Project: Mesos Issue Type: Bug Reporter: Meng Zhu Assignee: Meng Zhu Attachments: DefaultExecutorTest_TaskWithFileURI_badrun.txt The test is flaky and segfault on CI ubuntu-16.04-SSL, log attached. Looks like this is due to a race condition during the test destruction sequence: The test {noformat} Future startingUpdate; Future runningUpdate; Future finishedUpdate; EXPECT_CALL(*scheduler, update(_, _)) .WillOnce( DoAll( FutureArg<1>(&startingUpdate), v1::scheduler::SendAcknowledge(frameworkId, agentId))) .WillOnce( DoAll( FutureArg<1>(&runningUpdate), v1::scheduler::SendAcknowledge(frameworkId, agentId))) .WillOnce( DoAll( FutureArg<1>(&finishedUpdate), v1::scheduler::SendAcknowledge(frameworkId, agentId))); mesos.send( v1::createCallAccept( frameworkId, offer, {v1::LAUNCH_GROUP( executorInfo, v1::createTaskGroupInfo({taskInfo}))})); AWAIT_READY(startingUpdate); ASSERT_EQ(v1::TASK_STARTING, startingUpdate->status().state()); ASSERT_EQ(taskInfo.task_id(), startingUpdate->status().task_id()); AWAIT_READY(runningUpdate); ASSERT_EQ(v1::TASK_RUNNING, runningUpdate->status().state()); ASSERT_EQ(taskInfo.task_id(), runningUpdate->status().task_id()); AWAIT_READY(finishedUpdate); ASSERT_EQ(v1::TASK_FINISHED, finishedUpdate->status().state()); ASSERT_EQ(taskInfo.task_id(), finishedUpdate->status().task_id()); } {noformat} The sending acknowledgment of the last task status update (TASK_FINISHED) could race with the scheduler destruction. Removing the last ack should fix the test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks
[ https://issues.apache.org/jira/browse/MESOS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16543695#comment-16543695 ] Ilya Pronin edited comment on MESOS-9007 at 7/24/18 12:43 AM: -- Review requests: https://reviews.apache.org/r/67915/ https://reviews.apache.org/r/67914/ https://reviews.apache.org/r/68029/ was (Author: ipronin): Review requests: https://reviews.apache.org/r/67915/ https://reviews.apache.org/r/67914/ > XFS disk isolator doesn't clean up project ID from symlinks > --- > > Key: MESOS-9007 > URL: https://issues.apache.org/jira/browse/MESOS-9007 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.5.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > Upon container destruction its project ID is unallocated by the isolator and > removed from the container work directory. However the removing function > skips symbolic links and because of that the project still exists until the > container directory is garbage collected. If the project ID is reused for a > new container, any lingering symlinks that still have that project ID will > contribute to disk usage of the new container. Typically symlinks don't take > much space, but still this leads to inaccuracy in disk space usage accounting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-7947) Add GC capability to nested containers
[ https://issues.apache.org/jira/browse/MESOS-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553625#comment-16553625 ] Joseph Wu commented on MESOS-7947: -- In terms of GC-ing container sandboxes created via the LAUNCH_CONTAINER APIs, I think it will be relatively neat to pass the Agent's GarbageCollector to the Containerizer. The Containerizer is the one with direct access to the sandbox directories (held within the checkpointed {{ContainerConfig}} protobufs) and can schedule GC whenever a container exits, or during recovery. In future, if we provide a GCPolicy, that information would presumably be checkpointed into the {{ContainerConfig}} too; so it would be better to give the Containerizer access to the GarbageCollector. This implementation should cover both nested containers and standalone containers. And it would protect against the case where the user/executor forgets to call REMOVE_CONTAINER. For now, the plan is to defer making framework changes. Instead of adding a boolean or protobuf GCPolicy, I'll add an agent flag to tell the agent to GC non-executor sandboxes by default. I don't have a nice name for this flag yet (currently {{--gc_non_executor_container_sandboxes}}. --- Additionally, since the default executor (and custom executors) can be long-lived and run many tasks in its lifetime, we'll need to prune some of the Task metadata. This is limited to directories like {{/meta/slaves//frameworks//executors//runs//tasks/}}. This metadata GC will happen for all tasks, and frameworks shouldn't need to change how this works. > Add GC capability to nested containers > -- > > Key: MESOS-7947 > URL: https://issues.apache.org/jira/browse/MESOS-7947 > Project: Mesos > Issue Type: Improvement > Components: executor >Reporter: Chun-Hung Hsiao >Assignee: Joseph Wu >Priority: Major > > We should extend the existing API or add a new API for nested containers for > an executor to tell the Mesos agent that a nested container is no longer > needed and can be scheduled for GC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-7211) Document SUPPRESS HTTP call
[ https://issues.apache.org/jira/browse/MESOS-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553138#comment-16553138 ] ASF GitHub Bot commented on MESOS-7211: --- Github user asfgit closed the pull request at: https://github.com/apache/mesos/pull/301 > Document SUPPRESS HTTP call > --- > > Key: MESOS-7211 > URL: https://issues.apache.org/jira/browse/MESOS-7211 > Project: Mesos > Issue Type: Documentation > Components: documentation >Affects Versions: 1.1.0 >Reporter: Bruce Merry >Priority: Minor > Labels: mesosphere, newbie > > The documentation at > http://mesos.apache.org/documentation/latest/scheduler-http-api/ doesn't list > the SUPPRESS call as one of the call types, but it does seem to be > implemented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9107) mesos-execute cannot use reserved resources
Benjamin Bannier created MESOS-9107: --- Summary: mesos-execute cannot use reserved resources Key: MESOS-9107 URL: https://issues.apache.org/jira/browse/MESOS-9107 Project: Mesos Issue Type: Bug Reporter: Benjamin Bannier {{mesos-execute}} cannot be used to run tasks on reserved resources, e.g., and invocation {code} $ mesos-execute \ --resources=disk(some/role):32;cpus:0.1;mem:32 \ --command=true \ --name= \ --role=some/role \ --master= {code} will not be able to use {{disk}} resources reserved to {{some/role}}. This is due to the way {{mesos-execute}} performs its offer matching, e.g., as of {{9af920c75d1}} {code} if (!launched && offered.toUnreserved().contains(requiredResources)) { {code} Any reservations are stripped from offers before matching against the required resources. We should update {{mesos-execute}} so it can make use of reserved resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9106) Add seccomp filter into containerizer launcher.
[ https://issues.apache.org/jira/browse/MESOS-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552909#comment-16552909 ] Andrei Budnik commented on MESOS-9106: -- [https://reviews.apache.org/r/68022/] > Add seccomp filter into containerizer launcher. > --- > > Key: MESOS-9106 > URL: https://issues.apache.org/jira/browse/MESOS-9106 > Project: Mesos > Issue Type: Task >Reporter: Andrei Budnik >Assignee: Andrei Budnik >Priority: Major > Labels: mesosphere > > Containerizer launcher should create an instance of the `SeccompFilter` > class, which will be used to setup/load a Seccomp filter rules using the > given `ContainerSeccompProfile` message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9035) Implement `linux/seccomp` isolator
[ https://issues.apache.org/jira/browse/MESOS-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552908#comment-16552908 ] Andrei Budnik commented on MESOS-9035: -- [https://reviews.apache.org/r/68021/] > Implement `linux/seccomp` isolator > -- > > Key: MESOS-9035 > URL: https://issues.apache.org/jira/browse/MESOS-9035 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Andrei Budnik >Assignee: Andrei Budnik >Priority: Major > Labels: mesosphere > > The main purpose of this isolator is to prepare `ContainerSeccompProfile` for > a containerizer launcher. `ContainerSeccompProfile` message is generated by > the isolator from a JSON-file that contains declaration of Seccomp filter > rules. > In addition, seccomp isolator should check for a Seccomp support by the Linux > kernel. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9034) Implement a wrapper class for `libseccomp` API
[ https://issues.apache.org/jira/browse/MESOS-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552905#comment-16552905 ] Andrei Budnik commented on MESOS-9034: -- [https://reviews.apache.org/r/68018/] > Implement a wrapper class for `libseccomp` API > -- > > Key: MESOS-9034 > URL: https://issues.apache.org/jira/browse/MESOS-9034 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Andrei Budnik >Assignee: Andrei Budnik >Priority: Major > Labels: mesosphere > > The main purpose of this class is to provide translation of `SeccompProfile` > protobuf into invocations of `libseccomp` API. The main user of this class is > a containerizer launcher. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9106) Add seccomp filter into containerizer launcher.
Andrei Budnik created MESOS-9106: Summary: Add seccomp filter into containerizer launcher. Key: MESOS-9106 URL: https://issues.apache.org/jira/browse/MESOS-9106 Project: Mesos Issue Type: Task Reporter: Andrei Budnik Assignee: Andrei Budnik Containerizer launcher should create an instance of the `SeccompFilter` class, which will be used to setup/load a Seccomp filter rules using the given `ContainerSeccompProfile` message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9105) Implement Docker Seccomp profile parser.
Andrei Budnik created MESOS-9105: Summary: Implement Docker Seccomp profile parser. Key: MESOS-9105 URL: https://issues.apache.org/jira/browse/MESOS-9105 Project: Mesos Issue Type: Task Reporter: Andrei Budnik Assignee: Andrei Budnik The parser should translate Docker seccomp profile into the `ContainerSeccompProfile` protobuf message. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-8990) Build failure of the google-test dependency on Windows using MSVC.
[ https://issues.apache.org/jira/browse/MESOS-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552619#comment-16552619 ] PhoebeHui commented on MESOS-8990: -- You can install visual studio latest release version, and follow the steps to verify: # Open x64 native command tool # git clone -c core.autocrlf=true https://github.com/apache/mesos D:\mesos\src # set _CL_=/D_HAS_AUTO_PTR_ETC=1 /D_HAS_TR1_NAMESPACE=1 /D_SILENCE_TR1_NAMESPACE_DEPRECATION_WARNING /std:c++latest # cd d:\Mesos\src # .\bootstrap.bat # mkdir build_x64 && pushd build_x64 # cmake ..\src -G "Visual Studio 15 2017 Win64" -DCMAKE_SYSTEM_VERSION=10.0.17134.0 -DENABLE_LIBEVENT=1 -DHAS_AUTHENTICATION=0 -DPATCHEXE_PATH="C:\gnuwin32\bin" -T host=x64 # msbuild Mesos.sln /p:Configuration=Debug /p:Platform=x64 /maxcpucount:4 /t:Rebuild > Build failure of the google-test dependency on Windows using MSVC. > --- > > Key: MESOS-8990 > URL: https://issues.apache.org/jira/browse/MESOS-8990 > Project: Mesos > Issue Type: Task > Components: build >Reporter: PhoebeHui >Assignee: Andrew Schwartzmeyer >Priority: Blocker > Labels: agent, build, dependency, windows > Attachments: googletest-release-f66ab00.patch > > > Build Mesos with msvc on windows currently blocked by the following issue, > this issue has fixed on Goolgetest, could you help pick up it in Mesos? > The next release msvc toolset will have this behavior. > See background in > [{color:#0066cc}https://github.com/google/googletest/issues/1616{color}], and > the fix > [{color:#0066cc}https://github.com/google/googletest/pull/1620{color}]. > The failures like: > {noformat} > > d:\mesos\build_x64\3rdparty\googletest-1.8.0\src\googletest-1.8.0\googletest\include\gtest\gtest-printers.h(249,1): > error C2593: 'operator <<' is ambiguous > [D:\Mesos\build_x64\src\tests\mesos-tests.vcxproj] > d:\mesos\build_x64\3rdparty\googletest-1.8.0\src\googletest-1.8.0\googletest\include\gtest\gtest-printers.h(249,1): > error C2593: *os << value; > [D:\Mesos\build_x64\src\tests\mesos-tests.vcxproj] > d:\mesos\build_x64\3rdparty\googletest-1.8.0\src\googletest-1.8.0\googletest\include\gtest\gtest-printers.h(249,1): > error C2593: ^ (compiling source file > D:\Mesos\src\src\tests\command_executor_tests.cpp) > [D:\Mesos\build_x64\src\tests\mesos-tests.vcxproj] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-9082) Avoid two trips through the master mailbox for state.json requests.
[ https://issues.apache.org/jira/browse/MESOS-9082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benno Evers reassigned MESOS-9082: -- Assignee: Benno Evers > Avoid two trips through the master mailbox for state.json requests. > --- > > Key: MESOS-9082 > URL: https://issues.apache.org/jira/browse/MESOS-9082 > Project: Mesos > Issue Type: Task >Reporter: Alexander Rukletsov >Assignee: Benno Evers >Priority: Major > Labels: mesosphere, performance > > Currently, a state.json request travels through the master's mailbox twice: > before authorization and after. This increases the overall state.json > response time by around 30%. > To remove one mailbox trip, we can perform the initial portion (validation > and authorization) of state and /state off the master actor by using a > top-level {{Route}}, then dispatch onto the master actor only for json / > protobuf serialization. This should drop the authorization time down to near > 0 if it's indeed mostly queuing delay. -- This message was sent by Atlassian JIRA (v7.6.3#76005)