[jira] [Updated] (MESOS-3074) Validate and Persist Quota Request
[ https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3074: --- Sprint: Mesosphere Sprint 15 Validate and Persist Quota Request -- Key: MESOS-3074 URL: https://issues.apache.org/jira/browse/MESOS-3074 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We need to to validate and persist quota request in the Mesos Master as outlined in the Design Doc: https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2200) bogus docker images result in bad error message to scheduler
[ https://issues.apache.org/jira/browse/MESOS-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2200: --- Sprint: (was: Mesosphere Sprint 15) bogus docker images result in bad error message to scheduler Key: MESOS-2200 URL: https://issues.apache.org/jira/browse/MESOS-2200 Project: Mesos Issue Type: Bug Components: containerization, docker Reporter: Jay Buffington Assignee: Joerg Schad Labels: mesosphere When a scheduler specifies a bogus image in ContainerInfo mesos doesn't tell the scheduler that the docker pull failed or why. This error is logged in the mesos-slave log, but it isn't given to the scheduler (as far as I can tell): {noformat} E1218 23:50:55.406230 8123 slave.cpp:2730] Container '8f70784c-3e40-4072-9ca2-9daed23f15ff' for executor 'thermos-1418946354013-xxx-xxx-curl-0-f500cc41-dd0a-4338-8cbc-d631cb588bb1' of framework '20140522-213145-1749004561-5050-29512-' failed to start: Failed to 'docker pull docker-registry.example.com/doesntexist/hello1.1:latest': exit status = exited with status 1 stderr = 2014/12/18 23:50:55 Error: image doesntexist/hello1.1 not found {noformat} If the docker image is not in the registry, the scheduler should give the user an error message. If docker pull failed because of networking issues, it should be retried. Mesos should give the scheduler enough information to be able to make that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2660) ROOT_CGROUPS_Listen test is flaky
[ https://issues.apache.org/jira/browse/MESOS-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-2660: -- Assignee: Artem Harutyunyan (was: Chi Zhang) ROOT_CGROUPS_Listen test is flaky - Key: MESOS-2660 URL: https://issues.apache.org/jira/browse/MESOS-2660 Project: Mesos Issue Type: Bug Reporter: Jie Yu Assignee: Artem Harutyunyan Fix For: 0.23.0 [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest [ RUN ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen Failed to allocate RSS memory: Failed to lock memory, mlock: Resource temporarily unavailable../../../mesos/src/tests/cgroups_tests.cpp:571: Failure Failed to wait 15secs for future [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen (15121 ms) [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest (15121 ms total) [--] Global test environment tear-down [==] 1 test from 1 test case ran. (15174 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2660) ROOT_CGROUPS_Listen test is flaky
[ https://issues.apache.org/jira/browse/MESOS-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635362#comment-14635362 ] Bernd Mathiske commented on MESOS-2660: --- This still fails on Ubuntu 12.04, although not flaky, but every time. ROOT_CGROUPS_Listen test is flaky - Key: MESOS-2660 URL: https://issues.apache.org/jira/browse/MESOS-2660 Project: Mesos Issue Type: Bug Reporter: Jie Yu Assignee: Chi Zhang Fix For: 0.23.0 [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest [ RUN ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen Failed to allocate RSS memory: Failed to lock memory, mlock: Resource temporarily unavailable../../../mesos/src/tests/cgroups_tests.cpp:571: Failure Failed to wait 15secs for future [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen (15121 ms) [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest (15121 ms total) [--] Global test environment tear-down [==] 1 test from 1 test case ran. (15174 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2216) The configure phase breaks with the IBM JVM.
[ https://issues.apache.org/jira/browse/MESOS-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635204#comment-14635204 ] Tony Reix commented on MESOS-2216: -- I've just noticed that, on my Linux Ubuntu Intel PC, I'm using a i386 version of Linux/Java : 32 bits. Maybe it has some impact ? However, trying on a x86_64 machine and JVM with your ykrips github, I've got the same : configure: error: failed to determine linker flags for using Java (bad JAVA_HOME or missing support for your architecture?) reixt@dorado-vm3:~/MESOS/mesos/build$ echo $JAVA_HOME /usr/lib/jvm/ibm-java-x86_64-71 reixt@dorado-vm3:~/MESOS/mesos/build$ java -version java version 1.7.0 Java(TM) SE Runtime Environment (build pxa6470_27sr3-20150415_01(SR3)) IBM J9 VM (build 2.7, JRE 1.7.0 Linux amd64-64 Compressed References 20150407_243189 (JIT enabled, AOT enabled) J9VM - R27_Java727_SR3_20150407_1831_B243189 JIT - tr.r13.java_20150406_89182 GC - R27_Java727_SR3_20150407_1831_B243189_CMPRSS J9CL - 20150407_243189) JCL - 20150414_01 based on Oracle 7u79-b14 The configure phase breaks with the IBM JVM. -- Key: MESOS-2216 URL: https://issues.apache.org/jira/browse/MESOS-2216 Project: Mesos Issue Type: Bug Affects Versions: 0.20.1, 1.0.0 Environment: Ubuntu / x86_64 Reporter: Tony Reix Attachments: MESOS-2216_1.patch, config.log, jniport.h ./configure does not work with IBM JVM, since it looks for a directory: /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server x86_64 /usr/lib/jvm/ibm-java-ppc64le-71/jre/lib/ppc64le/serverPPC64 LE that does not exist for the IBM JVM. Though this directory does exist for Oracle JVM and Open JDK: /usr/lib/jvm/jdk1.7.0_71/jre/lib/amd64/server Oracle JVM /usr/lib/jvm/java-1.7.0-openjdk-amd64/jre/lib/amd64/server OpenJDK However, the files: libjsig.so libjvm.so (3 versions) do exist for IBM JVM. Anyway, creating the server directory and copying the files (tried with the 3 versions of libjvm.so) does not fix the issue: checking whether or not we can build with JNI... /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlopen' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlclose' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlerror' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlsym' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dladdr' Something (dlopen, dlclose, dlerror, dlsym, dladdr) is missing in IBM JVM. So, either the configure step relies on a feature that is not in the Java standard but only in the Oracle JVM and OpenJDK, or the IBM JVM lacks part of the Java standard. I'm not an expert about this. So, I'd like Mesos people to experiment with IBM JVM. Maybe there is another solution for this step of the Mesos configure that would work with all 3 JVMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3086: --- Sprint: Mesosphere Sprint 15 Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running). Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2216) The configure phase breaks with the IBM JVM.
[ https://issues.apache.org/jira/browse/MESOS-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635219#comment-14635219 ] Tony Reix commented on MESOS-2216: -- However, still on this x86_64 machine, patching the 0.21.0 version from tar.gz DID work ! Attaching the trace to this defect. So, maybe the issue deals with i386 vs x86_64 ? The configure phase breaks with the IBM JVM. -- Key: MESOS-2216 URL: https://issues.apache.org/jira/browse/MESOS-2216 Project: Mesos Issue Type: Bug Affects Versions: 0.20.1, 1.0.0 Environment: Ubuntu / x86_64 Reporter: Tony Reix Attachments: MESOS-2216_1.patch, config.log, jniport.h ./configure does not work with IBM JVM, since it looks for a directory: /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server x86_64 /usr/lib/jvm/ibm-java-ppc64le-71/jre/lib/ppc64le/serverPPC64 LE that does not exist for the IBM JVM. Though this directory does exist for Oracle JVM and Open JDK: /usr/lib/jvm/jdk1.7.0_71/jre/lib/amd64/server Oracle JVM /usr/lib/jvm/java-1.7.0-openjdk-amd64/jre/lib/amd64/server OpenJDK However, the files: libjsig.so libjvm.so (3 versions) do exist for IBM JVM. Anyway, creating the server directory and copying the files (tried with the 3 versions of libjvm.so) does not fix the issue: checking whether or not we can build with JNI... /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlopen' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlclose' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlerror' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlsym' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dladdr' Something (dlopen, dlclose, dlerror, dlsym, dladdr) is missing in IBM JVM. So, either the configure step relies on a feature that is not in the Java standard but only in the Oracle JVM and OpenJDK, or the IBM JVM lacks part of the Java standard. I'm not an expert about this. So, I'd like Mesos people to experiment with IBM JVM. Maybe there is another solution for this step of the Mesos configure that would work with all 3 JVMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota
[ https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3073: --- Sprint: Mesosphere Sprint 15 Introduce HTTP endpoints for Quota -- Key: MESOS-3073 URL: https://issues.apache.org/jira/browse/MESOS-3073 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We need to implement the HTTP endpoints for Quota as outlined in the Design Doc (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2552) C++ Scheduler library should send HTTP Calls to master
[ https://issues.apache.org/jira/browse/MESOS-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar updated MESOS-2552: -- Sprint: (was: Mesosphere Sprint 15) C++ Scheduler library should send HTTP Calls to master -- Key: MESOS-2552 URL: https://issues.apache.org/jira/browse/MESOS-2552 Project: Mesos Issue Type: Bug Reporter: Vinod Kone Assignee: Anand Mazumdar Labels: mesosphere Once the scheduler library sends Call messages, we should update it to send Calls as HTTP requests to /call endpoint on master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2660) ROOT_CGROUPS_Listen test is flaky
[ https://issues.apache.org/jira/browse/MESOS-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635363#comment-14635363 ] Bernd Mathiske commented on MESOS-2660: --- [~hartem] contributed this patch that he is working on: https://reviews.apache.org/r/36627/ ROOT_CGROUPS_Listen test is flaky - Key: MESOS-2660 URL: https://issues.apache.org/jira/browse/MESOS-2660 Project: Mesos Issue Type: Bug Reporter: Jie Yu Assignee: Chi Zhang Fix For: 0.23.0 [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest [ RUN ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen Failed to allocate RSS memory: Failed to lock memory, mlock: Resource temporarily unavailable../../../mesos/src/tests/cgroups_tests.cpp:571: Failure Failed to wait 15secs for future [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen (15121 ms) [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest (15121 ms total) [--] Global test environment tear-down [==] 1 test from 1 test case ran. (15174 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2673) Follow Google Style Guide for header file include order completely.
[ https://issues.apache.org/jira/browse/MESOS-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2673: --- Sprint: Mesosphere Sprint 10, Mesosphere Sprint 11 (was: Mesosphere Sprint 10, Mesosphere Sprint 11, Mesosphere Sprint 15) Follow Google Style Guide for header file include order completely. --- Key: MESOS-2673 URL: https://issues.apache.org/jira/browse/MESOS-2673 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Priority: Minor Labels: mesosphere Fix For: 0.24.0 The header include order for Mesos actually follows the Google Styleguide but omits step 1 without mentioning this exception in the Mesos styleguide. This proposal suggests to adapt to the include order explained in the Google Styleguide i.e. include the direct headers first in the .cpp files implementing them. A gist of the proposal can be found here: https://gist.github.com/joerg84/65cb9611d24b2e35b69b The corresponding Review Board review can be found here: https://reviews.apache.org/r/33646/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2216) The configure phase breaks with the IBM JVM.
[ https://issues.apache.org/jira/browse/MESOS-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Reix updated MESOS-2216: - Attachment: x86_64_traces Traces from Ubuntu x86_64. The configure phase breaks with the IBM JVM. -- Key: MESOS-2216 URL: https://issues.apache.org/jira/browse/MESOS-2216 Project: Mesos Issue Type: Bug Affects Versions: 0.20.1, 1.0.0 Environment: Ubuntu / x86_64 Reporter: Tony Reix Attachments: MESOS-2216_1.patch, config.log, jniport.h, x86_64_traces ./configure does not work with IBM JVM, since it looks for a directory: /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server x86_64 /usr/lib/jvm/ibm-java-ppc64le-71/jre/lib/ppc64le/serverPPC64 LE that does not exist for the IBM JVM. Though this directory does exist for Oracle JVM and Open JDK: /usr/lib/jvm/jdk1.7.0_71/jre/lib/amd64/server Oracle JVM /usr/lib/jvm/java-1.7.0-openjdk-amd64/jre/lib/amd64/server OpenJDK However, the files: libjsig.so libjvm.so (3 versions) do exist for IBM JVM. Anyway, creating the server directory and copying the files (tried with the 3 versions of libjvm.so) does not fix the issue: checking whether or not we can build with JNI... /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlopen' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlclose' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlerror' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dlsym' /usr/lib/jvm/ibm-java-x86_64-71/jre/lib/amd64/server/libjvm.so: undefined reference to `dladdr' Something (dlopen, dlclose, dlerror, dlsym, dladdr) is missing in IBM JVM. So, either the configure step relies on a feature that is not in the Java standard but only in the Oracle JVM and OpenJDK, or the IBM JVM lacks part of the Java standard. I'm not an expert about this. So, I'd like Mesos people to experiment with IBM JVM. Maybe there is another solution for this step of the Mesos configure that would work with all 3 JVMs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3074) Validate and Persist Quota Request
[ https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-3074: -- Assignee: Joerg Schad Validate and Persist Quota Request -- Key: MESOS-3074 URL: https://issues.apache.org/jira/browse/MESOS-3074 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We need to to validate and persist quota request in the Mesos Master as outlined in the Design Doc: https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota
[ https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-3073: -- Assignee: Joerg Schad Introduce HTTP endpoints for Quota -- Key: MESOS-3073 URL: https://issues.apache.org/jira/browse/MESOS-3073 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We need to implement the HTTP endpoints for Quota as outlined in the Design Doc (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3107) Define CMake style guide
[ https://issues.apache.org/jira/browse/MESOS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-3107: Sprint: Mesosphere Sprint 15 Story Points: 3 Define CMake style guide Key: MESOS-3107 URL: https://issues.apache.org/jira/browse/MESOS-3107 Project: Mesos Issue Type: Task Components: build Reporter: Alex Clemmer Assignee: Alex Clemmer Labels: build, cmake The short story is that it is important to be principled about how the CMake build system is maintained, because there CMake language makes it difficult to statically verify that a configuration is correct. It is not unique in this regard, but (make is arguably even worse) but it is something that's important to make sure we get right. The longer story is, CMake's language is dynamically scoped and often has somewhat odd defaults for variable values (_e.g._, IIRC, target names passed to ExternalProject_Add default to PREFIX instead of erroring out). This means that it is rare to get a configuration-time error (_i.e._, CMake usually doesn't say something like hey this variable isn't defined), and in large projects, this can make it very difficult to know where definitions come from, or whether it's important that one config routine runs before another. Dynamic scoping also makes it particularly easy to write spaghetti code, which is clearly undesirable for something as important as a build system. Thus, it is particularly important that we lay down our expectations for how the CMake system is to be structured. This might include: * Function naming (_e.g._, making it easy to tell whether a function was defined by us, and where it was defined; so we might say that we want our functions to have an underscore to start, and start with the package the come from, like libprocess, so that we know where to look for the definition.) * What assertions we want to check variable values against, so that we can replace subtle errors (_e.g._, a library is accidentally named something silly like PREFIX.0.0.1) with an obvious ones (_e.g._, You have failed to define your target name, so CMake has defaulted to 'PREFIX'; please check your configuration routines) * Decisions of what goes where. (_e.g._, the most complex parts of the CMake MVPs is in the configuration routines, like `MesosConfigure.cmake`; to curb this, we should have strict rules about what goes in that file vs other files, and how we know what is to be run before what. Part of this should probably be prominent comments explaining the structure of the project, so that people aren't confused!) * And so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3106) Extend CMake build system to support building against third-party libraries from either the system or the local Mesos rebundling
[ https://issues.apache.org/jira/browse/MESOS-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-3106: Sprint: Mesosphere Sprint 15 Story Points: 5 Extend CMake build system to support building against third-party libraries from either the system or the local Mesos rebundling Key: MESOS-3106 URL: https://issues.apache.org/jira/browse/MESOS-3106 Project: Mesos Issue Type: Task Components: build Reporter: Alex Clemmer Assignee: Alex Clemmer Labels: build, cmake Currently Mesos has third-party dependencies of two types: (1) those that are expected to be on the system (such as APR, libsvn, _etc_.), and (2) those that have been historically bundled as tarballs inside the Mesos repository, and are not expected to be on the system when Mesos is installed (these are located in the `3rdparty/` directory, and includes things like boost and glog). For type (2), the MVP of the CMake-based build system will always pull down a fresh tarball from an external source, instead of using the bundled tarballs in the `3rdparty/` folder. However, many CI systems do not have Internet access, so in the long term, we need to provide many options for getting these dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-898) Introduce CMake as an alternative build system.
[ https://issues.apache.org/jira/browse/MESOS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Hindman updated MESOS-898: --- Labels: build mesosphere (was: build) Introduce CMake as an alternative build system. --- Key: MESOS-898 URL: https://issues.apache.org/jira/browse/MESOS-898 Project: Mesos Issue Type: Epic Components: build Reporter: Timothy St. Clair Assignee: Alex Clemmer Labels: build, mesosphere This is a rather substantial undertaking, so I would want upstream debate+buy-in prior to full commitment. The basic premise is: upstream rebundles several of its dependencies in part to tightly control its stack. This is not out of the norm, but in order to be picked up by distribution channels it needs to built against system dependencies, and rebundling is strictly forbidden. Given that the mesos primary target platform are data-center distributions such as RHEL/CENTOS/SL it makes sense to still have bundling support for those who do not have dependencies in their channels yet. This is where cmake can be win with it's uber macros (http://www.cmake.org/cmake/help/v2.8.8/cmake.html#module:ExternalProject). I do not know of any equivalent in the autotools world, other then to brew your own solution. I've done this type of work in the past, and completely transformed condor and would leverage a lot of the work that was done there. I currently have a tracking branch where I've started this work, but before I go off into the woods, it makes sense to have a debate in public. The primary benefits are: 1. Enable downstream channels to easily distro without carrying a large patch sets. 2. Still support existing non-proper distribution methods. 3. Harden / future proof dependent interfaces. Side Benefits: Audit current build mechanics. - Presently the language specific binding are not installed. (.py .jar) - make -jX currently fails - optionally look in arm support. Costs: 1. Time 2. Potential temporary destabilization 3. Infrastructure around build+test may need to change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3114) Simplify JSON::* by providing jsonify along the lines of stringify
Kapil Arya created MESOS-3114: - Summary: Simplify JSON::* by providing jsonify along the lines of stringify Key: MESOS-3114 URL: https://issues.apache.org/jira/browse/MESOS-3114 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya We want to be able to do things like: {code} JSON::Value number1 = 25; JSON::Number number2 = 26; EXPECT_NE(number1, number2); EXPECT_EQ(jsonify(12), number1); EXPECT_EQ(jsonify(12), number2); {/code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3098) Implement WindowsContainerizer and WindowsDockerContainerizer
[ https://issues.apache.org/jira/browse/MESOS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635466#comment-14635466 ] Alex Clemmer commented on MESOS-3098: - Depends what you mean by the Windows Container API. The Windows folks have actually ported over the Docker engine itself. The goal of that API is to be identical; whether they accomplished that, we will see. As for the magic the DockerContainerizer does to the cgroups, I didn't realize this was so involved. If we want to have something to show people before MesosCon, we should probably loop you in with the Windows Container people immediately to see if there are going to be any problems. Implement WindowsContainerizer and WindowsDockerContainerizer - Key: MESOS-3098 URL: https://issues.apache.org/jira/browse/MESOS-3098 Project: Mesos Issue Type: Task Components: containerization Reporter: Joseph Wu Assignee: Alex Clemmer Labels: mesosphere The MVP for Windows support is a containerizer that (1) runs on Windows, and (2) runs and passes all the tests that are relevant to the Windows platform (_e.g._, not the tests that involve cgroups). To do this we require at least a `WindowsContainerizer` (to be implemented alongside the `MesosContainerizer`), which provides no meaningful (_e.g._) process namespacing (much like the default unix containerizer). In the long term (hopefully before MesosCon) we want to support also the Windows container API. This will require implementing a separate containerizer, maybe called `WindowsDockerContainerizer`. Since the Windows container API is actually officially supported through the Docker interface (_i.e._, MSFT actually ported the Docker engine to Windows, and that is the official API), the interfaces (like the fetcher) shouldn't change much. The tests probably will have to change, as we don't have access to any isolation primitives like cgroups for those tests. Outstanding TODO([~hausdorff]): Flesh out this description when more details are available, regarding: * The container API for Windows (when we know them) * The nuances of Windows vs Linux (when we know them) * etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3115) Convert mesos::slave::{Limitation,ExecutorRunState} into protobufs.
[ https://issues.apache.org/jira/browse/MESOS-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3115: -- Labels: mesosphere (was: ) Convert mesos::slave::{Limitation,ExecutorRunState} into protobufs. --- Key: MESOS-3115 URL: https://issues.apache.org/jira/browse/MESOS-3115 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2960) Configure DiscoveryInfo and Visibility per port
[ https://issues.apache.org/jira/browse/MESOS-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2960: -- Fix Version/s: (was: 0.24.0) Configure DiscoveryInfo and Visibility per port --- Key: MESOS-2960 URL: https://issues.apache.org/jira/browse/MESOS-2960 Project: Mesos Issue Type: Improvement Components: general Affects Versions: 0.22.1 Reporter: Frank Scholten Labels: mesosphere For Mesos Elasticsearch I like to use DiscoveryInfo to advertise the client port (9200) with Visibility=EXTERNAL so it can be discovered by load balancers while advertising the transport port (9300) as Visibility=FRAMEWORK because it is used by nodes to talk too each other and should not be load balanced. However, I can only set one DiscoveryInfo and one visibility, instead of one per port. I suggest to allow multiple DiscoveryInfo's to be configured with their own visibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3061) Expose docker container IP in Master's state.json
[ https://issues.apache.org/jira/browse/MESOS-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3061: -- Summary: Expose docker container IP in Master's state.json (was: Expose docker inspect output in Master's state.json) Expose docker container IP in Master's state.json - Key: MESOS-3061 URL: https://issues.apache.org/jira/browse/MESOS-3061 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Priority: Critical Labels: mesosphere We want to expose docker container IP to Mesos-DNS. One potential solution is to make it available via Master's state.json. We can set a label docker_inspect in TaskStatus message (when it is sent the first time with TASK_RUNNING status). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3061) Expose docker container IP in Master's state.json
[ https://issues.apache.org/jira/browse/MESOS-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3061: -- Description: We want to expose docker container IP to Mesos-DNS. One potential solution is to make it available via Master's state.json. We can set a label Docker.NetworkSettings.IPAddress in TaskStatus message (when it is sent the first time with TASK_RUNNING status). (was: We want to expose docker container IP to Mesos-DNS. One potential solution is to make it available via Master's state.json. We can set a label docker_inspect in TaskStatus message (when it is sent the first time with TASK_RUNNING status).) Expose docker container IP in Master's state.json - Key: MESOS-3061 URL: https://issues.apache.org/jira/browse/MESOS-3061 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Priority: Critical Labels: mesosphere We want to expose docker container IP to Mesos-DNS. One potential solution is to make it available via Master's state.json. We can set a label Docker.NetworkSettings.IPAddress in TaskStatus message (when it is sent the first time with TASK_RUNNING status). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3093) Support HTTPS requests in libprocess
[ https://issues.apache.org/jira/browse/MESOS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jojy Varghese updated MESOS-3093: - Shepherd: Timothy Chen Sprint: Mesosphere Sprint 15 Story Points: 3 Support HTTPS requests in libprocess Key: MESOS-3093 URL: https://issues.apache.org/jira/browse/MESOS-3093 Project: Mesos Issue Type: Improvement Reporter: Lily Chen Assignee: Jojy Varghese Labels: mesosphere In order to pull images from Docker registries, https calls are needed to securely communicate with the registry hosts. Currently, only http requests are supported through libprocess. Now that SSL sockets are available through libprocess, support for https can be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3115) Convert mesos::slave::{Limitation,ExecutorRunState} into protobufs.
Kapil Arya created MESOS-3115: - Summary: Convert mesos::slave::{Limitation,ExecutorRunState} into protobufs. Key: MESOS-3115 URL: https://issues.apache.org/jira/browse/MESOS-3115 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3117) Pass ContainerId into `slaveExecutorEnvironmentDecorator` hook
[ https://issues.apache.org/jira/browse/MESOS-3117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3117: -- Labels: mesosphere (was: ) Pass ContainerId into `slaveExecutorEnvironmentDecorator` hook -- Key: MESOS-3117 URL: https://issues.apache.org/jira/browse/MESOS-3117 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joerg Schad updated MESOS-3086: --- Description: We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running) in cases where the freezer subsystem is not available. Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. was: We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running). Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running) in cases where the freezer subsystem is not available. Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2673) Follow Google Style Guide for header file include order completely.
[ https://issues.apache.org/jira/browse/MESOS-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2673: -- Target Version/s: 0.24.0 Fix Version/s: (was: 0.24.0) Follow Google Style Guide for header file include order completely. --- Key: MESOS-2673 URL: https://issues.apache.org/jira/browse/MESOS-2673 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Priority: Minor Labels: mesosphere The header include order for Mesos actually follows the Google Styleguide but omits step 1 without mentioning this exception in the Mesos styleguide. This proposal suggests to adapt to the include order explained in the Google Styleguide i.e. include the direct headers first in the .cpp files implementing them. A gist of the proposal can be found here: https://gist.github.com/joerg84/65cb9611d24b2e35b69b The corresponding Review Board review can be found here: https://reviews.apache.org/r/33646/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3113) Add resource usage section to containerizer documentation
Niklas Quarfot Nielsen created MESOS-3113: - Summary: Add resource usage section to containerizer documentation Key: MESOS-3113 URL: https://issues.apache.org/jira/browse/MESOS-3113 Project: Mesos Issue Type: Documentation Reporter: Niklas Quarfot Nielsen Currently, the containerizer documentation doesn't touch upon the usage() API and how to interpret the collected statistics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633171#comment-14633171 ] Joerg Schad edited comment on MESOS-3086 at 7/21/15 5:23 PM: - Prior cleanUp: https://reviews.apache.org/r/36612/ Adding of TasksKiller: https://reviews.apache.org/r/36620/ was (Author: js84): https://reviews.apache.org/r/36612/ Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running). Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3039) Allow executors binding IP to be different than Slave binding IP
[ https://issues.apache.org/jira/browse/MESOS-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3039: -- Description: Currently, the Slave will bind either to the loopback IP (127.0.0.1) or to the IP passed via the '--ip' flag. When it launches a containerized executor (e.g, via Mesos Containerizer), the executor inherits the binding IP of the Slave. This is due to the fact that the '--ip' flags sets the environment variable `LIBPROCESS_IP` to the passed IP. The executor then inherits this environment variable and is forced to bind to the Slave IP. If an executor is running in its own containerized environment, with a separate IP than that of the Slave, currently there is no way of forcing it to bind to its own IP. A potential solution is to use the executor environment decorator hooks to update LIBPROCESS_IP environment variable for the executor. was: Currently, the Slave will bind either to the loopback IP (127.0.0.1) or to the IP passed via the `--ip` flag. When it launches a containerized executor (e.g, via Mesos Containerizer), the executor inherits the binding IP of the Slave. This is due to the fact that the `--ip` flags sets the environment variable `LIBPROCESS_IP` to the passed IP. The executor then inherits this environment variable and is forced to bind to the Slave IP. If an executor is running in its own containerized environment, with a separate IP than that of the Slave, currently there is no way of forcing it to bind to its own IP. A potential solution is to use the executor environment decorator hooks to update LIBPROCESS_IP environment variable for the executor. Allow executors binding IP to be different than Slave binding IP Key: MESOS-3039 URL: https://issues.apache.org/jira/browse/MESOS-3039 Project: Mesos Issue Type: Bug Reporter: Kapil Arya Assignee: Kapil Arya Priority: Critical Currently, the Slave will bind either to the loopback IP (127.0.0.1) or to the IP passed via the '--ip' flag. When it launches a containerized executor (e.g, via Mesos Containerizer), the executor inherits the binding IP of the Slave. This is due to the fact that the '--ip' flags sets the environment variable `LIBPROCESS_IP` to the passed IP. The executor then inherits this environment variable and is forced to bind to the Slave IP. If an executor is running in its own containerized environment, with a separate IP than that of the Slave, currently there is no way of forcing it to bind to its own IP. A potential solution is to use the executor environment decorator hooks to update LIBPROCESS_IP environment variable for the executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635457#comment-14635457 ] Vinod Kone commented on MESOS-3086: --- I don't follow the description. Freezer should be able to freeze any running tasks allowing us to remove the cgroups. Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running). Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3116) Pass ExecutorInfo argument into Isolator::isolate().
Kapil Arya created MESOS-3116: - Summary: Pass ExecutorInfo argument into Isolator::isolate(). Key: MESOS-3116 URL: https://issues.apache.org/jira/browse/MESOS-3116 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Some isolators need to lookup the executor environment variables to customize their isolation needs. Currently, one has to use the prepare() call to cache the executor-info to use it later during isolate() call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3114) Simplify JSON::* by providing jsonify along the lines of stringify
[ https://issues.apache.org/jira/browse/MESOS-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3114: -- Labels: mesosphere (was: ) Simplify JSON::* by providing jsonify along the lines of stringify -- Key: MESOS-3114 URL: https://issues.apache.org/jira/browse/MESOS-3114 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere We want to be able to do things like: {code} JSON::Value number1 = 25; JSON::Number number2 = 26; EXPECT_NE(number1, number2); EXPECT_EQ(jsonify(12), number1); EXPECT_EQ(jsonify(12), number2); {/code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635490#comment-14635490 ] Joerg Schad edited comment on MESOS-3086 at 7/21/15 5:57 PM: - Yes as long as the freezer subcomponent is available all is good. just in case that is not present we fall back to a very simple mechanism which fails if there are still tasks present (see https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728). was (Author: js84): Yes as long as the freezer subcomponent is available all is good. just in case that is not present we fall back to a very simple mechanism which fails if there are still tasks present (see https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp#L1725). Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running) in cases where the freezer subsystem is not available. In the current code (https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728) we will fallback to a very simple mechnism of recursivly trying to remove the cgroups which fails if there are still tasks running. Therefore we need an additional (NonFreeze)TasksKiller which doesn't rely on the freezer subsystem. This problem caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2995) Standardize use of Path
[ https://issues.apache.org/jira/browse/MESOS-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2995: -- Fix Version/s: (was: 0.24.0) Standardize use of Path Key: MESOS-2995 URL: https://issues.apache.org/jira/browse/MESOS-2995 Project: Mesos Issue Type: Improvement Components: stout Reporter: Joseph Wu Assignee: Joseph Wu Priority: Minor Labels: mesosphere, newbie, stout As per the discussion in MESOS-2965, the use of the Path object should be standardized: * Functions which effectively use Paths (as strings) should instead take Paths. * Functions which modify and return Paths (as strings) should instead return Paths. * Extraneous uses of Path.value should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3098) Implement WindowsContainerizer and WindowsDockerContainerizer
[ https://issues.apache.org/jira/browse/MESOS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635413#comment-14635413 ] Timothy Chen commented on MESOS-3098: - Is the Windows Container API 100% compatiable with the Docker remote API? Docker remote API changes from time to time so not sure what's being supported. But basically DockerContainerizer goes behind the Dockerd to update cgroup limits directly, and we need similiar capability from Windows Container API as well. And I think it's good to map out all the calls we make from Mesos - Docker and make sure they are all there, such as logs, kill, remove, ps -a, ideally also labels, etc. Implement WindowsContainerizer and WindowsDockerContainerizer - Key: MESOS-3098 URL: https://issues.apache.org/jira/browse/MESOS-3098 Project: Mesos Issue Type: Task Components: containerization Reporter: Joseph Wu Assignee: Alex Clemmer Labels: mesosphere The MVP for Windows support is a containerizer that (1) runs on Windows, and (2) runs and passes all the tests that are relevant to the Windows platform (_e.g._, not the tests that involve cgroups). To do this we require at least a `WindowsContainerizer` (to be implemented alongside the `MesosContainerizer`), which provides no meaningful (_e.g._) process namespacing (much like the default unix containerizer). In the long term (hopefully before MesosCon) we want to support also the Windows container API. This will require implementing a separate containerizer, maybe called `WindowsDockerContainerizer`. Since the Windows container API is actually officially supported through the Docker interface (_i.e._, MSFT actually ported the Docker engine to Windows, and that is the official API), the interfaces (like the fetcher) shouldn't change much. The tests probably will have to change, as we don't have access to any isolation primitives like cgroups for those tests. Outstanding TODO([~hausdorff]): Flesh out this description when more details are available, regarding: * The container API for Windows (when we know them) * The nuances of Windows vs Linux (when we know them) * etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joerg Schad updated MESOS-3086: --- Description: We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running). Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. was:We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running). Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running). Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3117) Pass ContainerId into `slaveExecutorEnvironmentDecorator` hook
[ https://issues.apache.org/jira/browse/MESOS-3117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3117: -- Story Points: 1 Pass ContainerId into `slaveExecutorEnvironmentDecorator` hook -- Key: MESOS-3117 URL: https://issues.apache.org/jira/browse/MESOS-3117 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joerg Schad updated MESOS-3086: --- Description: We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running) in cases where the freezer subsystem is not available. In the current code (https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728) we will fallback to a very simple mechnism of recursivly trying to remove the cgroups which fails if there are still tasks running. Therefore we need an additional (NonFreeze)TasksKiller which doesn't rely on the freezer subsystem. This problem caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. was: We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running) in cases where the freezer subsystem is not available. Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running) in cases where the freezer subsystem is not available. In the current code (https://github.com/apache/mesos/blob/0.22.1/src/linux/cgroups.cpp#L1728) we will fallback to a very simple mechnism of recursivly trying to remove the cgroups which fails if there are still tasks running. Therefore we need an additional (NonFreeze)TasksKiller which doesn't rely on the freezer subsystem. This problem caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3117) Pass ContainerId into `slaveExecutorEnvironmentDecorator` hook
Kapil Arya created MESOS-3117: - Summary: Pass ContainerId into `slaveExecutorEnvironmentDecorator` hook Key: MESOS-3117 URL: https://issues.apache.org/jira/browse/MESOS-3117 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3086) Create cgroups TasksKiller for non freeze subsystems.
[ https://issues.apache.org/jira/browse/MESOS-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635490#comment-14635490 ] Joerg Schad commented on MESOS-3086: Yes as long as the freezer subcomponent is available all is good. just in case that is not present we fall back to a very simple mechanism which fails if there are still tasks present (see https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp#L1725). Create cgroups TasksKiller for non freeze subsystems. - Key: MESOS-3086 URL: https://issues.apache.org/jira/browse/MESOS-3086 Project: Mesos Issue Type: Bug Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We have a number of test issues when we cannot remove cgroups (in case there are still related tasks running) in cases where the freezer subsystem is not available. Therefore we need a TasksKiller which doesn't rely on the freezer subsystem. This caused issues when running 'sudo make check' during 0.23 release testing, where BenH provided already a better error message with b1a23d6a52c31b8c5c840ab01902dbe00cb1feef / https://reviews.apache.org/r/36604. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2061) Add InverseOffer protobuf message.
[ https://issues.apache.org/jira/browse/MESOS-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636348#comment-14636348 ] Adam B commented on MESOS-2061: --- This protobuf message seems heavy with fields we won't use yet. Can we strip this down to the minimum number of fields to support the basic feature? Seems like Unavailability should be optional, and executor_ids should be repeated. I don't think we should even include task_ids/executor_ids until we have a real Preemption API/design. Add InverseOffer protobuf message. -- Key: MESOS-2061 URL: https://issues.apache.org/jira/browse/MESOS-2061 Project: Mesos Issue Type: Task Reporter: Benjamin Mahler Assignee: Joseph Wu Labels: mesosphere InverseOffer was defined as part of the maintenance work in MESOS-1474, design doc here: https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit?usp=sharing {code} // A request to deallocate or return any resources already // being consumed by the framework. message InverseOffer { required OfferID id = 1; required FrameworkID framework_id = 2; repeated Resource resources = 3; // The slave ID if the resources need to be released on a particular slave. optional SlaveID slave_id = 4; // The executor and task IDs if the resources need to be released on specific // executors and/or tasks. optional ExecutorID executor_id = 6; repeated TaskID task_ids = 6; // The resources specified in this offer will become unavailable // at the specified start time and for the specified duration. Any // tasks running using these resources might get killed when // these resources become unavailable. required Unavailability unavailability = 7; } {code} This ticket is to capture the addition of the InverseOffer protobuf to mesos.proto, the necessary API changes for Event/Call and the language bindings will be tracked separately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2578) Add '{' on newline for function declarations in style checker
[ https://issues.apache.org/jira/browse/MESOS-2578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14636175#comment-14636175 ] José Guilherme Vanz commented on MESOS-2578: Hi! I would like to start contribute in the project. I was thinking about begin in this issue. What are you think about? Add '{' on newline for function declarations in style checker - Key: MESOS-2578 URL: https://issues.apache.org/jira/browse/MESOS-2578 Project: Mesos Issue Type: Improvement Reporter: Niklas Quarfot Nielsen Priority: Trivial Similar to MESOS-2577; another common style mistake is to not move curly braces on a newline for function and class declarations: {code} class Foo { void bar() { ... } }; {code} vs {code} class Foo { void bar() { ... } }; {code} This should be easy to check with our style checker too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2660) ROOT_CGROUPS_Listen test is flaky
[ https://issues.apache.org/jira/browse/MESOS-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-2660: - Sprint: Twitter Q2 Sprint 2, Mesosphere Sprint 15 (was: Twitter Q2 Sprint 2) Story Points: 3 (was: 1) Labels: mesosphere (was: ) ROOT_CGROUPS_Listen test is flaky - Key: MESOS-2660 URL: https://issues.apache.org/jira/browse/MESOS-2660 Project: Mesos Issue Type: Bug Reporter: Jie Yu Assignee: Artem Harutyunyan Labels: mesosphere Fix For: 0.23.0 [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest [ RUN ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen Failed to allocate RSS memory: Failed to lock memory, mlock: Resource temporarily unavailable../../../mesos/src/tests/cgroups_tests.cpp:571: Failure Failed to wait 15secs for future [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen (15121 ms) [--] 1 test from CgroupsAnyHierarchyWithCpuMemoryTest (15121 ms total) [--] Global test environment tear-down [==] 1 test from 1 test case ran. (15174 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3101) Standardize separation of Windows/Linux-specific OS code
[ https://issues.apache.org/jira/browse/MESOS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3101: --- Sprint: Mesosphere Sprint 15 Standardize separation of Windows/Linux-specific OS code Key: MESOS-3101 URL: https://issues.apache.org/jira/browse/MESOS-3101 Project: Mesos Issue Type: Task Components: stout Reporter: Joseph Wu Assignee: Joseph Wu Labels: mesosphere There are 50+ files that must be touched to separate OS-specific code. First, we will standardize the changes by using stout/abort.hpp as an example. The review/discussion can be found here: https://reviews.apache.org/r/36625/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3039) Allow executors binding IP to be different than Slave binding IP
[ https://issues.apache.org/jira/browse/MESOS-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635612#comment-14635612 ] Timothy Chen commented on MESOS-3039: - Sounds like this is a useful option to be in Mesos proper, and perhaps without using env variables at all. Seems like a good isolator candidate, what do you think? Allow executors binding IP to be different than Slave binding IP Key: MESOS-3039 URL: https://issues.apache.org/jira/browse/MESOS-3039 Project: Mesos Issue Type: Bug Reporter: Kapil Arya Assignee: Kapil Arya Priority: Critical Currently, the Slave will bind either to the loopback IP (127.0.0.1) or to the IP passed via the '--ip' flag. When it launches a containerized executor (e.g, via Mesos Containerizer), the executor inherits the binding IP of the Slave. This is due to the fact that the '--ip' flags sets the environment variable `LIBPROCESS_IP` to the passed IP. The executor then inherits this environment variable and is forced to bind to the Slave IP. If an executor is running in its own containerized environment, with a separate IP than that of the Slave, currently there is no way of forcing it to bind to its own IP. A potential solution is to use the executor environment decorator hooks to update LIBPROCESS_IP environment variable for the executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3114) Simplify JSON::* by providing jsonify along the lines of stringify
[ https://issues.apache.org/jira/browse/MESOS-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3114: -- Description: We want to be able to do things like: {code} JSON::Value number1 = 25; JSON::Number number2 = 26; EXPECT_NE(number1, number2); EXPECT_EQ(jsonify(12), number1); EXPECT_EQ(jsonify(12), number2); {code} was: We want to be able to do things like: {code} JSON::Value number1 = 25; JSON::Number number2 = 26; EXPECT_NE(number1, number2); EXPECT_EQ(jsonify(12), number1); EXPECT_EQ(jsonify(12), number2); {/code} Simplify JSON::* by providing jsonify along the lines of stringify -- Key: MESOS-3114 URL: https://issues.apache.org/jira/browse/MESOS-3114 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere We want to be able to do things like: {code} JSON::Value number1 = 25; JSON::Number number2 = 26; EXPECT_NE(number1, number2); EXPECT_EQ(jsonify(12), number1); EXPECT_EQ(jsonify(12), number2); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3026) ProcessTest.Cache fails and hangs
[ https://issues.apache.org/jira/browse/MESOS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-3026: Labels: libprocess mesosphere tests (was: libprocess tests) ProcessTest.Cache fails and hangs - Key: MESOS-3026 URL: https://issues.apache.org/jira/browse/MESOS-3026 Project: Mesos Issue Type: Bug Components: libprocess Environment: ubuntu 15.04/ ubuntu 14.04.2 clang-3.6 / gcc 4.8.2 Reporter: Joris Van Remoortere Assignee: Alexander Rojas Priority: Blocker Labels: libprocess, mesosphere, tests {code} [ RUN ] ProcessTest.Cache ../../../3rdparty/libprocess/src/tests/process_tests.cpp:1726: Failure Value of: response.get().status Actual: 200 OK Expected: 304 Not Modified [ FAILED ] ProcessTest.Cache (1 ms) {code} The tests then finish running, but the gtest framework fails to terminate and uses 100% CPU. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2973) SSL tests don't work with --gtest_repeat
[ https://issues.apache.org/jira/browse/MESOS-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-2973: Labels: mesosphere ssl testing (was: ssl testing) SSL tests don't work with --gtest_repeat Key: MESOS-2973 URL: https://issues.apache.org/jira/browse/MESOS-2973 Project: Mesos Issue Type: Bug Components: libprocess Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere Labels: mesosphere, ssl, testing Fix For: 0.23.0 commit bfa89f22e9d6a3f365113b32ee1cac5208a0456f Author: Joris Van Remoortere joris.van.remoort...@gmail.com Date: Wed Jul 1 16:16:52 2015 -0700 MESOS-2973: Allow SSL tests to run using gtest_repeat. The SSL ctx object carried some settings between reinitialize() calls. Re-construct the object to avoid this state transition. Review: https://reviews.apache.org/r/36074 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3121) Always disable SSLV2
Joris Van Remoortere created MESOS-3121: --- Summary: Always disable SSLV2 Key: MESOS-3121 URL: https://issues.apache.org/jira/browse/MESOS-3121 Project: Mesos Issue Type: Bug Components: libprocess Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere The SSL protocol mismatch tests are failing on Centos7 when matching SSLV2 with SSLV2. Since this version of the protocol is highly discouraged anyway, let's disable it completely unless requested otherwise. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-588) Add option to master to send stdout and stderr to syslog
[ https://issues.apache.org/jira/browse/MESOS-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-588: - Labels: logging mesosphere syslog twitter (was: logging syslog twitter) Add option to master to send stdout and stderr to syslog Key: MESOS-588 URL: https://issues.apache.org/jira/browse/MESOS-588 Project: Mesos Issue Type: Story Components: master, slave Reporter: Jason Dusek Labels: logging, mesosphere, syslog, twitter Original Estimate: 49h 7m Remaining Estimate: 49h 7m Sending worker logs to Syslog simplifies implementation of retention policies and makes it easier for teams to devise their own approach to log indexing and authorized use of logs (since many of these solutions exist for Syslog already). In Bash, we can write a wrapper that uses redirection and process substitution to ensure a command invocation's STDOUT and STDERR are written to syslog. https://gist.github.com/solidsnack/6090947 It would be nice if Mesos offered such a facility, or provided a way to plugin a wrapper script like the one above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2795) Introduce filesystem provisioner abstraction to Mesos containerizer
[ https://issues.apache.org/jira/browse/MESOS-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2795: -- Sprint: Twitter Q2 Sprint 3, Twitter Mesos Q2 Sprint 5, Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2 (was: Twitter Q2 Sprint 3, Twitter Mesos Q2 Sprint 5, Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1) Introduce filesystem provisioner abstraction to Mesos containerizer --- Key: MESOS-2795 URL: https://issues.apache.org/jira/browse/MESOS-2795 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Assignee: Ian Downes Labels: twitter Optional filesystem provisioner component for the Mesos containerizer that can provision per-container filesystems. This is different to a filesystem isolators because it just provisions a root filesystem for a container and doesn't actually do any isolation (e.g., through a mount namespace + pivot or chroot). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3098) Implement WindowsContainerizer and WindowsDockerContainerizer
[ https://issues.apache.org/jira/browse/MESOS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635626#comment-14635626 ] Timothy Chen commented on MESOS-3098: - Sounds good we should chat and try out the Docker API on Windows with Mesos to see what we can do. Implement WindowsContainerizer and WindowsDockerContainerizer - Key: MESOS-3098 URL: https://issues.apache.org/jira/browse/MESOS-3098 Project: Mesos Issue Type: Task Components: containerization Reporter: Joseph Wu Assignee: Alex Clemmer Labels: mesosphere The MVP for Windows support is a containerizer that (1) runs on Windows, and (2) runs and passes all the tests that are relevant to the Windows platform (_e.g._, not the tests that involve cgroups). To do this we require at least a `WindowsContainerizer` (to be implemented alongside the `MesosContainerizer`), which provides no meaningful (_e.g._) process namespacing (much like the default unix containerizer). In the long term (hopefully before MesosCon) we want to support also the Windows container API. This will require implementing a separate containerizer, maybe called `WindowsDockerContainerizer`. Since the Windows container API is actually officially supported through the Docker interface (_i.e._, MSFT actually ported the Docker engine to Windows, and that is the official API), the interfaces (like the fetcher) shouldn't change much. The tests probably will have to change, as we don't have access to any isolation primitives like cgroups for those tests. Outstanding TODO([~hausdorff]): Flesh out this description when more details are available, regarding: * The container API for Windows (when we know them) * The nuances of Windows vs Linux (when we know them) * etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2794) Implement filesystem isolators
[ https://issues.apache.org/jira/browse/MESOS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2794: -- Sprint: Twitter Q2 Sprint 3, Twitter Mesos Q2 Sprint 5, Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2 (was: Twitter Q2 Sprint 3, Twitter Mesos Q2 Sprint 5, Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1) Implement filesystem isolators -- Key: MESOS-2794 URL: https://issues.apache.org/jira/browse/MESOS-2794 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Assignee: Ian Downes Labels: twitter Move persistent volume support from Mesos containerizer to separate filesystem isolators, including support for container rootfs, where possible. Use symlinks for posix systems without container rootfs. Use bind mounts for Linux with/without container rootfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3039) Allow executors binding IP to be different than Slave binding IP
[ https://issues.apache.org/jira/browse/MESOS-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3039: -- Labels: mesosphere (was: ) Allow executors binding IP to be different than Slave binding IP Key: MESOS-3039 URL: https://issues.apache.org/jira/browse/MESOS-3039 Project: Mesos Issue Type: Bug Reporter: Kapil Arya Assignee: Kapil Arya Priority: Critical Labels: mesosphere Currently, the Slave will bind either to the loopback IP (127.0.0.1) or to the IP passed via the '--ip' flag. When it launches a containerized executor (e.g, via Mesos Containerizer), the executor inherits the binding IP of the Slave. This is due to the fact that the '--ip' flags sets the environment variable `LIBPROCESS_IP` to the passed IP. The executor then inherits this environment variable and is forced to bind to the Slave IP. If an executor is running in its own containerized environment, with a separate IP than that of the Slave, currently there is no way of forcing it to bind to its own IP. A potential solution is to use the executor environment decorator hooks to update LIBPROCESS_IP environment variable for the executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3021) Implement Docker Image Provisioner Tag Store
[ https://issues.apache.org/jira/browse/MESOS-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lily Chen updated MESOS-3021: - Labels: mesosphere (was: ) Implement Docker Image Provisioner Tag Store Key: MESOS-3021 URL: https://issues.apache.org/jira/browse/MESOS-3021 Project: Mesos Issue Type: Improvement Reporter: Lily Chen Assignee: Lily Chen Labels: mesosphere Create a comprehensive store to look up an image and tag's associated image layer ID. Implement add, remove, save, and update images and their associated tags. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2721) Architecture document for per-container IP assignment, enforcement and isolation
[ https://issues.apache.org/jira/browse/MESOS-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-2721: -- Labels: mesosphere (was: ) Architecture document for per-container IP assignment, enforcement and isolation Key: MESOS-2721 URL: https://issues.apache.org/jira/browse/MESOS-2721 Project: Mesos Issue Type: Task Reporter: Niklas Quarfot Nielsen Assignee: Kapil Arya Labels: mesosphere There are many ways in which we can go around wiring up per-container IPs in Mesos. As there are multiple underlying mechanisms and systems for keeping track of IP pools, we probably need to aim for a very flexible architecture, similar to the oversubscription project. There are a couple of folks, companies and vendors interested in getting this capability into Mesos asap to provide a stronger networking story (https://www.mail-archive.com/dev@mesos.apache.org/msg32353.html). So let's start discussing and architecting this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3015) Add hooks for Slave exits
[ https://issues.apache.org/jira/browse/MESOS-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3015: -- Labels: mesosphere (was: ) Add hooks for Slave exits - Key: MESOS-3015 URL: https://issues.apache.org/jira/browse/MESOS-3015 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere The hook will be triggered on slave exits. A master hook module can use this to do Slave-specific cleanups. In our particular use case, the hook would trigger cleanup of IPs assigned to the given Slave (see the [design doc | https://docs.google.com/document/d/17mXtAmdAXcNBwp_JfrxmZcQrs7EO6ancSbejrqjLQ0g/edit#]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2119) Add Socket tests
[ https://issues.apache.org/jira/browse/MESOS-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-2119: Sprint: Mesosphere Q4 Sprint 3 - 12/7, Mesosphere Q1 Sprint 1 - 1/23, Mesosphere Q1 Sprint 2 - 2/6, Mesosphere Q1 Sprint 3 - 2/20, Mesosphere Q1 Sprint 4 - 3/6, Mesosphere Q1 Sprint 5 - 3/20, Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Sprint 10, Mesosphere Sprint 11 (was: Mesosphere Q4 Sprint 3 - 12/7, Mesosphere Q1 Sprint 1 - 1/23, Mesosphere Q1 Sprint 2 - 2/6, Mesosphere Q1 Sprint 3 - 2/20, Mesosphere Q1 Sprint 4 - 3/6, Mesosphere Q1 Sprint 5 - 3/20, Mesosphere Q1 Sprint 6 - 4/3, Mesosphere Q1 Sprint 7 - 4/17, Mesosphere Q2 Sprint 8 - 5/1, Mesosphere Sprint 10, Mesosphere Sprint 11, Mesosphere Sprint 15) Add Socket tests Key: MESOS-2119 URL: https://issues.apache.org/jira/browse/MESOS-2119 Project: Mesos Issue Type: Task Components: libprocess Reporter: Niklas Quarfot Nielsen Assignee: Joris Van Remoortere Labels: mesosphere Add more Socket specific tests to get coverage while doing libev to libevent (w and wo SSL) move -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3120) Remove pthread specific code from Mesos
Joris Van Remoortere created MESOS-3120: --- Summary: Remove pthread specific code from Mesos Key: MESOS-3120 URL: https://issues.apache.org/jira/browse/MESOS-3120 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3095) PoC running command executor with image provisioner
[ https://issues.apache.org/jira/browse/MESOS-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-3095: Labels: mesosphere (was: ) PoC running command executor with image provisioner --- Key: MESOS-3095 URL: https://issues.apache.org/jira/browse/MESOS-3095 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere This is to implement a PoC of the alternative design choices with MESOS-3004 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3092) Configure Jenkins to run Docker tests
[ https://issues.apache.org/jira/browse/MESOS-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-3092: Story Points: 2 Configure Jenkins to run Docker tests - Key: MESOS-3092 URL: https://issues.apache.org/jira/browse/MESOS-3092 Project: Mesos Issue Type: Improvement Components: docker Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere Add a jenkin job to run the Docker tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2742) Architecture doc on global resources
[ https://issues.apache.org/jira/browse/MESOS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2742: --- Sprint: (was: Mesosphere Sprint 15) Architecture doc on global resources Key: MESOS-2742 URL: https://issues.apache.org/jira/browse/MESOS-2742 Project: Mesos Issue Type: Task Reporter: Niklas Quarfot Nielsen Assignee: Joerg Schad Labels: mesosphere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2727) 0.23.0 Release
[ https://issues.apache.org/jira/browse/MESOS-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-2727: -- Issue Type: Task (was: Epic) 0.23.0 Release -- Key: MESOS-2727 URL: https://issues.apache.org/jira/browse/MESOS-2727 Project: Mesos Issue Type: Task Components: release Reporter: Adam B Assignee: Adam B Labels: mesosphere Please add links to Epics and Major features this release is blocked by. Check out the [Mesos 0.23 Release Dashboard|https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12326227] We can also track release management tasks as subtasks of this 0.23 Epic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2727) 0.23.0 Release
[ https://issues.apache.org/jira/browse/MESOS-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-2727: -- Sprint: Mesosphere Sprint 15 0.23.0 Release -- Key: MESOS-2727 URL: https://issues.apache.org/jira/browse/MESOS-2727 Project: Mesos Issue Type: Task Components: release Reporter: Adam B Assignee: Adam B Labels: mesosphere Please add links to Epics and Major features this release is blocked by. Check out the [Mesos 0.23 Release Dashboard|https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12326227] We can also track release management tasks as subtasks of this 0.23 Epic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2044) Use one IP address per container for network isolation
[ https://issues.apache.org/jira/browse/MESOS-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2044: --- Labels: mesosphere (was: ) Use one IP address per container for network isolation -- Key: MESOS-2044 URL: https://issues.apache.org/jira/browse/MESOS-2044 Project: Mesos Issue Type: Epic Reporter: Cong Wang Assignee: Kapil Arya Labels: mesosphere If there are enough IP addresses, either IPv4 or IPv6, we should use one IP address per container, instead of the ugly port range based solution. One problem with this is the IP address management, usually it is managed by a DHCP server, maybe we need to manage them in mesos master/slave. Also, maybe use macvlan instead of veth for better isolation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3050) Failing ROOT_ tests in 0.23.0-rc3 on CentOS 7.1
[ https://issues.apache.org/jira/browse/MESOS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3050: -- Labels: mesosphere (was: ) Failing ROOT_ tests in 0.23.0-rc3 on CentOS 7.1 --- Key: MESOS-3050 URL: https://issues.apache.org/jira/browse/MESOS-3050 Project: Mesos Issue Type: Bug Components: containerization, docker, test Affects Versions: 0.23.0 Environment: CentOS Linux release 7.1.1503 0.23.0-rc3 Reporter: Adam B Assignee: Timothy Chen Labels: mesosphere Running `sudo make check` on CentOS 7.1 for Mesos 0.23.0-rc3 causes several several failures/errors: {code} [ RUN ] DockerTest.ROOT_DOCKER_CheckPortResource ../../src/tests/docker_tests.cpp:303: Failure (run).failure(): Container exited on error: exited with status 1 [ FAILED ] DockerTest.ROOT_DOCKER_CheckPortResource (709 ms) {code} ... {code} [ RUN ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample ../../src/tests/isolator_tests.cpp:837: Failure isolator: Failed to create PerfEvent isolator, invalid events: { cycles, task-clock } [ FAILED ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample (9 ms) [--] 1 test from PerfEventIsolatorTest (9 ms total) [--] 2 tests from SharedFilesystemIsolatorTest [ RUN ] SharedFilesystemIsolatorTest.ROOT_RelativeVolume + mount -n --bind /tmp/SharedFilesystemIsolatorTest_ROOT_RelativeVolume_4yTEAC/var/tmp /var/tmp + touch /var/tmp/492407e1-5dec-4b34-8f2f-130430f41aac ../../src/tests/isolator_tests.cpp:1001: Failure Value of: os::exists(file) Actual: true Expected: false [ FAILED ] SharedFilesystemIsolatorTest.ROOT_RelativeVolume (92 ms) [ RUN ] SharedFilesystemIsolatorTest.ROOT_AbsoluteVolume + mount -n --bind /tmp/SharedFilesystemIsolatorTest_ROOT_AbsoluteVolume_OwYrXK /var/tmp + touch /var/tmp/7de712aa-52eb-4976-b0f9-32b6a006418d ../../src/tests/isolator_tests.cpp:1086: Failure Value of: os::exists(path::join(containerPath, filename)) Actual: true Expected: false [ FAILED ] SharedFilesystemIsolatorTest.ROOT_AbsoluteVolume (100 ms) {code} ... {code} [--] 1 test from UserCgroupIsolatorTest/0, where TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess userdel: user 'mesos.test.unprivileged.user' does not exist [ RUN ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup -bash: /sys/fs/cgroup/blkio/user.slice/cgroup.procs: Permission denied mkdir: cannot create directory ‘/sys/fs/cgroup/blkio/user.slice/user’: Permission denied ../../src/tests/isolator_tests.cpp:1274: Failure Value of: os::system( su - + UNPRIVILEGED_USERNAME + -c 'mkdir + path::join(flags.cgroups_hierarchy, userCgroup) + ') Actual: 256 Expected: 0 -bash: /sys/fs/cgroup/blkio/user.slice/user/cgroup.procs: No such file or directory ../../src/tests/isolator_tests.cpp:1283: Failure Value of: os::system( su - + UNPRIVILEGED_USERNAME + -c 'echo $$ + path::join(flags.cgroups_hierarchy, userCgroup, cgroup.procs) + ') Actual: 256 Expected: 0 -bash: /sys/fs/cgroup/memory/mesos/bbf8c8f0-3d67-40df-a269-b3dc6a9597aa/cgroup.procs: Permission denied -bash: /sys/fs/cgroup/cpuacct,cpu/user.slice/cgroup.procs: No such file or directory mkdir: cannot create directory ‘/sys/fs/cgroup/cpuacct,cpu/user.slice/user’: No such file or directory ../../src/tests/isolator_tests.cpp:1274: Failure Value of: os::system( su - + UNPRIVILEGED_USERNAME + -c 'mkdir + path::join(flags.cgroups_hierarchy, userCgroup) + ') Actual: 256 Expected: 0 -bash: /sys/fs/cgroup/cpuacct,cpu/user.slice/user/cgroup.procs: No such file or directory ../../src/tests/isolator_tests.cpp:1283: Failure Value of: os::system( su - + UNPRIVILEGED_USERNAME + -c 'echo $$ + path::join(flags.cgroups_hierarchy, userCgroup, cgroup.procs) + ') Actual: 256 Expected: 0 -bash: /sys/fs/cgroup/name=systemd/user.slice/user-2004.slice/session-3865.scope/cgroup.procs: No such file or directory mkdir: cannot create directory ‘/sys/fs/cgroup/name=systemd/user.slice/user-2004.slice/session-3865.scope/user’: No such file or directory ../../src/tests/isolator_tests.cpp:1274: Failure Value of: os::system( su - + UNPRIVILEGED_USERNAME + -c 'mkdir + path::join(flags.cgroups_hierarchy, userCgroup) + ') Actual: 256 Expected: 0 -bash: /sys/fs/cgroup/name=systemd/user.slice/user-2004.slice/session-3865.scope/user/cgroup.procs: No such file or directory ../../src/tests/isolator_tests.cpp:1283: Failure Value of: os::system( su - + UNPRIVILEGED_USERNAME + -c 'echo $$ + path::join(flags.cgroups_hierarchy, userCgroup, cgroup.procs) + ') Actual: 256 Expected: 0 [ FAILED ] UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup, where TypeParam = mesos::internal::slave::CgroupsMemIsolatorProcess (1034
[jira] [Commented] (MESOS-3062) Add authorization for dynamic reservation
[ https://issues.apache.org/jira/browse/MESOS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635680#comment-14635680 ] Adam B commented on MESOS-3062: --- I agree with using the framework principal, but for the operator endpoints, it should be authorized against the authenticated http user, not just the master. Add authorization for dynamic reservation - Key: MESOS-3062 URL: https://issues.apache.org/jira/browse/MESOS-3062 Project: Mesos Issue Type: Task Components: master Reporter: Michael Park Assignee: Michael Park Dynamic reservations should be authorized with the {{principal}} of the reserving entity (framework or master). The idea is to introduce {{Reserve}} and {{Unreserve}} into the ACL. {code} message Reserve { // Subjects. required Entity principals = 1; // Objects. MVP: Only possible values = ANY, NONE required Entity resources = 1; } message Unreserve { // Subjects. required Entity principals = 1; // Objects. required Entity reserver_principals = 2; } {code} When a framework/operator reserves resources, reserve ACLs are checked to see if the framework ({{FrameworkInfo.principal}}) or the operator ({{Credential.user}}) is authorized to reserve the specified resources. If not authorized, the reserve operation is rejected. When a framework/operator unreserves resources, unreserve ACLs are checked to see if the framework ({{FrameworkInfo.principal}}) or the operator ({{Credential.user}}) is authorized to unreserve the resources reserved by a framework or operator ({{Resource.ReservationInfo.principal}}). If not authorized, the unreserve operation is rejected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3007) Support systemd with Mesos containerizer
[ https://issues.apache.org/jira/browse/MESOS-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van Remoortere updated MESOS-3007: Labels: mesosphere (was: ) Support systemd with Mesos containerizer Key: MESOS-3007 URL: https://issues.apache.org/jira/browse/MESOS-3007 Project: Mesos Issue Type: Epic Reporter: Artem Harutyunyan Assignee: Joris Van Remoortere Labels: mesosphere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3118) Remove pthread specific code from Stout
Joris Van Remoortere created MESOS-3118: --- Summary: Remove pthread specific code from Stout Key: MESOS-3118 URL: https://issues.apache.org/jira/browse/MESOS-3118 Project: Mesos Issue Type: Improvement Components: stout Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-2562) 0.24.0 release
[ https://issues.apache.org/jira/browse/MESOS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone reassigned MESOS-2562: - Assignee: Vinod Kone 0.24.0 release -- Key: MESOS-2562 URL: https://issues.apache.org/jira/browse/MESOS-2562 Project: Mesos Issue Type: Epic Reporter: Kapil Arya Assignee: Vinod Kone -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3119) Remove pthread specific code from Libprocess
Joris Van Remoortere created MESOS-3119: --- Summary: Remove pthread specific code from Libprocess Key: MESOS-3119 URL: https://issues.apache.org/jira/browse/MESOS-3119 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1929) Create a Getting Started with Mesos using JIRA document
[ https://issues.apache.org/jira/browse/MESOS-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-1929: --- Labels: mesosphere (was: documentation) Create a Getting Started with Mesos using JIRA document -- Key: MESOS-1929 URL: https://issues.apache.org/jira/browse/MESOS-1929 Project: Mesos Issue Type: Documentation Components: general Reporter: John Pampuch Assignee: Marco Massenzio Priority: Minor Labels: mesosphere Create a quick start guide for contributors to understand the process used to move issues through to commits, explaining the states of issues, the Agile boards, sprints, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2806) Jira workflow appears inconsistent
[ https://issues.apache.org/jira/browse/MESOS-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2806: --- Labels: mesosphere (was: workflow) Jira workflow appears inconsistent -- Key: MESOS-2806 URL: https://issues.apache.org/jira/browse/MESOS-2806 Project: Mesos Issue Type: Task Reporter: Marco Massenzio Assignee: Marco Massenzio Labels: mesosphere Attachments: Mesos Workflow.png, Screen Shot 2015-06-09 at 1.38.34 PM.png, accepted.png, jira-stopped.png, jira-workflow-in-progress.png See attached screenshot - the story is in the {{Accepted}} state, so it should now have a {{Start Progress}} button, but it has a {{Stop Progress}} one instead. Also, when in the {{In Progress}} it has an {{Accept}} button (I think) or something similar; also other states appear inconsistent. This Story is about first looking at the workflow; ensuring the stories and their status(es) are consistent; that button in the UI are consistently applied and then correct any issues that may have been identified. The assumption here is that the workflow is: {noformat} Open Accepted Progress Reviewable Resolved Closed Accept Start Ready Resolve Close {noformat} and, at each stage, it can be moved back by one ({{Unaccept}}, {{Stop Progress}}, {{Unresolve}}) and that, at any stage, it can be moved to {{Closed}} (for whatever reason). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3122) Add configurable UNIMPLEMENTED macro to stout
Joris Van Remoortere created MESOS-3122: --- Summary: Add configurable UNIMPLEMENTED macro to stout Key: MESOS-3122 URL: https://issues.apache.org/jira/browse/MESOS-3122 Project: Mesos Issue Type: Improvement Components: stout Reporter: Joris Van Remoortere Assignee: Joris Van Remoortere During the transition to support for windows, it would be great if we had the ability to use a macro that marks functions as un-implemented. To support being able to find all the unimplemented functions easily at compile time, while also being able to run the tests at the same time, we can add a configuration flag that controls whether this macro aborts or expands to a static assertion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3095) PoC running command executor with image provisioner
[ https://issues.apache.org/jira/browse/MESOS-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-3095: Sprint: Mesosphere Sprint 15 PoC running command executor with image provisioner --- Key: MESOS-3095 URL: https://issues.apache.org/jira/browse/MESOS-3095 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere This is to implement a PoC of the alternative design choices with MESOS-3004 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2066) Add optional 'Unavailability' to resource offers to provide maintenance awareness.
[ https://issues.apache.org/jira/browse/MESOS-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2066: --- Sprint: Mesosphere Sprint 15 Add optional 'Unavailability' to resource offers to provide maintenance awareness. -- Key: MESOS-2066 URL: https://issues.apache.org/jira/browse/MESOS-2066 Project: Mesos Issue Type: Task Reporter: Benjamin Mahler Assignee: Joseph Wu Labels: mesosphere, twitter In order to inform frameworks about upcoming maintenance on offered resources, per MESOS-1474, we'd like to add an optional 'Unavailability' information to offers: {code} message Unavailability { required Time start = 1; // The approximate duration of the unavailability, // if this is a transient unavailability. optional Duration duration = 2; } message Offer { required OfferID id = 1; required FrameworkID framework_id = 2; required SlaveID slave_id = 3; required string hostname = 4; repeated Resource resources = 5; repeated Attribute attributes = 7; repeated ExecutorID executor_ids = 6; // The resources specified in this offer will become unavailable // at the specified start time and for the specified duration. Any // tasks launched using these resources might get killed when // these resources become unavailable. optional Unavailability unavailability = 8; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3066) Replicated registry does not have a representation of maintenance schedules
[ https://issues.apache.org/jira/browse/MESOS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3066: --- Sprint: Mesosphere Sprint 15 Replicated registry does not have a representation of maintenance schedules --- Key: MESOS-3066 URL: https://issues.apache.org/jira/browse/MESOS-3066 Project: Mesos Issue Type: Task Components: master, replicated log Reporter: Joseph Wu Assignee: Joseph Wu Labels: mesosphere In order to persist maintenance schedules across failovers of the master, the schedule information must be kept in the replicated registry. This means adding an additional key in src/master/registry.proto. The status of each individual slave's maintenance will also be persisted in this way. {code} message Maintenance { message HostStatus { required string hostname = 1; // True if the slave is deactivated for maintenance. // False if the slave is draining in preparation for maintenance. required bool is_down = 2; } message Schedule { // The set of affected slave(s). repeated HostStatus hosts = 1; // Interval in which this set of slaves is expected to be down for. optional Unavailability interval = 2; } message Schedules { repeated Schedule schedules; } optional Schedules schedules = 1; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2562) 0.24.0 release
[ https://issues.apache.org/jira/browse/MESOS-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2562: -- Issue Type: Task (was: Epic) 0.24.0 release -- Key: MESOS-2562 URL: https://issues.apache.org/jira/browse/MESOS-2562 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Vinod Kone -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2129) Enable managing mesos without having to be able to connect to each slave
[ https://issues.apache.org/jira/browse/MESOS-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2129: -- Target Version/s: (was: 0.24.0) Enable managing mesos without having to be able to connect to each slave Key: MESOS-2129 URL: https://issues.apache.org/jira/browse/MESOS-2129 Project: Mesos Issue Type: Epic Reporter: Cody Maloney Labels: mesosphere Ideally we want to use the full mesos WebUI from an office, which is firewalled off from the vast majority of hosts in the datacenter (mesos slaves). It also becomes burdensome to manage a precise firewall for additional hosts, since every time a slave comes/goes if we don't want to allow blanket access to the slave port, we have to add / remove firewall rules -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2962) Slave fails with Abort stacktrace when DNS cannot resolve hostname
[ https://issues.apache.org/jira/browse/MESOS-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2962: --- Labels: mesosphere (was: tech-debt) Slave fails with Abort stacktrace when DNS cannot resolve hostname -- Key: MESOS-2962 URL: https://issues.apache.org/jira/browse/MESOS-2962 Project: Mesos Issue Type: Bug Components: slave Affects Versions: 0.22.1 Reporter: Marco Massenzio Assignee: Marco Massenzio Labels: mesosphere Fix For: 0.23.0 If the DNS cannot resolve the hostname-to-IP for a slave node, we correctly return an {{Error}} object, but we then fail with a segfault. This code adds a more user-friendly message and exits normally (with an {{EXIT_FAILURE}} code). For example, forcing {{net::getIp()}} to always return an {{Error}}, now causes the slave to exit like this: {noformat} $ ./bin/mesos-slave.sh --master=10.10.1.121:5405 WARNING: Logging before InitGoogleLogging() is written to STDERR E0630 11:31:45.777465 1944417024 process.cpp:899] Could not obtain the IP address for stratos.local; the DNS service may not be able to resolve it: Marco was here!!! $ echo $? 1 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3077) Registry recovery does not recover the maintenance object.
[ https://issues.apache.org/jira/browse/MESOS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-3077: - Labels: mesosphere (was: ) Registry recovery does not recover the maintenance object. -- Key: MESOS-3077 URL: https://issues.apache.org/jira/browse/MESOS-3077 Project: Mesos Issue Type: Task Components: master, replicated log Reporter: Joseph Wu Assignee: Joseph Wu Labels: mesosphere Persisted info is fetched from the registry when a master is elected or after failover. Currently, this process involves 3 steps: * Fetch the registry. * Start an operation to add the new master to the fetched registry. * Check the success of the operation and finish recovering. These methods can be found in src/master/registrar.cpp {code}RegistrarProcess::recover, ::_recover, ::__recover{code} Since the maintenance schedule is stored in a separate key, the recover process must also fetch a new maintenance object. This object needs to be passed along to the master along with the existing registry object. Possible test(s): * src/tests/registrar_tests.cpp ** Change the Recovery test to include checks for the new object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2111) Add build instructions for OSX in getting started
[ https://issues.apache.org/jira/browse/MESOS-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2111: --- Labels: mesosphere (was: newbie) Add build instructions for OSX in getting started - Key: MESOS-2111 URL: https://issues.apache.org/jira/browse/MESOS-2111 Project: Mesos Issue Type: Improvement Components: documentation Reporter: Timothy Chen Assignee: Marco Massenzio Labels: mesosphere Fix For: 0.23.0 getting started doc in the docs folder -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-2860) Create the basic infrastructure to handle /call endpoint
[ https://issues.apache.org/jira/browse/MESOS-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607993#comment-14607993 ] Isabel Jimenez edited comment on MESOS-2860 at 7/21/15 6:43 PM: submitted: https://reviews.apache.org/r/36040/ https://reviews.apache.org/r/36360/ https://reviews.apache.org/r/36328/ https://reviews.apache.org/r/35934/ https://reviews.apache.org/r/35939/ https://reviews.apache.org/r/36073/ https://reviews.apache.org/r/36072/ reviewable: https://reviews.apache.org/r/36402/ https://reviews.apache.org/r/36040/ discarded or split: https://reviews.apache.org/r/36217/ https://reviews.apache.org/r/36037/ was (Author: ijimenez): https://reviews.apache.org/r/36037/ https://reviews.apache.org/r/36040/ https://reviews.apache.org/r/35934/ https://reviews.apache.org/r/35939/ https://reviews.apache.org/r/36073/ https://reviews.apache.org/r/36072/ Create the basic infrastructure to handle /call endpoint Key: MESOS-2860 URL: https://issues.apache.org/jira/browse/MESOS-2860 Project: Mesos Issue Type: Story Components: master Reporter: Marco Massenzio Assignee: Isabel Jimenez Labels: mesosphere This is the first basic step in ensuring the basic {{/call}} functionality: processing a {noformat} POST /call {noformat} and returning: - {{202}} if all goes well; - {{401}} if not authorized; and - {{403}} if the request is malformed. We'll get more sophisticated as the work progressed (eg, supporting {{415}} if the content-type is not of the right kind). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3095) PoC running command executor with image provisioner
[ https://issues.apache.org/jira/browse/MESOS-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-3095: Story Points: 3 PoC running command executor with image provisioner --- Key: MESOS-3095 URL: https://issues.apache.org/jira/browse/MESOS-3095 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere This is to implement a PoC of the alternative design choices with MESOS-3004 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3079) `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too)
[ https://issues.apache.org/jira/browse/MESOS-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3079: --- Sprint: Mesosphere Sprint 15 `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too) - Key: MESOS-3079 URL: https://issues.apache.org/jira/browse/MESOS-3079 Project: Mesos Issue Type: Bug Affects Versions: 0.23.0 Reporter: Marco Massenzio Assignee: Adam B Priority: Blocker Labels: mesosphere, tests Attachments: test-results.log Running tests as root causes a large number of failures. {noformat} $ lsb_release -a LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:cxx-3.0-amd64:cxx-3.0-noarch:cxx-3.1-amd64:cxx-3.1-noarch:cxx-3.2-amd64:cxx-3.2-noarch:cxx-4.0-amd64:cxx-4.0-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-3.1-amd64:desktop-3.1-noarch:desktop-3.2-amd64:desktop-3.2-noarch:desktop-4.0-amd64:desktop-4.0-noarch:desktop-4.1-amd64:desktop-4.1-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.0-amd64:graphics-3.0-noarch:graphics-3.1-amd64:graphics-3.1-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch:graphics-4.1-amd64:graphics-4.1-noarch:languages-3.2-amd64:languages-3.2-noarch:languages-4.0-amd64:languages-4.0-noarch:languages-4.1-amd64:languages-4.1-noarch:multimedia-3.2-amd64:multimedia-3.2-noarch:multimedia-4.0-amd64:multimedia-4.0-noarch:multimedia-4.1-amd64:multimedia-4.1-noarch:printing-3.2-amd64:printing-3.2-noarch:printing-4.0-amd64:printing-4.0-noarch:printing-4.1-amd64:printing-4.1-noarch:qt4-3.1-amd64:qt4-3.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch Distributor ID: Ubuntu Description:Ubuntu 14.04.2 LTS Release:14.04 Codename: trusty $ sudo make -j12 V=0 check [==] 712 tests from 116 test cases ran. (318672 ms total) [ PASSED ] 676 tests. [ FAILED ] 36 tests, listed below: [ FAILED ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample [ FAILED ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess [ FAILED ] SlaveRecoveryTest/0.RecoverSlaveState, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverStatusUpdateManager, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconnectExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverUnregisteredExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverTerminatedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverCompletedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.CleanupExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RemoveNonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.NonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.KillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.Reboot, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.GCExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RegisterDisconnectedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileKillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileShutdownFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileTasksMissingFromSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.SchedulerFailover, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.PartitionedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.MasterFailover, where TypeParam = mesos::internal::slave::MesosContainerizer [
[jira] [Updated] (MESOS-2066) Add optional 'Unavailability' to resource offers to provide maintenance awareness.
[ https://issues.apache.org/jira/browse/MESOS-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2066: -- Labels: mesosphere (was: mesosphere twitter) Add optional 'Unavailability' to resource offers to provide maintenance awareness. -- Key: MESOS-2066 URL: https://issues.apache.org/jira/browse/MESOS-2066 Project: Mesos Issue Type: Task Reporter: Benjamin Mahler Assignee: Joseph Wu Labels: mesosphere In order to inform frameworks about upcoming maintenance on offered resources, per MESOS-1474, we'd like to add an optional 'Unavailability' information to offers: {code} message Unavailability { required Time start = 1; // The approximate duration of the unavailability, // if this is a transient unavailability. optional Duration duration = 2; } message Offer { required OfferID id = 1; required FrameworkID framework_id = 2; required SlaveID slave_id = 3; required string hostname = 4; repeated Resource resources = 5; repeated Attribute attributes = 7; repeated ExecutorID executor_ids = 6; // The resources specified in this offer will become unavailable // at the specified start time and for the specified duration. Any // tasks launched using these resources might get killed when // these resources become unavailable. optional Unavailability unavailability = 8; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3079) `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too)
[ https://issues.apache.org/jira/browse/MESOS-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3079: --- Story Points: 2 Target Version/s: 0.23.0 `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too) - Key: MESOS-3079 URL: https://issues.apache.org/jira/browse/MESOS-3079 Project: Mesos Issue Type: Bug Affects Versions: 0.23.0 Reporter: Marco Massenzio Assignee: Adam B Priority: Blocker Labels: mesosphere, tests Attachments: test-results.log Running tests as root causes a large number of failures. {noformat} $ lsb_release -a LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:cxx-3.0-amd64:cxx-3.0-noarch:cxx-3.1-amd64:cxx-3.1-noarch:cxx-3.2-amd64:cxx-3.2-noarch:cxx-4.0-amd64:cxx-4.0-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-3.1-amd64:desktop-3.1-noarch:desktop-3.2-amd64:desktop-3.2-noarch:desktop-4.0-amd64:desktop-4.0-noarch:desktop-4.1-amd64:desktop-4.1-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.0-amd64:graphics-3.0-noarch:graphics-3.1-amd64:graphics-3.1-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch:graphics-4.1-amd64:graphics-4.1-noarch:languages-3.2-amd64:languages-3.2-noarch:languages-4.0-amd64:languages-4.0-noarch:languages-4.1-amd64:languages-4.1-noarch:multimedia-3.2-amd64:multimedia-3.2-noarch:multimedia-4.0-amd64:multimedia-4.0-noarch:multimedia-4.1-amd64:multimedia-4.1-noarch:printing-3.2-amd64:printing-3.2-noarch:printing-4.0-amd64:printing-4.0-noarch:printing-4.1-amd64:printing-4.1-noarch:qt4-3.1-amd64:qt4-3.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch Distributor ID: Ubuntu Description:Ubuntu 14.04.2 LTS Release:14.04 Codename: trusty $ sudo make -j12 V=0 check [==] 712 tests from 116 test cases ran. (318672 ms total) [ PASSED ] 676 tests. [ FAILED ] 36 tests, listed below: [ FAILED ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample [ FAILED ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess [ FAILED ] SlaveRecoveryTest/0.RecoverSlaveState, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverStatusUpdateManager, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconnectExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverUnregisteredExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverTerminatedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverCompletedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.CleanupExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RemoveNonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.NonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.KillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.Reboot, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.GCExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RegisterDisconnectedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileKillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileShutdownFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileTasksMissingFromSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.SchedulerFailover, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.PartitionedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.MasterFailover, where TypeParam =
[jira] [Updated] (MESOS-3062) Add authorization for dynamic reservation
[ https://issues.apache.org/jira/browse/MESOS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-3062: Labels: mesosphere (was: ) Add authorization for dynamic reservation - Key: MESOS-3062 URL: https://issues.apache.org/jira/browse/MESOS-3062 Project: Mesos Issue Type: Task Components: master Reporter: Michael Park Assignee: Michael Park Labels: mesosphere Dynamic reservations should be authorized with the {{principal}} of the reserving entity (framework or master). The idea is to introduce {{Reserve}} and {{Unreserve}} into the ACL. {code} message Reserve { // Subjects. required Entity principals = 1; // Objects. MVP: Only possible values = ANY, NONE required Entity resources = 1; } message Unreserve { // Subjects. required Entity principals = 1; // Objects. required Entity reserver_principals = 2; } {code} When a framework/operator reserves resources, reserve ACLs are checked to see if the framework ({{FrameworkInfo.principal}}) or the operator ({{Credential.user}}) is authorized to reserve the specified resources. If not authorized, the reserve operation is rejected. When a framework/operator unreserves resources, unreserve ACLs are checked to see if the framework ({{FrameworkInfo.principal}}) or the operator ({{Credential.user}}) is authorized to unreserve the resources reserved by a framework or operator ({{Resource.ReservationInfo.principal}}). If not authorized, the unreserve operation is rejected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails crashes
Adam B created MESOS-3123: - Summary: DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails crashes Key: MESOS-3123 URL: https://issues.apache.org/jira/browse/MESOS-3123 Project: Mesos Issue Type: Bug Components: docker, test Affects Versions: 0.23.0 Environment: CentOS 7.1, or Ubuntu 14.04 Mesos 0.23.0-rc4 or today's master Reporter: Adam B Assignee: Timothy Chen Fails the test and then fails to shutdown the slaves. {code} [ RUN ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged ../../src/tests/docker_containerizer_tests.cpp:618: Failure Value of: statusRunning.get().state() Actual: TASK_LOST Expected: TASK_RUNNING ../../src/tests/docker_containerizer_tests.cpp:619: Failure Failed to wait 1mins for statusFinished ../../src/tests/docker_containerizer_tests.cpp:610: Failure Actual function call count doesn't match EXPECT_CALL(sched, statusUpdate(driver, _))... Expected: to be called twice Actual: called once - unsatisfied and active F0721 21:59:54.950773 30622 logging.cpp:57] RAW: Pure virtual method called @ 0x7f3915347a02 google::LogMessage::Fail() @ 0x7f391534cee4 google::RawLog__() @ 0x7f3914890312 __cxa_pure_virtual @ 0x88c3ae mesos::internal::tests::Cluster::Slaves::shutdown() @ 0x88c176 mesos::internal::tests::Cluster::Slaves::~Slaves() @ 0x88dc16 mesos::internal::tests::Cluster::~Cluster() @ 0x88dc87 mesos::internal::tests::MesosTest::~MesosTest() @ 0xa529ab mesos::internal::tests::DockerContainerizerTest::~DockerContainerizerTest() @ 0xa8125f mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() @ 0xa8128e mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() @ 0x1218b4e testing::Test::DeleteSelf_() @ 0x1221909 testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x121cb38 testing::internal::HandleExceptionsInMethodIfSupported() @ 0x1205713 testing::TestInfo::Run() @ 0x1205c4e testing::TestCase::Run() @ 0x120a9ca testing::internal::UnitTestImpl::RunAllTests() @ 0x122277b testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x121d81b testing::internal::HandleExceptionsInMethodIfSupported() @ 0x120987a testing::UnitTest::Run() @ 0xcfbf0c main @ 0x7f391097caf5 __libc_start_main @ 0x882089 (unknown) make[3]: *** [check-local] Aborted (core dumped) make[3]: Leaving directory `/home/me/mesos/build/src' make[2]: *** [check-am] Error 2 make[2]: Leaving directory `/home/me/mesos/build/src' make[1]: *** [check] Error 2 make[1]: Leaving directory `/home/me/mesos/build/src' make: *** [check-recursive] Error 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-2736) Upgrade the design of MasterInfo
[ https://issues.apache.org/jira/browse/MESOS-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2736: --- Comment: was deleted (was: Following from some email conversation, I'm reporting here a proposal to add {{VersionInfo}} to the {{MasterInfo}} PB: {quote} Regarding version - we have {{mesos/version.hpp}} (apparently a generated file {{configure.ac}}: {noformat} AC_INIT([mesos], [0.23.0]) ) {noformat} which we could easily use in {{createMasterInfo()}} to assemble the version. The problem with string-based version is that it's virtually impossible to come up with anything that will allow to terminally decide if version-x version-y -- which is the very problem that [Semantic Versioning|http://semver.org] was meant to deal with. Something like: {code} message MasterInfo { message Version { message SemanticVersion { required string major = 1; required string minor = 2; optional string patch = 3; } required SemanticVersion semanticVersion = 1; optional string build = 2; } // ... other stuff optional Version versionInfo = 98; optional string version = 99; } {code} and we build a full version by concatenating the fields, with the optional build: {code} std::string version(const MasterInfo masterInfo) { MasterInfo::Version::SemanticVersion semanticVersion = masterInfo.versionInfo().semanticVersion(); return semanticVersion.major() + . + semanticVersion.minor() + (semanticVersion.hasPatch() ? . + semanticVersion.patch() : ) + (masterInfo.versionInfo().hasBuild() ? - + masterInfo.versionInfo().build() : ); } {code} you get the idea :) This way, we use canonically the semanticVersion to decide precedence (and, ideally, distance too) but folks can add whichever decoration they choose to enrich the Mesos version number. {quote}) Upgrade the design of MasterInfo Key: MESOS-2736 URL: https://issues.apache.org/jira/browse/MESOS-2736 Project: Mesos Issue Type: Improvement Reporter: Marco Massenzio Assignee: Marco Massenzio Labels: mesosphere Currently, the {{MasterInfo}} PB only supports an {{ip}} field as an {{int32}}. Beyond making it harder (and opaque; open to subtle bugs) for languages other than C/C++ to decode into an IPv4 octets, this does not allow Mesos to support IPv6 Master nodes. We should consider ways to upgrade it in ways that permit us to support both IPv4 / IPv6 nodes, and, possibly, in a way that makes it easy for languages such as Java/Python that already have PB support, so could easily deserialize this information. See also MESOS-2709 for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2736) Upgrade the design of MasterInfo
[ https://issues.apache.org/jira/browse/MESOS-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2736: --- Sprint: Mesosphere Sprint 15 Target Version/s: 0.24.0 Upgrade the design of MasterInfo Key: MESOS-2736 URL: https://issues.apache.org/jira/browse/MESOS-2736 Project: Mesos Issue Type: Improvement Reporter: Marco Massenzio Assignee: Marco Massenzio Labels: mesosphere Currently, the {{MasterInfo}} PB only supports an {{ip}} field as an {{int32}}. Beyond making it harder (and opaque; open to subtle bugs) for languages other than C/C++ to decode into an IPv4 octets, this does not allow Mesos to support IPv6 Master nodes. We should consider ways to upgrade it in ways that permit us to support both IPv4 / IPv6 nodes, and, possibly, in a way that makes it easy for languages such as Java/Python that already have PB support, so could easily deserialize this information. See also MESOS-2709 for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-905) Remove Framework.id in favor of FrameworkInfo.id
[ https://issues.apache.org/jira/browse/MESOS-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635808#comment-14635808 ] Adam B commented on MESOS-905: -- Added previous Epic children tasks as links. Remove Framework.id in favor of FrameworkInfo.id Key: MESOS-905 URL: https://issues.apache.org/jira/browse/MESOS-905 Project: Mesos Issue Type: Story Components: framework Reporter: Adam B Assignee: Kapil Arya Labels: mesosphere Framework.id currently holds the correct FrameworkId, but Framework also contains a FrameworkInfo, and the FrameworkInfo.id is not necessarily set. I propose that we eliminate the Framework.id member variable and replace it with a Framework.id() accessor that references Framework.FrameworkInfo.id and ensure that it is correctly set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-905) Remove Framework.id in favor of FrameworkInfo.id
[ https://issues.apache.org/jira/browse/MESOS-905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-905: - Target Version/s: (was: 0.24.0) Remove Framework.id in favor of FrameworkInfo.id Key: MESOS-905 URL: https://issues.apache.org/jira/browse/MESOS-905 Project: Mesos Issue Type: Story Components: framework Reporter: Adam B Assignee: Kapil Arya Labels: mesosphere Framework.id currently holds the correct FrameworkId, but Framework also contains a FrameworkInfo, and the FrameworkInfo.id is not necessarily set. I propose that we eliminate the Framework.id member variable and replace it with a Framework.id() accessor that references Framework.FrameworkInfo.id and ensure that it is correctly set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-905) Remove Framework.id in favor of FrameworkInfo.id
[ https://issues.apache.org/jira/browse/MESOS-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635810#comment-14635810 ] Adam B commented on MESOS-905: -- Untargeted this Story from 0.24. We should try to get MESOS-2559 done in 0.24, and then we can do MESOS-2560 in 0.25, and then this Story/Epic will be complete. Remove Framework.id in favor of FrameworkInfo.id Key: MESOS-905 URL: https://issues.apache.org/jira/browse/MESOS-905 Project: Mesos Issue Type: Story Components: framework Reporter: Adam B Assignee: Kapil Arya Labels: mesosphere Framework.id currently holds the correct FrameworkId, but Framework also contains a FrameworkInfo, and the FrameworkInfo.id is not necessarily set. I propose that we eliminate the Framework.id member variable and replace it with a Framework.id() accessor that references Framework.FrameworkInfo.id and ensure that it is correctly set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3123) DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged fails crashes
[ https://issues.apache.org/jira/browse/MESOS-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-3123: -- Description: Fails the test and then crashes while trying to shutdown the slaves. {code} [ RUN ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged ../../src/tests/docker_containerizer_tests.cpp:618: Failure Value of: statusRunning.get().state() Actual: TASK_LOST Expected: TASK_RUNNING ../../src/tests/docker_containerizer_tests.cpp:619: Failure Failed to wait 1mins for statusFinished ../../src/tests/docker_containerizer_tests.cpp:610: Failure Actual function call count doesn't match EXPECT_CALL(sched, statusUpdate(driver, _))... Expected: to be called twice Actual: called once - unsatisfied and active F0721 21:59:54.950773 30622 logging.cpp:57] RAW: Pure virtual method called @ 0x7f3915347a02 google::LogMessage::Fail() @ 0x7f391534cee4 google::RawLog__() @ 0x7f3914890312 __cxa_pure_virtual @ 0x88c3ae mesos::internal::tests::Cluster::Slaves::shutdown() @ 0x88c176 mesos::internal::tests::Cluster::Slaves::~Slaves() @ 0x88dc16 mesos::internal::tests::Cluster::~Cluster() @ 0x88dc87 mesos::internal::tests::MesosTest::~MesosTest() @ 0xa529ab mesos::internal::tests::DockerContainerizerTest::~DockerContainerizerTest() @ 0xa8125f mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() @ 0xa8128e mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() @ 0x1218b4e testing::Test::DeleteSelf_() @ 0x1221909 testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x121cb38 testing::internal::HandleExceptionsInMethodIfSupported() @ 0x1205713 testing::TestInfo::Run() @ 0x1205c4e testing::TestCase::Run() @ 0x120a9ca testing::internal::UnitTestImpl::RunAllTests() @ 0x122277b testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x121d81b testing::internal::HandleExceptionsInMethodIfSupported() @ 0x120987a testing::UnitTest::Run() @ 0xcfbf0c main @ 0x7f391097caf5 __libc_start_main @ 0x882089 (unknown) make[3]: *** [check-local] Aborted (core dumped) make[3]: Leaving directory `/home/me/mesos/build/src' make[2]: *** [check-am] Error 2 make[2]: Leaving directory `/home/me/mesos/build/src' make[1]: *** [check] Error 2 make[1]: Leaving directory `/home/me/mesos/build/src' make: *** [check-recursive] Error 1 {code} was: Fails the test and then fails to shutdown the slaves. {code} [ RUN ] DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged ../../src/tests/docker_containerizer_tests.cpp:618: Failure Value of: statusRunning.get().state() Actual: TASK_LOST Expected: TASK_RUNNING ../../src/tests/docker_containerizer_tests.cpp:619: Failure Failed to wait 1mins for statusFinished ../../src/tests/docker_containerizer_tests.cpp:610: Failure Actual function call count doesn't match EXPECT_CALL(sched, statusUpdate(driver, _))... Expected: to be called twice Actual: called once - unsatisfied and active F0721 21:59:54.950773 30622 logging.cpp:57] RAW: Pure virtual method called @ 0x7f3915347a02 google::LogMessage::Fail() @ 0x7f391534cee4 google::RawLog__() @ 0x7f3914890312 __cxa_pure_virtual @ 0x88c3ae mesos::internal::tests::Cluster::Slaves::shutdown() @ 0x88c176 mesos::internal::tests::Cluster::Slaves::~Slaves() @ 0x88dc16 mesos::internal::tests::Cluster::~Cluster() @ 0x88dc87 mesos::internal::tests::MesosTest::~MesosTest() @ 0xa529ab mesos::internal::tests::DockerContainerizerTest::~DockerContainerizerTest() @ 0xa8125f mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() @ 0xa8128e mesos::internal::tests::DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test::~DockerContainerizerTest_ROOT_DOCKER_Launch_Executor_Bridged_Test() @ 0x1218b4e testing::Test::DeleteSelf_() @ 0x1221909 testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x121cb38 testing::internal::HandleExceptionsInMethodIfSupported() @ 0x1205713 testing::TestInfo::Run() @ 0x1205c4e testing::TestCase::Run() @ 0x120a9ca testing::internal::UnitTestImpl::RunAllTests() @ 0x122277b testing::internal::HandleSehExceptionsInMethodIfSupported() @
[jira] [Commented] (MESOS-2736) Upgrade the design of MasterInfo
[ https://issues.apache.org/jira/browse/MESOS-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635741#comment-14635741 ] Marco Massenzio commented on MESOS-2736: We have added the {{version}} field to {{MasterInfo}}: {noformat} commit b5ba769c6619ba878d60c1bad1b5dfbafa71360b Author: Marco Massenzio ma...@mesosphere.io Date: Wed Jul 1 15:21:17 2015 -0700 Added `version` string to MasterInfo. Jira: MESOS-2957 Adds an optional version string to the `MasterInfo` protobuf, to simplify handling versioning in HTTP API calls. The version string is taken from mesos/version.hpp which is the same used when starting up Master (master/main.cpp). I've added a simple test on the back of the `MasterDetector` test, as this was the place where we would expect the `MasterInfo` to have been fully populated by real production code (as opposed to other places, where it is actually handled by mocks). Review: https://reviews.apache.org/r/36036 {noformat} Upgrade the design of MasterInfo Key: MESOS-2736 URL: https://issues.apache.org/jira/browse/MESOS-2736 Project: Mesos Issue Type: Improvement Reporter: Marco Massenzio Assignee: Marco Massenzio Labels: mesosphere Currently, the {{MasterInfo}} PB only supports an {{ip}} field as an {{int32}}. Beyond making it harder (and opaque; open to subtle bugs) for languages other than C/C++ to decode into an IPv4 octets, this does not allow Mesos to support IPv6 Master nodes. We should consider ways to upgrade it in ways that permit us to support both IPv4 / IPv6 nodes, and, possibly, in a way that makes it easy for languages such as Java/Python that already have PB support, so could easily deserialize this information. See also MESOS-2709 for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3066) Replicated registry does not have a representation of maintenance schedules
[ https://issues.apache.org/jira/browse/MESOS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-3066: - Description: In order to persist maintenance schedules across failovers of the master, the schedule information must be kept in the replicated registry. This means adding an additional key in src/master/registry.proto. The status of each individual slave's maintenance will also be persisted in this way. {code} message Maintenance { message HostStatus { required string hostname = 1; // True if the slave is deactivated for maintenance. // False if the slave is draining in preparation for maintenance. required bool is_down = 2; } message Schedule { // The set of affected slave(s). repeated HostStatus hosts = 1; // Interval in which this set of slaves is expected to be down for. optional Unavailability interval = 2; } message Schedules { repeated Schedule schedules; } optional Schedules schedules = 1; } {code} Note: There can be multiple SlaveID's attached to a single hostname. was: In order to persist maintenance schedules across failovers of the master, the schedule information must be kept in the replicated registry. This means adding an additional key in src/master/registry.proto. The status of each individual slave's maintenance will also be persisted in this way. {code} message Maintenance { message HostStatus { required string hostname = 1; // True if the slave is deactivated for maintenance. // False if the slave is draining in preparation for maintenance. required bool is_down = 2; } message Schedule { // The set of affected slave(s). repeated HostStatus hosts = 1; // Interval in which this set of slaves is expected to be down for. optional Unavailability interval = 2; } message Schedules { repeated Schedule schedules; } optional Schedules schedules = 1; } {code} Replicated registry does not have a representation of maintenance schedules --- Key: MESOS-3066 URL: https://issues.apache.org/jira/browse/MESOS-3066 Project: Mesos Issue Type: Task Components: master, replicated log Reporter: Joseph Wu Assignee: Joseph Wu Labels: mesosphere In order to persist maintenance schedules across failovers of the master, the schedule information must be kept in the replicated registry. This means adding an additional key in src/master/registry.proto. The status of each individual slave's maintenance will also be persisted in this way. {code} message Maintenance { message HostStatus { required string hostname = 1; // True if the slave is deactivated for maintenance. // False if the slave is draining in preparation for maintenance. required bool is_down = 2; } message Schedule { // The set of affected slave(s). repeated HostStatus hosts = 1; // Interval in which this set of slaves is expected to be down for. optional Unavailability interval = 2; } message Schedules { repeated Schedule schedules; } optional Schedules schedules = 1; } {code} Note: There can be multiple SlaveID's attached to a single hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2736) Upgrade the design of MasterInfo
[ https://issues.apache.org/jira/browse/MESOS-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635742#comment-14635742 ] Marco Massenzio commented on MESOS-2736: Now adding the {{ip_address}} string: {code} optional string ip_address = 6; {code} Upgrade the design of MasterInfo Key: MESOS-2736 URL: https://issues.apache.org/jira/browse/MESOS-2736 Project: Mesos Issue Type: Improvement Reporter: Marco Massenzio Assignee: Marco Massenzio Labels: mesosphere Currently, the {{MasterInfo}} PB only supports an {{ip}} field as an {{int32}}. Beyond making it harder (and opaque; open to subtle bugs) for languages other than C/C++ to decode into an IPv4 octets, this does not allow Mesos to support IPv6 Master nodes. We should consider ways to upgrade it in ways that permit us to support both IPv4 / IPv6 nodes, and, possibly, in a way that makes it easy for languages such as Java/Python that already have PB support, so could easily deserialize this information. See also MESOS-2709 for more info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)