[jira] [Assigned] (MESOS-3170) 0.23 Build fails when compiling against -lsasl2 which has been statically linked
[ https://issues.apache.org/jira/browse/MESOS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff reassigned MESOS-3170: - Assignee: Till Toenshoff 0.23 Build fails when compiling against -lsasl2 which has been statically linked Key: MESOS-3170 URL: https://issues.apache.org/jira/browse/MESOS-3170 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Chris Heller Assignee: Till Toenshoff Priority: Minor Labels: easyfix Fix For: 0.24.0 If the sasl library has been statically linked the check from CRAM-MD5 can fail, due to missing symbols. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2502) Enforce disk quota in Docker Containerizer
[ https://issues.apache.org/jira/browse/MESOS-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652035#comment-14652035 ] Deshi Xiao commented on MESOS-2502: --- any update on it? the docker have not implement a fully support disk quota feature. Enforce disk quota in Docker Containerizer -- Key: MESOS-2502 URL: https://issues.apache.org/jira/browse/MESOS-2502 Project: Mesos Issue Type: Improvement Components: docker Reporter: Timothy Chen Priority: Minor Labels: gsoc2015 Currently we enforce disk quota with Mesos containerizer, but we can also enforce disk quota with Docker containers as well, so when a container goes over the disk limit we can force limiting action such as killing the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3013) Extend DiscoveryInfo to include NetworkRequirement message
[ https://issues.apache.org/jira/browse/MESOS-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kapil Arya updated MESOS-3013: -- Sprint: (was: Mesosphere Sprint 15) Extend DiscoveryInfo to include NetworkRequirement message Key: MESOS-3013 URL: https://issues.apache.org/jira/browse/MESOS-3013 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere As per the [design doc|https://docs.google.com/document/d/17mXtAmdAXcNBwp_JfrxmZcQrs7EO6ancSbejrqjLQ0g], we need to enable frameworks to specify network requirements. The proposed message could be along the lines of: {code} message NetworkRequirement { enum Protocol { IPv4, IPv6 } required Protocol protocol; // A netgroup is the name given to a set of logically-related IPs that are // allowed to communicate within themselves. For example, one might want // to create separate netgroups for dev, testing, qa and prod deployment // environments. repeated string netgroups; // Sticky IPs allow a framwork to re-launch a task with the same IP on a // different Slave/Node. optional bool sticky [default = false]; // A unique id that the framework uses to tag the assigned IP. This tag // can be later used to reclaim IP while relaunching the task. optional string id; }; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3193) Implement AppC image discovery.
[ https://issues.apache.org/jira/browse/MESOS-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-3193: -- Issue Type: Task (was: Bug) Implement AppC image discovery. --- Key: MESOS-3193 URL: https://issues.apache.org/jira/browse/MESOS-3193 Project: Mesos Issue Type: Task Reporter: Yan Xu https://reviews.apache.org/r/34139/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3197) MemIsolatorTest/{0,1}.MemUsage fails on OS X
[ https://issues.apache.org/jira/browse/MESOS-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-3197: - Summary: MemIsolatorTest/{0,1}.MemUsage fails on OS X (was: MemIsolaterTest/{0,1}.MemUsage fails on OS X) MemIsolatorTest/{0,1}.MemUsage fails on OS X Key: MESOS-3197 URL: https://issues.apache.org/jira/browse/MESOS-3197 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Michael Park Labels: mesosphere Looks like this is due to {{mlockall}} being unimplemented on OS X. {noformat} [--] 1 test from MemIsolatorTest/0, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE [ RUN ] MemIsolatorTest/0.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40558 [ FAILED ] MemIsolatorTest/0.MemUsage, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE (56 ms) [--] 1 test from MemIsolatorTest/0 (57 ms total) [--] 1 test from MemIsolatorTest/1, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE [ RUN ] MemIsolatorTest/1.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40572 [ FAILED ] MemIsolatorTest/1.MemUsage, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE (50 ms) [--] 1 test from MemIsolatorTest/1 (50 ms total) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1010) Python extension build is broken if gflags-dev is installed
[ https://issues.apache.org/jira/browse/MESOS-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652740#comment-14652740 ] Greg Mann commented on MESOS-1010: -- I'm working on this ticket for the current Mesosphere sprint, so I'd like to revive the discussion on the preferred solution. I was able to apply [~tillt]'s previous patch, https://reviews.apache.org/r/18723/, and successfully run this test with gflags installed; this corresponds to solution A from his previous comment, patching glog's configure to avoid detection of gflags. This solution seems satisfactory to me, but does mean that some work will be required if we shift away from bundled libraries in the future; is this an eventuality we should be planning for? Python extension build is broken if gflags-dev is installed --- Key: MESOS-1010 URL: https://issues.apache.org/jira/browse/MESOS-1010 Project: Mesos Issue Type: Bug Components: build, python api Environment: Fedora 20, amd64. GCC: 4.8.2. Reporter: Nikita Vetoshkin Assignee: Greg Mann Labels: flaky-test, mesosphere In my environment mesos build from master results in broken python api module {{_mesos.so}}: {noformat} nekto0n@ya-darkstar ~/workspace/mesos/src/python $ PYTHONPATH=build/lib.linux-x86_64-2.7/ python -c import _mesos Traceback (most recent call last): File string, line 1, in module ImportError: /home/nekto0n/workspace/mesos/src/python/build/lib.linux-x86_64-2.7/_mesos.so: undefined symbol: _ZN6google14FlagRegistererC1EPKcS2_S2_S2_PvS3_ {noformat} Unmangled version of symbol looks like this: {noformat} google::FlagRegisterer::FlagRegisterer(char const*, char const*, char const*, char const*, void*, void*) {noformat} During {{./configure}} step {{glog}} finds {{gflags}} development files and starts using them, thus *implicitly* adding dependency on {{libgflags.so}}. This breaks Python extensions module and perhaps can break other mesos subsystems when moved to hosts without {{gflags}} installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2970) Support container image caching
[ https://issues.apache.org/jira/browse/MESOS-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-2970: Assignee: (was: Timothy Chen) Support container image caching Key: MESOS-2970 URL: https://issues.apache.org/jira/browse/MESOS-2970 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Timothy Chen Labels: mesosphere Each image provisioner need to implement its own storing and fetching images, and in some level need to implement caching and concurrent downloads of the same layer/image. We already have fetcher cache, and we should consider if we can reuse this. And if not we still should have some primitives that all the provisioners can reuse around caching. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-2971) Implement OverlayFS based provisioner backend
[ https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mei Wan reassigned MESOS-2971: -- Assignee: Mei Wan Implement OverlayFS based provisioner backend - Key: MESOS-2971 URL: https://issues.apache.org/jira/browse/MESOS-2971 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Assignee: Mei Wan Labels: mesosphere Part of the image provisioning process is to call a backend to create a root filesystem based on the image on disk layout. The problem with the copy backend is that it's both waste of IO and space, and bind only can deal with one layer. Overlayfs backend allows us to utilize the filesystem to merge multiple filesystems into one efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2968) Implement shared copy based provisioner backend
[ https://issues.apache.org/jira/browse/MESOS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-2968: Summary: Implement shared copy based provisioner backend (was: Implement copy based provisioner backend) Implement shared copy based provisioner backend --- Key: MESOS-2968 URL: https://issues.apache.org/jira/browse/MESOS-2968 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere Currently Appc and Docker both implemented its own copy backend, but most of the logic is the same where the input is just a image name with its dependencies. We can refactor both so that we just have one implementation that is shared between both provisioners, so appc and docker can reuse the shared copy backend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2971) Implement OverlayFS based provisioner backend
[ https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated MESOS-2971: Assignee: (was: Timothy Chen) Implement OverlayFS based provisioner backend - Key: MESOS-2971 URL: https://issues.apache.org/jira/browse/MESOS-2971 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Labels: mesosphere Part of the image provisioning process is to call a backend to create a root filesystem based on the image on disk layout. The problem with the copy backend is that it's both waste of IO and space, and bind only can deal with one layer. Overlayfs backend allows us to utilize the filesystem to merge multiple filesystems into one efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3198) mesos.native could not found
pugna created MESOS-3198: Summary: mesos.native could not found Key: MESOS-3198 URL: https://issues.apache.org/jira/browse/MESOS-3198 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.23.0 Environment: Ubuntu14.04 Mesos0.23.0 Reporter: pugna I deploy apache mesos-0.23 on Ubuntu14.04 This error comes from the last step # Run Python framework (Exits after successfully running some tasks.). $ ./src/examples/python/test-framework 127.0.0.1:5050 Mesos/src/examples/python/test_framework.py line 25, mesos.native could not found Anyone who can help me solve this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3198) mesos.native could not found
[ https://issues.apache.org/jira/browse/MESOS-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Klaus Ma updated MESOS-3198: Attachment: Re mesos.native could not found.msg Update the email thread for the background. mesos.native could not found Key: MESOS-3198 URL: https://issues.apache.org/jira/browse/MESOS-3198 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.23.0 Environment: Ubuntu14.04 Mesos0.23.0 Reporter: pugna Labels: build, newbie Attachments: Re mesos.native could not found.msg I deploy apache mesos-0.23 on Ubuntu14.04 This error comes from the last step # Run Python framework (Exits after successfully running some tasks.). $ ./src/examples/python/test-framework 127.0.0.1:5050 Mesos/src/examples/python/test_framework.py line 25, mesos.native could not found Anyone who can help me solve this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3197) MemIsolaterTest/{0,1}.MemUsage fails on OS X
Michael Park created MESOS-3197: --- Summary: MemIsolaterTest/{0,1}.MemUsage fails on OS X Key: MESOS-3197 URL: https://issues.apache.org/jira/browse/MESOS-3197 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Michael Park Looks like this is due to {{mlockall}} being unimplemented on OS X. {noformat} [--] 1 test from MemIsolatorTest/0, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE [ RUN ] MemIsolatorTest/0.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40558 [ FAILED ] MemIsolatorTest/0.MemUsage, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE (56 ms) [--] 1 test from MemIsolatorTest/0 (57 ms total) [--] 1 test from MemIsolatorTest/1, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE [ RUN ] MemIsolatorTest/1.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40572 [ FAILED ] MemIsolatorTest/1.MemUsage, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE (50 ms) [--] 1 test from MemIsolatorTest/1 (50 ms total) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1010) Python extension build is broken if gflags-dev is installed
[ https://issues.apache.org/jira/browse/MESOS-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-1010: - Description: In my environment mesos build from master results in broken python api module {{_mesos.so}}: {noformat} nekto0n@ya-darkstar ~/workspace/mesos/src/python $ PYTHONPATH=build/lib.linux-x86_64-2.7/ python -c import _mesos Traceback (most recent call last): File string, line 1, in module ImportError: /home/nekto0n/workspace/mesos/src/python/build/lib.linux-x86_64-2.7/_mesos.so: undefined symbol: _ZN6google14FlagRegistererC1EPKcS2_S2_S2_PvS3_ {noformat} Unmangled version of symbol looks like this: {noformat} google::FlagRegisterer::FlagRegisterer(char const*, char const*, char const*, char const*, void*, void*) {noformat} During {{./configure}} step {{glog}} finds {{gflags}} development files and starts using them, thus *implicitly* adding dependency on {{libgflags.so}}. This breaks Python extensions module and perhaps can break other mesos subsystems when moved to hosts without {{gflags}} installed. This task is done when the ExamplesTest.PythonFramework test will pass on a system with gflags installed. was: In my environment mesos build from master results in broken python api module {{_mesos.so}}: {noformat} nekto0n@ya-darkstar ~/workspace/mesos/src/python $ PYTHONPATH=build/lib.linux-x86_64-2.7/ python -c import _mesos Traceback (most recent call last): File string, line 1, in module ImportError: /home/nekto0n/workspace/mesos/src/python/build/lib.linux-x86_64-2.7/_mesos.so: undefined symbol: _ZN6google14FlagRegistererC1EPKcS2_S2_S2_PvS3_ {noformat} Unmangled version of symbol looks like this: {noformat} google::FlagRegisterer::FlagRegisterer(char const*, char const*, char const*, char const*, void*, void*) {noformat} During {{./configure}} step {{glog}} finds {{gflags}} development files and starts using them, thus *implicitly* adding dependency on {{libgflags.so}}. This breaks Python extensions module and perhaps can break other mesos subsystems when moved to hosts without {{gflags}} installed. Python extension build is broken if gflags-dev is installed --- Key: MESOS-1010 URL: https://issues.apache.org/jira/browse/MESOS-1010 Project: Mesos Issue Type: Bug Components: build, python api Environment: Fedora 20, amd64. GCC: 4.8.2. Reporter: Nikita Vetoshkin Assignee: Greg Mann Labels: flaky-test, mesosphere In my environment mesos build from master results in broken python api module {{_mesos.so}}: {noformat} nekto0n@ya-darkstar ~/workspace/mesos/src/python $ PYTHONPATH=build/lib.linux-x86_64-2.7/ python -c import _mesos Traceback (most recent call last): File string, line 1, in module ImportError: /home/nekto0n/workspace/mesos/src/python/build/lib.linux-x86_64-2.7/_mesos.so: undefined symbol: _ZN6google14FlagRegistererC1EPKcS2_S2_S2_PvS3_ {noformat} Unmangled version of symbol looks like this: {noformat} google::FlagRegisterer::FlagRegisterer(char const*, char const*, char const*, char const*, void*, void*) {noformat} During {{./configure}} step {{glog}} finds {{gflags}} development files and starts using them, thus *implicitly* adding dependency on {{libgflags.so}}. This breaks Python extensions module and perhaps can break other mesos subsystems when moved to hosts without {{gflags}} installed. This task is done when the ExamplesTest.PythonFramework test will pass on a system with gflags installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2968) Implement shared copy based provisioner backend
[ https://issues.apache.org/jira/browse/MESOS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652859#comment-14652859 ] Yan Xu commented on MESOS-2968: --- [~tnachen] If we define the Backend as not provisioning images but rather provisioning a list of {{rootfs}}, then we can define the API as {code} class Backend { ... virtual process::FutureNothing provision( const std::vectorPath roots, const Path directory) = 0; } {code} The caller is responsible for figuring out how the layers should be resolved and ordered in this list. This way the {{Backend}} can be unified for AppC and Docker. How does it sound? BTW I think {{Backend::provision()}} can be confused with Provisioner::provision() and the word {{Backend}} is not as self-documenting as {{Installer::installer()}}. What do you think? /cc [~idownes] Implement shared copy based provisioner backend --- Key: MESOS-2968 URL: https://issues.apache.org/jira/browse/MESOS-2968 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere Currently Appc and Docker both implemented its own copy backend, but most of the logic is the same where the input is just a image name with its dependencies. We can refactor both so that we just have one implementation that is shared between both provisioners, so appc and docker can reuse the shared copy backend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3196) Always set TaskStatus.executor_id when sending a status update message from Executor
Kapil Arya created MESOS-3196: - Summary: Always set TaskStatus.executor_id when sending a status update message from Executor Key: MESOS-3196 URL: https://issues.apache.org/jira/browse/MESOS-3196 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Currently, the Executor doesn't always set TaskStatus.executor_id. This prevents the Slave TaskStatus label decorator hook from knowing the executor id. An appropriate place to automatically fill in the executor_id is ExecutorProcesS::sendStatusUpdate() since we are already filling in some other information here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3004) Design support running the command executor with provisioned image for running a task in a container
[ https://issues.apache.org/jira/browse/MESOS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652863#comment-14652863 ] Jie Yu commented on MESOS-3004: --- My proposal for solving this issue: https://docs.google.com/document/d/1n2emC2ruTMur5nURvLgGYJxuP-tgBwgLDrmg_17QSmA/edit# The main idea is to allow 'Volume' to specify an 'Image' as the source. The provisioner is going to prepare the rootfs according to that. Mesos command line executor will run under the host filesystem (i.e., not specify image in ContainerInfo). The task image will be moved to a volume. The Mesos command line executor will perform the root pivoting itself right before exec-ing the user process. Design support running the command executor with provisioned image for running a task in a container Key: MESOS-3004 URL: https://issues.apache.org/jira/browse/MESOS-3004 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere Mesos Containerizer uses the command executor to actually launch the user defined command, and the command executor then can communicate with the slave about the process lifecycle. When we provision a new container with the user specified image, we also need to be able to run the command executor in the container to support the same semantics. One approach is to dynamically mount in a static binary of the command executor with all its dependencies in a special directory so it doesn't interfere with the provisioned root filesystem and configure the mesos containerizer to run the command executor in that directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3196) Always set TaskStatus.executor_id when sending a status update message from Executor
[ https://issues.apache.org/jira/browse/MESOS-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652985#comment-14652985 ] Vinod Kone commented on MESOS-3196: --- Why not have the slave set it? Always set TaskStatus.executor_id when sending a status update message from Executor Key: MESOS-3196 URL: https://issues.apache.org/jira/browse/MESOS-3196 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere Currently, the Executor doesn't always set TaskStatus.executor_id. This prevents the Slave TaskStatus label decorator hook from knowing the executor id. An appropriate place to automatically fill in the executor_id is ExecutorProcesS::sendStatusUpdate() since we are already filling in some other information here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3198) mesos.native could not found
[ https://issues.apache.org/jira/browse/MESOS-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Peach updated MESOS-3198: --- Labels: build newbie python (was: build newbie) mesos.native could not found Key: MESOS-3198 URL: https://issues.apache.org/jira/browse/MESOS-3198 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.23.0 Environment: Ubuntu14.04 Mesos0.23.0 Reporter: pugna Labels: build, newbie, python Attachments: Re mesos.native could not found.msg I deploy apache mesos-0.23 on Ubuntu14.04 This error comes from the last step # Run Python framework (Exits after successfully running some tasks.). $ ./src/examples/python/test-framework 127.0.0.1:5050 Mesos/src/examples/python/test_framework.py line 25, mesos.native could not found Anyone who can help me solve this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3198) mesos.native could not found
[ https://issues.apache.org/jira/browse/MESOS-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653059#comment-14653059 ] James Peach commented on MESOS-3198: FWIW, this is broken in my environment too. The mesos build seems to sprinkle python packages into both {{/usr/lib/python2.7/site-packages}} and {{/usr/libexec/mesos/python}}. At least in the default install, the packages {{site-packages}} end up missing dependencies (eg. protobuf). A separate but related bug is that the python tools do not install using the requested python version. We force the use of python 2.7 in the build by setting {{PYTHON_VERSION}} in the build, but various tools like {{mesos-ps}} just use {{/usr/bin/env python}}. mesos.native could not found Key: MESOS-3198 URL: https://issues.apache.org/jira/browse/MESOS-3198 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.23.0 Environment: Ubuntu14.04 Mesos0.23.0 Reporter: pugna Labels: build, newbie, python Attachments: Re mesos.native could not found.msg I deploy apache mesos-0.23 on Ubuntu14.04 This error comes from the last step # Run Python framework (Exits after successfully running some tasks.). $ ./src/examples/python/test-framework 127.0.0.1:5050 Mesos/src/examples/python/test_framework.py line 25, mesos.native could not found Anyone who can help me solve this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3197) MemIsolatorTest/{0,1}.MemUsage fails on OS X
[ https://issues.apache.org/jira/browse/MESOS-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan reassigned MESOS-3197: Assignee: Artem Harutyunyan MemIsolatorTest/{0,1}.MemUsage fails on OS X Key: MESOS-3197 URL: https://issues.apache.org/jira/browse/MESOS-3197 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Michael Park Assignee: Artem Harutyunyan Labels: mesosphere Looks like this is due to {{mlockall}} being unimplemented on OS X. {noformat} [--] 1 test from MemIsolatorTest/0, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE [ RUN ] MemIsolatorTest/0.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40558 [ FAILED ] MemIsolatorTest/0.MemUsage, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE (56 ms) [--] 1 test from MemIsolatorTest/0 (57 ms total) [--] 1 test from MemIsolatorTest/1, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE [ RUN ] MemIsolatorTest/1.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40572 [ FAILED ] MemIsolatorTest/1.MemUsage, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE (50 ms) [--] 1 test from MemIsolatorTest/1 (50 ms total) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3197) MemIsolatorTest/{0,1}.MemUsage fails on OS X
[ https://issues.apache.org/jira/browse/MESOS-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-3197: - Shepherd: Michael Park Sprint: Mesosphere Sprint 16 Story Points: 2 MemIsolatorTest/{0,1}.MemUsage fails on OS X Key: MESOS-3197 URL: https://issues.apache.org/jira/browse/MESOS-3197 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Michael Park Labels: mesosphere Looks like this is due to {{mlockall}} being unimplemented on OS X. {noformat} [--] 1 test from MemIsolatorTest/0, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE [ RUN ] MemIsolatorTest/0.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40558 [ FAILED ] MemIsolatorTest/0.MemUsage, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE (56 ms) [--] 1 test from MemIsolatorTest/0 (57 ms total) [--] 1 test from MemIsolatorTest/1, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE [ RUN ] MemIsolatorTest/1.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40572 [ FAILED ] MemIsolatorTest/1.MemUsage, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE (50 ms) [--] 1 test from MemIsolatorTest/1 (50 ms total) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3197) MemIsolatorTest/{0,1}.MemUsage fails on OS X
[ https://issues.apache.org/jira/browse/MESOS-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653074#comment-14653074 ] Artem Harutyunyan commented on MESOS-3197: -- https://reviews.apache.org/r/37065/ MemIsolatorTest/{0,1}.MemUsage fails on OS X Key: MESOS-3197 URL: https://issues.apache.org/jira/browse/MESOS-3197 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Michael Park Labels: mesosphere Looks like this is due to {{mlockall}} being unimplemented on OS X. {noformat} [--] 1 test from MemIsolatorTest/0, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE [ RUN ] MemIsolatorTest/0.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40558 [ FAILED ] MemIsolatorTest/0.MemUsage, where TypeParam = N5mesos8internal5slave23PosixMemIsolatorProcessE (56 ms) [--] 1 test from MemIsolatorTest/0 (57 ms total) [--] 1 test from MemIsolatorTest/1, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE [ RUN ] MemIsolatorTest/1.MemUsage Failed to allocate RSS memory: Failed to make pages to be mapped unevictable: Function not implemented../../src/tests/containerizer/isolator_tests.cpp:812: Failure helper.increaseRSS(allocation): Failed to sync with the subprocess ../../src/tests/containerizer/isolator_tests.cpp:815: Failure (usage).failure(): Failed to get usage: No process found at 40572 [ FAILED ] MemIsolatorTest/1.MemUsage, where TypeParam = N5mesos8internal5tests6ModuleINS_5slave8IsolatorELNS1_8ModuleIDE0EEE (50 ms) [--] 1 test from MemIsolatorTest/1 (50 ms total) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3198) mesos.native could not found
[ https://issues.apache.org/jira/browse/MESOS-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653059#comment-14653059 ] James Peach edited comment on MESOS-3198 at 8/4/15 5:16 AM: FWIW, this is broken in my environment too. The mesos build seems to sprinkle python packages into both {{/usr/lib/python2.7/site-packages}} and {{/usr/libexec/mesos/python}}. At least in the default install, the packages {{site-packages}} end up missing dependencies (eg. protobuf). You end up with mesos packages in 2 separate paths and python seems to only look at the first one: {code} $ rpm -ql mesos | grep python /usr/lib/python2.7/site-packages/mesos ... /usr/libexec/mesos/python/mesos ... $ PYTHONPATH=/usr/libexec/mesos/python python2.7 import mesos dir(mesos) ['__doc__', '__name__', '__path__'] mesos.__path__ ['/usr/lib/python2.7/site-packages/mesos'] import sys sys.path ['', '/usr/libexec/mesos/python', '/usr/lib64/python27.zip', '/usr/lib64/python2.7', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7/site-packages', '/usr/lib/python2.7/site-packages'] {code} A separate but related bug is that the python tools do not install using the requested python version. We force the use of python 2.7 in the build by setting {{PYTHON_VERSION}} in the build, but various tools like {{mesos-ps}} just use {{/usr/bin/env python}}. was (Author: jamespeach): FWIW, this is broken in my environment too. The mesos build seems to sprinkle python packages into both {{/usr/lib/python2.7/site-packages}} and {{/usr/libexec/mesos/python}}. At least in the default install, the packages {{site-packages}} end up missing dependencies (eg. protobuf). A separate but related bug is that the python tools do not install using the requested python version. We force the use of python 2.7 in the build by setting {{PYTHON_VERSION}} in the build, but various tools like {{mesos-ps}} just use {{/usr/bin/env python}}. mesos.native could not found Key: MESOS-3198 URL: https://issues.apache.org/jira/browse/MESOS-3198 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.23.0 Environment: Ubuntu14.04 Mesos0.23.0 Reporter: pugna Labels: build, newbie, python Attachments: Re mesos.native could not found.msg I deploy apache mesos-0.23 on Ubuntu14.04 This error comes from the last step # Run Python framework (Exits after successfully running some tasks.). $ ./src/examples/python/test-framework 127.0.0.1:5050 Mesos/src/examples/python/test_framework.py line 25, mesos.native could not found Anyone who can help me solve this problem? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3064) Add 'principal' field to 'Resource.DiskInfo'
[ https://issues.apache.org/jira/browse/MESOS-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-3064: Shepherd: Jie Yu Assignee: Michael Park Sprint: Mesosphere Sprint 16 Story Points: 1 Labels: mesosphere (was: ) Add 'principal' field to 'Resource.DiskInfo' Key: MESOS-3064 URL: https://issues.apache.org/jira/browse/MESOS-3064 Project: Mesos Issue Type: Task Reporter: Michael Park Assignee: Michael Park Labels: mesosphere In order to support authorization for persistent volumes, we should add the {{principal}} to {{Resource.DiskInfo}}, analogous to {{Resource.ReservationInfo.principal}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3021) Implement Docker Image Provisioner Reference Store
[ https://issues.apache.org/jira/browse/MESOS-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3021: --- Sprint: Mesosphere Sprint 14, Mesosphere Sprint 16 (was: Mesosphere Sprint 14) Implement Docker Image Provisioner Reference Store -- Key: MESOS-3021 URL: https://issues.apache.org/jira/browse/MESOS-3021 Project: Mesos Issue Type: Improvement Reporter: Lily Chen Assignee: Lily Chen Labels: mesosphere Create a comprehensive store to look up an image and tag's associated image layer ID. Implement add, remove, save, and update images and their associated tags. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3065) Add authorization for persistent volume
[ https://issues.apache.org/jira/browse/MESOS-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-3065: Shepherd: Jie Yu Assignee: Michael Park Sprint: Mesosphere Sprint 16 Labels: mesosphere (was: ) Add authorization for persistent volume --- Key: MESOS-3065 URL: https://issues.apache.org/jira/browse/MESOS-3065 Project: Mesos Issue Type: Task Reporter: Michael Park Assignee: Michael Park Labels: mesosphere Persistent volume should be authorized with the {{principal}} of the reserving entity (framework or master). The idea is to introduce {{Create}} and {{Destroy}} into the ACL. {code} message Create { // Subjects. required Entity principals = 1; // Objects? Perhaps the kind of volume? allowed permissions? } message Unreserve { // Subjects. required Entity principals = 1; // Objects. required Entity creator_principals = 2; } {code} When a framework/operator creates a persistent volume, create ACLs are checked to see if the framework (FrameworkInfo.principal) or the operator (Credential.user) is authorized to create persistent volumes. If not authorized, the create operation is rejected. When a framework/operator destroys a persistent volume, destroy ACLs are checked to see if the framework (FrameworkInfo.principal) or the operator (Credential.user) is authorized to destroy the persistent volume created by a framework or operator (Resource.DiskInfo.principal). If not authorized, the destroy operation is rejected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2455) Add operator endpoint to destroy persistent volumes.
[ https://issues.apache.org/jira/browse/MESOS-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-2455: Assignee: Michael Park Sprint: Mesosphere Sprint 16 Story Points: 3 Labels: mesosphere (was: ) Add operator endpoint to destroy persistent volumes. Key: MESOS-2455 URL: https://issues.apache.org/jira/browse/MESOS-2455 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Michael Park Priority: Critical Labels: mesosphere Persistent volumes will not be released automatically. So we probably need an endpoint for operators to forcefully release persistent volumes. We probably need to add principal to Persistence struct and use ACLs to control who can release what. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2497) Create synchronous validations for Calls
[ https://issues.apache.org/jira/browse/MESOS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2497: --- Story Points: 8 (was: 3) Create synchronous validations for Calls Key: MESOS-2497 URL: https://issues.apache.org/jira/browse/MESOS-2497 Project: Mesos Issue Type: Bug Reporter: Isabel Jimenez Assignee: Isabel Jimenez Labels: HTTP, mesosphere /call endpoint will return a 202 accepted code but has to do some basic validations before. In case of invalidation it will return a 4xx code. We have to create a mechanism that will validate the 'request' and send back the appropriate code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2067) Add HTTP API to the master for maintenance operations.
[ https://issues.apache.org/jira/browse/MESOS-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2067: --- Sprint: (was: Mesosphere Sprint 16) Add HTTP API to the master for maintenance operations. -- Key: MESOS-2067 URL: https://issues.apache.org/jira/browse/MESOS-2067 Project: Mesos Issue Type: Task Components: master Reporter: Benjamin Mahler Assignee: Artem Harutyunyan Labels: mesosphere, twitter Based on MESOS-1474, we'd like to provide an HTTP API on the master for the maintenance primitives in mesos. For the MVP, we'll want something like this for manipulating the schedule: {code} /maintenance GET - returns the schedule, which will include the various maintenance windows. POST - create or update the schedule with a JSON blob (see below). /maintenance/status (Note: the slash might not be usable) GET - returns a list of machines and their maintenance mode. POST - change the mode of a list of machines. The request would have a query parameter to express the action of starting or stopping maintenance. {code} A schedule might look like: {code} { windows : [ { machines : [192.168.0.1, localhost, foo.bar.com, ... ], unavailability : { start : 12345, // Epoch seconds. duration : 1000 // Seconds. } }, ... ] } {code} There should be firewall settings such that only those with access to master can use these endpoints. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1841) Mesos components should expose their version on an endpoint.
[ https://issues.apache.org/jira/browse/MESOS-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-1841: --- Shepherd: Benjamin Mahler Mesos components should expose their version on an endpoint. Key: MESOS-1841 URL: https://issues.apache.org/jira/browse/MESOS-1841 Project: Mesos Issue Type: Improvement Reporter: David Robinson Assignee: haosdent Priority: Minor Labels: twitter The libraries (used by scheduler and executor) should expose their version on an endpoint. {noformat:title=stats exposed by the scheduler driver} # netstat -dnatp | grep LISTEN | grep java tcp0 0 0.0.0.0:48747 0.0.0.0:* LISTEN 1699/java tcp0 0 :::8081 :::* LISTEN 1699/java # curl -s http://localhost:48747/metrics/snapshot | python2.7 -m json.tool { scheduler/event_queue_messages: 0, system/cpus_total: 24, system/load_15min: 0.18, system/load_1min: 0.29, system/load_5min: 0.2, system/mem_free_bytes: 7850483712, system/mem_total_bytes: 33534873600 } {noformat} Likewise, for the master and slave, a lightweight endpoint is desired, see MESOS-2644 as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3165) Persist and recover quota to/from Registry
[ https://issues.apache.org/jira/browse/MESOS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3165: --- Story Points: 3 (was: 4) Persist and recover quota to/from Registry -- Key: MESOS-3165 URL: https://issues.apache.org/jira/browse/MESOS-3165 Project: Mesos Issue Type: Task Components: master, replicated log Reporter: Alexander Rukletsov Labels: mesosphere To persist quotas across failovers, the Master should save them in the registry. To support this, we shall: * Introduce a Quota state variable in registry.proto; * Extend the Operation interface so that it supports a ‘Quota’ accumulator (see src/master/registrar.hpp); * Introduce AddQuota / RemoveQuota operations; * Recover quotas from the registry on failover to the Master’s internal::master::Role struct; * Extend RegistrarTest with quota-specific tests. NOTE: Registry variable can be rather big for production clusters (see MESOS-2075). While it should be fine for MVP to add quota information to registry, we should consider storing Quota separately, as this does not need to be in sync with slaves update. However, currently adding more variable is not supported by the registrar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3074) Validate quota requests in Master
[ https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3074: --- Story Points: 3 (was: 4) Validate quota requests in Master - Key: MESOS-3074 URL: https://issues.apache.org/jira/browse/MESOS-3074 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We need to to validate and quota requests in the Mesos Master as outlined in the Design Doc: https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I This ticket aims to validate satisfiability (in terms of available resources) of a quota request using a heuristic algorithm in the Mesos Master, rather than validating the syntax of the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3074) Check satisfiability of quota requests in Master
[ https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3074: --- Summary: Check satisfiability of quota requests in Master (was: Validate quota requests in Master) Check satisfiability of quota requests in Master Key: MESOS-3074 URL: https://issues.apache.org/jira/browse/MESOS-3074 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We need to to validate and quota requests in the Mesos Master as outlined in the Design Doc: https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I This ticket aims to validate satisfiability (in terms of available resources) of a quota request using a heuristic algorithm in the Mesos Master, rather than validating the syntax of the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3073) Introduce HTTP endpoints for Quota
[ https://issues.apache.org/jira/browse/MESOS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3073: --- Story Points: 5 (was: 3) Introduce HTTP endpoints for Quota -- Key: MESOS-3073 URL: https://issues.apache.org/jira/browse/MESOS-3073 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Labels: mesosphere We need to implement the HTTP endpoints for Quota as outlined in the Design Doc: (https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I). This also includes validating quota requests in terms of syntax correctness, updating Master bookkeeping structures, persisting quota requests in the {{Registry}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-2937) Create a design document for Quota support in Allocator
[ https://issues.apache.org/jira/browse/MESOS-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov reassigned MESOS-2937: -- Assignee: Alexander Rukletsov Create a design document for Quota support in Allocator --- Key: MESOS-2937 URL: https://issues.apache.org/jira/browse/MESOS-2937 Project: Mesos Issue Type: Documentation Components: documentation Reporter: Alexander Rukletsov Assignee: Alexander Rukletsov Labels: mesosphere Create a design document for the Quota feature support in the built-in Hierarchical DRF allocator to be shared with the Mesos community. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3074) Check satisfiability of quota requests in Master
[ https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov reassigned MESOS-3074: -- Assignee: Alexander Rukletsov (was: Joerg Schad) Check satisfiability of quota requests in Master Key: MESOS-3074 URL: https://issues.apache.org/jira/browse/MESOS-3074 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Alexander Rukletsov Labels: mesosphere We need to to validate and quota requests in the Mesos Master as outlined in the Design Doc: https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I This ticket aims to validate satisfiability (in terms of available resources) of a quota request using a heuristic algorithm in the Mesos Master, rather than validating the syntax of the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3165) Persist and recover quota to/from Registry
[ https://issues.apache.org/jira/browse/MESOS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-3165: --- Story Points: 5 (was: 3) Persist and recover quota to/from Registry -- Key: MESOS-3165 URL: https://issues.apache.org/jira/browse/MESOS-3165 Project: Mesos Issue Type: Task Components: master, replicated log Reporter: Alexander Rukletsov Labels: mesosphere To persist quotas across failovers, the Master should save them in the registry. To support this, we shall: * Introduce a Quota state variable in registry.proto; * Extend the Operation interface so that it supports a ‘Quota’ accumulator (see src/master/registrar.hpp); * Introduce AddQuota / RemoveQuota operations; * Recover quotas from the registry on failover to the Master’s internal::master::Role struct; * Extend RegistrarTest with quota-specific tests. NOTE: Registry variable can be rather big for production clusters (see MESOS-2075). While it should be fine for MVP to add quota information to registry, we should consider storing Quota separately, as this does not need to be in sync with slaves update. However, currently adding more variable is not supported by the registrar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2937) Create a design document for Quota support in Allocator
[ https://issues.apache.org/jira/browse/MESOS-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-2937: --- Story Points: 8 Create a design document for Quota support in Allocator --- Key: MESOS-2937 URL: https://issues.apache.org/jira/browse/MESOS-2937 Project: Mesos Issue Type: Documentation Components: documentation Reporter: Alexander Rukletsov Labels: mesosphere Create a design document for the Quota feature support in the built-in Hierarchical DRF allocator to be shared with the Mesos community. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3023) Factoring out the pattern for URL generation
[ https://issues.apache.org/jira/browse/MESOS-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651586#comment-14651586 ] Bernd Mathiske commented on MESOS-3023: --- This latest patch first factors out the URL generation, then inserts white space: commit 9c905880abc02e064f0430afab12afa6073549c8 Author: Artem Harutyunyan ar...@mesosphere.io Date: Mon Aug 3 10:22:15 2015 +0200 Factored out the pattern for URL generation in (another) fetcher test. Review: https://reviews.apache.org/r/36946 Factoring out the pattern for URL generation - Key: MESOS-3023 URL: https://issues.apache.org/jira/browse/MESOS-3023 Project: Mesos Issue Type: Task Reporter: Artem Harutyunyan Assignee: Klaus Ma Priority: Minor Labels: beginner, mesosphere, newbie fetcher_test.cpp uses the following code for generating URLs: string url = http://; + net::getHostname(process.self().address.ip).get() + : + stringify(process.self().address.port) + / + process.self().id it would be good to isolate that code in a function, and replace the code above with something like: string url = http://; + endpoint_url(process, uri_test); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3170) 0.23 Build fails when compiling against -lsasl2 which has been statically linked
[ https://issues.apache.org/jira/browse/MESOS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-3170: -- Shepherd: Till Toenshoff 0.23 Build fails when compiling against -lsasl2 which has been statically linked Key: MESOS-3170 URL: https://issues.apache.org/jira/browse/MESOS-3170 Project: Mesos Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Chris Heller Assignee: Chris Heller Priority: Minor Labels: easyfix Fix For: 0.24.0 If the sasl library has been statically linked the check from CRAM-MD5 can fail, due to missing symbols. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2289) Design doc for the HTTP API
[ https://issues.apache.org/jira/browse/MESOS-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652154#comment-14652154 ] Craig W commented on MESOS-2289: Will this be published to the mesos site? Design doc for the HTTP API --- Key: MESOS-2289 URL: https://issues.apache.org/jira/browse/MESOS-2289 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Vinod Kone Fix For: 0.23.0 This tracks the design of the HTTP API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3187) Docker cli option support
[ https://issues.apache.org/jira/browse/MESOS-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Khanduja reassigned MESOS-3187: --- Assignee: Vaibhav Khanduja Docker cli option support - Key: MESOS-3187 URL: https://issues.apache.org/jira/browse/MESOS-3187 Project: Mesos Issue Type: Improvement Components: docker, slave Reporter: Vaibhav Khanduja Assignee: Vaibhav Khanduja Priority: Minor Mesos slave today support docker as a container environment. The docker cli support much more options than what is supported by mesos slave. The slave command line option should be enhanced support such parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2688) Slave should kill revocable tasks if oversubscription is disabled
[ https://issues.apache.org/jira/browse/MESOS-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2688: -- Assignee: Jie Yu Slave should kill revocable tasks if oversubscription is disabled - Key: MESOS-2688 URL: https://issues.apache.org/jira/browse/MESOS-2688 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Jie Yu Labels: twitter If oversubscription is disabled on a restarted slave (that had it previously enabled), it should kill revocable tasks. Slave knows this information from the Resources of a container that it checkpoints and recovers. Add a new reason OVERSUBSCRIPTION_DISABLED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2688) Slave should kill revocable tasks if oversubscription is disabled
[ https://issues.apache.org/jira/browse/MESOS-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2688: -- Sprint: Twitter Mesos Q3 Sprint 3 Slave should kill revocable tasks if oversubscription is disabled - Key: MESOS-2688 URL: https://issues.apache.org/jira/browse/MESOS-2688 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Jie Yu Labels: twitter If oversubscription is disabled on a restarted slave (that had it previously enabled), it should kill revocable tasks. Slave knows this information from the Resources of a container that it checkpoints and recovers. Add a new reason OVERSUBSCRIPTION_DISABLED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2796) Implement AppC image provisioner.
[ https://issues.apache.org/jira/browse/MESOS-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2796: -- Summary: Implement AppC image provisioner. (was: Implement a filesystem provisioner for AppC images (aci).) Implement AppC image provisioner. - Key: MESOS-2796 URL: https://issues.apache.org/jira/browse/MESOS-2796 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Assignee: Yan Xu Labels: twitter Implement a filesystem provisioner that can provision container images compliant with the Application Container Image (aci) [specification|https://github.com/appc/spec]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2340) Add ability to decode JSON serialized MasterInfo from ZK
[ https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652241#comment-14652241 ] Marco Massenzio commented on MESOS-2340: Please don't use this ticket to file errors - it's much better to ask for help in the dev@ mailing list. Having said that, what you are attaching here is most definitely *not* the error cause: this log in fact, confirms that the Master was able to successfully decode the MasterInfo in Zookeeper (you probably still have a 0.23 Master running?). What error do you see? what is the symptom? and you may want to upload the full log output (ideally, as an attachment). Finally, if you really have installed mesosphere (did you mean DCOS?) and not Apache Mesos, you may want to file a request support there instead. FYI - this is what a 0.24 Master emits when starting up: {noformat} $ ./bin/mesos-master.sh --zk=zk://localhost:2181/test/report --quorum=1 --work_dir=/tmp/report ... I0803 11:33:41.037178 222056448 group.cpp:331] Group process (group(2)@10.0.77.243:5050) connected to ZooKeeper I0803 11:33:41.037196 220446720 group.cpp:331] Group process (group(1)@10.0.77.243:5050) connected to ZooKeeper I0803 11:33:41.037204 222056448 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (1, 0, 0) I0803 11:33:41.037217 220446720 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0803 11:33:41.037250 220446720 group.cpp:403] Trying to create path '/test/report/log_replicas' in ZooKeeper I0803 11:33:41.037250 222056448 group.cpp:403] Trying to create path '/test/report/log_replicas' in ZooKeeper 2015-08-03 11:33:41,037:3806(0x10d8f3000):ZOO_INFO@check_events@1750: session establishment complete on server [::1:2181], sessionId=0x14ed7154df50023, negotiated timeout=1 I0803 11:33:41.038208 220983296 group.cpp:331] Group process (group(3)@10.0.77.243:5050) connected to ZooKeeper I0803 11:33:41.038229 220983296 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0803 11:33:41.038240 220983296 group.cpp:403] Trying to create path '/test/report' in ZooKeeper 2015-08-03 11:33:41,038:3806(0x10d976000):ZOO_INFO@check_events@1750: session establishment complete on server [::1:2181], sessionId=0x14ed7154df50024, negotiated timeout=1 I0803 11:33:41.038566 223666176 group.cpp:331] Group process (group(4)@10.0.77.243:5050) connected to ZooKeeper I0803 11:33:41.038595 223666176 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0803 11:33:41.038614 223666176 group.cpp:403] Trying to create path '/test/report' in ZooKeeper I0803 11:33:41.058315 222593024 contender.cpp:149] Joining the ZK group I0803 11:33:41.078806 222593024 contender.cpp:265] New candidate (id='1') has entered the contest for leadership I0803 11:33:41.079265 221519872 detector.cpp:156] Detected a new leader: (id='1') I0803 11:33:41.079429 223129600 group.cpp:674] Trying to get '/test/report/json.info_01' in ZooKeeper I0803 11:33:41.079447 219910144 network.hpp:415] ZooKeeper group memberships changed I0803 11:33:41.079527 222593024 group.cpp:674] Trying to get '/test/report/log_replicas/00' in ZooKeeper I0803 11:33:41.081306 220983296 detector.cpp:481] A new leading master (UPID=master@10.0.77.243:5050) is detected I0803 11:33:41.081464 219910144 network.hpp:463] ZooKeeper group PIDs: { log-replica(1)@10.0.77.243:5050 } I0803 11:33:41.081482 223129600 master.cpp:1495] The newly elected leader is master@10.0.77.243:5050 with id 20150803-113340-4081909770-5050-3806 I0803 11:33:41.082784 223129600 master.cpp:1508] Elected as the leading master! I0803 11:33:41.083602 223129600 master.cpp:1278] Recovering from registrar I0803 11:33:41.085345 222056448 registrar.cpp:313] Recovering registrar {noformat} the next Master detects it: {noformat} $ ./bin/mesos-master.sh --zk=zk://localhost:2181/test/report --quorum=1 --work_dir=/tmp/report2 --port=5051 ... I0803 11:37:27.356122 317771776 group.cpp:331] Group process (group(4)@10.0.77.243:5051) connected to ZooKeeper I0803 11:37:27.356145 317771776 group.cpp:805] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0803 11:37:27.356158 317771776 group.cpp:403] Trying to create path '/test/report' in ZooKeeper I0803 11:37:27.360528 316162048 network.hpp:415] ZooKeeper group memberships changed I0803 11:37:27.360651 315088896 group.cpp:674] Trying to get '/test/report/log_replicas/00' in ZooKeeper I0803 11:37:27.360689 317771776 detector.cpp:156] Detected a new leader: (id='1') I0803 11:37:27.360949 317771776 group.cpp:674] Trying to get '/test/report/json.info_01' in ZooKeeper I0803 11:37:27.361369 315088896 group.cpp:674] Trying to get '/test/report/log_replicas/01' in ZooKeeper I0803 11:37:27.362244 317235200 network.hpp:463] ZooKeeper
[jira] [Updated] (MESOS-2795) Introduce filesystem provisioner abstraction
[ https://issues.apache.org/jira/browse/MESOS-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2795: -- Summary: Introduce filesystem provisioner abstraction (was: Introduce filesystem provisioner abstraction to Mesos containerizer) Introduce filesystem provisioner abstraction Key: MESOS-2795 URL: https://issues.apache.org/jira/browse/MESOS-2795 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Assignee: Ian Downes Labels: twitter Fix For: 0.24.0 Optional filesystem provisioner component for the Mesos containerizer that can provision per-container filesystems. This is different to a filesystem isolators because it just provisions a root filesystem for a container and doesn't actually do any isolation (e.g., through a mount namespace + pivot or chroot). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3097) OS-specific code touched by the containerizer tests is not Windows compatible
[ https://issues.apache.org/jira/browse/MESOS-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Wu updated MESOS-3097: - Assignee: Alex Clemmer (was: Joseph Wu) OS-specific code touched by the containerizer tests is not Windows compatible - Key: MESOS-3097 URL: https://issues.apache.org/jira/browse/MESOS-3097 Project: Mesos Issue Type: Story Components: libprocess, stout Reporter: Joseph Wu Assignee: Alex Clemmer Labels: mesosphere In the process of adding the Cmake build system, [~hausdorff] noted and stubbed out all OS-specific code. That sweep (mostly of libprocess and stout) is here: https://github.com/hausdorff/mesos/commit/b862f66c6ff58c115a009513621e5128cb734d52 Instead of having inline {{#if defined(...)}}, the OS-specific code will be separated into directories. The Windows code will be stubbed out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2695) Add master flag to enable/disable oversubscription
[ https://issues.apache.org/jira/browse/MESOS-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2695: -- Sprint: Twitter Mesos Q3 Sprint 3 Add master flag to enable/disable oversubscription -- Key: MESOS-2695 URL: https://issues.apache.org/jira/browse/MESOS-2695 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Jie Yu Labels: twitter This flag lets an operator control cluster level oversubscription. The master should send revocable offers to framework iff this flag is enabled and the framework opts in to receive them. Master should ignore revocable resources from slaves if the flag is disabled. Need tests for all these scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3037) Add a SUPPRESS call to the scheduler
[ https://issues.apache.org/jira/browse/MESOS-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3037: -- Sprint: Twitter Mesos Q3 Sprint 2 (was: Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3) Add a SUPPRESS call to the scheduler Key: MESOS-3037 URL: https://issues.apache.org/jira/browse/MESOS-3037 Project: Mesos Issue Type: Improvement Reporter: Vinod Kone Assignee: Vinod Kone SUPPRESS call is the complement to the current REVIVE call i.e., it will inform Mesos to stop sending offers to the framework. For the scheduler driver to send only Call messages (MESOS-2913), DeactivateFrameworkMessage needs to be converted to Call(s). We can implement this by having the driver send a SUPPRESS call followed by a DECLINE call for outstanding offers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3188) Add tuple-awareness to Future callbacks.
[ https://issues.apache.org/jira/browse/MESOS-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652212#comment-14652212 ] Benjamin Mahler commented on MESOS-3188: Sounds good! Add tuple-awareness to Future callbacks. Key: MESOS-3188 URL: https://issues.apache.org/jira/browse/MESOS-3188 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Benjamin Mahler Future is currently single-valued and so the only way to create a multi-value Future is to use a custom struct or a tuple. Since Future is not currently tuple-aware, continuations have to take a tuple as well, and manually unpack: {code} { // Subprocess example. await(s.get().status(), io::read(s.get().out().get()), io::read(s.get().err().get())) .then(defer(self(), Self::continue, lambda::_1)); } void continue(std::tuple FutureOptionint, Futurestring, Futurestring results) { FutureOptionint status = std::get0(results); Futurestring output = std::get1(results); Futurestring error = std::get2(results); } {code} Since multi-value Future (i.e. FutureT1, T2, ...) seems to be a bad design choice, being tuple aware can improve the code cleanliness by unpacking the tuple automatically for the continuation: {code} { // Subprocess example. await(s.get().status(), io::read(s.get().out().get()), io::read(s.get().err().get())) .then(defer(self(), Self::continue, lambda::_1)); } void continue( FutureOptionint status, Futurestring output, Futurestring error) { ... } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3189) TimeTest.Now fails with --enable-libevent
Joris Van Remoortere created MESOS-3189: --- Summary: TimeTest.Now fails with --enable-libevent Key: MESOS-3189 URL: https://issues.apache.org/jira/browse/MESOS-3189 Project: Mesos Issue Type: Bug Components: libprocess Affects Versions: 0.23.0 Reporter: Joris Van Remoortere [ RUN ] TimeTest.Now ../../../3rdparty/libprocess/src/tests/time_tests.cpp:50: Failure Expected: (Microseconds(10)) (Clock::now() - t1), actual: 8-byte object 10-27 00-00 00-00 00-00 vs 0ns [ FAILED ] TimeTest.Now (0 ms) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2968) Implement copy based provisioner backend
[ https://issues.apache.org/jira/browse/MESOS-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2968: -- Summary: Implement copy based provisioner backend (was: Implement Shared Copy backend) Implement copy based provisioner backend Key: MESOS-2968 URL: https://issues.apache.org/jira/browse/MESOS-2968 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere Currently Appc and Docker both implemented its own copy backend, but most of the logic is the same where the input is just a image name with its dependencies. We can refactor both so that we just have one implementation that is shared between both provisioners, so appc and docker can reuse the shared copy backend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3190) Implement bind mount based provisioner backend
Jie Yu created MESOS-3190: - Summary: Implement bind mount based provisioner backend Key: MESOS-3190 URL: https://issues.apache.org/jira/browse/MESOS-3190 Project: Mesos Issue Type: Bug Reporter: Jie Yu This is a specialized backend that may be useful for deployments using large (multi-GB) single-layer images and where more recent kernel features such as overlayfs are not available. For small images (10's to 100's of MB) the Copy backend may be sufficient. 1) It supports only a single layer. Multi-layer images will fail to provision and the container will fail to launch! 2) The filesystem is read-only because all containers using this image share the source. Select writable areas can be achieved by mounting read-write volumes to places like /tmp, /var/tmp, /home, etc. using the ContainerInfo. These can be relative to the executor work directory. 3) It relies on the image persisting in the store. 4) It's fast because the bind mount requires (nearly) zero IO. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2289) Design doc for the HTTP API
[ https://issues.apache.org/jira/browse/MESOS-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652172#comment-14652172 ] Vinod Kone commented on MESOS-2289: --- We currently do not publish design docs to the website (though there has been recent discussion around this). Note that this API is not ready for public consumption yet. You can follow the epic MESOS-2288 to track its progress. Once that epic completes, we will definitely have a doc that explains the new HTTP API, up on the website. Design doc for the HTTP API --- Key: MESOS-2289 URL: https://issues.apache.org/jira/browse/MESOS-2289 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Vinod Kone Fix For: 0.23.0 This tracks the design of the HTTP API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3178) Perform a self bind mount of rootfs itself in fs::chroot::enter.
[ https://issues.apache.org/jira/browse/MESOS-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-3178: -- Fix Version/s: 0.24.0 Perform a self bind mount of rootfs itself in fs::chroot::enter. Key: MESOS-3178 URL: https://issues.apache.org/jira/browse/MESOS-3178 Project: Mesos Issue Type: Bug Reporter: Jie Yu Assignee: Jie Yu Fix For: 0.24.0 Syscall 'pivot_root' requires that the old and the new root are not in the same filesystem. Otherwise, the user will receive a Device or resource busy error. Currently, we rely on the provisioner to prepare the rootfs and do proper bind mount if needed so that pivot_root can succeed. The drawback of this approach is that it potentially pollutes the host mount table which requires cleanup logics. For instance, in the test, we create a test rootfs by copying the host files. We need to do a self bind mount so that we can pivot_root on it. That pollute the host mount table and it might leak mounts if test crashes before we do the lazy umount: https://github.com/apache/mesos/blob/master/src/tests/containerizer/launch_tests.cpp#L96-L102 What I propose is that we always perform a recursive self bind mount of rootfs itself in fs::chroot::enter (after enter the new mount namespace). Seems that this is also done in libcontainer: https://github.com/opencontainers/runc/blob/master/libcontainer/rootfs_linux.go#L402 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2697) Add a /teardown endpoint on master to teardown a framework
[ https://issues.apache.org/jira/browse/MESOS-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652144#comment-14652144 ] Craig W commented on MESOS-2697: Is there a page that documents the HTTP API? Add a /teardown endpoint on master to teardown a framework -- Key: MESOS-2697 URL: https://issues.apache.org/jira/browse/MESOS-2697 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Vinod Kone Fix For: 0.23.0 We plan to rename /shutdown endpoint to /teardown to be compatible with the new API. /shutdown will be deprecated in 0.23.0 or later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2697) Add a /teardown endpoint on master to teardown a framework
[ https://issues.apache.org/jira/browse/MESOS-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652151#comment-14652151 ] Vinod Kone commented on MESOS-2697: --- Yes. See https://issues.apache.org/jira/browse/MESOS-2289 Add a /teardown endpoint on master to teardown a framework -- Key: MESOS-2697 URL: https://issues.apache.org/jira/browse/MESOS-2697 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Vinod Kone Fix For: 0.23.0 We plan to rename /shutdown endpoint to /teardown to be compatible with the new API. /shutdown will be deprecated in 0.23.0 or later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3179) Create a test abstraction for preparing test rootfs.
[ https://issues.apache.org/jira/browse/MESOS-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3179: -- Sprint: Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Mesos Q3 Sprint 2) Create a test abstraction for preparing test rootfs. Key: MESOS-3179 URL: https://issues.apache.org/jira/browse/MESOS-3179 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Jie Yu Several tests need this abstraction, so it's better to unify them. For example, src/tests/containerizer/launch_tests.cpp needs to create a test rootfs. We also need that to test filesystem isolators. The test rootfs can be created by copying files/directories from host file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2794) Implement filesystem isolators
[ https://issues.apache.org/jira/browse/MESOS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2794: -- Assignee: Jie Yu (was: Ian Downes) Implement filesystem isolators -- Key: MESOS-2794 URL: https://issues.apache.org/jira/browse/MESOS-2794 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Assignee: Jie Yu Labels: twitter Move persistent volume support from Mesos containerizer to separate filesystem isolators, including support for container rootfs, where possible. Use symlinks for posix systems without container rootfs. Use bind mounts for Linux with/without container rootfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3182) Make Master::registerFramework() and Master::reregisterFramework() call into Master::subscribe()
[ https://issues.apache.org/jira/browse/MESOS-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3182: -- Sprint: Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Mesos Q3 Sprint 2) Make Master::registerFramework() and Master::reregisterFramework() call into Master::subscribe() Key: MESOS-3182 URL: https://issues.apache.org/jira/browse/MESOS-3182 Project: Mesos Issue Type: Improvement Reporter: Vinod Kone Assignee: Vinod Kone Currently Master::subscribe() calls into Master::registerFramework() and Master::reregisterFramework(). We should do it the other way around to be consistent with how we did all the other calls. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3167) Design doc for versioning the HTTP API
[ https://issues.apache.org/jira/browse/MESOS-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3167: -- Sprint: Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Mesos Q3 Sprint 2) Design doc for versioning the HTTP API -- Key: MESOS-3167 URL: https://issues.apache.org/jira/browse/MESOS-3167 Project: Mesos Issue Type: Documentation Reporter: Vinod Kone Assignee: Vinod Kone In concert with the release of the HTTP API, we would also like to come up with a versioning strategy. This enables to do a meaningful 1.0 release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2794) Implement filesystem isolators
[ https://issues.apache.org/jira/browse/MESOS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2794: -- Sprint: Twitter Q2 Sprint 3, Twitter Mesos Q2 Sprint 5, Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Q2 Sprint 3, Twitter Mesos Q2 Sprint 5, Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2) Implement filesystem isolators -- Key: MESOS-2794 URL: https://issues.apache.org/jira/browse/MESOS-2794 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Assignee: Ian Downes Labels: twitter Move persistent volume support from Mesos containerizer to separate filesystem isolators, including support for container rootfs, where possible. Use symlinks for posix systems without container rootfs. Use bind mounts for Linux with/without container rootfs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2834) Support different perf output formats
[ https://issues.apache.org/jira/browse/MESOS-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2834: -- Sprint: Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Mesos Q2 Sprint 6, Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2) Support different perf output formats - Key: MESOS-2834 URL: https://issues.apache.org/jira/browse/MESOS-2834 Project: Mesos Issue Type: Improvement Components: isolation Reporter: Ian Downes Assignee: Paul Brett Labels: twitter The output format of perf changes in 3.14 (inserting an additional field) and in again in 4.1 (appending additional) fields. See kernel commits: 410136f5dd96b6013fe6d1011b523b1c247e1ccb d73515c03c6a2706e088094ff6095a3abefd398b Update the perf::parse() function to understand all these formats. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2757) Add - operator for OptionT, TryT, ResultT, FutureT.
[ https://issues.apache.org/jira/browse/MESOS-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2757: -- Sprint: Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Mesos Q3 Sprint 2) Add - operator for OptionT, TryT, ResultT, FutureT. Key: MESOS-2757 URL: https://issues.apache.org/jira/browse/MESOS-2757 Project: Mesos Issue Type: Improvement Components: libprocess, stout Reporter: Joris Van Remoortere Assignee: Benjamin Mahler Labels: c++11, option, stout, twitter Let's add operator overloads to OptionT to allow access to the underlying T using the `-` operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3037) Add a SUPPRESS call to the scheduler
[ https://issues.apache.org/jira/browse/MESOS-3037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3037: -- Sprint: Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Mesos Q3 Sprint 2) Add a SUPPRESS call to the scheduler Key: MESOS-3037 URL: https://issues.apache.org/jira/browse/MESOS-3037 Project: Mesos Issue Type: Improvement Reporter: Vinod Kone Assignee: Vinod Kone SUPPRESS call is the complement to the current REVIVE call i.e., it will inform Mesos to stop sending offers to the framework. For the scheduler driver to send only Call messages (MESOS-2913), DeactivateFrameworkMessage needs to be converted to Call(s). We can implement this by having the driver send a SUPPRESS call followed by a DECLINE call for outstanding offers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2994) Design doc for creating user namespaces inside containers
[ https://issues.apache.org/jira/browse/MESOS-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2994: -- Sprint: Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2, Twitter Mesos Q3 Sprint 3 (was: Twitter Mesos Q3 Sprint 1, Twitter Mesos Q3 Sprint 2) Design doc for creating user namespaces inside containers - Key: MESOS-2994 URL: https://issues.apache.org/jira/browse/MESOS-2994 Project: Mesos Issue Type: Improvement Reporter: Paul Brett Assignee: Paul Brett Labels: twitter -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-2796) Implement a filesystem provisioner for AppC images (aci).
[ https://issues.apache.org/jira/browse/MESOS-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-2796: - Assignee: Yan Xu (was: Ian Downes) Implement a filesystem provisioner for AppC images (aci). - Key: MESOS-2796 URL: https://issues.apache.org/jira/browse/MESOS-2796 Project: Mesos Issue Type: Improvement Components: isolation Affects Versions: 0.22.1 Reporter: Ian Downes Assignee: Yan Xu Labels: twitter Implement a filesystem provisioner that can provision container images compliant with the Application Container Image (aci) [specification|https://github.com/appc/spec]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2695) Add master flag to enable/disable oversubscription
[ https://issues.apache.org/jira/browse/MESOS-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2695: -- Assignee: Jie Yu Summary: Add master flag to enable/disable oversubscription (was: Add master flag to enable oversubscription) Add master flag to enable/disable oversubscription -- Key: MESOS-2695 URL: https://issues.apache.org/jira/browse/MESOS-2695 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Jie Yu Labels: twitter This flag lets an operator control cluster level oversubscription. The master should send revocable offers to framework iff this flag is enabled and the framework opts in to receive them. Master should ignore revocable resources from slaves if the flag is disabled. Need tests for all these scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2971) Implement OverlayFS based provisioner backend
[ https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2971: -- Summary: Implement OverlayFS based provisioner backend (was: Add OverlayFS based provisioner backend) Implement OverlayFS based provisioner backend - Key: MESOS-2971 URL: https://issues.apache.org/jira/browse/MESOS-2971 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere Part of the image provisioning process is to call a backend to create a root filesystem based on the image on disk layout. The problem with the copy backend is that it's both waste of IO and space, and bind only can deal with one layer. Overlayfs backend allows us to utilize the filesystem to merge multiple filesystems into one efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2846) Refactor passing and instantiating multiple provisioners
[ https://issues.apache.org/jira/browse/MESOS-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652331#comment-14652331 ] Jie Yu commented on MESOS-2846: --- Can someone provide more details on this ticket? cc [~tnachen], [~idownes] Refactor passing and instantiating multiple provisioners Key: MESOS-2846 URL: https://issues.apache.org/jira/browse/MESOS-2846 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Assignee: Ian Downes -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3193) Implement AppC image discovery.
Yan Xu created MESOS-3193: - Summary: Implement AppC image discovery. Key: MESOS-3193 URL: https://issues.apache.org/jira/browse/MESOS-3193 Project: Mesos Issue Type: Bug Reporter: Yan Xu https://reviews.apache.org/r/34139/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2066) Add optional 'Unavailability' to resource offers to provide maintenance awareness.
[ https://issues.apache.org/jira/browse/MESOS-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2066: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Add optional 'Unavailability' to resource offers to provide maintenance awareness. -- Key: MESOS-2066 URL: https://issues.apache.org/jira/browse/MESOS-2066 Project: Mesos Issue Type: Task Reporter: Benjamin Mahler Assignee: Joseph Wu Labels: mesosphere In order to inform frameworks about upcoming maintenance on offered resources, per MESOS-1474, we'd like to add an optional 'Unavailability' information to offers: {code} message Interval { optional double start = 1; // Time, in seconds since the Epoch. optional double duration = 2; // Time, in seconds. } message Offer { // Existing fields ... // Signifies that the resources in this Offer are part of a planned // maintenance schedule in the specified window. Any tasks launched // using these resources may be killed when the window arrives. // This field gives additional information about the maintenance. // The maintenance may not necessarily start at exactly at this interval, // nor last for exactly the duration of this interval. optional Interval unavailability = 9; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2949) Design generalized Authorizer interface
[ https://issues.apache.org/jira/browse/MESOS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2949: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Design generalized Authorizer interface --- Key: MESOS-2949 URL: https://issues.apache.org/jira/browse/MESOS-2949 Project: Mesos Issue Type: Task Components: master, security Reporter: Alexander Rojas Assignee: Alexander Rojas Labels: acl, mesosphere, security As mentioned in MESOS-2948 the current {{mesos::Authorizer}} interface is rather inflexible if new _Actions_ or _Objects_ need to be added. A new API needs to be designed in a way that allows for arbitrary _Actions_ and _Objects_ to be added to the authorization mechanism without having to recompile mesos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3079) `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too)
[ https://issues.apache.org/jira/browse/MESOS-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3079: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too) - Key: MESOS-3079 URL: https://issues.apache.org/jira/browse/MESOS-3079 Project: Mesos Issue Type: Bug Affects Versions: 0.23.0 Reporter: Marco Massenzio Priority: Blocker Labels: mesosphere, tests Attachments: test-results.log Running tests as root causes a large number of failures. {noformat} $ lsb_release -a LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:cxx-3.0-amd64:cxx-3.0-noarch:cxx-3.1-amd64:cxx-3.1-noarch:cxx-3.2-amd64:cxx-3.2-noarch:cxx-4.0-amd64:cxx-4.0-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-3.1-amd64:desktop-3.1-noarch:desktop-3.2-amd64:desktop-3.2-noarch:desktop-4.0-amd64:desktop-4.0-noarch:desktop-4.1-amd64:desktop-4.1-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.0-amd64:graphics-3.0-noarch:graphics-3.1-amd64:graphics-3.1-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch:graphics-4.1-amd64:graphics-4.1-noarch:languages-3.2-amd64:languages-3.2-noarch:languages-4.0-amd64:languages-4.0-noarch:languages-4.1-amd64:languages-4.1-noarch:multimedia-3.2-amd64:multimedia-3.2-noarch:multimedia-4.0-amd64:multimedia-4.0-noarch:multimedia-4.1-amd64:multimedia-4.1-noarch:printing-3.2-amd64:printing-3.2-noarch:printing-4.0-amd64:printing-4.0-noarch:printing-4.1-amd64:printing-4.1-noarch:qt4-3.1-amd64:qt4-3.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch Distributor ID: Ubuntu Description:Ubuntu 14.04.2 LTS Release:14.04 Codename: trusty $ sudo make -j12 V=0 check [==] 712 tests from 116 test cases ran. (318672 ms total) [ PASSED ] 676 tests. [ FAILED ] 36 tests, listed below: [ FAILED ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample [ FAILED ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess [ FAILED ] SlaveRecoveryTest/0.RecoverSlaveState, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverStatusUpdateManager, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconnectExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverUnregisteredExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverTerminatedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverCompletedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.CleanupExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RemoveNonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.NonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.KillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.Reboot, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.GCExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RegisterDisconnectedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileKillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileShutdownFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileTasksMissingFromSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.SchedulerFailover, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.PartitionedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.MasterFailover, where TypeParam =
[jira] [Updated] (MESOS-3095) PoC running command executor with image provisioner
[ https://issues.apache.org/jira/browse/MESOS-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3095: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) PoC running command executor with image provisioner --- Key: MESOS-3095 URL: https://issues.apache.org/jira/browse/MESOS-3095 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Assignee: Timothy Chen Labels: mesosphere This is to implement a PoC of the alternative design choices with MESOS-3004 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3164) Introduce QuotaInfo message
[ https://issues.apache.org/jira/browse/MESOS-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3164: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Introduce QuotaInfo message --- Key: MESOS-3164 URL: https://issues.apache.org/jira/browse/MESOS-3164 Project: Mesos Issue Type: Task Components: master Reporter: Alexander Rukletsov Assignee: Joerg Schad Labels: mesosphere A {{QuotaInfo}} protobuf message is internal representation for quota related information (e.g. for persisting quota). The protobuf message should be extendable for future needs and allows for easy aggregation across roles and operator principals. It may also be used to pass quota information to allocators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3114) Simplify JSON::* by providing jsonify along the lines of stringify
[ https://issues.apache.org/jira/browse/MESOS-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3114: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Simplify JSON::* by providing jsonify along the lines of stringify -- Key: MESOS-3114 URL: https://issues.apache.org/jira/browse/MESOS-3114 Project: Mesos Issue Type: Task Reporter: Kapil Arya Assignee: Kapil Arya Labels: mesosphere We want to be able to do things like: {code} JSON::Value number1 = 25; JSON::Number number2 = 26; EXPECT_NE(number1, number2); EXPECT_EQ(jsonify(12), number1); EXPECT_EQ(jsonify(12), number2); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2849) Implement Docker local image store
[ https://issues.apache.org/jira/browse/MESOS-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2849: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Implement Docker local image store -- Key: MESOS-2849 URL: https://issues.apache.org/jira/browse/MESOS-2849 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Timothy Chen Assignee: Lily Chen Labels: mesosphere, unified-prototype Given a local Docker image name and path to the image or image tarball, fetches the image's dependent layers, untarring if necessary. It will also parse the image layers' configuration json and place the layers and image into persistent store. Done when a Docker image can be successfully stored and retrieved using 'put' and 'get' methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3074) Check satisfiability of quota requests in Master
[ https://issues.apache.org/jira/browse/MESOS-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3074: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Check satisfiability of quota requests in Master Key: MESOS-3074 URL: https://issues.apache.org/jira/browse/MESOS-3074 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Alexander Rukletsov Labels: mesosphere We need to to validate and quota requests in the Mesos Master as outlined in the Design Doc: https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I This ticket aims to validate satisfiability (in terms of available resources) of a quota request using a heuristic algorithm in the Mesos Master, rather than validating the syntax of the request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2200) bogus docker images result in bad error message to scheduler
[ https://issues.apache.org/jira/browse/MESOS-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2200: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) bogus docker images result in bad error message to scheduler Key: MESOS-2200 URL: https://issues.apache.org/jira/browse/MESOS-2200 Project: Mesos Issue Type: Bug Components: containerization, docker Reporter: Jay Buffington Assignee: Joerg Schad Labels: mesosphere When a scheduler specifies a bogus image in ContainerInfo mesos doesn't tell the scheduler that the docker pull failed or why. This error is logged in the mesos-slave log, but it isn't given to the scheduler (as far as I can tell): {noformat} E1218 23:50:55.406230 8123 slave.cpp:2730] Container '8f70784c-3e40-4072-9ca2-9daed23f15ff' for executor 'thermos-1418946354013-xxx-xxx-curl-0-f500cc41-dd0a-4338-8cbc-d631cb588bb1' of framework '20140522-213145-1749004561-5050-29512-' failed to start: Failed to 'docker pull docker-registry.example.com/doesntexist/hello1.1:latest': exit status = exited with status 1 stderr = 2014/12/18 23:50:55 Error: image doesntexist/hello1.1 not found {noformat} If the docker image is not in the registry, the scheduler should give the user an error message. If docker pull failed because of networking issues, it should be retried. Mesos should give the scheduler enough information to be able to make that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2061) Add InverseOffer protobuf message.
[ https://issues.apache.org/jira/browse/MESOS-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2061: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Add InverseOffer protobuf message. -- Key: MESOS-2061 URL: https://issues.apache.org/jira/browse/MESOS-2061 Project: Mesos Issue Type: Task Reporter: Benjamin Mahler Assignee: Joseph Wu Labels: mesosphere InverseOffer was defined as part of the maintenance work in MESOS-1474, design doc here: https://docs.google.com/document/d/16k0lVwpSGVOyxPSyXKmGC-gbNmRlisNEe4p-fAUSojk/edit?usp=sharing {code} /** * A request to return some resources occupied by a framework. */ message InverseOffer { required OfferID id = 1; required FrameworkID framework_id = 2; // A list of resources being requested back from the framework. repeated Resource resources = 3; // Specified if the resources need to be released from a particular slave. optional SlaveID slave_id = 4; // The resources in this InverseOffer are part of a planned maintenance // schedule in the specified window. Any tasks running using these // resources may be killed when the window arrives. optional Interval unavailability = 5; } {code} This ticket is to capture the addition of the InverseOffer protobuf to mesos.proto, the necessary API changes for Event/Call and the language bindings will be tracked separately. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2947) Authorizer Module: Implementation, Integration Tests
[ https://issues.apache.org/jira/browse/MESOS-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2947: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Authorizer Module: Implementation, Integration Tests -- Key: MESOS-2947 URL: https://issues.apache.org/jira/browse/MESOS-2947 Project: Mesos Issue Type: Improvement Reporter: Till Toenshoff Assignee: Alexander Rojas Labels: mesosphere, module, security h4.Motivation Provide an example authorizer module based on the {{LocalAuthorizer}} implementation. Make sure that such authorizer module can be fully unit- and integration- tested within the mesos test suite. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3062) Add authorization for dynamic reservation
[ https://issues.apache.org/jira/browse/MESOS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-3062: Story Points: 2 Add authorization for dynamic reservation - Key: MESOS-3062 URL: https://issues.apache.org/jira/browse/MESOS-3062 Project: Mesos Issue Type: Task Components: master Reporter: Michael Park Assignee: Michael Park Labels: mesosphere Dynamic reservations should be authorized with the {{principal}} of the reserving entity (framework or master). The idea is to introduce {{Reserve}} and {{Unreserve}} into the ACL. {code} message Reserve { // Subjects. required Entity principals = 1; // Objects. MVP: Only possible values = ANY, NONE required Entity resources = 1; } message Unreserve { // Subjects. required Entity principals = 1; // Objects. required Entity reserver_principals = 2; } {code} When a framework/operator reserves resources, reserve ACLs are checked to see if the framework ({{FrameworkInfo.principal}}) or the operator ({{Credential.user}}) is authorized to reserve the specified resources. If not authorized, the reserve operation is rejected. When a framework/operator unreserves resources, unreserve ACLs are checked to see if the framework ({{FrameworkInfo.principal}}) or the operator ({{Credential.user}}) is authorized to unreserve the resources reserved by a framework or operator ({{Resource.ReservationInfo.principal}}). If not authorized, the unreserve operation is rejected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3079) `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too)
[ https://issues.apache.org/jira/browse/MESOS-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio reassigned MESOS-3079: -- Assignee: Marco Massenzio `sudo make distcheck` fails on Ubuntu 14.04 (and possibly other OSes too) - Key: MESOS-3079 URL: https://issues.apache.org/jira/browse/MESOS-3079 Project: Mesos Issue Type: Bug Affects Versions: 0.23.0 Reporter: Marco Massenzio Assignee: Marco Massenzio Priority: Blocker Labels: mesosphere, tests Attachments: test-results.log Running tests as root causes a large number of failures. {noformat} $ lsb_release -a LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:cxx-3.0-amd64:cxx-3.0-noarch:cxx-3.1-amd64:cxx-3.1-noarch:cxx-3.2-amd64:cxx-3.2-noarch:cxx-4.0-amd64:cxx-4.0-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-3.1-amd64:desktop-3.1-noarch:desktop-3.2-amd64:desktop-3.2-noarch:desktop-4.0-amd64:desktop-4.0-noarch:desktop-4.1-amd64:desktop-4.1-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.0-amd64:graphics-3.0-noarch:graphics-3.1-amd64:graphics-3.1-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch:graphics-4.1-amd64:graphics-4.1-noarch:languages-3.2-amd64:languages-3.2-noarch:languages-4.0-amd64:languages-4.0-noarch:languages-4.1-amd64:languages-4.1-noarch:multimedia-3.2-amd64:multimedia-3.2-noarch:multimedia-4.0-amd64:multimedia-4.0-noarch:multimedia-4.1-amd64:multimedia-4.1-noarch:printing-3.2-amd64:printing-3.2-noarch:printing-4.0-amd64:printing-4.0-noarch:printing-4.1-amd64:printing-4.1-noarch:qt4-3.1-amd64:qt4-3.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch Distributor ID: Ubuntu Description:Ubuntu 14.04.2 LTS Release:14.04 Codename: trusty $ sudo make -j12 V=0 check [==] 712 tests from 116 test cases ran. (318672 ms total) [ PASSED ] 676 tests. [ FAILED ] 36 tests, listed below: [ FAILED ] PerfEventIsolatorTest.ROOT_CGROUPS_Sample [ FAILED ] UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup, where TypeParam = mesos::internal::slave::CgroupsPerfEventIsolatorProcess [ FAILED ] SlaveRecoveryTest/0.RecoverSlaveState, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverStatusUpdateManager, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconnectExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverUnregisteredExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverTerminatedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RecoverCompletedExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.CleanupExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RemoveNonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.NonCheckpointingFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.KillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.Reboot, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.GCExecutor, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ShutdownSlaveSIGUSR1, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.RegisterDisconnectedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileKillTask, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileShutdownFramework, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.ReconcileTasksMissingFromSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.SchedulerFailover, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.PartitionedSlave, where TypeParam = mesos::internal::slave::MesosContainerizer [ FAILED ] SlaveRecoveryTest/0.MasterFailover, where TypeParam =
[jira] [Updated] (MESOS-3069) Registry operations do not exist for manipulating maintanence schedules
[ https://issues.apache.org/jira/browse/MESOS-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3069: --- Sprint: (was: Mesosphere Sprint 16) Registry operations do not exist for manipulating maintanence schedules --- Key: MESOS-3069 URL: https://issues.apache.org/jira/browse/MESOS-3069 Project: Mesos Issue Type: Task Components: master, replicated log Reporter: Joseph Wu Assignee: Joseph Wu Labels: mesosphere In order to modify the maintenance schedule in the replicated registry, we will need Operations (src/master/registrar.hpp). The operations will likely correspond to the HTTP API: * UpdateMaintenance: Given a blob representing a maintenance schedule, write the blob to the registry. Possibly perform some verification on the blob. * UpdateSlaveMaintenanceStatus: Given a set of machines and a status (action), change the machiness' status in the maintenance schedule. Possible test(s): * UpdateMaintenance: ** Add a schedule with 1 slave, 2+ slaves, and 0 slaves. ** Add multiple schedules (different intervals). ** Delete schedules (empty schedule). * UpdateSlaveMaintenanceStatus: ** Add schedule. ** Change a slave's status. ** Change a slave's status, given a slave that is not in the schedule (slave should be added to the schedule). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3069) Registry operations do not exist for manipulating maintanence schedules
[ https://issues.apache.org/jira/browse/MESOS-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3069: --- Story Points: 8 (was: 5) Registry operations do not exist for manipulating maintanence schedules --- Key: MESOS-3069 URL: https://issues.apache.org/jira/browse/MESOS-3069 Project: Mesos Issue Type: Task Components: master, replicated log Reporter: Joseph Wu Assignee: Joseph Wu Labels: mesosphere In order to modify the maintenance schedule in the replicated registry, we will need Operations (src/master/registrar.hpp). The operations will likely correspond to the HTTP API: * UpdateMaintenance: Given a blob representing a maintenance schedule, write the blob to the registry. Possibly perform some verification on the blob. * UpdateSlaveMaintenanceStatus: Given a set of machines and a status (action), change the machiness' status in the maintenance schedule. Possible test(s): * UpdateMaintenance: ** Add a schedule with 1 slave, 2+ slaves, and 0 slaves. ** Add multiple schedules (different intervals). ** Delete schedules (empty schedule). * UpdateSlaveMaintenanceStatus: ** Add schedule. ** Change a slave's status. ** Change a slave's status, given a slave that is not in the schedule (slave should be added to the schedule). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1010) Python extension build is broken if gflags-dev is installed
[ https://issues.apache.org/jira/browse/MESOS-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-1010: --- Story Points: 3 Python extension build is broken if gflags-dev is installed --- Key: MESOS-1010 URL: https://issues.apache.org/jira/browse/MESOS-1010 Project: Mesos Issue Type: Bug Components: build, python api Environment: Fedora 20, amd64. GCC: 4.8.2. Reporter: Nikita Vetoshkin Assignee: Greg Mann Labels: flaky-test, mesosphere In my environment mesos build from master results in broken python api module {{_mesos.so}}: {noformat} nekto0n@ya-darkstar ~/workspace/mesos/src/python $ PYTHONPATH=build/lib.linux-x86_64-2.7/ python -c import _mesos Traceback (most recent call last): File string, line 1, in module ImportError: /home/nekto0n/workspace/mesos/src/python/build/lib.linux-x86_64-2.7/_mesos.so: undefined symbol: _ZN6google14FlagRegistererC1EPKcS2_S2_S2_PvS3_ {noformat} Unmangled version of symbol looks like this: {noformat} google::FlagRegisterer::FlagRegisterer(char const*, char const*, char const*, char const*, void*, void*) {noformat} During {{./configure}} step {{glog}} finds {{gflags}} development files and starts using them, thus *implicitly* adding dependency on {{libgflags.so}}. This breaks Python extensions module and perhaps can break other mesos subsystems when moved to hosts without {{gflags}} installed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-830) ExamplesTest.JavaFramework is flaky
[ https://issues.apache.org/jira/browse/MESOS-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-830: -- Story Points: 8 ExamplesTest.JavaFramework is flaky --- Key: MESOS-830 URL: https://issues.apache.org/jira/browse/MESOS-830 Project: Mesos Issue Type: Bug Components: test Reporter: Vinod Kone Assignee: Greg Mann Labels: flaky, mesosphere [ RUN ] ExamplesTest.JavaFramework Using temporary directory '/tmp/ExamplesTest_JavaFramework_wSc7u8' Enabling authentication for the framework I1120 15:13:39.820032 1681264640 master.cpp:285] Master started on 172.25.133.171:52576 I1120 15:13:39.820180 1681264640 master.cpp:299] Master ID: 201311201513-2877626796-52576-3234 I1120 15:13:39.820194 1681264640 master.cpp:302] Master only allowing authenticated frameworks to register! I1120 15:13:39.821197 1679654912 slave.cpp:112] Slave started on 1)@172.25.133.171:52576 I1120 15:13:39.821795 1679654912 slave.cpp:212] Slave resources: cpus(*):4; mem(*):7168; disk(*):481998; ports(*):[31000-32000] I1120 15:13:39.822855 1682337792 slave.cpp:112] Slave started on 2)@172.25.133.171:52576 I1120 15:13:39.823652 1682337792 slave.cpp:212] Slave resources: cpus(*):4; mem(*):7168; disk(*):481998; ports(*):[31000-32000] I1120 15:13:39.825330 1679118336 master.cpp:744] The newly elected leader is master@172.25.133.171:52576 I1120 15:13:39.825445 1679118336 master.cpp:748] Elected as the leading master! I1120 15:13:39.825907 1681264640 state.cpp:33] Recovering state from '/tmp/ExamplesTest_JavaFramework_wSc7u8/0/meta' I1120 15:13:39.826127 1681264640 status_update_manager.cpp:180] Recovering status update manager I1120 15:13:39.826331 1681801216 process_isolator.cpp:317] Recovering isolator I1120 15:13:39.826738 1682874368 slave.cpp:2743] Finished recovery I1120 15:13:39.827747 1682337792 state.cpp:33] Recovering state from '/tmp/ExamplesTest_JavaFramework_wSc7u8/1/meta' I1120 15:13:39.827945 1680191488 slave.cpp:112] Slave started on 3)@172.25.133.171:52576 I1120 15:13:39.828415 1682337792 status_update_manager.cpp:180] Recovering status update manager I1120 15:13:39.828608 1680728064 sched.cpp:260] Authenticating with master master@172.25.133.171:52576 I1120 15:13:39.828606 1680191488 slave.cpp:212] Slave resources: cpus(*):4; mem(*):7168; disk(*):481998; ports(*):[31000-32000] I1120 15:13:39.828680 1682874368 slave.cpp:497] New master detected at master@172.25.133.171:52576 I1120 15:13:39.828765 1682337792 process_isolator.cpp:317] Recovering isolator I1120 15:13:39.829828 1680728064 sched.cpp:229] Detecting new master I1120 15:13:39.830288 1679654912 authenticatee.hpp:100] Initializing client SASL I1120 15:13:39.831635 1680191488 state.cpp:33] Recovering state from '/tmp/ExamplesTest_JavaFramework_wSc7u8/2/meta' I1120 15:13:39.831991 1679118336 status_update_manager.cpp:158] New master detected at master@172.25.133.171:52576 I1120 15:13:39.832042 1682874368 slave.cpp:524] Detecting new master I1120 15:13:39.832314 1682337792 slave.cpp:2743] Finished recovery I1120 15:13:39.832309 1681264640 master.cpp:1266] Attempting to register slave on vkone.local at slave(1)@172.25.133.171:52576 I1120 15:13:39.832929 1680728064 status_update_manager.cpp:180] Recovering status update manager I1120 15:13:39.833371 1681801216 slave.cpp:497] New master detected at master@172.25.133.171:52576 I1120 15:13:39.833273 1681264640 master.cpp:2513] Adding slave 201311201513-2877626796-52576-3234-0 at vkone.local with cpus(*):4; mem(*):7168; disk(*):481998; ports(*):[31000-32000] I1120 15:13:39.833595 1680728064 process_isolator.cpp:317] Recovering isolator I1120 15:13:39.833859 1681801216 slave.cpp:524] Detecting new master I1120 15:13:39.833861 1682874368 status_update_manager.cpp:158] New master detected at master@172.25.133.171:52576 I1120 15:13:39.834092 1680191488 slave.cpp:542] Registered with master master@172.25.133.171:52576; given slave ID 201311201513-2877626796-52576-3234-0 I1120 15:13:39.834486 1681264640 master.cpp:1266] Attempting to register slave on vkone.local at slave(2)@172.25.133.171:52576 I1120 15:13:39.834549 1681264640 master.cpp:2513] Adding slave 201311201513-2877626796-52576-3234-1 at vkone.local with cpus(*):4; mem(*):7168; disk(*):481998; ports(*):[31000-32000] I1120 15:13:39.834750 1680191488 slave.cpp:555] Checkpointing SlaveInfo to '/tmp/ExamplesTest_JavaFramework_wSc7u8/0/meta/slaves/201311201513-2877626796-52576-3234-0/slave.info' I1120 15:13:39.834875 1682874368 hierarchical_allocator_process.hpp:445] Added slave 201311201513-2877626796-52576-3234-0 (vkone.local) with cpus(*):4; mem(*):7168; disk(*):481998; ports(*):[31000-32000] (and cpus(*):4; mem(*):7168;
[jira] [Created] (MESOS-3195) Fix master metrics for scheduler calls
Vinod Kone created MESOS-3195: - Summary: Fix master metrics for scheduler calls Key: MESOS-3195 URL: https://issues.apache.org/jira/browse/MESOS-3195 Project: Mesos Issue Type: Bug Reporter: Vinod Kone Currently the master increments metrics for old style messages from the driver but not when it receives Calls. Since the driver is now sending Calls, master should update metrics correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3042) Master/Allocator does not send InverseOffers to resources to be maintained
[ https://issues.apache.org/jira/browse/MESOS-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio reassigned MESOS-3042: -- Assignee: Joris Van Remoortere (was: Artem Harutyunyan) Master/Allocator does not send InverseOffers to resources to be maintained -- Key: MESOS-3042 URL: https://issues.apache.org/jira/browse/MESOS-3042 Project: Mesos Issue Type: Task Components: allocation, master Reporter: Joseph Wu Assignee: Joris Van Remoortere Labels: mesosphere Offers are currently sent from master/allocator to framework via ResourceOffersMessage's. InverseOffers, which are roughly equivalent to negative Offers, can be sent in the same package. In src/messages/messages.proto {code} message ResourceOffersMessage { repeated Offer offers = 1; repeated string pids = 2; // New field with InverseOffers repeated InverseOffer inverseOffers = 3; } {code} Sent InverseOffers can be tracked in the master's local state: i.e. In src/master/master.hpp: {code} struct Slave { ... // Existing fields. // Active InverseOffers on this slave. // Similar pattern to the offers field hashsetInverseOffer* inverseOffers; } {code} One actor (master or allocator) should populate the new InverseOffers field. * In master (src/master/master.cpp) ** Master::offer is where the ResourceOffersMessage and Offer object is constructed. ** The same method could also check for maintenance and send InverseOffers. * In the allocator (src/master/allocator/mesos/hierarchical.hpp) ** HierarchicalAllocatorProcess::allocate is where slave resources are aggregated an sent off to the frameworks. ** InverseOffers (i.e. negative resources) allocation could be calculated in this method. ** A change to Master::offer (i.e. the offerCallback) may be necessary to account for the negative resources. Possible test(s): * InverseOfferTest ** Start master, slave, framework. ** Accept resource offer, start task. ** Set maintenance schedule to the future. ** Check that InverseOffer(s) are sent to the framework. ** Decline InverseOffer. ** Check that more InverseOffer(s) are sent. ** Accept InverseOffer. ** Check that more InverseOffer(s) are sent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3042) Master/Allocator does not send InverseOffers to resources to be maintained
[ https://issues.apache.org/jira/browse/MESOS-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-3042: --- Sprint: Mesosphere Sprint 16 Master/Allocator does not send InverseOffers to resources to be maintained -- Key: MESOS-3042 URL: https://issues.apache.org/jira/browse/MESOS-3042 Project: Mesos Issue Type: Task Components: allocation, master Reporter: Joseph Wu Assignee: Artem Harutyunyan Labels: mesosphere Offers are currently sent from master/allocator to framework via ResourceOffersMessage's. InverseOffers, which are roughly equivalent to negative Offers, can be sent in the same package. In src/messages/messages.proto {code} message ResourceOffersMessage { repeated Offer offers = 1; repeated string pids = 2; // New field with InverseOffers repeated InverseOffer inverseOffers = 3; } {code} Sent InverseOffers can be tracked in the master's local state: i.e. In src/master/master.hpp: {code} struct Slave { ... // Existing fields. // Active InverseOffers on this slave. // Similar pattern to the offers field hashsetInverseOffer* inverseOffers; } {code} One actor (master or allocator) should populate the new InverseOffers field. * In master (src/master/master.cpp) ** Master::offer is where the ResourceOffersMessage and Offer object is constructed. ** The same method could also check for maintenance and send InverseOffers. * In the allocator (src/master/allocator/mesos/hierarchical.hpp) ** HierarchicalAllocatorProcess::allocate is where slave resources are aggregated an sent off to the frameworks. ** InverseOffers (i.e. negative resources) allocation could be calculated in this method. ** A change to Master::offer (i.e. the offerCallback) may be necessary to account for the negative resources. Possible test(s): * InverseOfferTest ** Start master, slave, framework. ** Accept resource offer, start task. ** Set maintenance schedule to the future. ** Check that InverseOffer(s) are sent to the framework. ** Decline InverseOffer. ** Check that more InverseOffer(s) are sent. ** Accept InverseOffer. ** Check that more InverseOffer(s) are sent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2981) Allow MesosContainerizer to support modifying launch based on image execution configuration
[ https://issues.apache.org/jira/browse/MESOS-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652336#comment-14652336 ] Jie Yu commented on MESOS-2981: --- Can you elaborate? Allow MesosContainerizer to support modifying launch based on image execution configuration --- Key: MESOS-2981 URL: https://issues.apache.org/jira/browse/MESOS-2981 Project: Mesos Issue Type: Improvement Reporter: Timothy Chen Labels: mesosphere -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3192) ContainerInfo::Image::AppC::id should be optional
Yan Xu created MESOS-3192: - Summary: ContainerInfo::Image::AppC::id should be optional Key: MESOS-3192 URL: https://issues.apache.org/jira/browse/MESOS-3192 Project: Mesos Issue Type: Bug Reporter: Yan Xu As I commented here: https://reviews.apache.org/r/34136/ Currently ContainerInfo::Image::Appc is defined as the following {noformat:title=} message AppC { required string name = 1; required string id = 2; optional Labels labels = 3; } {noformat} In which the {{id}} is a required field. When users specify the image in tasks they likely will not use an image id and we should change it to be optional. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3194) Implement AppC Image Store
Yan Xu created MESOS-3194: - Summary: Implement AppC Image Store Key: MESOS-3194 URL: https://issues.apache.org/jira/browse/MESOS-3194 Project: Mesos Issue Type: Task Reporter: Yan Xu https://reviews.apache.org/r/34140/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners
[ https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2840: -- Labels: mesosphere twitter (was: mesosphere) MesosContainerizer support multiple image provisioners -- Key: MESOS-2840 URL: https://issues.apache.org/jira/browse/MESOS-2840 Project: Mesos Issue Type: Epic Components: containerization, docker Affects Versions: 0.23.0 Reporter: Marco Massenzio Assignee: Timothy Chen Labels: mesosphere, twitter We want to utilize the Appc integration interfaces to further make MesosContainerizers to support multiple image formats. This allows our future work on isolators to support any container image format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-2860) Create the basic infrastructure to handle /call endpoint
[ https://issues.apache.org/jira/browse/MESOS-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14607993#comment-14607993 ] Isabel Jimenez edited comment on MESOS-2860 at 8/3/15 8:20 PM: --- submitted: https://reviews.apache.org/r/36040/ https://reviews.apache.org/r/36360/ https://reviews.apache.org/r/36328/ https://reviews.apache.org/r/35934/ https://reviews.apache.org/r/35939/ https://reviews.apache.org/r/36073/ https://reviews.apache.org/r/36072/ reviewable: https://reviews.apache.org/r/36402/ https://reviews.apache.org/r/36624/ https://reviews.apache.org/r/36040/ discarded or split: https://reviews.apache.org/r/36217/ https://reviews.apache.org/r/36037/ was (Author: ijimenez): submitted: https://reviews.apache.org/r/36040/ https://reviews.apache.org/r/36360/ https://reviews.apache.org/r/36328/ https://reviews.apache.org/r/35934/ https://reviews.apache.org/r/35939/ https://reviews.apache.org/r/36073/ https://reviews.apache.org/r/36072/ reviewable: https://reviews.apache.org/r/36402/ https://reviews.apache.org/r/36040/ discarded or split: https://reviews.apache.org/r/36217/ https://reviews.apache.org/r/36037/ Create the basic infrastructure to handle /call endpoint Key: MESOS-2860 URL: https://issues.apache.org/jira/browse/MESOS-2860 Project: Mesos Issue Type: Story Components: master Reporter: Marco Massenzio Assignee: Isabel Jimenez Labels: mesosphere This is the first basic step in ensuring the basic {{/call}} functionality: processing a {noformat} POST /call {noformat} and returning: - {{202}} if all goes well; - {{401}} if not authorized; and - {{403}} if the request is malformed. We'll get more sophisticated as the work progressed (eg, supporting {{415}} if the content-type is not of the right kind). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2879) Random recursive_mutex errors in when running make check
[ https://issues.apache.org/jira/browse/MESOS-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2879: --- Sprint: Mesosphere Sprint 15 (was: Mesosphere Sprint 15, Mesosphere Sprint 16) Random recursive_mutex errors in when running make check Key: MESOS-2879 URL: https://issues.apache.org/jira/browse/MESOS-2879 Project: Mesos Issue Type: Bug Components: libprocess Reporter: Alexander Rojas Assignee: Joris Van Remoortere Labels: mesosphere, tech-debt While running make check on OS X, from time to time {{recursive_mutex}} errors appear after running all the test successfully. Just one of the experience messages actually stops {{make check}} reporting an error. The following error messages have been experienced: {code} libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argument *** Aborted at 1434553937 (unix time) try date -d @1434553937 if you are using GNU date *** {code} {code} libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argument *** Aborted at 1434557001 (unix time) try date -d @1434557001 if you are using GNU date *** libc++abi.dylib: PC: @ 0x7fff93855286 __pthread_kill libc++abi.dylib: *** SIGABRT (@0x7fff93855286) received by PID 88060 (TID 0x10fc4) stack trace: *** @ 0x7fff8e1d6f1a _sigtramp libc++abi.dylib: @0x10fc3f1a8 (unknown) libc++abi.dylib: @ 0x7fff979deb53 abort libc++abi.dylib: libc++abi.dylib: libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentterminating with uncaught exception of type std::__1::system_error: recursive_mutex lock failed: Invalid argumentMaking check in include {code} {code} Assertion failed: (e == 0), function ~recursive_mutex, file /SourceCache/libcxx/libcxx-120/src/mutex.cpp, line 82. *** Aborted at 1434555685 (unix time) try date -d @1434555685 if you are using GNU date *** PC: @ 0x7fff93855286 __pthread_kill *** SIGABRT (@0x7fff93855286) received by PID 60235 (TID 0x7fff7ebdc300) stack trace: *** @ 0x7fff8e1d6f1a _sigtramp @0x10b512350 google::CheckNotNull() @ 0x7fff979deb53 abort @ 0x7fff979a6c39 __assert_rtn @ 0x7fff9bffdcc9 std::__1::recursive_mutex::~recursive_mutex() @0x10b881928 process::ProcessManager::~ProcessManager() @0x10b874445 process::ProcessManager::~ProcessManager() @0x10b874418 process::finalize() @0x10b2f7aec main @ 0x7fff98edc5c9 start make[5]: *** [check-local] Abort trap: 6 make[4]: *** [check-am] Error 2 make[3]: *** [check-recursive] Error 1 make[2]: *** [check-recursive] Error 1 make[1]: *** [check] Error 2 make: *** [check-recursive] Error 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2860) Create the basic infrastructure to handle /call endpoint
[ https://issues.apache.org/jira/browse/MESOS-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Massenzio updated MESOS-2860: --- Sprint: Mesosphere Sprint 15, Mesosphere Sprint 16 (was: Mesosphere Sprint 15) Create the basic infrastructure to handle /call endpoint Key: MESOS-2860 URL: https://issues.apache.org/jira/browse/MESOS-2860 Project: Mesos Issue Type: Story Components: master Reporter: Marco Massenzio Assignee: Isabel Jimenez Labels: mesosphere This is the first basic step in ensuring the basic {{/call}} functionality: processing a {noformat} POST /call {noformat} and returning: - {{202}} if all goes well; - {{401}} if not authorized; and - {{403}} if the request is malformed. We'll get more sophisticated as the work progressed (eg, supporting {{415}} if the content-type is not of the right kind). -- This message was sent by Atlassian JIRA (v6.3.4#6332)