[jira] [Created] (MESOS-8355) "expression with side effects has no effect in an unevaluated context" on Ubuntu 16.04
Armand Grillet created MESOS-8355: - Summary: "expression with side effects has no effect in an unevaluated context" on Ubuntu 16.04 Key: MESOS-8355 URL: https://issues.apache.org/jira/browse/MESOS-8355 Project: Mesos Issue Type: Bug Reporter: Armand Grillet Attachments: ubuntu-16.04-clang.txt Following https://reviews.apache.org/r/62287/ building Mesos on Ubuntu 16.04 with Clang does not work: {code} 00:13:42 creating build/bdist.linux-x86_64/wheel/mesos.scheduler-1.5.0.dist-info/WHEEL 00:13:46 make dynamic-reservation-framework test-http-framework test-framework test-executor test-http-executor long-lived-framework long-lived-executor no-executor-framework docker-no-executor-framework balloon-framework balloon-executor load-generator-framework persistent-volume-framework disk-full-framework test-helper mesos-tests examples/java/test-executor examples/java/test-exception-framework examples/java/test-framework examples/java/test-log examples/java/test-multiple-executors-framework examples/java/v1-test-framework examples/python/test_executor.py examples/python/test-executor examples/python/test_framework.py examples/python/test-framework \ 00:13:46 tests/balloon_framework_test.sh tests/disk_full_framework_test.sh tests/dynamic_reservation_framework_test.sh tests/java_exception_test.sh tests/java_framework_test.sh tests/java_log_test.sh tests/java_v0_framework_test.sh tests/java_v1_framework_test.sh tests/no_executor_framework_test.sh tests/persistent_volume_framework_test.sh tests/python_framework_test.sh tests/test_http_framework_test.sh tests/test_framework_test.sh 00:13:47 make[3]: Entering directory '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src' 00:13:47 CXXLDdynamic-reservation-framework 00:13:47 CXXLDtest-http-framework 00:13:49 CXXLDtest-framework 00:13:49 CXXLDtest-executor 00:13:51 CXXLDtest-http-executor 00:13:51 CXXLDlong-lived-framework 00:13:52 CXXLDlong-lived-executor 00:13:53 CXXLDno-executor-framework 00:13:54 CXXLDdocker-no-executor-framework 00:13:54 CXXLDballoon-framework 00:13:56 CXXLDballoon-executor 00:13:56 CXXLDload-generator-framework 00:13:58 CXXLDpersistent-volume-framework 00:13:58 CXXLDdisk-full-framework 00:14:00 CXX tests/test_helper-active_user_test_helper.o 00:14:00 CXX tests/test_helper-flags.o 00:14:00 CXX tests/test_helper-http_server_test_helper.o 00:14:00 CXX tests/test_helper-kill_policy_test_helper.o 00:14:00 CXX tests/test_helper-resources_utils.o 00:14:00 CXX tests/test_helper-test_helper_main.o 00:14:00 CXX tests/test_helper-utils.o 00:14:00 CXX tests/containerizer/test_helper-memory_test_helper.o 00:14:00 CXX tests/containerizer/test_helper-capabilities_test_helper.o 00:14:00 CXX tests/containerizer/test_helper-setns_test_helper.o 00:14:00 CXX tests/mesos_tests-log_tests.o 00:14:01 CXX tests/mesos_tests-master_authorization_tests.o 00:14:27 ../../src/tests/log_tests.cpp:2439:120: error: expression with side effects has no effect in an unevaluated context [-Werror,-Wunevaluated-expression] 00:14:27 switch (0) case 0: default: if (const ::testing::AssertionResult gtest_ar = (::testing::internal:: EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(stringify(position++))) == 1)>::Compare("stringify(position++)", "entry.data", stringify(position++), entry.data))) ; else ::testing::internal::AssertHelper(::testing::TestPartResult::kNonFatalFailure, "../../src/tests/log_tests.cpp", 2439, gtest_ar.failure_message()) = ::testing::Message(); 00:14:27 ^ 00:14:27 1 error generated. 00:14:27 Makefile:10317: recipe for target 'tests/mesos_tests-log_tests.o' failed 00:14:27 make[3]: *** [tests/mesos_tests-log_tests.o] Error 1 00:14:27 make[3]: *** Waiting for unfinished jobs 00:14:54 make[3]: Leaving directory '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src' 00:14:54 Makefile:13776: recipe for target 'check-am' failed 00:14:54 make[2]: *** [check-am] Error 2 00:14:54 make[2]: Leaving directory '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src' 00:14:54 Makefile:13780: recipe for target 'check' failed 00:14:54 make[1]: *** [check] Error 2 00:14:54 make[1]: Leaving directory '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src' 00:14:54 Makefile:774: recipe for target 'check-recursive' failed 00:14:54 make: *** [check-recursive] Error 1 00:14:55 Build step 'Conditional step (single)' marked build as failure 00:14:55
[jira] [Updated] (MESOS-8355) "expression with side effects has no effect in an unevaluated context" when building Mesos on Ubuntu 16.04 (Clang)
[ https://issues.apache.org/jira/browse/MESOS-8355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Armand Grillet updated MESOS-8355: -- Summary: "expression with side effects has no effect in an unevaluated context" when building Mesos on Ubuntu 16.04 (Clang) (was: "expression with side effects has no effect in an unevaluated context" on Ubuntu 16.04) > "expression with side effects has no effect in an unevaluated context" when > building Mesos on Ubuntu 16.04 (Clang) > -- > > Key: MESOS-8355 > URL: https://issues.apache.org/jira/browse/MESOS-8355 > Project: Mesos > Issue Type: Bug >Reporter: Armand Grillet > Attachments: ubuntu-16.04-clang.txt > > > Following https://reviews.apache.org/r/62287/ building Mesos on Ubuntu 16.04 > with Clang does not work: > {code} > 00:13:42 creating > build/bdist.linux-x86_64/wheel/mesos.scheduler-1.5.0.dist-info/WHEEL > 00:13:46 make dynamic-reservation-framework test-http-framework > test-framework test-executor test-http-executor long-lived-framework > long-lived-executor no-executor-framework docker-no-executor-framework > balloon-framework balloon-executor load-generator-framework > persistent-volume-framework disk-full-framework test-helper mesos-tests > examples/java/test-executor examples/java/test-exception-framework > examples/java/test-framework examples/java/test-log > examples/java/test-multiple-executors-framework > examples/java/v1-test-framework examples/python/test_executor.py > examples/python/test-executor examples/python/test_framework.py > examples/python/test-framework \ > 00:13:46 tests/balloon_framework_test.sh tests/disk_full_framework_test.sh > tests/dynamic_reservation_framework_test.sh tests/java_exception_test.sh > tests/java_framework_test.sh tests/java_log_test.sh > tests/java_v0_framework_test.sh tests/java_v1_framework_test.sh > tests/no_executor_framework_test.sh tests/persistent_volume_framework_test.sh > tests/python_framework_test.sh tests/test_http_framework_test.sh > tests/test_framework_test.sh > 00:13:47 make[3]: Entering directory > '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src' > 00:13:47 CXXLDdynamic-reservation-framework > 00:13:47 CXXLDtest-http-framework > 00:13:49 CXXLDtest-framework > 00:13:49 CXXLDtest-executor > 00:13:51 CXXLDtest-http-executor > 00:13:51 CXXLDlong-lived-framework > 00:13:52 CXXLDlong-lived-executor > 00:13:53 CXXLDno-executor-framework > 00:13:54 CXXLDdocker-no-executor-framework > 00:13:54 CXXLDballoon-framework > 00:13:56 CXXLDballoon-executor > 00:13:56 CXXLDload-generator-framework > 00:13:58 CXXLDpersistent-volume-framework > 00:13:58 CXXLDdisk-full-framework > 00:14:00 CXX tests/test_helper-active_user_test_helper.o > 00:14:00 CXX tests/test_helper-flags.o > 00:14:00 CXX tests/test_helper-http_server_test_helper.o > 00:14:00 CXX tests/test_helper-kill_policy_test_helper.o > 00:14:00 CXX tests/test_helper-resources_utils.o > 00:14:00 CXX tests/test_helper-test_helper_main.o > 00:14:00 CXX tests/test_helper-utils.o > 00:14:00 CXX tests/containerizer/test_helper-memory_test_helper.o > 00:14:00 CXX tests/containerizer/test_helper-capabilities_test_helper.o > 00:14:00 CXX tests/containerizer/test_helper-setns_test_helper.o > 00:14:00 CXX tests/mesos_tests-log_tests.o > 00:14:01 CXX tests/mesos_tests-master_authorization_tests.o > 00:14:27 ../../src/tests/log_tests.cpp:2439:120: error: expression with side > effects has no effect in an unevaluated context > [-Werror,-Wunevaluated-expression] > 00:14:27 switch (0) case 0: default: if (const ::testing::AssertionResult > gtest_ar = (::testing::internal:: > EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(stringify(position++))) > == 1)>::Compare("stringify(position++)", "entry.data", > stringify(position++), entry.data))) ; else > ::testing::internal::AssertHelper(::testing::TestPartResult::kNonFatalFailure, > "../../src/tests/log_tests.cpp", 2439, gtest_ar.failure_message()) = > ::testing::Message(); > 00:14:27 > ^ > 00:14:27 1 error generated. > 00:14:27 Makefile:10317: recipe for target 'tests/mesos_tests-log_tests.o' > failed > 00:14:27 make[3]: *** [tests/mesos_tests-log_tests.o] Error 1 > 00:14:27 make[3]: *** Waiting for unfinished jobs > 00:14:54 make[3]: Leaving directory > '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src' > 00:14:54 Makefile:13776: recipe for
[jira] [Updated] (MESOS-7550) Publish Local Resource Provider resources in the agent before container launch or update.
[ https://issues.apache.org/jira/browse/MESOS-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-7550: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Publish Local Resource Provider resources in the agent before container > launch or update. > - > > Key: MESOS-7550 > URL: https://issues.apache.org/jira/browse/MESOS-7550 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > > The agent will ask RP manager to publish the resources before container can > start to use them. SLRP (storage local resource provider) will be responsible > for making sure the CSI volume is made available on the host. This will > involve calling `ControllerPublishVolume` and `NodePublishVolume` RPCs from > the CSI Plugin. > This will happen when a workload (i.e., task/executor) are being launched on > the agent that uses a CSI volume as a persistent volume. During the creation > of a CSI volume, the SLRP will generate a fixed mount point under the agent's > work directory based on the ID of the CSI volume, and store the mount point > in the `Resource.disk.source.path.root` or `Resource.disk.source.path.mount` > fields. Prior to a workload launch, SLRP will mount the CSI volume to the > same path, then the Docker containerizer or the Mesos containerizer will > again bind-mount the volume into the container of the workload. Since the > containerizers know nothing about the resource providers, it would extract > the mount point of the CSI volume from the `Resource.disk.source.path.root` > or `Resource.disk.source.path.mount` fields. > For storage local resource provider, the agent's work directory is known > during the creation of the CSI volume since it will be created an used on the > same agent. However, in the case of a storage external resource provider, > where a CSI volume might be created on one agent X and published on another > agent Y, the work directory of agent Y might not be known at the creation of > a CSI volume on X. To support it in the future, we introduce new semantics > for `Resource.disk.source.path.root` and `Resource.disk.source.path.mount`, > such that if these fields are set to relative paths, they are relative to the > agent's work directory, so the containerizer can extract the mount point by > prefixing the relative paths with the agent's work directory. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8265) Add state recovery for storage local resource provider.
[ https://issues.apache.org/jira/browse/MESOS-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8265: -- Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Add state recovery for storage local resource provider. > --- > > Key: MESOS-8265 > URL: https://issues.apache.org/jira/browse/MESOS-8265 > Project: Mesos > Issue Type: Task >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > > The storage local resource provider needs to checkpoint its total resources > and pending operations atomically, and recover them after failing over. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8032) Launch CSI plugins in storage local resource provider.
[ https://issues.apache.org/jira/browse/MESOS-8032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8032: -- Sprint: Mesosphere Sprint 64, Mesosphere Sprint 65, Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 64, Mesosphere Sprint 65, Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Launch CSI plugins in storage local resource provider. > -- > > Key: MESOS-8032 > URL: https://issues.apache.org/jira/browse/MESOS-8032 > Project: Mesos > Issue Type: Task > Components: storage >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > > Launching a CSI plugin requires the following steps: > 1. Verify the configuration. > 2. Prepare a directory in the work directory of the resource provider where > the socket file should be placed, and construct the path of the socket file. > 3. If the socket file already exists and the plugin is already running, we > should not launch another plugin instance. > 4. Otherwise, launch a standalone container to run the plugin and connect to > it through the socket file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8244) Add operator API to reload local resource providers.
[ https://issues.apache.org/jira/browse/MESOS-8244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8244: -- Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Add operator API to reload local resource providers. > > > Key: MESOS-8244 > URL: https://issues.apache.org/jira/browse/MESOS-8244 > Project: Mesos > Issue Type: Task >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > > To add, remove and update local resource providers on the fly more > conveniently and without restarting agents, we would like to introduce new > operator API to add new config files in the resource provider config > directory and trigger a reload for the resource provider. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8291) Add documentation about fault domains
[ https://issues.apache.org/jira/browse/MESOS-8291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8291: -- Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 70) > Add documentation about fault domains > - > > Key: MESOS-8291 > URL: https://issues.apache.org/jira/browse/MESOS-8291 > Project: Mesos > Issue Type: Documentation >Reporter: Vinod Kone >Assignee: Benno Evers > > We need some user docs for fault domains. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8143) Publish and unpublish storage local resources through CSI plugins.
[ https://issues.apache.org/jira/browse/MESOS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8143: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Publish and unpublish storage local resources through CSI plugins. > -- > > Key: MESOS-8143 > URL: https://issues.apache.org/jira/browse/MESOS-8143 > Project: Mesos > Issue Type: Task > Components: storage >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > > Storage local resource provider needs to call the following CSI API to > publish CSI volumes for tasks to use: > 1. ControllerPublishVolume (optional) > 2. NodePublishVolume > Although we don't need to unpublish CSI volumes after tasks are completed, we > still needs to unpublish them for DESTROY_VOLUME or DESTROY_BLOCK: > 1. NodeUnpublishVolume > 2. ControllerUnpublishVolume (optional) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8101) Import resources from CSI plugins in storage local resource provider.
[ https://issues.apache.org/jira/browse/MESOS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8101: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Import resources from CSI plugins in storage local resource provider. > - > > Key: MESOS-8101 > URL: https://issues.apache.org/jira/browse/MESOS-8101 > Project: Mesos > Issue Type: Task > Components: storage >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > > The following lists the steps to import resources from a CSI plugin: > 1. Launch the node plugin > 1.1 GetSupportedVersions > 1.2 GetPluginInfo > 1.3 ProbeNode > 1.4 GetNodeCapabilities > 2. Launch the controller plugin > 2.1 GetSuportedVersions > 2.2 GetPluginInfo > 2.3 GetControllerCapabilities > 3. GetCapacity > 4. ListVolumes > 5. Report to the resource provider through UPDATE_TOTAL_RESOURCES -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7790) Design hierarchical quota allocation.
[ https://issues.apache.org/jira/browse/MESOS-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-7790: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Design hierarchical quota allocation. > - > > Key: MESOS-7790 > URL: https://issues.apache.org/jira/browse/MESOS-7790 > Project: Mesos > Issue Type: Task > Components: allocation >Reporter: Benjamin Mahler >Assignee: Michael Park > Labels: multitenancy > > When quota is assigned in the role hierarchy (see MESOS-6375), it's possible > for there to be "undelegated" quota for a role. For example: > {noformat} > ^ > / \ > / \ >eng (90 cpus) sales (10 cpus) > ^ >/ \ > / \ > ads (50 cpus) build (10 cpus) > {noformat} > Here, the "eng" role has 60 of its 90 cpus of quota delegated to its > children, and 30 cpus remain undelegated. We need to design how to allocate > these 30 cpus undelegated cpus. Are they allocated entirely to the "eng" > role? Are they allocated to the "eng" role tree? If so, how do we determine > how much is allocated to each role in the "eng" tree (i.e. "eng", "eng/ads", > "eng/build"). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8240) Add an option to build the new CLI and run unit tests.
[ https://issues.apache.org/jira/browse/MESOS-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8240: -- Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 70) > Add an option to build the new CLI and run unit tests. > -- > > Key: MESOS-8240 > URL: https://issues.apache.org/jira/browse/MESOS-8240 > Project: Mesos > Issue Type: Improvement >Reporter: Armand Grillet >Assignee: Armand Grillet > > An update of the discarded https://reviews.apache.org/r/52543/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8190) Update the master to accept OfferOperationIDs from frameworks.
[ https://issues.apache.org/jira/browse/MESOS-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8190: -- Sprint: Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 69, Mesosphere Sprint 70) > Update the master to accept OfferOperationIDs from frameworks. > -- > > Key: MESOS-8190 > URL: https://issues.apache.org/jira/browse/MESOS-8190 > Project: Mesos > Issue Type: Task >Reporter: Gastón Kleiman >Assignee: Greg Mann > Labels: mesosphere > > Master’s {{ACCEPT}} handler should send failed operation updates when a > framework sets the {{OfferOperationID}} on an operation destined for an agent > without the {{RESOURCE_PROVIDER}} capability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-5333) GET /master/maintenance/schedule/ produces 404.
[ https://issues.apache.org/jira/browse/MESOS-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5333: -- Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 70) > GET /master/maintenance/schedule/ produces 404. > --- > > Key: MESOS-5333 > URL: https://issues.apache.org/jira/browse/MESOS-5333 > Project: Mesos > Issue Type: Bug > Components: HTTP API, libprocess >Reporter: Nathan Handler >Assignee: Alexander Rukletsov >Priority: Minor > Labels: mesosphere > > Attempts to make a GET request to /master/maintenance/schedule/ result in a > 404. However, if I make a GET request to /master/maintenance/schedule > (without the trailing /), it works. My current (untested) theory is that this > might be related to the fact that there is also a > /master/maintenance/schedule/status endpoint (an endpoint built on top of a > functioning endpoint), as requests to /help and /help/ (with and without the > trailing slash) produce the same functioning result. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8108) Process offer operations in storage local resource provider
[ https://issues.apache.org/jira/browse/MESOS-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8108: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Process offer operations in storage local resource provider > --- > > Key: MESOS-8108 > URL: https://issues.apache.org/jira/browse/MESOS-8108 > Project: Mesos > Issue Type: Task > Components: storage >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: storage > > The storage local resource provider receives offer operations for > reservations and resource conversions, and invoke proper CSI calls to > implement these operations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8102) Add a test CSI plugin for storage local resource provider.
[ https://issues.apache.org/jira/browse/MESOS-8102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8102: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Add a test CSI plugin for storage local resource provider. > -- > > Key: MESOS-8102 > URL: https://issues.apache.org/jira/browse/MESOS-8102 > Project: Mesos > Issue Type: Task > Components: test >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > > We need a dummy CSI plugin for testing storage local resoure providers. The > test CSI plugin would just create subdirectories under its working > directories to mimic the behavior of creating volumes, then bind-mount those > volumes to mimic publish. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8115) Add a master flag to disallow agents that are not configured with fault domain
[ https://issues.apache.org/jira/browse/MESOS-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8115: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Add a master flag to disallow agents that are not configured with fault domain > -- > > Key: MESOS-8115 > URL: https://issues.apache.org/jira/browse/MESOS-8115 > Project: Mesos > Issue Type: Improvement >Reporter: Vinod Kone >Assignee: Benno Evers > > Once mesos masters and agents in a cluster are *all* upgraded to a version > where the fault domains feature is available, it is beneficial to enforce > that agents without a fault domain configured are not allowed to join the > cluster. > This is a safety net for operators who could forget to configure the fault > domain of a remote agent and let it join the cluster. If this happens, an > agent in a remote region will be considered a local agent by the master and > frameworks (because agent's fault domain is not configured) causing tasks to > potentially land in a remote agent which is undesirable. > Note that this has to be a configurable flag and not enforced by default > because otherwise upgrades from a fault domain non-configured cluster to a > configured cluster will not be possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.
[ https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8297: -- Sprint: Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 69, Mesosphere Sprint 70) > Built-in driver-based executors ignore kill task if the task has not been > launched. > --- > > Key: MESOS-8297 > URL: https://issues.apache.org/jira/browse/MESOS-8297 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov >Priority: Blocker > Labels: mesosphere > > If docker executor receives a kill task request and the task has never been > launch, the request is ignored. We now know that: the executor has never > received the registration confirmation, hence has ignored the launch task > request, hence the task has never started. And this is how the executor > enters an idle state, waiting for registration and ignoring kill task > requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8303) Add user doc for agent reconfiguration
[ https://issues.apache.org/jira/browse/MESOS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8303: -- Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 70) > Add user doc for agent reconfiguration > -- > > Key: MESOS-8303 > URL: https://issues.apache.org/jira/browse/MESOS-8303 > Project: Mesos > Issue Type: Documentation >Reporter: Vinod Kone >Assignee: Benno Evers > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8184) Implement master's AcknowledgeOfferOperationMessage handler.
[ https://issues.apache.org/jira/browse/MESOS-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8184: -- Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Implement master's AcknowledgeOfferOperationMessage handler. > > > Key: MESOS-8184 > URL: https://issues.apache.org/jira/browse/MESOS-8184 > Project: Mesos > Issue Type: Task >Reporter: Gastón Kleiman >Assignee: Gastón Kleiman > Labels: mesosphere > > This handler should validate the message and forward it to the corresponding > agent/ERP. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8144) Add a mock resource provider manager.
[ https://issues.apache.org/jira/browse/MESOS-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8144: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Add a mock resource provider manager. > - > > Key: MESOS-8144 > URL: https://issues.apache.org/jira/browse/MESOS-8144 > Project: Mesos > Issue Type: Task >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: storage > > To test a storage local resource provider, we need to inject a mock resource > provider manager such that: > 1. A full agent will start during the test so the resource provider can > launch standalone containers for CSI plugins. > 2. We can inject offer operations through the mock manager to test the > resource provider. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8221) Use protobuf reflection to simplify downgrading of resources.
[ https://issues.apache.org/jira/browse/MESOS-8221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8221: -- Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Use protobuf reflection to simplify downgrading of resources. > - > > Key: MESOS-8221 > URL: https://issues.apache.org/jira/browse/MESOS-8221 > Project: Mesos > Issue Type: Improvement > Components: agent >Reporter: Michael Park >Assignee: Michael Park > > We currently have a {{downgradeResources}} function which is called on every > {{repeated Resource}} field in every message that we checkpoint. We should > leverage > protobuf reflection to automatically downgrade any instances of {{Resource}} > within any > protobuf message. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7506) Multiple tests leave orphan containers.
[ https://issues.apache.org/jira/browse/MESOS-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-7506: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Multiple tests leave orphan containers. > --- > > Key: MESOS-7506 > URL: https://issues.apache.org/jira/browse/MESOS-7506 > Project: Mesos > Issue Type: Bug > Components: containerization > Environment: Ubuntu 16.04 > Fedora 23 > other Linux distros >Reporter: Alexander Rukletsov >Assignee: Andrei Budnik > Labels: containerizer, flaky-test, mesosphere > Attachments: KillMultipleTasks-badrun.txt, > ROOT_IsolatorFlags-badrun.txt, ResourceLimitation-badrun.txt, > ResourceLimitation-badrun2.txt, > RestartSlaveRequireExecutorAuthentication-badrun.txt, > TaskWithFileURI-badrun.txt > > > I've observed a number of flaky tests that leave orphan containers upon > cleanup. A typical log looks like this: > {noformat} > ../../src/tests/cluster.cpp:580: Failure > Value of: containers->empty() > Actual: false > Expected: true > Failed to destroy containers: { da3e8aa8-98e7-4e72-a8fd-5d0bae960014 } > {noformat} > All currently affected tests: > {noformat} > SlaveTest.RestartSlaveRequireExecutorAuthentication > LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8096) Enqueueing events in MockHTTPScheduler can lead to segfaults.
[ https://issues.apache.org/jira/browse/MESOS-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8096: -- Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70) > Enqueueing events in MockHTTPScheduler can lead to segfaults. > - > > Key: MESOS-8096 > URL: https://issues.apache.org/jira/browse/MESOS-8096 > Project: Mesos > Issue Type: Bug > Components: scheduler driver, test > Environment: Fedora 23, Ubuntu 14.04, Ubuntu 16 >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: flaky-test, mesosphere > Attachments: AsyncExecutorProcess-badrun-1.txt, > AsyncExecutorProcess-badrun-2.txt, AsyncExecutorProcess-badrun-3.txt, > scheduler-shutdown-invalid-driver.txt > > > Various tests segfault due to a yet unknown reason. Comparing logs (attached) > hints that the problem might be in the scheduler's event queue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8352) Resources may get over allocated to some roles while fail to meet the quota of other roles.
[ https://issues.apache.org/jira/browse/MESOS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-8352: -- Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71 (was: Mesosphere Sprint 70) > Resources may get over allocated to some roles while fail to meet the quota > of other roles. > --- > > Key: MESOS-8352 > URL: https://issues.apache.org/jira/browse/MESOS-8352 > Project: Mesos > Issue Type: Bug > Components: allocation >Reporter: Meng Zhu >Assignee: Meng Zhu > Labels: multitenancy, quotas > > In the quota role allocation stage, if a role gets some resources on an agent > to meet its quota, it will also get all other resources on the same agent > that it does not have quota for. This may starve roles behind it that have > quotas set for those resources. > To fix that, we need to track quota headroom in the quota role allocation > stage. In that stage, if a role has no quota set for a scalar resource, it > will get that resource only when two conditions are both met: > - It got some other resources on the same agent to meet its quota; And > - After allocating those resources, quota headroom is still above the > required amount. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.
[ https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301268#comment-16301268 ] Alexander Rukletsov commented on MESOS-8297: {noformat} Commit: 44a702a1b26963040e6cb6c362b7f01e5b4ef097 [44a702a] Author: Alexander Rukletsov ruklet...@gmail.com Date: 22 December 2017 at 12:09:58 GMT+1 Committer: Alexander Rukletsov al...@apache.org Promoted log level to warning for disconnected events in exec.cpp. When the executor library receives messages while being disconnected, it might indicate an out-of-order message delivery or lost messages. This should be logged at the warning level to simplify triaging. Review: https://reviews.apache.org/r/64032/ {noformat} {noformat} Commit: 47392cf9f9024718550c69bcef9319560b47d5c7 [47392cf] Author: Alexander RukletsovDate: 22 December 2017 at 12:10:15 GMT+1 Committer: Alexander Rukletsov Ensured command executor always honors shutdown request. Review: https://reviews.apache.org/r/64069/ {noformat} {noformat} Commit: b2eddcfe0ede4725208ae33c8c7f56563ff10514 [b2eddcf] Author: Alexander Rukletsov Date: 22 December 2017 at 12:10:28 GMT+1 Committer: Alexander Rukletsov Ensured executor adapter propagates error and shutdown messages. Prior to this patch, if an error, kill, or shutdown occurred during subscription / registration with the agent, it was not propagated back to the executor if the v0_v1 executor adapter was used. This happened because the adapter did not call the `connected` callback until after successful registration and hence the executor did not even try to send the `SUBSCRIBE` call, without which the adapter did not send any events to the executor. A fix is to call the `connected` callback if an error occurred or shutdown / kill event arrived before the executor had subscribed. Review: https://reviews.apache.org/r/64070/ {noformat} {noformat} Commit: 769108e94a7c7834c44e01091a9940354eb3f6e4 [769108e] Author: Alexander Rukletsov Date: 22 December 2017 at 12:10:35 GMT+1 Committer: Alexander Rukletsov Terminated driver-based executors if kill arrives before launch task. `ExecutorRegisteredMessage` or `RunTaskMessage` may not be delivered to a driver-based executor. Since these messages are not retried, without this patch an executor never starts a task and remains idle, ignoring kill task request. This patch ensures all built-in driver- based executors eventually shut down if kill task arrives before the task has been started. Review: https://reviews.apache.org/r/64033/ {noformat} > Built-in driver-based executors ignore kill task if the task has not been > launched. > --- > > Key: MESOS-8297 > URL: https://issues.apache.org/jira/browse/MESOS-8297 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov >Priority: Blocker > Labels: mesosphere > > If docker executor receives a kill task request and the task has never been > launch, the request is ignored. We now know that: the executor has never > received the registration confirmation, hence has ignored the launch task > request, hence the task has never started. And this is how the executor > enters an idle state, waiting for registration and ignoring kill task > requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.
[ https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301267#comment-16301267 ] Alexander Rukletsov commented on MESOS-8297: [~gilbert] Landed. > Built-in driver-based executors ignore kill task if the task has not been > launched. > --- > > Key: MESOS-8297 > URL: https://issues.apache.org/jira/browse/MESOS-8297 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov >Priority: Blocker > Labels: mesosphere > > If docker executor receives a kill task request and the task has never been > launch, the request is ignored. We now know that: the executor has never > received the registration confirmation, hence has ignored the launch task > request, hence the task has never started. And this is how the executor > enters an idle state, waiting for registration and ignoring kill task > requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-6616) Error: dereferencing type-punned pointer will break strict-aliasing rules.
[ https://issues.apache.org/jira/browse/MESOS-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-6616: --- Shepherd: Benjamin Bannier > Error: dereferencing type-punned pointer will break strict-aliasing rules. > -- > > Key: MESOS-6616 > URL: https://issues.apache.org/jira/browse/MESOS-6616 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.1.0, 1.2.3, 1.3.1, 1.4.1 > Environment: Fedora Rawhide; > Debian 8.10 + gcc 5.5.0-6 with {{O2}} >Reporter: Orion Poplawski >Assignee: Alexander Rukletsov > Labels: compile-error, mesosphere > > Trying to update the mesos package to 1.1.0 in Fedora. Getting: > {noformat} > libtool: compile: g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" > -DPACKAGE_VERSION=\"1.1.0\" "-DPACKAGE_STRING=\"mesos 1.1.0\"" > -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" > -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 > -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 > -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 > -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 > -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 > -DHAVE_LIBAPR_1=1 -DHAVE_BOOST_VERSION_HPP=1 -DHAVE_LIBCURL=1 > -DHAVE_ELFIO_ELFIO_HPP=1 -DHAVE_GLOG_LOGGING_H=1 -DHAVE_HTTP_PARSER_H=1 > -DMESOS_HAS_JAVA=1 -DHAVE_LEVELDB_DB_H=1 -DHAVE_LIBNL_3=1 > -DHAVE_LIBNL_ROUTE_3=1 -DHAVE_LIBNL_IDIAG_3=1 -DWITH_NETWORK_ISOLATOR=1 > -DHAVE_GOOGLE_PROTOBUF_MESSAGE_H=1 -DHAVE_EV_H=1 -DHAVE_PICOJSON_H=1 > -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 > -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 > -DHAVE_ZOOKEEPER_H=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -Wall > -Werror -Wsign-compare -DLIBDIR=\"/usr/lib64\" > -DPKGLIBEXECDIR=\"/usr/libexec/mesos\" -DPKGDATADIR=\"/usr/share/mesos\" > -DPKGMODULEDIR=\"/usr/lib64/mesos/modules\" -I../include -I../include > -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS > -I../3rdparty/libprocess/include -I../3rdparty/nvml-352.79 > -I../3rdparty/stout/include -DHAS_AUTHENTICATION=1 -Iyes/include > -I/usr/include/subversion-1 -Iyes/include -Iyes/include -Iyes/include/libnl3 > -Iyes/include -I/ -Iyes/include -I/usr/include/apr-1 -I/usr/include/apr-1.0 > -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15/include -isystem yes/include > -Iyes/include -I/usr/src/gmock -I/usr/src/gmock/include -I/usr/src/gmock/src > -I/usr/src/gmock/gtest -I/usr/src/gmock/gtest/include > -I/usr/src/gmock/gtest/src -Iyes/include -Iyes/include -I/usr/include > -I/builddir/build/BUILD/mesos-1.1.0/libev4.15/include -Iyes/include > -I/usr/include -I/usr/include/zookeeper -pthread -O2 -g -pipe -Wall > -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions > -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches > -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic > -DEV_CHILD_ENABLE=0 -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15 > -Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11 -c > health-check/health_checker.cpp -fPIC -DPIC -o > health-check/.libs/libmesos_no_3rdparty_la-health_checker.o > In file included from health-check/health_checker.cpp:51:0: > ./linux/ns.hpp: In function 'Try ns::clone(pid_t, int, const > std::function&, int)': > ./linux/ns.hpp:480:69: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] > pid_t pid = ((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid; > ^~ > ./linux/ns.hpp: In lambda function: > ./linux/ns.hpp:581:59: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] >((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid = ::getpid(); >^~ > ./linux/ns.hpp:582:59: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] >((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->uid = ::getuid(); >^~ > ./linux/ns.hpp:583:59: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] >((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->gid = ::getgid(); >^~ > cc1plus: all warnings being treated as errors > make[2]: *** [Makefile:6655: > health-check/libmesos_no_3rdparty_la-health_checker.lo] Error 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-6616) Error: dereferencing type-punned pointer will break strict-aliasing rules.
[ https://issues.apache.org/jira/browse/MESOS-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-6616: --- Sprint: Mesosphere Sprint 71 > Error: dereferencing type-punned pointer will break strict-aliasing rules. > -- > > Key: MESOS-6616 > URL: https://issues.apache.org/jira/browse/MESOS-6616 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.1.0, 1.2.3, 1.3.1, 1.4.1 > Environment: Fedora Rawhide; > Debian 8.10 + gcc 5.5.0-6 with {{O2}} >Reporter: Orion Poplawski >Assignee: Alexander Rukletsov > Labels: compile-error, mesosphere > > Trying to update the mesos package to 1.1.0 in Fedora. Getting: > {noformat} > libtool: compile: g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" > -DPACKAGE_VERSION=\"1.1.0\" "-DPACKAGE_STRING=\"mesos 1.1.0\"" > -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" > -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 > -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 > -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 > -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 > -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 > -DHAVE_LIBAPR_1=1 -DHAVE_BOOST_VERSION_HPP=1 -DHAVE_LIBCURL=1 > -DHAVE_ELFIO_ELFIO_HPP=1 -DHAVE_GLOG_LOGGING_H=1 -DHAVE_HTTP_PARSER_H=1 > -DMESOS_HAS_JAVA=1 -DHAVE_LEVELDB_DB_H=1 -DHAVE_LIBNL_3=1 > -DHAVE_LIBNL_ROUTE_3=1 -DHAVE_LIBNL_IDIAG_3=1 -DWITH_NETWORK_ISOLATOR=1 > -DHAVE_GOOGLE_PROTOBUF_MESSAGE_H=1 -DHAVE_EV_H=1 -DHAVE_PICOJSON_H=1 > -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 > -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 > -DHAVE_ZOOKEEPER_H=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -Wall > -Werror -Wsign-compare -DLIBDIR=\"/usr/lib64\" > -DPKGLIBEXECDIR=\"/usr/libexec/mesos\" -DPKGDATADIR=\"/usr/share/mesos\" > -DPKGMODULEDIR=\"/usr/lib64/mesos/modules\" -I../include -I../include > -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS > -I../3rdparty/libprocess/include -I../3rdparty/nvml-352.79 > -I../3rdparty/stout/include -DHAS_AUTHENTICATION=1 -Iyes/include > -I/usr/include/subversion-1 -Iyes/include -Iyes/include -Iyes/include/libnl3 > -Iyes/include -I/ -Iyes/include -I/usr/include/apr-1 -I/usr/include/apr-1.0 > -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15/include -isystem yes/include > -Iyes/include -I/usr/src/gmock -I/usr/src/gmock/include -I/usr/src/gmock/src > -I/usr/src/gmock/gtest -I/usr/src/gmock/gtest/include > -I/usr/src/gmock/gtest/src -Iyes/include -Iyes/include -I/usr/include > -I/builddir/build/BUILD/mesos-1.1.0/libev4.15/include -Iyes/include > -I/usr/include -I/usr/include/zookeeper -pthread -O2 -g -pipe -Wall > -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions > -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches > -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic > -DEV_CHILD_ENABLE=0 -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15 > -Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11 -c > health-check/health_checker.cpp -fPIC -DPIC -o > health-check/.libs/libmesos_no_3rdparty_la-health_checker.o > In file included from health-check/health_checker.cpp:51:0: > ./linux/ns.hpp: In function 'Try ns::clone(pid_t, int, const > std::function&, int)': > ./linux/ns.hpp:480:69: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] > pid_t pid = ((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid; > ^~ > ./linux/ns.hpp: In lambda function: > ./linux/ns.hpp:581:59: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] >((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid = ::getpid(); >^~ > ./linux/ns.hpp:582:59: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] >((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->uid = ::getuid(); >^~ > ./linux/ns.hpp:583:59: error: dereferencing type-punned pointer will break > strict-aliasing rules [-Werror=strict-aliasing] >((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->gid = ::getgid(); >^~ > cc1plus: all warnings being treated as errors > make[2]: *** [Makefile:6655: > health-check/libmesos_no_3rdparty_la-health_checker.lo] Error 1 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used
[ https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Kalin updated MESOS-8356: Component/s: (was: docker) > Persistent volume ownership is set to root despite of sandbox owner > (frameworkInfo.user) when docker executor is used > - > > Key: MESOS-8356 > URL: https://issues.apache.org/jira/browse/MESOS-8356 > Project: Mesos > Issue Type: Bug > Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13 >Reporter: Konstantin Kalin > Labels: persistent-volumes > > PersistentVolume ownership is not set to match the sandbox user when the > docker executor is used. Looks like the issue was introduced by > https://reviews.apache.org/r/45963/ > I didn't check the universal containerizer yet. > As far as I understand the following code is supposed to check that a volume > is not being already used by other tasks/containers. > src/slave/containerizer/docker.cpp > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource)) { > isVolumeInUse = true; > break; > } > } > {code} > But it doesn't exclude a container to be launch (In my case I have only one > container - no group of tasks). Thus the ownership of PersistentVolume stays > "root" (I run mesos-agent under root) > Making a small patch to exclude the container to launch fixes the issue. > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource) && > containerId != container->id) { > isVolumeInUse = true; > break; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used
[ https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Kalin updated MESOS-8356: Labels: persistent-volumes (was: ) > Persistent volume ownership is set to root despite of sandbox owner > (frameworkInfo.user) when docker executor is used > - > > Key: MESOS-8356 > URL: https://issues.apache.org/jira/browse/MESOS-8356 > Project: Mesos > Issue Type: Bug > Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13 >Reporter: Konstantin Kalin > Labels: persistent-volumes > > PersistentVolume ownership is not set to match the sandbox user when the > docker executor is used. Looks like the issue was introduced by > https://reviews.apache.org/r/45963/ > I didn't check the universal containerizer yet. > As far as I understand the following code is supposed to check that a volume > is not being already used by other tasks/containers. > src/slave/containerizer/docker.cpp > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource)) { > isVolumeInUse = true; > break; > } > } > {code} > But it doesn't exclude a container to be launch (In my case I have only one > container - no group of tasks). Thus the ownership of PersistentVolume stays > "root" (I run mesos-agent under root) > Making a small patch to exclude the container to launch fixes the issue. > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource) && > containerId != container->id) { > isVolumeInUse = true; > break; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used
Konstantin Kalin created MESOS-8356: --- Summary: Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used Key: MESOS-8356 URL: https://issues.apache.org/jira/browse/MESOS-8356 Project: Mesos Issue Type: Bug Components: docker Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13 Reporter: Konstantin Kalin PersistentVolume ownership is not set to match the sandbox user when the docker executor is used. Looks like the issue was introduced by https://reviews.apache.org/r/45963/ I didn't check the universal containerizer yet. As far as I understand the following code is supposed to check that a volume is not being already used by other tasks/containers. src/slave/containerizer/docker.cpp {code:c++} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource)) { isVolumeInUse = true; break; } } {code} But it doesn't exclude a container to be launch (In my case I have only one container - no group of tasks). Thus the ownership of PersistentVolume stays "root" (I run mesos-agent under root) Making a small patch to exclude the container to launch fixes the issue. {code:c++} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource) && containerId != container->id) { isVolumeInUse = true; break; } } {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used
[ https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Kalin updated MESOS-8356: Description: PersistentVolume ownership is not set to match the sandbox user when the docker executor is used. Looks like the issue was introduced by https://reviews.apache.org/r/45963/ I didn't check the universal containerizer yet. As far as I understand the following code is supposed to check that a volume is not being already used by other tasks/containers. src/slave/containerizer/docker.cpp {code:java} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource)) { isVolumeInUse = true; break; } } {code} But it doesn't exclude a container to be launch (In my case I have only one container - no group of tasks). Thus the ownership of PersistentVolume stays "root" (I run mesos-agent under root) and it's impossible to use the volume inside the container. We always run processes inside Docker containers under unprivileged user. Making a small patch to exclude the container to launch fixes the issue. {code:java} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource) && containerId != container->id) { isVolumeInUse = true; break; } } {code} was: PersistentVolume ownership is not set to match the sandbox user when the docker executor is used. Looks like the issue was introduced by https://reviews.apache.org/r/45963/ I didn't check the universal containerizer yet. As far as I understand the following code is supposed to check that a volume is not being already used by other tasks/containers. src/slave/containerizer/docker.cpp {code:java} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource)) { isVolumeInUse = true; break; } } {code} But it doesn't exclude a container to be launch (In my case I have only one container - no group of tasks). Thus the ownership of PersistentVolume stays "root" (I run mesos-agent under root) Making a small patch to exclude the container to launch fixes the issue. {code:java} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource) && containerId != container->id) { isVolumeInUse = true; break; } } {code} > Persistent volume ownership is set to root despite of sandbox owner > (frameworkInfo.user) when docker executor is used > - > > Key: MESOS-8356 > URL: https://issues.apache.org/jira/browse/MESOS-8356 > Project: Mesos > Issue Type: Bug > Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13 >Reporter: Konstantin Kalin > Labels: persistent-volumes > > PersistentVolume ownership is not set to match the sandbox user when the > docker executor is used. Looks like the issue was introduced by > https://reviews.apache.org/r/45963/ > I didn't check the universal containerizer yet. > As far as I understand the following code is supposed to check that a volume > is not being already used by other tasks/containers. > src/slave/containerizer/docker.cpp > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource)) { > isVolumeInUse = true; > break; > } > } > {code} > But it doesn't exclude a container to be launch (In my case I have only one > container - no group of tasks). Thus the ownership of PersistentVolume stays > "root" (I run mesos-agent under root) and it's impossible to use the volume > inside the container. We always run processes inside Docker containers under > unprivileged user. > Making a small patch to exclude the container to launch fixes the issue. > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource) && > containerId != container->id) { > isVolumeInUse = true; > break; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.
[ https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301484#comment-16301484 ] Alexander Rukletsov commented on MESOS-8297: Back ported to 1.4.2. > Built-in driver-based executors ignore kill task if the task has not been > launched. > --- > > Key: MESOS-8297 > URL: https://issues.apache.org/jira/browse/MESOS-8297 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov >Priority: Blocker > Labels: mesosphere > > If docker executor receives a kill task request and the task has never been > launch, the request is ignored. We now know that: the executor has never > received the registration confirmation, hence has ignored the launch task > request, hence the task has never started. And this is how the executor > enters an idle state, waiting for registration and ignoring kill task > requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used
[ https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Kalin updated MESOS-8356: Description: PersistentVolume ownership is not set to match the sandbox user when the docker executor is used. Looks like the issue was introduced by https://reviews.apache.org/r/45963/ I didn't check the universal containerizer yet. As far as I understand the following code is supposed to check that a volume is not being already used by other tasks/containers. src/slave/containerizer/docker.cpp {code:java} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource)) { isVolumeInUse = true; break; } } {code} But it doesn't exclude a container to be launch (In my case I have only one container - no group of tasks). Thus the ownership of PersistentVolume stays "root" (I run mesos-agent under root) Making a small patch to exclude the container to launch fixes the issue. {code:java} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource) && containerId != container->id) { isVolumeInUse = true; break; } } {code} was: PersistentVolume ownership is not set to match the sandbox user when the docker executor is used. Looks like the issue was introduced by https://reviews.apache.org/r/45963/ I didn't check the universal containerizer yet. As far as I understand the following code is supposed to check that a volume is not being already used by other tasks/containers. src/slave/containerizer/docker.cpp {code:c++} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource)) { isVolumeInUse = true; break; } } {code} But it doesn't exclude a container to be launch (In my case I have only one container - no group of tasks). Thus the ownership of PersistentVolume stays "root" (I run mesos-agent under root) Making a small patch to exclude the container to launch fixes the issue. {code:c++} foreachvalue (const Container* container, containers_) { if (container->resources.contains(resource) && containerId != container->id) { isVolumeInUse = true; break; } } {code} > Persistent volume ownership is set to root despite of sandbox owner > (frameworkInfo.user) when docker executor is used > - > > Key: MESOS-8356 > URL: https://issues.apache.org/jira/browse/MESOS-8356 > Project: Mesos > Issue Type: Bug > Components: docker > Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13 >Reporter: Konstantin Kalin > > PersistentVolume ownership is not set to match the sandbox user when the > docker executor is used. Looks like the issue was introduced by > https://reviews.apache.org/r/45963/ > I didn't check the universal containerizer yet. > As far as I understand the following code is supposed to check that a volume > is not being already used by other tasks/containers. > src/slave/containerizer/docker.cpp > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource)) { > isVolumeInUse = true; > break; > } > } > {code} > But it doesn't exclude a container to be launch (In my case I have only one > container - no group of tasks). Thus the ownership of PersistentVolume stays > "root" (I run mesos-agent under root) > Making a small patch to exclude the container to launch fixes the issue. > {code:java} > foreachvalue (const Container* container, containers_) { > if (container->resources.contains(resource) && > containerId != container->id) { > isVolumeInUse = true; > break; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.
[ https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-8297: --- Shepherd: Vinod Kone (was: Anand Mazumdar) > Built-in driver-based executors ignore kill task if the task has not been > launched. > --- > > Key: MESOS-8297 > URL: https://issues.apache.org/jira/browse/MESOS-8297 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov >Priority: Blocker > Labels: mesosphere > > If docker executor receives a kill task request and the task has never been > launch, the request is ignored. We now know that: the executor has never > received the registration confirmation, hence has ignored the launch task > request, hence the task has never started. And this is how the executor > enters an idle state, waiting for registration and ignoring kill task > requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.
[ https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-8297: --- Fix Version/s: 1.5.0 1.4.2 > Built-in driver-based executors ignore kill task if the task has not been > launched. > --- > > Key: MESOS-8297 > URL: https://issues.apache.org/jira/browse/MESOS-8297 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov >Priority: Blocker > Labels: mesosphere > Fix For: 1.4.2, 1.5.0 > > > If docker executor receives a kill task request and the task has never been > launch, the request is ignored. We now know that: the executor has never > received the registration confirmation, hence has ignored the launch task > request, hence the task has never started. And this is how the executor > enters an idle state, waiting for registration and ignoring kill task > requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-6843) Fetcher should not assume stdout/stderr in the sandbox.
[ https://issues.apache.org/jira/browse/MESOS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-6843: -- Target Version/s: 1.6.0 (was: 1.5.0) > Fetcher should not assume stdout/stderr in the sandbox. > --- > > Key: MESOS-6843 > URL: https://issues.apache.org/jira/browse/MESOS-6843 > Project: Mesos > Issue Type: Bug > Components: fetcher >Affects Versions: 1.0.2, 1.1.0 >Reporter: Jie Yu >Priority: Critical > Labels: mesosphere > > If container logger is used, this assumption might not be true. For instance, > a journald logger might redirect all task logs to journald. So in theory, the > fetcher log should go to journald as well, rather than writing to > sandbox/stdout and sandbox/stderr. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-6784) IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky
[ https://issues.apache.org/jira/browse/MESOS-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301693#comment-16301693 ] Jie Yu commented on MESOS-6784: --- Haven't seen this test being flaky for months on head. Close it for now. RE-open if you see this being flaky again. cc [~alexr] > IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky > > > Key: MESOS-6784 > URL: https://issues.apache.org/jira/browse/MESOS-6784 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Neil Conway >Priority: Critical > Labels: mesosphere > Fix For: 1.5.0 > > > {noformat} > [ RUN ] IOSwitchboardTest.KillSwitchboardContainerDestroyed > I1212 13:57:02.641043 2211 containerizer.cpp:220] Using isolation: > posix/cpu,filesystem/posix,network/cni > W1212 13:57:02.641438 2211 backend.cpp:76] Failed to create 'overlay' > backend: OverlayBackend requires root privileges, but is running as user nrc > W1212 13:57:02.641559 2211 backend.cpp:76] Failed to create 'bind' backend: > BindBackend requires root privileges > I1212 13:57:02.642822 2268 containerizer.cpp:594] Recovering containerizer > I1212 13:57:02.643975 2253 provisioner.cpp:253] Provisioner recovery complete > I1212 13:57:02.644953 2255 containerizer.cpp:986] Starting container > 09e87380-00ab-4987-83c9-fa1c5d86717f for executor 'executor' of framework > I1212 13:57:02.647004 2245 switchboard.cpp:430] Allocated pseudo terminal > '/dev/pts/54' for container 09e87380-00ab-4987-83c9-fa1c5d86717f > I1212 13:57:02.652305 2245 switchboard.cpp:596] Created I/O switchboard > server (pid: 2705) listening on socket file > '/tmp/mesos-io-switchboard-b4af1c92-6633-44f3-9d35-e0e36edaf70a' for > container 09e87380-00ab-4987-83c9-fa1c5d86717f > I1212 13:57:02.655513 2267 launcher.cpp:133] Forked child with pid '2706' > for container '09e87380-00ab-4987-83c9-fa1c5d86717f' > I1212 13:57:02.655732 2267 containerizer.cpp:1621] Checkpointing container's > forked pid 2706 to > '/tmp/IOSwitchboardTest_KillSwitchboardContainerDestroyed_Me5CRx/meta/slaves/frameworks/executors/executor/runs/09e87380-00ab-4987-83c9-fa1c5d86717f/pids/forked.pid' > I1212 13:57:02.726306 2265 containerizer.cpp:2463] Container > 09e87380-00ab-4987-83c9-fa1c5d86717f has exited > I1212 13:57:02.726352 2265 containerizer.cpp:2100] Destroying container > 09e87380-00ab-4987-83c9-fa1c5d86717f in RUNNING state > E1212 13:57:02.726495 2243 switchboard.cpp:861] Unexpected termination of > I/O switchboard server: 'IOSwitchboard' exited with signal: Killed for > container 09e87380-00ab-4987-83c9-fa1c5d86717f > I1212 13:57:02.726563 2265 launcher.cpp:149] Asked to destroy container > 09e87380-00ab-4987-83c9-fa1c5d86717f > E1212 13:57:02.783607 2228 switchboard.cpp:799] Failed to remove unix domain > socket file '/tmp/mesos-io-switchboard-b4af1c92-6633-44f3-9d35-e0e36edaf70a' > for container '09e87380-00ab-4987-83c9-fa1c5d86717f': No such file or > directory > ../../mesos/src/tests/containerizer/io_switchboard_tests.cpp:661: Failure > Value of: wait.get()->reasons().size() == 1 > Actual: false > Expected: true > *** Aborted at 1481579822 (unix time) try "date -d @1481579822" if you are > using GNU date *** > PC: @ 0x1bf16d0 testing::UnitTest::AddTestPartResult() > *** SIGSEGV (@0x0) received by PID 2211 (TID 0x7faed7d078c0) from PID 0; > stack trace: *** > @ 0x7faecf855100 (unknown) > @ 0x1bf16d0 testing::UnitTest::AddTestPartResult() > @ 0x1be6247 testing::internal::AssertHelper::operator=() > @ 0x19ed751 > mesos::internal::tests::IOSwitchboardTest_KillSwitchboardContainerDestroyed_Test::TestBody() > @ 0x1c0ed8c > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @ 0x1c09e74 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x1beb505 testing::Test::Run() > @ 0x1bebc88 testing::TestInfo::Run() > @ 0x1bec2ce testing::TestCase::Run() > @ 0x1bf2ba8 testing::internal::UnitTestImpl::RunAllTests() > @ 0x1c0f9b1 > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @ 0x1c0a9f2 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x1bf18ee testing::UnitTest::Run() > @ 0x11bc9e3 RUN_ALL_TESTS() > @ 0x11bc599 main > @ 0x7faece663b15 __libc_start_main > @ 0xa9c219 (unknown) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-6240) Allow executor/agent communication over non-TCP/IP stream socket.
[ https://issues.apache.org/jira/browse/MESOS-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301677#comment-16301677 ] Jie Yu commented on MESOS-6240: --- Re-target for 1.6.0 as no progress has been made in a few months. > Allow executor/agent communication over non-TCP/IP stream socket. > - > > Key: MESOS-6240 > URL: https://issues.apache.org/jira/browse/MESOS-6240 > Project: Mesos > Issue Type: Improvement > Components: containerization > Environment: Linux and Windows >Reporter: Avinash Sridharan >Assignee: Benjamin Hindman >Priority: Critical > Labels: mesosphere > > Currently, the executor agent communication happens specifically over TCP > sockets. This works fine in most cases, but specifically for the > `MesosContainerizer` when containers are running on CNI networks, this mode > of communication starts imposing constraints on the CNI network. Since, now > there has to connectivity between the CNI network (on which the executor is > running) and the agent. Introducing paths from a CNI network to the > underlying agent, at best, creates headaches for operators and at worst > introduces serious security holes in the network, since it is breaking the > isolation between the container CNI network and the host network (on which > the agent is running). > In order to simplify/strengthen deployment of Mesos containers on CNI > networks we therefore need to move away from using TCP/IP sockets for > executor/agent communication. Since, executor and agent are guaranteed to run > on the same host, the above problems can be resolved if, for the > `MesosContainerizer`, we use UNIX domain sockets or named pipes instead of > TCP/IP sockets for the executor/agent communication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-6240) Allow executor/agent communication over non-TCP/IP stream socket.
[ https://issues.apache.org/jira/browse/MESOS-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-6240: -- Target Version/s: 1.6.0 (was: 1.5.0) > Allow executor/agent communication over non-TCP/IP stream socket. > - > > Key: MESOS-6240 > URL: https://issues.apache.org/jira/browse/MESOS-6240 > Project: Mesos > Issue Type: Improvement > Components: containerization > Environment: Linux and Windows >Reporter: Avinash Sridharan >Assignee: Benjamin Hindman >Priority: Critical > Labels: mesosphere > > Currently, the executor agent communication happens specifically over TCP > sockets. This works fine in most cases, but specifically for the > `MesosContainerizer` when containers are running on CNI networks, this mode > of communication starts imposing constraints on the CNI network. Since, now > there has to connectivity between the CNI network (on which the executor is > running) and the agent. Introducing paths from a CNI network to the > underlying agent, at best, creates headaches for operators and at worst > introduces serious security holes in the network, since it is breaking the > isolation between the container CNI network and the host network (on which > the agent is running). > In order to simplify/strengthen deployment of Mesos containers on CNI > networks we therefore need to move away from using TCP/IP sockets for > executor/agent communication. Since, executor and agent are guaranteed to run > on the same host, the above problems can be resolved if, for the > `MesosContainerizer`, we use UNIX domain sockets or named pipes instead of > TCP/IP sockets for the executor/agent communication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7103) Container Attach/Exec Improvements
[ https://issues.apache.org/jira/browse/MESOS-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301695#comment-16301695 ] Jie Yu commented on MESOS-7103: --- Re-target for 1.6.0 as no progress has been made recently. > Container Attach/Exec Improvements > -- > > Key: MESOS-7103 > URL: https://issues.apache.org/jira/browse/MESOS-7103 > Project: Mesos > Issue Type: Epic >Reporter: Kevin Klues > Labels: tech-debt > > Most of the core changes required to add "container exec" and "container > attach" support to Mesos landed in the 1.2 release. However, some features > (such as actually integrating this support into the CLI) haven't quite landed > yet. > This Epic aims to capture the tickets that still need to be resolved before > we can consider work on this feature complete. It is targeted for the 1.3 > release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7141) Support hook scripts to customize actions for container's lifecycle
[ https://issues.apache.org/jira/browse/MESOS-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301696#comment-16301696 ] Jie Yu commented on MESOS-7141: --- Retarget to 1.6.0 as no progress has been made. > Support hook scripts to customize actions for container's lifecycle > --- > > Key: MESOS-7141 > URL: https://issues.apache.org/jira/browse/MESOS-7141 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Jason Lai >Assignee: Jason Lai > Labels: containerizer, hooks > > Inspired by [hooks | > https://github.com/opencontainers/runtime-spec/blob/master/config.md#hooks] > in [OCI's runtime spec | https://github.com/opencontainers/runtime-spec], it > would be great to have scripts hooked into the lifecycle of containers. > The OCI doc has specified 3 stages for hooking: > * Prestart > * Poststart > * Poststop > We can consider having the 3 stages to start with. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-7103) Container Attach/Exec Improvements
[ https://issues.apache.org/jira/browse/MESOS-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu reassigned MESOS-7103: - Assignee: (was: Kevin Klues) > Container Attach/Exec Improvements > -- > > Key: MESOS-7103 > URL: https://issues.apache.org/jira/browse/MESOS-7103 > Project: Mesos > Issue Type: Epic >Reporter: Kevin Klues > Labels: tech-debt > > Most of the core changes required to add "container exec" and "container > attach" support to Mesos landed in the 1.2 release. However, some features > (such as actually integrating this support into the CLI) haven't quite landed > yet. > This Epic aims to capture the tickets that still need to be resolved before > we can consider work on this feature complete. It is targeted for the 1.3 > release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7103) Container Attach/Exec Improvements
[ https://issues.apache.org/jira/browse/MESOS-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7103: -- Target Version/s: 1.6.0 (was: 1.5.0) > Container Attach/Exec Improvements > -- > > Key: MESOS-7103 > URL: https://issues.apache.org/jira/browse/MESOS-7103 > Project: Mesos > Issue Type: Epic >Reporter: Kevin Klues > Labels: tech-debt > > Most of the core changes required to add "container exec" and "container > attach" support to Mesos landed in the 1.2 release. However, some features > (such as actually integrating this support into the CLI) haven't quite landed > yet. > This Epic aims to capture the tickets that still need to be resolved before > we can consider work on this feature complete. It is targeted for the 1.3 > release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8350) Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration
[ https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302032#comment-16302032 ] Jie Yu commented on MESOS-8350: --- Re-target this for 1.5.1 given the likelihood for this to happen is pretty rare and we do have a workaround for this. > Resource provider-capable agents not correctly synchronizing checkpointed > agent resources on reregistration > --- > > Key: MESOS-8350 > URL: https://issues.apache.org/jira/browse/MESOS-8350 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Critical > > For resource provider-capable agents the master does not re-send checkpointed > resources on agent reregistration; instead the checkpointed resources sent as > part of the {{ReregisterSlaveMessage}} should be used. > This is not what happens in reality. If e.g., checkpointing of an offer > operation fails and the agent fails over the checkpointed resources would, as > expected, not be reflected in the agent, but would still be assumed in the > master. > A workaround is to fail over the master which would lead to the newly elected > master bootstrapping agent state from {{ReregisterSlaveMessage}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8350) Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration
[ https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-8350: -- Target Version/s: 1.5.1 (was: 1.5.0) > Resource provider-capable agents not correctly synchronizing checkpointed > agent resources on reregistration > --- > > Key: MESOS-8350 > URL: https://issues.apache.org/jira/browse/MESOS-8350 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Critical > > For resource provider-capable agents the master does not re-send checkpointed > resources on agent reregistration; instead the checkpointed resources sent as > part of the {{ReregisterSlaveMessage}} should be used. > This is not what happens in reality. If e.g., checkpointing of an offer > operation fails and the agent fails over the checkpointed resources would, as > expected, not be reflected in the agent, but would still be assumed in the > master. > A workaround is to fail over the master which would lead to the newly elected > master bootstrapping agent state from {{ReregisterSlaveMessage}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8337) Invalid state transition attempted when agent is lost.
[ https://issues.apache.org/jira/browse/MESOS-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302039#comment-16302039 ] Jie Yu commented on MESOS-8337: --- [~jpe...@apache.org] who is working on this issue? Is that a blocker for 1.5.0? > Invalid state transition attempted when agent is lost. > -- > > Key: MESOS-8337 > URL: https://issues.apache.org/jira/browse/MESOS-8337 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: James Peach > > The change in MESOS-7215 can attempt to transition a task from {{FAILED}} to > {{LOST}} when removing a lost agent. This ends up triggering a {{CHECK}} that > was added in the same patch. > {noformat} > I1214 23:42:16.507931 22396 master.cpp:10155] Removing task > mobius-mloop-1512774555_3661616380-xxx with resources disk(allocated: *):200; > cpus(allocated: *):0.01; mem(allocated: *):200; ports(allocated: > *):[31068-31068, 31069-31069, 31072-31072] of framework > afcbfa05-7973-4ad3-8399-4153556a8fa9-3607 on agent > daceae53-448b-4349-8503-9dd8132a6828-S4 at slave(1)@17.147.52.220:5 > (magent0006.xxx.com) > F1214 23:42:16.507961 22396 master.hpp:2342] Check failed: task->state() == > TASK_UNREACHABLE || task->state() == TASK_LOST TASK_FAILED > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8321) Validate that offer operations contain only master-known resource provider resources
[ https://issues.apache.org/jira/browse/MESOS-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-8321: -- Target Version/s: 1.5.1 (was: 1.5.0) > Validate that offer operations contain only master-known resource provider > resources > > > Key: MESOS-8321 > URL: https://issues.apache.org/jira/browse/MESOS-8321 > Project: Mesos > Issue Type: Bug >Reporter: Benjamin Bannier > > We should update the master's offer operation validation to also check that > any offer operation only works with resources from known resource providers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8303) Add user doc for agent reconfiguration
[ https://issues.apache.org/jira/browse/MESOS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-8303: -- Target Version/s: (was: 1.5.0) > Add user doc for agent reconfiguration > -- > > Key: MESOS-8303 > URL: https://issues.apache.org/jira/browse/MESOS-8303 > Project: Mesos > Issue Type: Documentation >Reporter: Vinod Kone >Assignee: Benno Evers > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8321) Validate that offer operations contain only master-known resource provider resources
[ https://issues.apache.org/jira/browse/MESOS-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302041#comment-16302041 ] Jie Yu commented on MESOS-8321: --- Re-target for 1.5.1 given RP related feature is experimental > Validate that offer operations contain only master-known resource provider > resources > > > Key: MESOS-8321 > URL: https://issues.apache.org/jira/browse/MESOS-8321 > Project: Mesos > Issue Type: Bug >Reporter: Benjamin Bannier > > We should update the master's offer operation validation to also check that > any offer operation only works with resources from known resource providers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8303) Add user doc for agent reconfiguration
[ https://issues.apache.org/jira/browse/MESOS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302042#comment-16302042 ] Jie Yu commented on MESOS-8303: --- Remove the target version 1.5.0 to unblock release. Please add doc asap! > Add user doc for agent reconfiguration > -- > > Key: MESOS-8303 > URL: https://issues.apache.org/jira/browse/MESOS-8303 > Project: Mesos > Issue Type: Documentation >Reporter: Vinod Kone >Assignee: Benno Evers > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8219) Validate that any offer operation is only applied on resources from a single provider
[ https://issues.apache.org/jira/browse/MESOS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302050#comment-16302050 ] Jie Yu commented on MESOS-8219: --- Re-target this for 1.5.1 given RP related features are experimental > Validate that any offer operation is only applied on resources from a single > provider > - > > Key: MESOS-8219 > URL: https://issues.apache.org/jira/browse/MESOS-8219 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Benjamin Bannier > > Offer operations can only be applied to resources from one single resource > provider. A number of places in the implementation assume that the provider > ID obtained from any {Resource} in an offer operation is equivalent to the > one from any other resource. We should update the master to validate that > invariant and reject malformed operations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8219) Validate that any offer operation is only applied on resources from a single provider
[ https://issues.apache.org/jira/browse/MESOS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-8219: -- Target Version/s: 1.5.1 (was: 1.5.0) > Validate that any offer operation is only applied on resources from a single > provider > - > > Key: MESOS-8219 > URL: https://issues.apache.org/jira/browse/MESOS-8219 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Benjamin Bannier > > Offer operations can only be applied to resources from one single resource > provider. A number of places in the implementation assume that the provider > ID obtained from any {Resource} in an offer operation is equivalent to the > one from any other resource. We should update the master to validate that > invariant and reject malformed operations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8221) Use protobuf reflection to simplify downgrading of resources.
[ https://issues.apache.org/jira/browse/MESOS-8221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302048#comment-16302048 ] Jie Yu commented on MESOS-8221: --- [~bmahler], [~mcypark], is this a blocker for 1.5.0? > Use protobuf reflection to simplify downgrading of resources. > - > > Key: MESOS-8221 > URL: https://issues.apache.org/jira/browse/MESOS-8221 > Project: Mesos > Issue Type: Improvement > Components: agent >Reporter: Michael Park >Assignee: Michael Park > > We currently have a {{downgradeResources}} function which is called on every > {{repeated Resource}} field in every message that we checkpoint. We should > leverage > protobuf reflection to automatically downgrade any instances of {{Resource}} > within any > protobuf message. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8247) Executor registered message is lost
[ https://issues.apache.org/jira/browse/MESOS-8247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302046#comment-16302046 ] Jie Yu commented on MESOS-8247: --- [~abudnik], [~alexr], is this a blocker for 1.5.0? > Executor registered message is lost > --- > > Key: MESOS-8247 > URL: https://issues.apache.org/jira/browse/MESOS-8247 > Project: Mesos > Issue Type: Bug >Reporter: Andrei Budnik >Assignee: Andrei Budnik > > h3. Brief description of successful agent-executor communication. > Executor sends `RegisterExecutorMessage` message to Agent during > initialization step. Agent sends a `ExecutorRegisteredMessage` message as a > response to the Executor in `registerExecutor()` method. Whenever executor > receives `ExecutorRegisteredMessage`, it prints a `Executor registered on > agent...` to stderr logs. > h3. Problem description. > The agent launches built-in docker executor, which is stuck in `STAGING` > state. > stderr logs of the docker executor: > {code} > I1114 23:03:17.919090 14322 exec.cpp:162] Version: 1.2.3 > {code} > It doesn't contain a message like `Executor registered on agent...`. At the > same time agent received `RegisterExecutorMessage` and sent `runTask` message > to the executor. > stdout logs consists of the same repeating message: > {code} > Received killTask for task ... > {code} > Also, the docker executor process doesn't contain child processes. > Currently, executor [doesn't > attempt|https://github.com/apache/mesos/blob/2a253093ecdc7d743c9c0874d6e01b68f6a813e4/src/exec/exec.cpp#L320] > to launch a task if it is not registered at the agent, while [task > killing|https://github.com/apache/mesos/blob/2a253093ecdc7d743c9c0874d6e01b68f6a813e4/src/exec/exec.cpp#L343] > doesn't have such a check. > It looks like `ExecutorRegisteredMessage` has been lost. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8041) Add a document for `cgroups/blkio` isolation
[ https://issues.apache.org/jira/browse/MESOS-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-8041: -- Target Version/s: (was: 1.5.0) > Add a document for `cgroups/blkio` isolation > > > Key: MESOS-8041 > URL: https://issues.apache.org/jira/browse/MESOS-8041 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Qian Zhang >Assignee: Jason Lai > > Now we have supported {{cgroups/blkio}} isolation in Mesos agent for > collecting blkio statistics, we need to add a document for it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8025) Update the master field in the new CLI config to accept a URL instead of an
[ https://issues.apache.org/jira/browse/MESOS-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-8025: -- Target Version/s: 1.6.0 (was: 1.5.0) > Update the master field in the new CLI config to accept a URL instead of an > > - > > Key: MESOS-8025 > URL: https://issues.apache.org/jira/browse/MESOS-8025 > Project: Mesos > Issue Type: Improvement > Components: cli > Environment: This will be useful in cases where the master is behind > a proxy or when the master is sitting directly on port 80. >Reporter: Kevin Klues >Assignee: Armand Grillet > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8025) Update the master field in the new CLI config to accept a URL instead of an
[ https://issues.apache.org/jira/browse/MESOS-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302056#comment-16302056 ] Jie Yu commented on MESOS-8025: --- Retarget this as no progress has been made recently > Update the master field in the new CLI config to accept a URL instead of an > > - > > Key: MESOS-8025 > URL: https://issues.apache.org/jira/browse/MESOS-8025 > Project: Mesos > Issue Type: Improvement > Components: cli > Environment: This will be useful in cases where the master is behind > a proxy or when the master is sitting directly on port 80. >Reporter: Kevin Klues >Assignee: Armand Grillet > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7974) Accept "application/recordio" type is rejected for master operator API SUBSCRIBE call
[ https://issues.apache.org/jira/browse/MESOS-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7974: -- Target Version/s: 1.6.0 (was: 1.5.0) > Accept "application/recordio" type is rejected for master operator API > SUBSCRIBE call > - > > Key: MESOS-7974 > URL: https://issues.apache.org/jira/browse/MESOS-7974 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: James DeFelice > Labels: mesosphere > > The agent operator API supports for "application/recordio" for things like > attach-container-output, which streams objects back to the caller. I expected > the master operator API SUBSCRIBE call to work the same way, w/ > Accept/Content-Type headers for "recordio" and > Message-Accept/Message-Content-Type headers for json (or protobuf). This was > not the case. > Looking again at the master operator API documentation, SUBSCRIBE docs > illustrate usage Accept and Content-Type headers for the "application/json" > type. Not a "recordio" type. So my experience, as per the docs, seems > expected. However, this is counter-intuitive since the whole point of adding > the new Message-prefixed headers was to help callers consistently request > (and differentiate) streaming responses from non-streaming responses in the > v1 API. > Please fix the master operator API implementation to also support the > Message-prefixed headers w/ Accept/Content-Type set to "recordio". > Observed on ubuntu w/ mesos package version 1.2.1-2.0.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8022) Add tests proving the HTTP authenticatee modularize.
[ https://issues.apache.org/jira/browse/MESOS-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-8022: -- Target Version/s: (was: 1.5.0) > Add tests proving the HTTP authenticatee modularize. > > > Key: MESOS-8022 > URL: https://issues.apache.org/jira/browse/MESOS-8022 > Project: Mesos > Issue Type: Improvement >Reporter: Till Toenshoff > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8041) Add a document for `cgroups/blkio` isolation
[ https://issues.apache.org/jira/browse/MESOS-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302055#comment-16302055 ] Jie Yu commented on MESOS-8041: --- Remove the target version. Can this be closed? > Add a document for `cgroups/blkio` isolation > > > Key: MESOS-8041 > URL: https://issues.apache.org/jira/browse/MESOS-8041 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Qian Zhang >Assignee: Jason Lai > > Now we have supported {{cgroups/blkio}} isolation in Mesos agent for > collecting blkio statistics, we need to add a document for it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7974) Accept "application/recordio" type is rejected for master operator API SUBSCRIBE call
[ https://issues.apache.org/jira/browse/MESOS-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302058#comment-16302058 ] Jie Yu commented on MESOS-7974: --- Re-target this to 1.6.0 as no progress has been made [~vinodkone], can you take a look at this? > Accept "application/recordio" type is rejected for master operator API > SUBSCRIBE call > - > > Key: MESOS-7974 > URL: https://issues.apache.org/jira/browse/MESOS-7974 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: James DeFelice > Labels: mesosphere > > The agent operator API supports for "application/recordio" for things like > attach-container-output, which streams objects back to the caller. I expected > the master operator API SUBSCRIBE call to work the same way, w/ > Accept/Content-Type headers for "recordio" and > Message-Accept/Message-Content-Type headers for json (or protobuf). This was > not the case. > Looking again at the master operator API documentation, SUBSCRIBE docs > illustrate usage Accept and Content-Type headers for the "application/json" > type. Not a "recordio" type. So my experience, as per the docs, seems > expected. However, this is counter-intuitive since the whole point of adding > the new Message-prefixed headers was to help callers consistently request > (and differentiate) streaming responses from non-streaming responses in the > v1 API. > Please fix the master operator API implementation to also support the > Message-prefixed headers w/ Accept/Content-Type set to "recordio". > Observed on ubuntu w/ mesos package version 1.2.1-2.0.1 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8022) Add tests proving the HTTP authenticatee modularize.
[ https://issues.apache.org/jira/browse/MESOS-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302057#comment-16302057 ] Jie Yu commented on MESOS-8022: --- Remove target version as tests shouldn't block a release. > Add tests proving the HTTP authenticatee modularize. > > > Key: MESOS-8022 > URL: https://issues.apache.org/jira/browse/MESOS-8022 > Project: Mesos > Issue Type: Improvement >Reporter: Till Toenshoff > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7967) Make `mesos-execute` work with old-style resources
[ https://issues.apache.org/jira/browse/MESOS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7967: -- Target Version/s: (was: 1.4.1, 1.5.0) > Make `mesos-execute` work with old-style resources > -- > > Key: MESOS-7967 > URL: https://issues.apache.org/jira/browse/MESOS-7967 > Project: Mesos > Issue Type: Improvement > Components: cli >Reporter: Michael Park > > {{mesos-execute}} should be updated to be able to handle > "pre-reservation-refinement" resource format. > For reservation refinement, new resource format were introduced. > The master and agent have been carefully updated to be able to handle > pre/post reservation-refinement resource formats, whereas the example > frameworks and {{mesos-execute}} were updated such that they require > the new resource format. While the example frameworks are probably fine > being updated to use the new format, {{mesos-execute}} is used as a > developer tool, and as such we should update it to be more robust in its > handling of resource formats. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7967) Make `mesos-execute` work with old-style resources
[ https://issues.apache.org/jira/browse/MESOS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302061#comment-16302061 ] Jie Yu commented on MESOS-7967: --- [~mcypark] any plan to work on this? I remove the target versions for now > Make `mesos-execute` work with old-style resources > -- > > Key: MESOS-7967 > URL: https://issues.apache.org/jira/browse/MESOS-7967 > Project: Mesos > Issue Type: Improvement > Components: cli >Reporter: Michael Park > > {{mesos-execute}} should be updated to be able to handle > "pre-reservation-refinement" resource format. > For reservation refinement, new resource format were introduced. > The master and agent have been carefully updated to be able to handle > pre/post reservation-refinement resource formats, whereas the example > frameworks and {{mesos-execute}} were updated such that they require > the new resource format. While the example frameworks are probably fine > being updated to use the new format, {{mesos-execute}} is used as a > developer tool, and as such we should update it to be more robust in its > handling of resource formats. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7776) Document `MESOS_CONTAINER_IP`
[ https://issues.apache.org/jira/browse/MESOS-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7776: -- Target Version/s: 1.6.0 (was: 1.5.0) > Document `MESOS_CONTAINER_IP` > -- > > Key: MESOS-7776 > URL: https://issues.apache.org/jira/browse/MESOS-7776 > Project: Mesos > Issue Type: Documentation > Components: containerization >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > > We introduced `MESOS_CONTAINER_IP` to inform tasks launched by the > default-executor to inform the tasks about their container IP. This was done > primarily to break the dependency of the containers on `LIBPROCESS_IP` to > learn their IP addresses which was misleading. > This change need to be documented. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7949) Upgrade Mesos to C++14.
[ https://issues.apache.org/jira/browse/MESOS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302066#comment-16302066 ] Jie Yu commented on MESOS-7949: --- Retarget this to 1.6.0 > Upgrade Mesos to C++14. > --- > > Key: MESOS-7949 > URL: https://issues.apache.org/jira/browse/MESOS-7949 > Project: Mesos > Issue Type: Epic >Reporter: Michael Park > > Upgrading Mesos to C++14 will give us features such as > - Generic lambdas > - New lambda captures (Proper move captures) > - SFINAE result_of (We can remove {{stout/result_of.hpp}}) > - Variable templates > - Relaxed {{constexpr}} functions > - Simple utilities such as {{std::make_unique}} > - Metaprogramming facilities such as {{decay_t}}, {{index_sequence}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7958) The example framework `test-framework` is broken.
[ https://issues.apache.org/jira/browse/MESOS-7958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302063#comment-16302063 ] Jie Yu commented on MESOS-7958: --- Removed the target version. Is this still an issue? Can we close? > The example framework `test-framework` is broken. > - > > Key: MESOS-7958 > URL: https://issues.apache.org/jira/browse/MESOS-7958 > Project: Mesos > Issue Type: Bug > Components: framework >Affects Versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.2.0, 1.2.1, 1.2.2, 1.2.3, > 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1 >Reporter: Michael Park > Attachments: screenshot-1.png > > > The {{test-framework}} example framework does not work. > Launching a cluster like so: > {code} > MESOS_RESOURCES="cpus:32;mem:512;disk:1024" MESOS_REGISTRY="in_memory" > ./bin/mesos-local.sh --num_slaves=1 --ip=127.0.0.1 --port=4040 > --work_dir=$HOME/mesos-local > {code} > and trying to launch the {{test-framework}} like so: > {code} > ./src/test-framework --master=127.0.0.1:4040 > {code} > {code} > /home/mpark/projects/mesos/build/src/.libs/test-executor: error while loading > shared libraries: libmesos-1.5.0.so: cannot open shared object file: No such > file or directory > {code} > It seems that {{test-executor}} cannot load {{libmesos.so}} correctly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7949) Upgrade Mesos to C++14.
[ https://issues.apache.org/jira/browse/MESOS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7949: -- Target Version/s: 1.6.0 (was: 1.5.0) > Upgrade Mesos to C++14. > --- > > Key: MESOS-7949 > URL: https://issues.apache.org/jira/browse/MESOS-7949 > Project: Mesos > Issue Type: Epic >Reporter: Michael Park > > Upgrading Mesos to C++14 will give us features such as > - Generic lambdas > - New lambda captures (Proper move captures) > - SFINAE result_of (We can remove {{stout/result_of.hpp}}) > - Variable templates > - Relaxed {{constexpr}} functions > - Simple utilities such as {{std::make_unique}} > - Metaprogramming facilities such as {{decay_t}}, {{index_sequence}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7950) Update autotools and CMake to build in C++14 mode.
[ https://issues.apache.org/jira/browse/MESOS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302065#comment-16302065 ] Jie Yu commented on MESOS-7950: --- Re-target this to 1.6.0. > Update autotools and CMake to build in C++14 mode. > -- > > Key: MESOS-7950 > URL: https://issues.apache.org/jira/browse/MESOS-7950 > Project: Mesos > Issue Type: Task > Components: build >Reporter: Michael Park > > Update the {{configure.ac}} for autotools, and > {{cmake/CompilationConfigure.cmake}} for CMake to build in C++14 mode. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7958) The example framework `test-framework` is broken.
[ https://issues.apache.org/jira/browse/MESOS-7958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7958: -- Target Version/s: (was: 1.5.0) > The example framework `test-framework` is broken. > - > > Key: MESOS-7958 > URL: https://issues.apache.org/jira/browse/MESOS-7958 > Project: Mesos > Issue Type: Bug > Components: framework >Affects Versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.2.0, 1.2.1, 1.2.2, 1.2.3, > 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1 >Reporter: Michael Park > Attachments: screenshot-1.png > > > The {{test-framework}} example framework does not work. > Launching a cluster like so: > {code} > MESOS_RESOURCES="cpus:32;mem:512;disk:1024" MESOS_REGISTRY="in_memory" > ./bin/mesos-local.sh --num_slaves=1 --ip=127.0.0.1 --port=4040 > --work_dir=$HOME/mesos-local > {code} > and trying to launch the {{test-framework}} like so: > {code} > ./src/test-framework --master=127.0.0.1:4040 > {code} > {code} > /home/mpark/projects/mesos/build/src/.libs/test-executor: error while loading > shared libraries: libmesos-1.5.0.so: cannot open shared object file: No such > file or directory > {code} > It seems that {{test-executor}} cannot load {{libmesos.so}} correctly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7950) Update autotools and CMake to build in C++14 mode.
[ https://issues.apache.org/jira/browse/MESOS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7950: -- Target Version/s: 1.6.0 (was: 1.5.0) > Update autotools and CMake to build in C++14 mode. > -- > > Key: MESOS-7950 > URL: https://issues.apache.org/jira/browse/MESOS-7950 > Project: Mesos > Issue Type: Task > Components: build >Reporter: Michael Park > > Update the {{configure.ac}} for autotools, and > {{cmake/CompilationConfigure.cmake}} for CMake to build in C++14 mode. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7776) Document `MESOS_CONTAINER_IP`
[ https://issues.apache.org/jira/browse/MESOS-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302068#comment-16302068 ] Jie Yu commented on MESOS-7776: --- Retarget this to 1.6.0. [~avinash.mesos], do you still plan to work on this? > Document `MESOS_CONTAINER_IP` > -- > > Key: MESOS-7776 > URL: https://issues.apache.org/jira/browse/MESOS-7776 > Project: Mesos > Issue Type: Documentation > Components: containerization >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > > We introduced `MESOS_CONTAINER_IP` to inform tasks launched by the > default-executor to inform the tasks about their container IP. This was done > primarily to break the dependency of the containers on `LIBPROCESS_IP` to > learn their IP addresses which was misleading. > This change need to be documented. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7903) Include in the DefaultExecutor logs the output of timed out checks
[ https://issues.apache.org/jira/browse/MESOS-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302067#comment-16302067 ] Jie Yu commented on MESOS-7903: --- Retarget this to 1.6.0 due to inactivity > Include in the DefaultExecutor logs the output of timed out checks > -- > > Key: MESOS-7903 > URL: https://issues.apache.org/jira/browse/MESOS-7903 > Project: Mesos > Issue Type: Improvement >Reporter: Gastón Kleiman >Priority: Minor > Labels: check, default-executor > > Once the patches for https://issues.apache.org/jira/browse/MESOS-7861 land, > the output of successful and failed checks will be included in the > DefaultExecutor logs, but the output of timed out checks won't be included. > Right now the checker process sends the {{LAUNCH_NESTED_CONTAINER_SESSION}} > requests using {{streamed=false}}. Libprocess will then convert the streaming > response into a body (non-streamed) response, completing the future returned > by {{Connection::send()}} only once the request has been fully received. The > checker will then read the whole process output from the response's body and > log it. > However when a check times out, the checker will close the connection before > the full response is received. So the future returned by > {{Connection::send()}} will be failed, and the checker won't have access to > the response. > In order to log the output of timed out checks, we will probably need to make > the checker send the launch request with {{streamed=true}}, and then make it > read the check output from the pipe of the streamed response. > If we do that, we should probably turn the {{Future> > getProcessIOData(...)}} method from {{api_tests.cpp}} into a helper method > and use it in {{checker_process.cpp}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7903) Include in the DefaultExecutor logs the output of timed out checks
[ https://issues.apache.org/jira/browse/MESOS-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7903: -- Target Version/s: 1.6.0 (was: 1.5.0) > Include in the DefaultExecutor logs the output of timed out checks > -- > > Key: MESOS-7903 > URL: https://issues.apache.org/jira/browse/MESOS-7903 > Project: Mesos > Issue Type: Improvement >Reporter: Gastón Kleiman >Priority: Minor > Labels: check, default-executor > > Once the patches for https://issues.apache.org/jira/browse/MESOS-7861 land, > the output of successful and failed checks will be included in the > DefaultExecutor logs, but the output of timed out checks won't be included. > Right now the checker process sends the {{LAUNCH_NESTED_CONTAINER_SESSION}} > requests using {{streamed=false}}. Libprocess will then convert the streaming > response into a body (non-streamed) response, completing the future returned > by {{Connection::send()}} only once the request has been fully received. The > checker will then read the whole process output from the response's body and > log it. > However when a check times out, the checker will close the connection before > the full response is received. So the future returned by > {{Connection::send()}} will be failed, and the checker won't have access to > the response. > In order to log the output of timed out checks, we will probably need to make > the checker send the launch request with {{streamed=true}}, and then make it > read the check output from the pipe of the streamed response. > If we do that, we should probably turn the {{Future> > getProcessIOData(...)}} method from {{api_tests.cpp}} into a helper method > and use it in {{checker_process.cpp}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7691) Support local enabled cgroups subsystems automatically.
[ https://issues.apache.org/jira/browse/MESOS-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7691: -- Target Version/s: 1.6.0 (was: 1.5.0) > Support local enabled cgroups subsystems automatically. > --- > > Key: MESOS-7691 > URL: https://issues.apache.org/jira/browse/MESOS-7691 > Project: Mesos > Issue Type: Improvement > Components: cgroups >Reporter: Gilbert Song >Assignee: Gilbert Song > Labels: cgroups > > Currently, each cgroup subsystem needs to be turned on as an isolator, e.g., > "cgroups/blkio". Ideally, mesos should be able to detect all local enabled > cgroup subsystems and turn them on automatically (or we call it auto cgroups). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7691) Support local enabled cgroups subsystems automatically.
[ https://issues.apache.org/jira/browse/MESOS-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302071#comment-16302071 ] Jie Yu commented on MESOS-7691: --- Re-target this to 1.6.0. > Support local enabled cgroups subsystems automatically. > --- > > Key: MESOS-7691 > URL: https://issues.apache.org/jira/browse/MESOS-7691 > Project: Mesos > Issue Type: Improvement > Components: cgroups >Reporter: Gilbert Song >Assignee: Gilbert Song > Labels: cgroups > > Currently, each cgroup subsystem needs to be turned on as an isolator, e.g., > "cgroups/blkio". Ideally, mesos should be able to detect all local enabled > cgroup subsystems and turn them on automatically (or we call it auto cgroups). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7705) Reconsider restricting the resource format for frameworks.
[ https://issues.apache.org/jira/browse/MESOS-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302070#comment-16302070 ] Jie Yu commented on MESOS-7705: --- [~bmahler], [~mcypark], is this a blocker for 1.5.0? If not, can you retarget? > Reconsider restricting the resource format for frameworks. > -- > > Key: MESOS-7705 > URL: https://issues.apache.org/jira/browse/MESOS-7705 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Michael Park >Assignee: Michael Park > > We output the "endpoint" format through the endpoints > for backward compatibility of external tooling. A framework should be > able to use the result of an endpoint and pass it back to Mesos, > since the result was produced by Mesos. This is especially applicable > to the V1 API. We also allow the "pre-reservation-refinement" format > because existing "resources files" are written in that format, and > they should still be usable without modification. > This is probably too flexible however, since a framework without > a RESERVATION_REFINEMENT capability could make refined reservations > using the "post-reservation-refinement" format, although they wouldn't be > offered such resources. It still seems undesirable if anyone were to > run into it, and we should consider adding sensible restrictions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8357) Example frameworks have an inconsistent UX.
[ https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301905#comment-16301905 ] Till Toenshoff commented on MESOS-8357: --- Our example frameworks are considered part of the infrastructure used for testing Mesos - but they are not part of the Mesos core distribution. My suggestion would be to consistently use the prefix {{TEST_}}, but not {{MESOS_}} and not {{DEFAULT_}}. > Example frameworks have an inconsistent UX. > --- > > Key: MESOS-8357 > URL: https://issues.apache.org/jira/browse/MESOS-8357 > Project: Mesos > Issue Type: Improvement >Reporter: Till Toenshoff >Assignee: Till Toenshoff > > Our example frameworks are a bit inconsistent when it comes to specifying > things like the framework principal / secret etc.. > Many of these examples have great value in testing a Mesos cluster. Unifying > the parameterizing would improve the user experience when testing Mesos. > {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling > / disabling authentication. {{load_generator_framework}} as one example > however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials > themselves are most commonly expected in environment variables > {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to > use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8357) Example frameworks have an inconsistent UX.
[ https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-8357: -- Affects Version/s: 1.5.0 > Example frameworks have an inconsistent UX. > --- > > Key: MESOS-8357 > URL: https://issues.apache.org/jira/browse/MESOS-8357 > Project: Mesos > Issue Type: Improvement >Affects Versions: 1.5.0 >Reporter: Till Toenshoff >Assignee: Till Toenshoff >Priority: Minor > > Our example frameworks are a bit inconsistent when it comes to specifying > things like the framework principal / secret etc.. > Many of these examples have great value in testing a Mesos cluster. Unifying > the parameterizing would improve the user experience when testing Mesos. > {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling > / disabling authentication. {{load_generator_framework}} as one example > however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials > themselves are most commonly expected in environment variables > {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to > use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8357) Example frameworks have an inconsistent UX.
[ https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff updated MESOS-8357: -- Priority: Minor (was: Major) > Example frameworks have an inconsistent UX. > --- > > Key: MESOS-8357 > URL: https://issues.apache.org/jira/browse/MESOS-8357 > Project: Mesos > Issue Type: Improvement >Affects Versions: 1.5.0 >Reporter: Till Toenshoff >Assignee: Till Toenshoff >Priority: Minor > > Our example frameworks are a bit inconsistent when it comes to specifying > things like the framework principal / secret etc.. > Many of these examples have great value in testing a Mesos cluster. Unifying > the parameterizing would improve the user experience when testing Mesos. > {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling > / disabling authentication. {{load_generator_framework}} as one example > however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials > themselves are most commonly expected in environment variables > {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to > use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7007) filesystem/shared and --default_container_info broken since 1.1
[ https://issues.apache.org/jira/browse/MESOS-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301881#comment-16301881 ] Jie Yu commented on MESOS-7007: --- I made an attempt to clean this up. I used the patch from [~jpepy] (https://reviews.apache.org/r/63598/), and added a followup patch (https://reviews.apache.org/r/64811/) > filesystem/shared and --default_container_info broken since 1.1 > --- > > Key: MESOS-7007 > URL: https://issues.apache.org/jira/browse/MESOS-7007 > Project: Mesos > Issue Type: Bug > Components: agent >Affects Versions: 1.1.0, 1.2.0 >Reporter: Pierre Cheynier >Assignee: Chun-Hung Hsiao > Labels: storage > > I face this issue, that prevent me to upgrade to 1.1.0 (and the change was > consequently introduced in this version): > I'm using default_container_info to mount a /tmp volume in the container's > mount namespace from its current sandbox, meaning that each container have a > dedicated /tmp, thanks to the {{filesystem/shared}} isolator. > I noticed through our automation pipeline that integration tests were failing > and found that this is because /tmp (the one from the host!) contents is > trashed each time a container is created. > Here is my setup: > * > {{--isolation='cgroups/cpu,cgroups/mem,namespaces/pid,*disk/du,filesystem/shared,filesystem/linux*,docker/runtime'}} > * > {{--default_container_info='\{"type":"MESOS","volumes":\[\{"host_path":"tmp","container_path":"/tmp","mode":"RW"\}\]\}'}} > I discovered this issue in the early days of 1.1 (end of Nov, spoke with > someone on Slack), but had unfortunately no time to dig into the symptoms a > bit more. > I found nothing interesting even using GLOGv=3. > Maybe it's a bad usage of isolators that trigger this issue ? If it's the > case, then at least a documentation update should be done. > Let me know if more information is needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7007) filesystem/shared and --default_container_info broken since 1.1
[ https://issues.apache.org/jira/browse/MESOS-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7007: -- Shepherd: Jie Yu (was: Gilbert Song) > filesystem/shared and --default_container_info broken since 1.1 > --- > > Key: MESOS-7007 > URL: https://issues.apache.org/jira/browse/MESOS-7007 > Project: Mesos > Issue Type: Bug > Components: agent >Affects Versions: 1.1.0, 1.2.0 >Reporter: Pierre Cheynier >Assignee: Chun-Hung Hsiao > Labels: storage > > I face this issue, that prevent me to upgrade to 1.1.0 (and the change was > consequently introduced in this version): > I'm using default_container_info to mount a /tmp volume in the container's > mount namespace from its current sandbox, meaning that each container have a > dedicated /tmp, thanks to the {{filesystem/shared}} isolator. > I noticed through our automation pipeline that integration tests were failing > and found that this is because /tmp (the one from the host!) contents is > trashed each time a container is created. > Here is my setup: > * > {{--isolation='cgroups/cpu,cgroups/mem,namespaces/pid,*disk/du,filesystem/shared,filesystem/linux*,docker/runtime'}} > * > {{--default_container_info='\{"type":"MESOS","volumes":\[\{"host_path":"tmp","container_path":"/tmp","mode":"RW"\}\]\}'}} > I discovered this issue in the early days of 1.1 (end of Nov, spoke with > someone on Slack), but had unfortunately no time to dig into the symptoms a > bit more. > I found nothing interesting even using GLOGv=3. > Maybe it's a bad usage of isolators that trigger this issue ? If it's the > case, then at least a documentation update should be done. > Let me know if more information is needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8357) Example frameworks have an inconsistent UX.
Till Toenshoff created MESOS-8357: - Summary: Example frameworks have an inconsistent UX. Key: MESOS-8357 URL: https://issues.apache.org/jira/browse/MESOS-8357 Project: Mesos Issue Type: Improvement Reporter: Till Toenshoff Our example frameworks are a bit inconsistent when it comes to specifying things like the framework principal / secret etc.. Many of these examples have great value in testing a Mesos cluster. Unifying the parameterizing would improve the user experience when testing Mesos. {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling / disabling authentication. {{load_generator_framework}} as one example however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials themselves are most commonly expected in environment variables {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-5362) Add authentication to example frameworks
[ https://issues.apache.org/jira/browse/MESOS-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff reassigned MESOS-5362: - Assignee: Till Toenshoff (was: Greg Mann) > Add authentication to example frameworks > > > Key: MESOS-5362 > URL: https://issues.apache.org/jira/browse/MESOS-5362 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Greg Mann >Assignee: Till Toenshoff > Labels: authentication, mesosphere, security > > Some example frameworks do not have the ability to authenticate with the > master. Adding authentication to the example frameworks that don't already > have it implemented would allow us to use these frameworks for testing in > authenticated/authorized scenarios. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-8357) Example frameworks have an inconsistent UX.
[ https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff reassigned MESOS-8357: - Assignee: Till Toenshoff > Example frameworks have an inconsistent UX. > --- > > Key: MESOS-8357 > URL: https://issues.apache.org/jira/browse/MESOS-8357 > Project: Mesos > Issue Type: Improvement >Reporter: Till Toenshoff >Assignee: Till Toenshoff > > Our example frameworks are a bit inconsistent when it comes to specifying > things like the framework principal / secret etc.. > Many of these examples have great value in testing a Mesos cluster. Unifying > the parameterizing would improve the user experience when testing Mesos. > {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling > / disabling authentication. {{load_generator_framework}} as one example > however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials > themselves are most commonly expected in environment variables > {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to > use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7643) The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically
[ https://issues.apache.org/jira/browse/MESOS-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302072#comment-16302072 ] Jie Yu commented on MESOS-7643: --- [~jpe...@apache.org], do you have a patch ready for this? Would be nice to fix in 1.5.0. > The order of isolators provided in '--isolation' flag is not preserved and > instead sorted alphabetically > > > Key: MESOS-7643 > URL: https://issues.apache.org/jira/browse/MESOS-7643 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.1.2, 1.2.0, 1.3.0 >Reporter: Michael Cherny >Assignee: James Peach >Priority: Critical > Labels: isolation > > According to documentation and comments in code the order of the entries in > the --isolation flag should specify the ordering of the isolators. > Specifically, the `create` and `prepare` calls for each isolator should run > serially in the order in which they appear in the --isolation flag, while the > `cleanup` call should be serialized in reverse order (with exception of > filesystem isolator which is always first). > But in fact, the isolators provided in '--isolation' flag are sorted > alphabetically. > That happens in [this line of > code|https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L377]. > In this line use of 'set' is done (apparently instead of list or > vector) and set is a sorted container. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7607) Support for first-class fault domains.
[ https://issues.apache.org/jira/browse/MESOS-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7607: -- Target Version/s: 1.6.0 (was: 1.5.0) > Support for first-class fault domains. > -- > > Key: MESOS-7607 > URL: https://issues.apache.org/jira/browse/MESOS-7607 > Project: Mesos > Issue Type: Epic >Reporter: Neil Conway >Assignee: Neil Conway > Labels: mesosphere > > Mesos should support a first-class notion of "fault domains", which > effectively provide a common vocabulary for describing the region and zone > where a node (either master or agent) is located. > Design doc: > https://drive.google.com/open?id=1gEugdkLRbBsqsiFv3urRPRNrHwUC-i1HwfFfHR_MvC8 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7563) Make the HTTP command executor the default implementation.
[ https://issues.apache.org/jira/browse/MESOS-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7563: -- Target Version/s: 1.6.0 (was: 1.5.0) > Make the HTTP command executor the default implementation. > -- > > Key: MESOS-7563 > URL: https://issues.apache.org/jira/browse/MESOS-7563 > Project: Mesos > Issue Type: Epic >Reporter: Anand Mazumdar > > This epic tracks the work needed to make HTTP command executors the default > i.e., enable the {{http_command_executor}} flag. Currently, all command > executors use the old executor driver implementation. With this flag being > always enabled, the command executors would use the v1 HTTP API. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-6394) Improvements to partition-aware Mesos frameworks.
[ https://issues.apache.org/jira/browse/MESOS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu reassigned MESOS-6394: - Assignee: Jie Yu (was: Neil Conway) > Improvements to partition-aware Mesos frameworks. > - > > Key: MESOS-6394 > URL: https://issues.apache.org/jira/browse/MESOS-6394 > Project: Mesos > Issue Type: Epic > Components: master >Reporter: Alexander Rukletsov >Assignee: Jie Yu > Labels: mesosphere > > This is a follow up epic to MESOS-5344 to capture further improvements and > changes that need to be made to the MVP. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7404) Ensure hierarchical roles work with old Mesos agents
[ https://issues.apache.org/jira/browse/MESOS-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302076#comment-16302076 ] Jie Yu commented on MESOS-7404: --- Retarget due to inactivity > Ensure hierarchical roles work with old Mesos agents > > > Key: MESOS-7404 > URL: https://issues.apache.org/jira/browse/MESOS-7404 > Project: Mesos > Issue Type: Bug >Reporter: Neil Conway >Assignee: Jie Yu > Labels: mesosphere > > If the Mesos master supports hierarchical roles but the agent does not, we > need to ensure that we avoid putting the agent into a bad state, e.g., if the > user creates a persistent volume. > One approach is to use an agent capability for hierarchical roles, and > disallow creating persistent-volumes using a hierarchical role if the agent > doesn't have the capability. We could also use an agent version check, > although until MESOS-6975 is implemented, that will be a bit awkward. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-6623) Re-enable tests impacted by request streaming support
[ https://issues.apache.org/jira/browse/MESOS-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu reassigned MESOS-6623: - Assignee: (was: Anand Mazumdar) > Re-enable tests impacted by request streaming support > - > > Key: MESOS-6623 > URL: https://issues.apache.org/jira/browse/MESOS-6623 > Project: Mesos > Issue Type: Bug > Components: HTTP API, test >Reporter: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > We added support for HTTP request streaming in libprocess as part of > MESOS-6466. However, this broke a few tests that relied on HTTP request > filtering since the handlers no longer have access to the body of the request > when {{visit()}} is invoked. We would need to revisit how we do HTTP request > filtering and then re-enable these tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7317) Add master endpoint to deactivate / activate agent
[ https://issues.apache.org/jira/browse/MESOS-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7317: -- Target Version/s: 1.6.0 (was: 1.5.0) > Add master endpoint to deactivate / activate agent > -- > > Key: MESOS-7317 > URL: https://issues.apache.org/jira/browse/MESOS-7317 > Project: Mesos > Issue Type: Improvement > Components: agent, master >Reporter: Neil Conway > Labels: mesosphere > > This would allow the operator to deactivate and then subsequently activate an > agent. The allocator does not make offers for deactivated agents; this > functionality would be useful to help operators "manually (incrementally) > drain" the tasks running on an agent, e.g., before taking the agent down. > At present, if the operator causes a framework to kill a task running on the > agent, the framework will often receive an offer for the unused resources on > the agent, which will often result in respawning the killed task on the same > agent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-7404) Ensure hierarchical roles work with old Mesos agents
[ https://issues.apache.org/jira/browse/MESOS-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu reassigned MESOS-7404: - Assignee: Jie Yu (was: Neil Conway) > Ensure hierarchical roles work with old Mesos agents > > > Key: MESOS-7404 > URL: https://issues.apache.org/jira/browse/MESOS-7404 > Project: Mesos > Issue Type: Bug >Reporter: Neil Conway >Assignee: Jie Yu > Labels: mesosphere > > If the Mesos master supports hierarchical roles but the agent does not, we > need to ensure that we avoid putting the agent into a bad state, e.g., if the > user creates a persistent volume. > One approach is to use an agent capability for hierarchical roles, and > disallow creating persistent-volumes using a hierarchical role if the agent > doesn't have the capability. We could also use an agent version check, > although until MESOS-6975 is implemented, that will be a bit awkward. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7473) Use "-dev" prerelease label for version during development
[ https://issues.apache.org/jira/browse/MESOS-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7473: -- Target Version/s: 1.6.0 (was: 1.5.0) > Use "-dev" prerelease label for version during development > -- > > Key: MESOS-7473 > URL: https://issues.apache.org/jira/browse/MESOS-7473 > Project: Mesos > Issue Type: Task >Reporter: Neil Conway >Assignee: Neil Conway > Labels: mesosphere > > Prior discussion: > https://lists.apache.org/thread.html/6e291c504fd44b79e452744b80073cb33adc1be85c17e22bbca35a6c@%3Cdev.mesos.apache.org%3E > https://lists.apache.org/thread.html/eb526c9295b3cf8e4efc7e0a7d2dacabb61ab5ed867a05e7d913d3fb@%3Cdev.mesos.apache.org%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7404) Ensure hierarchical roles work with old Mesos agents
[ https://issues.apache.org/jira/browse/MESOS-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7404: -- Target Version/s: 1.6.0 (was: 1.5.0) > Ensure hierarchical roles work with old Mesos agents > > > Key: MESOS-7404 > URL: https://issues.apache.org/jira/browse/MESOS-7404 > Project: Mesos > Issue Type: Bug >Reporter: Neil Conway >Assignee: Neil Conway > Labels: mesosphere > > If the Mesos master supports hierarchical roles but the agent does not, we > need to ensure that we avoid putting the agent into a bad state, e.g., if the > user creates a persistent volume. > One approach is to use an agent capability for hierarchical roles, and > disallow creating persistent-volumes using a hierarchical role if the agent > doesn't have the capability. We could also use an agent version check, > although until MESOS-6975 is implemented, that will be a bit awkward. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7428) Report exit code of tasks from default and command executors
[ https://issues.apache.org/jira/browse/MESOS-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7428: -- Target Version/s: 1.6.0 (was: 1.5.0) > Report exit code of tasks from default and command executors > > > Key: MESOS-7428 > URL: https://issues.apache.org/jira/browse/MESOS-7428 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Zhitao Li >Assignee: Zhitao Li > > Use case: some tasks should only be retried if the exit code matches certain > user requirement. > Based on [~gilbert], we already checkpoint the exit code in containerizer > now, and we need to clarify how to report exit code for executor containers > v.s. nested containers, and we should do this consistently for command and > default executor. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7426) Support for agent lifecycle management.
[ https://issues.apache.org/jira/browse/MESOS-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7426: -- Target Version/s: 1.6.0 (was: 1.5.0) > Support for agent lifecycle management. > --- > > Key: MESOS-7426 > URL: https://issues.apache.org/jira/browse/MESOS-7426 > Project: Mesos > Issue Type: Epic > Components: agent >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: agent-lifecycle, mesosphere > > This epic co-ordinates the work for introducing agent lifecycle management in > Mesos allowing a framework to be notified in case of agent node failures. The > existing {{Event::Failure}} is not enough for frameworks to know that the > given agent node isn't ever coming back. > The primary motivations for introducing such a feature would be: > - Currently, when an agent running a task fails, there is inherently an > operator interference needed (manual step) to remove the node via a > configuration API exposed by the framework e.g., dcos cassandra node replace > for the cassandra framework. This needs to be done once for every stateful > framework running on the cluster. > - When an agent is marked as unhealthy, the removal rate is bounded if the > `--agent_rate_removal_limit` option is set. This is specifically problematic > for operators relying on EC2 autoscaling groups or for workload bursting to > another cloud. > - When an agent is marked as unhealthy, the removal rate is bounded if the > `--agent_rate_removal_limit` option is set. This is specifically problematic > for operators relying on EC2 autoscaling groups or for workload bursting to > another cloud. > - When the fault domain associated with an agent changes (e.g., it is moved > from an unallocated rack to an allocated rack), there is no feedback > mechanism for the framework. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7278) Implement configuration reader/writer for the new CLI
[ https://issues.apache.org/jira/browse/MESOS-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7278: -- Target Version/s: 1.6.0 (was: 1.5.0) > Implement configuration reader/writer for the new CLI > - > > Key: MESOS-7278 > URL: https://issues.apache.org/jira/browse/MESOS-7278 > Project: Mesos > Issue Type: Task > Components: cli >Affects Versions: 1.3.0 >Reporter: Eric Chung >Assignee: Eric Chung > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-6623) Re-enable tests impacted by request streaming support
[ https://issues.apache.org/jira/browse/MESOS-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-6623: -- Target Version/s: (was: 1.5.0) > Re-enable tests impacted by request streaming support > - > > Key: MESOS-6623 > URL: https://issues.apache.org/jira/browse/MESOS-6623 > Project: Mesos > Issue Type: Bug > Components: HTTP API, test >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > We added support for HTTP request streaming in libprocess as part of > MESOS-6466. However, this broke a few tests that relied on HTTP request > filtering since the handlers no longer have access to the body of the request > when {{visit()}} is invoked. We would need to revisit how we do HTTP request > filtering and then re-enable these tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)