[jira] [Created] (MESOS-8355) "expression with side effects has no effect in an unevaluated context" on Ubuntu 16.04

2017-12-22 Thread Armand Grillet (JIRA)
Armand Grillet created MESOS-8355:
-

 Summary: "expression with side effects has no effect in an 
unevaluated context" on Ubuntu 16.04
 Key: MESOS-8355
 URL: https://issues.apache.org/jira/browse/MESOS-8355
 Project: Mesos
  Issue Type: Bug
Reporter: Armand Grillet
 Attachments: ubuntu-16.04-clang.txt

Following https://reviews.apache.org/r/62287/ building Mesos on Ubuntu 16.04 
with Clang does not work:

{code}
00:13:42 creating 
build/bdist.linux-x86_64/wheel/mesos.scheduler-1.5.0.dist-info/WHEEL
00:13:46 make  dynamic-reservation-framework test-http-framework test-framework 
test-executor test-http-executor long-lived-framework long-lived-executor 
no-executor-framework docker-no-executor-framework balloon-framework 
balloon-executor load-generator-framework persistent-volume-framework 
disk-full-framework  test-helper mesos-tests examples/java/test-executor 
examples/java/test-exception-framework examples/java/test-framework 
examples/java/test-log examples/java/test-multiple-executors-framework 
examples/java/v1-test-framework examples/python/test_executor.py 
examples/python/test-executor examples/python/test_framework.py 
examples/python/test-framework \
00:13:46   tests/balloon_framework_test.sh tests/disk_full_framework_test.sh 
tests/dynamic_reservation_framework_test.sh tests/java_exception_test.sh 
tests/java_framework_test.sh tests/java_log_test.sh 
tests/java_v0_framework_test.sh tests/java_v1_framework_test.sh 
tests/no_executor_framework_test.sh tests/persistent_volume_framework_test.sh 
tests/python_framework_test.sh tests/test_http_framework_test.sh 
tests/test_framework_test.sh
00:13:47 make[3]: Entering directory 
'/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src'
00:13:47   CXXLDdynamic-reservation-framework
00:13:47   CXXLDtest-http-framework
00:13:49   CXXLDtest-framework
00:13:49   CXXLDtest-executor
00:13:51   CXXLDtest-http-executor
00:13:51   CXXLDlong-lived-framework
00:13:52   CXXLDlong-lived-executor
00:13:53   CXXLDno-executor-framework
00:13:54   CXXLDdocker-no-executor-framework
00:13:54   CXXLDballoon-framework
00:13:56   CXXLDballoon-executor
00:13:56   CXXLDload-generator-framework
00:13:58   CXXLDpersistent-volume-framework
00:13:58   CXXLDdisk-full-framework
00:14:00   CXX  tests/test_helper-active_user_test_helper.o
00:14:00   CXX  tests/test_helper-flags.o
00:14:00   CXX  tests/test_helper-http_server_test_helper.o
00:14:00   CXX  tests/test_helper-kill_policy_test_helper.o
00:14:00   CXX  tests/test_helper-resources_utils.o
00:14:00   CXX  tests/test_helper-test_helper_main.o
00:14:00   CXX  tests/test_helper-utils.o
00:14:00   CXX  tests/containerizer/test_helper-memory_test_helper.o
00:14:00   CXX  tests/containerizer/test_helper-capabilities_test_helper.o
00:14:00   CXX  tests/containerizer/test_helper-setns_test_helper.o
00:14:00   CXX  tests/mesos_tests-log_tests.o
00:14:01   CXX  tests/mesos_tests-master_authorization_tests.o
00:14:27 ../../src/tests/log_tests.cpp:2439:120: error: expression with side 
effects has no effect in an unevaluated context 
[-Werror,-Wunevaluated-expression]
00:14:27 switch (0) case 0: default: if (const ::testing::AssertionResult 
gtest_ar = (::testing::internal:: 
EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(stringify(position++)))
 == 1)>::Compare("stringify(position++)", "entry.data", stringify(position++), 
entry.data))) ; else 
::testing::internal::AssertHelper(::testing::TestPartResult::kNonFatalFailure, 
"../../src/tests/log_tests.cpp", 2439, gtest_ar.failure_message()) = 
::testing::Message();
00:14:27
^
00:14:27 1 error generated.
00:14:27 Makefile:10317: recipe for target 'tests/mesos_tests-log_tests.o' 
failed
00:14:27 make[3]: *** [tests/mesos_tests-log_tests.o] Error 1
00:14:27 make[3]: *** Waiting for unfinished jobs
00:14:54 make[3]: Leaving directory 
'/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src'
00:14:54 Makefile:13776: recipe for target 'check-am' failed
00:14:54 make[2]: *** [check-am] Error 2
00:14:54 make[2]: Leaving directory 
'/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src'
00:14:54 Makefile:13780: recipe for target 'check' failed
00:14:54 make[1]: *** [check] Error 2
00:14:54 make[1]: Leaving directory 
'/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src'
00:14:54 Makefile:774: recipe for target 'check-recursive' failed
00:14:54 make: *** [check-recursive] Error 1
00:14:55 Build step 'Conditional step (single)' marked build as failure
00:14:55 

[jira] [Updated] (MESOS-8355) "expression with side effects has no effect in an unevaluated context" when building Mesos on Ubuntu 16.04 (Clang)

2017-12-22 Thread Armand Grillet (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Armand Grillet updated MESOS-8355:
--
Summary: "expression with side effects has no effect in an unevaluated 
context" when building Mesos on Ubuntu 16.04 (Clang)  (was: "expression with 
side effects has no effect in an unevaluated context" on Ubuntu 16.04)

> "expression with side effects has no effect in an unevaluated context" when 
> building Mesos on Ubuntu 16.04 (Clang)
> --
>
> Key: MESOS-8355
> URL: https://issues.apache.org/jira/browse/MESOS-8355
> Project: Mesos
>  Issue Type: Bug
>Reporter: Armand Grillet
> Attachments: ubuntu-16.04-clang.txt
>
>
> Following https://reviews.apache.org/r/62287/ building Mesos on Ubuntu 16.04 
> with Clang does not work:
> {code}
> 00:13:42 creating 
> build/bdist.linux-x86_64/wheel/mesos.scheduler-1.5.0.dist-info/WHEEL
> 00:13:46 make  dynamic-reservation-framework test-http-framework 
> test-framework test-executor test-http-executor long-lived-framework 
> long-lived-executor no-executor-framework docker-no-executor-framework 
> balloon-framework balloon-executor load-generator-framework 
> persistent-volume-framework disk-full-framework  test-helper mesos-tests 
> examples/java/test-executor examples/java/test-exception-framework 
> examples/java/test-framework examples/java/test-log 
> examples/java/test-multiple-executors-framework 
> examples/java/v1-test-framework examples/python/test_executor.py 
> examples/python/test-executor examples/python/test_framework.py 
> examples/python/test-framework \
> 00:13:46   tests/balloon_framework_test.sh tests/disk_full_framework_test.sh 
> tests/dynamic_reservation_framework_test.sh tests/java_exception_test.sh 
> tests/java_framework_test.sh tests/java_log_test.sh 
> tests/java_v0_framework_test.sh tests/java_v1_framework_test.sh 
> tests/no_executor_framework_test.sh tests/persistent_volume_framework_test.sh 
> tests/python_framework_test.sh tests/test_http_framework_test.sh 
> tests/test_framework_test.sh
> 00:13:47 make[3]: Entering directory 
> '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src'
> 00:13:47   CXXLDdynamic-reservation-framework
> 00:13:47   CXXLDtest-http-framework
> 00:13:49   CXXLDtest-framework
> 00:13:49   CXXLDtest-executor
> 00:13:51   CXXLDtest-http-executor
> 00:13:51   CXXLDlong-lived-framework
> 00:13:52   CXXLDlong-lived-executor
> 00:13:53   CXXLDno-executor-framework
> 00:13:54   CXXLDdocker-no-executor-framework
> 00:13:54   CXXLDballoon-framework
> 00:13:56   CXXLDballoon-executor
> 00:13:56   CXXLDload-generator-framework
> 00:13:58   CXXLDpersistent-volume-framework
> 00:13:58   CXXLDdisk-full-framework
> 00:14:00   CXX  tests/test_helper-active_user_test_helper.o
> 00:14:00   CXX  tests/test_helper-flags.o
> 00:14:00   CXX  tests/test_helper-http_server_test_helper.o
> 00:14:00   CXX  tests/test_helper-kill_policy_test_helper.o
> 00:14:00   CXX  tests/test_helper-resources_utils.o
> 00:14:00   CXX  tests/test_helper-test_helper_main.o
> 00:14:00   CXX  tests/test_helper-utils.o
> 00:14:00   CXX  tests/containerizer/test_helper-memory_test_helper.o
> 00:14:00   CXX  tests/containerizer/test_helper-capabilities_test_helper.o
> 00:14:00   CXX  tests/containerizer/test_helper-setns_test_helper.o
> 00:14:00   CXX  tests/mesos_tests-log_tests.o
> 00:14:01   CXX  tests/mesos_tests-master_authorization_tests.o
> 00:14:27 ../../src/tests/log_tests.cpp:2439:120: error: expression with side 
> effects has no effect in an unevaluated context 
> [-Werror,-Wunevaluated-expression]
> 00:14:27 switch (0) case 0: default: if (const ::testing::AssertionResult 
> gtest_ar = (::testing::internal:: 
> EqHelper<(sizeof(::testing::internal::IsNullLiteralHelper(stringify(position++)))
>  == 1)>::Compare("stringify(position++)", "entry.data", 
> stringify(position++), entry.data))) ; else 
> ::testing::internal::AssertHelper(::testing::TestPartResult::kNonFatalFailure,
>  "../../src/tests/log_tests.cpp", 2439, gtest_ar.failure_message()) = 
> ::testing::Message();
> 00:14:27  
>   ^
> 00:14:27 1 error generated.
> 00:14:27 Makefile:10317: recipe for target 'tests/mesos_tests-log_tests.o' 
> failed
> 00:14:27 make[3]: *** [tests/mesos_tests-log_tests.o] Error 1
> 00:14:27 make[3]: *** Waiting for unfinished jobs
> 00:14:54 make[3]: Leaving directory 
> '/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/Clang/label/mesos-ec2-ubuntu-16.04/mesos/build/src'
> 00:14:54 Makefile:13776: recipe for 

[jira] [Updated] (MESOS-7550) Publish Local Resource Provider resources in the agent before container launch or update.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-7550:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Publish Local Resource Provider resources in the agent before container 
> launch or update.
> -
>
> Key: MESOS-7550
> URL: https://issues.apache.org/jira/browse/MESOS-7550
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
>
> The agent will ask RP manager to publish the resources before container can 
> start to use them. SLRP (storage local resource provider) will be responsible 
> for making sure the CSI volume is made available on the host. This will 
> involve calling `ControllerPublishVolume` and `NodePublishVolume` RPCs from 
> the CSI Plugin.
> This will happen when a workload (i.e., task/executor) are being launched on 
> the agent that uses a CSI volume as a persistent volume. During the creation 
> of a CSI volume, the SLRP will generate a fixed mount point under the agent's 
> work directory based on the ID of the CSI volume, and store the mount point 
> in the `Resource.disk.source.path.root` or `Resource.disk.source.path.mount` 
> fields. Prior to a workload launch, SLRP will mount the CSI volume to the 
> same path, then the Docker containerizer or the Mesos containerizer will 
> again bind-mount the volume into the container of the workload. Since the 
> containerizers know nothing about the resource providers, it would extract 
> the mount point of the CSI volume from the `Resource.disk.source.path.root` 
> or `Resource.disk.source.path.mount` fields.
> For storage local resource provider, the agent's work directory is known 
> during the creation of the CSI volume since it will be created an used on the 
> same agent. However, in the case of a storage external resource provider, 
> where a CSI volume might be created on one agent X and published on another 
> agent Y, the work directory of agent Y might not be known at the creation of 
> a CSI volume on X. To support it in the future, we introduce new semantics 
> for `Resource.disk.source.path.root` and `Resource.disk.source.path.mount`, 
> such that if these fields are set to relative paths, they are relative to the 
> agent's work directory, so the containerizer can extract the mount point by 
> prefixing the relative paths with the agent's work directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8265) Add state recovery for storage local resource provider.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8265:
--
Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, 
Mesosphere Sprint 71  (was: Mesosphere Sprint 68, Mesosphere Sprint 69, 
Mesosphere Sprint 70)

> Add state recovery for storage local resource provider.
> ---
>
> Key: MESOS-8265
> URL: https://issues.apache.org/jira/browse/MESOS-8265
> Project: Mesos
>  Issue Type: Task
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
>
> The storage local resource provider needs to checkpoint its total resources 
> and pending operations atomically, and recover them after failing over.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8032) Launch CSI plugins in storage local resource provider.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8032:
--
Sprint: Mesosphere Sprint 64, Mesosphere Sprint 65, Mesosphere Sprint 66, 
Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere 
Sprint 70, Mesosphere Sprint 71  (was: Mesosphere Sprint 64, Mesosphere Sprint 
65, Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70)

> Launch CSI plugins in storage local resource provider.
> --
>
> Key: MESOS-8032
> URL: https://issues.apache.org/jira/browse/MESOS-8032
> Project: Mesos
>  Issue Type: Task
>  Components: storage
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
>
> Launching a CSI plugin requires the following steps:
> 1. Verify the configuration.
> 2. Prepare a directory in the work directory of the resource provider where 
> the socket file should be placed, and construct the path of the socket file.
> 3. If the socket file already exists and the plugin is already running, we 
> should not launch another plugin instance.
> 4. Otherwise, launch a standalone container to run the plugin and connect to 
> it through the socket file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8244) Add operator API to reload local resource providers.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8244:
--
Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, 
Mesosphere Sprint 71  (was: Mesosphere Sprint 68, Mesosphere Sprint 69, 
Mesosphere Sprint 70)

> Add operator API to reload local resource providers.
> 
>
> Key: MESOS-8244
> URL: https://issues.apache.org/jira/browse/MESOS-8244
> Project: Mesos
>  Issue Type: Task
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
>
> To add, remove and update local resource providers on the fly more 
> conveniently and without restarting agents, we would like to introduce new 
> operator API to add new config files in the resource provider config 
> directory and trigger a reload for the resource provider.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8291) Add documentation about fault domains

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8291:
--
Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71  (was: Mesosphere Sprint 
70)

> Add documentation about fault domains
> -
>
> Key: MESOS-8291
> URL: https://issues.apache.org/jira/browse/MESOS-8291
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Benno Evers
>
> We need some user docs for fault domains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8143) Publish and unpublish storage local resources through CSI plugins.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8143:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Publish and unpublish storage local resources through CSI plugins.
> --
>
> Key: MESOS-8143
> URL: https://issues.apache.org/jira/browse/MESOS-8143
> Project: Mesos
>  Issue Type: Task
>  Components: storage
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
>
> Storage local resource provider needs to call the following CSI API to 
> publish CSI volumes for tasks to use:
> 1. ControllerPublishVolume (optional)
> 2. NodePublishVolume
> Although we don't need to unpublish CSI volumes after tasks are completed, we 
> still needs to unpublish them for DESTROY_VOLUME or DESTROY_BLOCK:
> 1. NodeUnpublishVolume
> 2. ControllerUnpublishVolume (optional)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8101) Import resources from CSI plugins in storage local resource provider.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8101:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Import resources from CSI plugins in storage local resource provider.
> -
>
> Key: MESOS-8101
> URL: https://issues.apache.org/jira/browse/MESOS-8101
> Project: Mesos
>  Issue Type: Task
>  Components: storage
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
>
> The following lists the steps to import resources from a CSI plugin:
> 1. Launch the node plugin
> 1.1 GetSupportedVersions
> 1.2 GetPluginInfo
> 1.3 ProbeNode
> 1.4 GetNodeCapabilities
> 2. Launch the controller plugin
> 2.1 GetSuportedVersions
> 2.2 GetPluginInfo
> 2.3 GetControllerCapabilities
> 3. GetCapacity
> 4. ListVolumes
> 5. Report to the resource provider through UPDATE_TOTAL_RESOURCES



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7790) Design hierarchical quota allocation.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-7790:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Design hierarchical quota allocation.
> -
>
> Key: MESOS-7790
> URL: https://issues.apache.org/jira/browse/MESOS-7790
> Project: Mesos
>  Issue Type: Task
>  Components: allocation
>Reporter: Benjamin Mahler
>Assignee: Michael Park
>  Labels: multitenancy
>
> When quota is assigned in the role hierarchy (see MESOS-6375), it's possible 
> for there to be "undelegated" quota for a role. For example:
> {noformat}
> ^
>   /   \
> /   \
>eng (90 cpus)   sales (10 cpus)
>  ^
>/   \
>  /   \
>  ads (50 cpus)   build (10 cpus)
> {noformat}
> Here, the "eng" role has 60 of its 90 cpus of quota delegated to its 
> children, and 30 cpus remain undelegated. We need to design how to allocate 
> these 30 cpus undelegated cpus. Are they allocated entirely to the "eng" 
> role? Are they allocated to the "eng" role tree? If so, how do we determine 
> how much is allocated to each role in the "eng" tree (i.e. "eng", "eng/ads", 
> "eng/build").



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8240) Add an option to build the new CLI and run unit tests.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8240:
--
Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71  (was: Mesosphere Sprint 
70)

> Add an option to build the new CLI and run unit tests.
> --
>
> Key: MESOS-8240
> URL: https://issues.apache.org/jira/browse/MESOS-8240
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Armand Grillet
>Assignee: Armand Grillet
>
> An update of the discarded https://reviews.apache.org/r/52543/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8190) Update the master to accept OfferOperationIDs from frameworks.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8190:
--
Sprint: Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  
(was: Mesosphere Sprint 69, Mesosphere Sprint 70)

> Update the master to accept OfferOperationIDs from frameworks.
> --
>
> Key: MESOS-8190
> URL: https://issues.apache.org/jira/browse/MESOS-8190
> Project: Mesos
>  Issue Type: Task
>Reporter: Gastón Kleiman
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Master’s {{ACCEPT}} handler should send failed operation updates when a 
> framework sets the {{OfferOperationID}} on an operation destined for an agent 
> without the {{RESOURCE_PROVIDER}} capability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-5333) GET /master/maintenance/schedule/ produces 404.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-5333:
--
Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71  (was: Mesosphere Sprint 
70)

> GET /master/maintenance/schedule/ produces 404.
> ---
>
> Key: MESOS-5333
> URL: https://issues.apache.org/jira/browse/MESOS-5333
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, libprocess
>Reporter: Nathan Handler
>Assignee: Alexander Rukletsov
>Priority: Minor
>  Labels: mesosphere
>
> Attempts to make a GET request to /master/maintenance/schedule/ result in a 
> 404. However, if I make a GET request to /master/maintenance/schedule 
> (without the trailing /), it works. My current (untested) theory is that this 
> might be related to the fact that there is also a 
> /master/maintenance/schedule/status endpoint (an endpoint built on top of a 
> functioning endpoint), as requests to /help and /help/ (with and without the 
> trailing slash) produce the same functioning result.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8108) Process offer operations in storage local resource provider

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8108:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Process offer operations in storage local resource provider
> ---
>
> Key: MESOS-8108
> URL: https://issues.apache.org/jira/browse/MESOS-8108
> Project: Mesos
>  Issue Type: Task
>  Components: storage
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: storage
>
> The storage local resource provider receives offer operations for 
> reservations and resource conversions, and invoke proper CSI calls to 
> implement these operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8102) Add a test CSI plugin for storage local resource provider.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8102:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Add a test CSI plugin for storage local resource provider.
> --
>
> Key: MESOS-8102
> URL: https://issues.apache.org/jira/browse/MESOS-8102
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
>
> We need a dummy CSI plugin for testing storage local resoure providers. The 
> test CSI plugin would just create subdirectories under its working 
> directories to mimic the behavior of creating volumes, then bind-mount those 
> volumes to mimic publish.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8115) Add a master flag to disallow agents that are not configured with fault domain

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8115:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Add a master flag to disallow agents that are not configured with fault domain
> --
>
> Key: MESOS-8115
> URL: https://issues.apache.org/jira/browse/MESOS-8115
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Vinod Kone
>Assignee: Benno Evers
>
> Once mesos masters and agents in a cluster are *all* upgraded to a version 
> where the fault domains feature is available, it is beneficial to enforce 
> that agents without a fault domain configured are not allowed to join the 
> cluster. 
> This is a safety net for operators who could forget to configure the fault 
> domain of a remote agent and let it join the cluster. If this happens, an 
> agent in a remote region will be considered a local agent by the master and 
> frameworks (because agent's fault domain is not configured) causing tasks to 
> potentially land in a remote agent which is undesirable.
> Note that this has to be a configurable flag and not enforced by default 
> because otherwise upgrades from a fault domain non-configured cluster to a 
> configured cluster will not be possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8297:
--
Sprint: Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  
(was: Mesosphere Sprint 69, Mesosphere Sprint 70)

> Built-in driver-based executors ignore kill task if the task has not been 
> launched.
> ---
>
> Key: MESOS-8297
> URL: https://issues.apache.org/jira/browse/MESOS-8297
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
>
> If docker executor receives a kill task request and the task has never been 
> launch, the request is ignored. We now know that: the executor has never 
> received the registration confirmation, hence has ignored the launch task 
> request, hence the task has never started. And this is how the executor 
> enters an idle state, waiting for registration and ignoring kill task 
> requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8303) Add user doc for agent reconfiguration

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8303:
--
Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71  (was: Mesosphere Sprint 
70)

> Add user doc for agent reconfiguration
> --
>
> Key: MESOS-8303
> URL: https://issues.apache.org/jira/browse/MESOS-8303
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Benno Evers
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8184) Implement master's AcknowledgeOfferOperationMessage handler.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8184:
--
Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, 
Mesosphere Sprint 71  (was: Mesosphere Sprint 68, Mesosphere Sprint 69, 
Mesosphere Sprint 70)

> Implement master's AcknowledgeOfferOperationMessage handler.
> 
>
> Key: MESOS-8184
> URL: https://issues.apache.org/jira/browse/MESOS-8184
> Project: Mesos
>  Issue Type: Task
>Reporter: Gastón Kleiman
>Assignee: Gastón Kleiman
>  Labels: mesosphere
>
> This handler should validate the message and forward it to the corresponding 
> agent/ERP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8144) Add a mock resource provider manager.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8144:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Add a mock resource provider manager.
> -
>
> Key: MESOS-8144
> URL: https://issues.apache.org/jira/browse/MESOS-8144
> Project: Mesos
>  Issue Type: Task
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: storage
>
> To test a storage local resource provider, we need to inject a mock resource 
> provider manager such that:
> 1. A full agent will start during the test so the resource provider can 
> launch standalone containers for CSI plugins.
> 2. We can inject offer operations through the mock manager to test the 
> resource provider.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8221) Use protobuf reflection to simplify downgrading of resources.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8221:
--
Sprint: Mesosphere Sprint 68, Mesosphere Sprint 69, Mesosphere Sprint 70, 
Mesosphere Sprint 71  (was: Mesosphere Sprint 68, Mesosphere Sprint 69, 
Mesosphere Sprint 70)

> Use protobuf reflection to simplify downgrading of resources.
> -
>
> Key: MESOS-8221
> URL: https://issues.apache.org/jira/browse/MESOS-8221
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Michael Park
>Assignee: Michael Park
>
> We currently have a {{downgradeResources}} function which is called on every
> {{repeated Resource}} field in every message that we checkpoint. We should 
> leverage
> protobuf reflection to automatically downgrade any instances of {{Resource}} 
> within any
> protobuf message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7506) Multiple tests leave orphan containers.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-7506:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Multiple tests leave orphan containers.
> ---
>
> Key: MESOS-7506
> URL: https://issues.apache.org/jira/browse/MESOS-7506
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: Ubuntu 16.04
> Fedora 23
> other Linux distros
>Reporter: Alexander Rukletsov
>Assignee: Andrei Budnik
>  Labels: containerizer, flaky-test, mesosphere
> Attachments: KillMultipleTasks-badrun.txt, 
> ROOT_IsolatorFlags-badrun.txt, ResourceLimitation-badrun.txt, 
> ResourceLimitation-badrun2.txt, 
> RestartSlaveRequireExecutorAuthentication-badrun.txt, 
> TaskWithFileURI-badrun.txt
>
>
> I've observed a number of flaky tests that leave orphan containers upon 
> cleanup. A typical log looks like this:
> {noformat}
> ../../src/tests/cluster.cpp:580: Failure
> Value of: containers->empty()
>   Actual: false
> Expected: true
> Failed to destroy containers: { da3e8aa8-98e7-4e72-a8fd-5d0bae960014 }
> {noformat}
> All currently affected tests:
> {noformat}
> SlaveTest.RestartSlaveRequireExecutorAuthentication
> LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8096) Enqueueing events in MockHTTPScheduler can lead to segfaults.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8096:
--
Sprint: Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, 
Mesosphere Sprint 69, Mesosphere Sprint 70, Mesosphere Sprint 71  (was: 
Mesosphere Sprint 66, Mesosphere Sprint 67, Mesosphere Sprint 68, Mesosphere 
Sprint 69, Mesosphere Sprint 70)

> Enqueueing events in MockHTTPScheduler can lead to segfaults.
> -
>
> Key: MESOS-8096
> URL: https://issues.apache.org/jira/browse/MESOS-8096
> Project: Mesos
>  Issue Type: Bug
>  Components: scheduler driver, test
> Environment: Fedora 23, Ubuntu 14.04, Ubuntu 16
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: flaky-test, mesosphere
> Attachments: AsyncExecutorProcess-badrun-1.txt, 
> AsyncExecutorProcess-badrun-2.txt, AsyncExecutorProcess-badrun-3.txt, 
> scheduler-shutdown-invalid-driver.txt
>
>
> Various tests segfault due to a yet unknown reason. Comparing logs (attached) 
> hints that the problem might be in the scheduler's event queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8352) Resources may get over allocated to some roles while fail to meet the quota of other roles.

2017-12-22 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-8352:
--
Sprint: Mesosphere Sprint 70, Mesosphere Sprint 71  (was: Mesosphere Sprint 
70)

> Resources may get over allocated to some roles while fail to meet the quota 
> of other roles.
> ---
>
> Key: MESOS-8352
> URL: https://issues.apache.org/jira/browse/MESOS-8352
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Meng Zhu
>Assignee: Meng Zhu
>  Labels: multitenancy, quotas
>
> In the quota role allocation stage, if a role gets some resources on an agent 
> to meet its quota, it will also get all other resources on the same agent 
> that it does not have quota for. This may starve roles behind it that have 
> quotas set for those resources.
> To fix that, we need to track quota headroom in the quota role allocation 
> stage. In that stage, if a role has no quota set for a scalar resource, it 
> will get that resource only when two conditions are both met:
> - It got some other resources on the same agent to meet its quota; And
> - After allocating those resources, quota headroom is still above the 
> required amount.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.

2017-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301268#comment-16301268
 ] 

Alexander Rukletsov commented on MESOS-8297:


{noformat}
Commit: 44a702a1b26963040e6cb6c362b7f01e5b4ef097 [44a702a]
Author: Alexander Rukletsov ruklet...@gmail.com
Date: 22 December 2017 at 12:09:58 GMT+1
Committer: Alexander Rukletsov al...@apache.org

Promoted log level to warning for disconnected events in exec.cpp.

When the executor library receives messages while being disconnected,
it might indicate an out-of-order message delivery or lost messages.
This should be logged at the warning level to simplify triaging.

Review: https://reviews.apache.org/r/64032/
{noformat}
{noformat}
Commit: 47392cf9f9024718550c69bcef9319560b47d5c7 [47392cf]
Author: Alexander Rukletsov 
Date: 22 December 2017 at 12:10:15 GMT+1
Committer: Alexander Rukletsov 

Ensured command executor always honors shutdown request.

Review: https://reviews.apache.org/r/64069/
{noformat}
{noformat}
Commit: b2eddcfe0ede4725208ae33c8c7f56563ff10514 [b2eddcf]
Author: Alexander Rukletsov 
Date: 22 December 2017 at 12:10:28 GMT+1
Committer: Alexander Rukletsov 

Ensured executor adapter propagates error and shutdown messages.

Prior to this patch, if an error, kill, or shutdown occurred during
subscription / registration with the agent, it was not propagated back
to the executor if the v0_v1 executor adapter was used. This happened
because the adapter did not call the `connected` callback until after
successful registration and hence the executor did not even try to
send the `SUBSCRIBE` call, without which the adapter did not send any
events to the executor.

A fix is to call the `connected` callback if an error occurred or
shutdown / kill event arrived before the executor had subscribed.

Review: https://reviews.apache.org/r/64070/
{noformat}
{noformat}
Commit: 769108e94a7c7834c44e01091a9940354eb3f6e4 [769108e]
Author: Alexander Rukletsov 
Date: 22 December 2017 at 12:10:35 GMT+1
Committer: Alexander Rukletsov 

Terminated driver-based executors if kill arrives before launch task.

`ExecutorRegisteredMessage` or `RunTaskMessage` may not be delivered
to a driver-based executor. Since these messages are not retried,
without this patch an executor never starts a task and remains idle,
ignoring kill task request. This patch ensures all built-in driver-
based executors eventually shut down if kill task arrives before
the task has been started.

Review: https://reviews.apache.org/r/64033/
{noformat}

> Built-in driver-based executors ignore kill task if the task has not been 
> launched.
> ---
>
> Key: MESOS-8297
> URL: https://issues.apache.org/jira/browse/MESOS-8297
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
>
> If docker executor receives a kill task request and the task has never been 
> launch, the request is ignored. We now know that: the executor has never 
> received the registration confirmation, hence has ignored the launch task 
> request, hence the task has never started. And this is how the executor 
> enters an idle state, waiting for registration and ignoring kill task 
> requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.

2017-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301267#comment-16301267
 ] 

Alexander Rukletsov commented on MESOS-8297:


[~gilbert] Landed.

> Built-in driver-based executors ignore kill task if the task has not been 
> launched.
> ---
>
> Key: MESOS-8297
> URL: https://issues.apache.org/jira/browse/MESOS-8297
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
>
> If docker executor receives a kill task request and the task has never been 
> launch, the request is ignored. We now know that: the executor has never 
> received the registration confirmation, hence has ignored the launch task 
> request, hence the task has never started. And this is how the executor 
> enters an idle state, waiting for registration and ignoring kill task 
> requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6616) Error: dereferencing type-punned pointer will break strict-aliasing rules.

2017-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6616:
---
Shepherd: Benjamin Bannier

> Error: dereferencing type-punned pointer will break strict-aliasing rules.
> --
>
> Key: MESOS-6616
> URL: https://issues.apache.org/jira/browse/MESOS-6616
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.2.3, 1.3.1, 1.4.1
> Environment: Fedora Rawhide;
> Debian 8.10 + gcc 5.5.0-6 with {{O2}}
>Reporter: Orion Poplawski
>Assignee: Alexander Rukletsov
>  Labels: compile-error, mesosphere
>
> Trying to update the mesos package to 1.1.0 in Fedora.  Getting:
> {noformat}
> libtool: compile:  g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
> -DPACKAGE_VERSION=\"1.1.0\" "-DPACKAGE_STRING=\"mesos 1.1.0\"" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
> -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
> -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
> -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 
> -DHAVE_LIBAPR_1=1 -DHAVE_BOOST_VERSION_HPP=1 -DHAVE_LIBCURL=1 
> -DHAVE_ELFIO_ELFIO_HPP=1 -DHAVE_GLOG_LOGGING_H=1 -DHAVE_HTTP_PARSER_H=1 
> -DMESOS_HAS_JAVA=1 -DHAVE_LEVELDB_DB_H=1 -DHAVE_LIBNL_3=1 
> -DHAVE_LIBNL_ROUTE_3=1 -DHAVE_LIBNL_IDIAG_3=1 -DWITH_NETWORK_ISOLATOR=1 
> -DHAVE_GOOGLE_PROTOBUF_MESSAGE_H=1 -DHAVE_EV_H=1 -DHAVE_PICOJSON_H=1 
> -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 
> -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 
> -DHAVE_ZOOKEEPER_H=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -Wall 
> -Werror -Wsign-compare -DLIBDIR=\"/usr/lib64\" 
> -DPKGLIBEXECDIR=\"/usr/libexec/mesos\" -DPKGDATADIR=\"/usr/share/mesos\" 
> -DPKGMODULEDIR=\"/usr/lib64/mesos/modules\" -I../include -I../include 
> -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS 
> -I../3rdparty/libprocess/include -I../3rdparty/nvml-352.79 
> -I../3rdparty/stout/include -DHAS_AUTHENTICATION=1 -Iyes/include 
> -I/usr/include/subversion-1 -Iyes/include -Iyes/include -Iyes/include/libnl3 
> -Iyes/include -I/ -Iyes/include -I/usr/include/apr-1 -I/usr/include/apr-1.0 
> -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15/include -isystem yes/include 
> -Iyes/include -I/usr/src/gmock -I/usr/src/gmock/include -I/usr/src/gmock/src 
> -I/usr/src/gmock/gtest -I/usr/src/gmock/gtest/include 
> -I/usr/src/gmock/gtest/src -Iyes/include -Iyes/include -I/usr/include 
> -I/builddir/build/BUILD/mesos-1.1.0/libev4.15/include -Iyes/include 
> -I/usr/include -I/usr/include/zookeeper -pthread -O2 -g -pipe -Wall 
> -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
> -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches 
> -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic 
> -DEV_CHILD_ENABLE=0 -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15 
> -Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11 -c 
> health-check/health_checker.cpp  -fPIC -DPIC -o 
> health-check/.libs/libmesos_no_3rdparty_la-health_checker.o
> In file included from health-check/health_checker.cpp:51:0:
> ./linux/ns.hpp: In function 'Try ns::clone(pid_t, int, const 
> std::function&, int)':
> ./linux/ns.hpp:480:69: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>  pid_t pid = ((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid;
>  ^~
> ./linux/ns.hpp: In lambda function:
> ./linux/ns.hpp:581:59: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid = ::getpid();
>^~
> ./linux/ns.hpp:582:59: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->uid = ::getuid();
>^~
> ./linux/ns.hpp:583:59: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->gid = ::getgid();
>^~
> cc1plus: all warnings being treated as errors
> make[2]: *** [Makefile:6655: 
> health-check/libmesos_no_3rdparty_la-health_checker.lo] Error 1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6616) Error: dereferencing type-punned pointer will break strict-aliasing rules.

2017-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6616:
---
Sprint: Mesosphere Sprint 71

> Error: dereferencing type-punned pointer will break strict-aliasing rules.
> --
>
> Key: MESOS-6616
> URL: https://issues.apache.org/jira/browse/MESOS-6616
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.2.3, 1.3.1, 1.4.1
> Environment: Fedora Rawhide;
> Debian 8.10 + gcc 5.5.0-6 with {{O2}}
>Reporter: Orion Poplawski
>Assignee: Alexander Rukletsov
>  Labels: compile-error, mesosphere
>
> Trying to update the mesos package to 1.1.0 in Fedora.  Getting:
> {noformat}
> libtool: compile:  g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
> -DPACKAGE_VERSION=\"1.1.0\" "-DPACKAGE_STRING=\"mesos 1.1.0\"" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
> -DVERSION=\"1.1.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
> -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
> -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 
> -DHAVE_LIBAPR_1=1 -DHAVE_BOOST_VERSION_HPP=1 -DHAVE_LIBCURL=1 
> -DHAVE_ELFIO_ELFIO_HPP=1 -DHAVE_GLOG_LOGGING_H=1 -DHAVE_HTTP_PARSER_H=1 
> -DMESOS_HAS_JAVA=1 -DHAVE_LEVELDB_DB_H=1 -DHAVE_LIBNL_3=1 
> -DHAVE_LIBNL_ROUTE_3=1 -DHAVE_LIBNL_IDIAG_3=1 -DWITH_NETWORK_ISOLATOR=1 
> -DHAVE_GOOGLE_PROTOBUF_MESSAGE_H=1 -DHAVE_EV_H=1 -DHAVE_PICOJSON_H=1 
> -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 
> -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBZ=1 
> -DHAVE_ZOOKEEPER_H=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. -Wall 
> -Werror -Wsign-compare -DLIBDIR=\"/usr/lib64\" 
> -DPKGLIBEXECDIR=\"/usr/libexec/mesos\" -DPKGDATADIR=\"/usr/share/mesos\" 
> -DPKGMODULEDIR=\"/usr/lib64/mesos/modules\" -I../include -I../include 
> -I../include/mesos -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS 
> -I../3rdparty/libprocess/include -I../3rdparty/nvml-352.79 
> -I../3rdparty/stout/include -DHAS_AUTHENTICATION=1 -Iyes/include 
> -I/usr/include/subversion-1 -Iyes/include -Iyes/include -Iyes/include/libnl3 
> -Iyes/include -I/ -Iyes/include -I/usr/include/apr-1 -I/usr/include/apr-1.0 
> -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15/include -isystem yes/include 
> -Iyes/include -I/usr/src/gmock -I/usr/src/gmock/include -I/usr/src/gmock/src 
> -I/usr/src/gmock/gtest -I/usr/src/gmock/gtest/include 
> -I/usr/src/gmock/gtest/src -Iyes/include -Iyes/include -I/usr/include 
> -I/builddir/build/BUILD/mesos-1.1.0/libev4.15/include -Iyes/include 
> -I/usr/include -I/usr/include/zookeeper -pthread -O2 -g -pipe -Wall 
> -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions 
> -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches 
> -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m64 -mtune=generic 
> -DEV_CHILD_ENABLE=0 -I/builddir/build/BUILD/mesos-1.1.0/libev-4.15 
> -Wno-unused-local-typedefs -Wno-maybe-uninitialized -std=c++11 -c 
> health-check/health_checker.cpp  -fPIC -DPIC -o 
> health-check/.libs/libmesos_no_3rdparty_la-health_checker.o
> In file included from health-check/health_checker.cpp:51:0:
> ./linux/ns.hpp: In function 'Try ns::clone(pid_t, int, const 
> std::function&, int)':
> ./linux/ns.hpp:480:69: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>  pid_t pid = ((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid;
>  ^~
> ./linux/ns.hpp: In lambda function:
> ./linux/ns.hpp:581:59: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->pid = ::getpid();
>^~
> ./linux/ns.hpp:582:59: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->uid = ::getuid();
>^~
> ./linux/ns.hpp:583:59: error: dereferencing type-punned pointer will break 
> strict-aliasing rules [-Werror=strict-aliasing]
>((struct ucred*) CMSG_DATA(CMSG_FIRSTHDR()))->gid = ::getgid();
>^~
> cc1plus: all warnings being treated as errors
> make[2]: *** [Makefile:6655: 
> health-check/libmesos_no_3rdparty_la-health_checker.lo] Error 1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used

2017-12-22 Thread Konstantin Kalin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Kalin updated MESOS-8356:

Component/s: (was: docker)

> Persistent volume ownership is set to root despite of sandbox owner 
> (frameworkInfo.user) when docker executor is used
> -
>
> Key: MESOS-8356
> URL: https://issues.apache.org/jira/browse/MESOS-8356
> Project: Mesos
>  Issue Type: Bug
> Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13
>Reporter: Konstantin Kalin
>  Labels: persistent-volumes
>
> PersistentVolume ownership is not set to match the sandbox user when the 
> docker executor is used. Looks like the issue was introduced by 
> https://reviews.apache.org/r/45963/
> I didn't check the universal containerizer yet. 
> As far as I understand the following code is supposed to check that a volume 
> is not being already used by other tasks/containers.
> src/slave/containerizer/docker.cpp
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource)) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}
> But it doesn't exclude a container to be launch (In my case I have only one 
> container - no group of tasks). Thus the ownership of PersistentVolume stays 
> "root" (I run mesos-agent under root)
> Making a small patch to exclude the container to launch fixes the issue.
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource) &&
>   containerId != container->id) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used

2017-12-22 Thread Konstantin Kalin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Kalin updated MESOS-8356:

Labels: persistent-volumes  (was: )

> Persistent volume ownership is set to root despite of sandbox owner 
> (frameworkInfo.user) when docker executor is used
> -
>
> Key: MESOS-8356
> URL: https://issues.apache.org/jira/browse/MESOS-8356
> Project: Mesos
>  Issue Type: Bug
> Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13
>Reporter: Konstantin Kalin
>  Labels: persistent-volumes
>
> PersistentVolume ownership is not set to match the sandbox user when the 
> docker executor is used. Looks like the issue was introduced by 
> https://reviews.apache.org/r/45963/
> I didn't check the universal containerizer yet. 
> As far as I understand the following code is supposed to check that a volume 
> is not being already used by other tasks/containers.
> src/slave/containerizer/docker.cpp
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource)) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}
> But it doesn't exclude a container to be launch (In my case I have only one 
> container - no group of tasks). Thus the ownership of PersistentVolume stays 
> "root" (I run mesos-agent under root)
> Making a small patch to exclude the container to launch fixes the issue.
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource) &&
>   containerId != container->id) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used

2017-12-22 Thread Konstantin Kalin (JIRA)
Konstantin Kalin created MESOS-8356:
---

 Summary: Persistent volume ownership is set to root despite of 
sandbox owner (frameworkInfo.user) when docker executor is used
 Key: MESOS-8356
 URL: https://issues.apache.org/jira/browse/MESOS-8356
 Project: Mesos
  Issue Type: Bug
  Components: docker
 Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13
Reporter: Konstantin Kalin


PersistentVolume ownership is not set to match the sandbox user when the docker 
executor is used. Looks like the issue was introduced by 
https://reviews.apache.org/r/45963/
I didn't check the universal containerizer yet. 

As far as I understand the following code is supposed to check that a volume is 
not being already used by other tasks/containers.

src/slave/containerizer/docker.cpp
{code:c++}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource)) {
isVolumeInUse = true;
break;
  }
}
{code}
But it doesn't exclude a container to be launch (In my case I have only one 
container - no group of tasks). Thus the ownership of PersistentVolume stays 
"root" (I run mesos-agent under root)

Making a small patch to exclude the container to launch fixes the issue.
{code:c++}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource) &&
  containerId != container->id) {
isVolumeInUse = true;
break;
  }
}
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used

2017-12-22 Thread Konstantin Kalin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Kalin updated MESOS-8356:

Description: 
PersistentVolume ownership is not set to match the sandbox user when the docker 
executor is used. Looks like the issue was introduced by 
https://reviews.apache.org/r/45963/
I didn't check the universal containerizer yet. 

As far as I understand the following code is supposed to check that a volume is 
not being already used by other tasks/containers.

src/slave/containerizer/docker.cpp
{code:java}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource)) {
isVolumeInUse = true;
break;
  }
}
{code}
But it doesn't exclude a container to be launch (In my case I have only one 
container - no group of tasks). Thus the ownership of PersistentVolume stays 
"root" (I run mesos-agent under root) and it's impossible to use the volume 
inside the container. We always run processes inside Docker containers under 
unprivileged user. 

Making a small patch to exclude the container to launch fixes the issue.
{code:java}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource) &&
  containerId != container->id) {
isVolumeInUse = true;
break;
  }
}
{code}


  was:
PersistentVolume ownership is not set to match the sandbox user when the docker 
executor is used. Looks like the issue was introduced by 
https://reviews.apache.org/r/45963/
I didn't check the universal containerizer yet. 

As far as I understand the following code is supposed to check that a volume is 
not being already used by other tasks/containers.

src/slave/containerizer/docker.cpp
{code:java}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource)) {
isVolumeInUse = true;
break;
  }
}
{code}
But it doesn't exclude a container to be launch (In my case I have only one 
container - no group of tasks). Thus the ownership of PersistentVolume stays 
"root" (I run mesos-agent under root)

Making a small patch to exclude the container to launch fixes the issue.
{code:java}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource) &&
  containerId != container->id) {
isVolumeInUse = true;
break;
  }
}
{code}



> Persistent volume ownership is set to root despite of sandbox owner 
> (frameworkInfo.user) when docker executor is used
> -
>
> Key: MESOS-8356
> URL: https://issues.apache.org/jira/browse/MESOS-8356
> Project: Mesos
>  Issue Type: Bug
> Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13
>Reporter: Konstantin Kalin
>  Labels: persistent-volumes
>
> PersistentVolume ownership is not set to match the sandbox user when the 
> docker executor is used. Looks like the issue was introduced by 
> https://reviews.apache.org/r/45963/
> I didn't check the universal containerizer yet. 
> As far as I understand the following code is supposed to check that a volume 
> is not being already used by other tasks/containers.
> src/slave/containerizer/docker.cpp
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource)) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}
> But it doesn't exclude a container to be launch (In my case I have only one 
> container - no group of tasks). Thus the ownership of PersistentVolume stays 
> "root" (I run mesos-agent under root) and it's impossible to use the volume 
> inside the container. We always run processes inside Docker containers under 
> unprivileged user. 
> Making a small patch to exclude the container to launch fixes the issue.
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource) &&
>   containerId != container->id) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.

2017-12-22 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301484#comment-16301484
 ] 

Alexander Rukletsov commented on MESOS-8297:


Back ported to 1.4.2.

> Built-in driver-based executors ignore kill task if the task has not been 
> launched.
> ---
>
> Key: MESOS-8297
> URL: https://issues.apache.org/jira/browse/MESOS-8297
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
>
> If docker executor receives a kill task request and the task has never been 
> launch, the request is ignored. We now know that: the executor has never 
> received the registration confirmation, hence has ignored the launch task 
> request, hence the task has never started. And this is how the executor 
> enters an idle state, waiting for registration and ignoring kill task 
> requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8356) Persistent volume ownership is set to root despite of sandbox owner (frameworkInfo.user) when docker executor is used

2017-12-22 Thread Konstantin Kalin (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Kalin updated MESOS-8356:

Description: 
PersistentVolume ownership is not set to match the sandbox user when the docker 
executor is used. Looks like the issue was introduced by 
https://reviews.apache.org/r/45963/
I didn't check the universal containerizer yet. 

As far as I understand the following code is supposed to check that a volume is 
not being already used by other tasks/containers.

src/slave/containerizer/docker.cpp
{code:java}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource)) {
isVolumeInUse = true;
break;
  }
}
{code}
But it doesn't exclude a container to be launch (In my case I have only one 
container - no group of tasks). Thus the ownership of PersistentVolume stays 
"root" (I run mesos-agent under root)

Making a small patch to exclude the container to launch fixes the issue.
{code:java}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource) &&
  containerId != container->id) {
isVolumeInUse = true;
break;
  }
}
{code}


  was:
PersistentVolume ownership is not set to match the sandbox user when the docker 
executor is used. Looks like the issue was introduced by 
https://reviews.apache.org/r/45963/
I didn't check the universal containerizer yet. 

As far as I understand the following code is supposed to check that a volume is 
not being already used by other tasks/containers.

src/slave/containerizer/docker.cpp
{code:c++}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource)) {
isVolumeInUse = true;
break;
  }
}
{code}
But it doesn't exclude a container to be launch (In my case I have only one 
container - no group of tasks). Thus the ownership of PersistentVolume stays 
"root" (I run mesos-agent under root)

Making a small patch to exclude the container to launch fixes the issue.
{code:c++}
foreachvalue (const Container* container, containers_) {
  if (container->resources.contains(resource) &&
  containerId != container->id) {
isVolumeInUse = true;
break;
  }
}
{code}



> Persistent volume ownership is set to root despite of sandbox owner 
> (frameworkInfo.user) when docker executor is used
> -
>
> Key: MESOS-8356
> URL: https://issues.apache.org/jira/browse/MESOS-8356
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
> Environment: Centos 7, Mesos 1.4.1, Docker Engine 1.13
>Reporter: Konstantin Kalin
>
> PersistentVolume ownership is not set to match the sandbox user when the 
> docker executor is used. Looks like the issue was introduced by 
> https://reviews.apache.org/r/45963/
> I didn't check the universal containerizer yet. 
> As far as I understand the following code is supposed to check that a volume 
> is not being already used by other tasks/containers.
> src/slave/containerizer/docker.cpp
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource)) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}
> But it doesn't exclude a container to be launch (In my case I have only one 
> container - no group of tasks). Thus the ownership of PersistentVolume stays 
> "root" (I run mesos-agent under root)
> Making a small patch to exclude the container to launch fixes the issue.
> {code:java}
> foreachvalue (const Container* container, containers_) {
>   if (container->resources.contains(resource) &&
>   containerId != container->id) {
> isVolumeInUse = true;
> break;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.

2017-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-8297:
---
Shepherd: Vinod Kone  (was: Anand Mazumdar)

> Built-in driver-based executors ignore kill task if the task has not been 
> launched.
> ---
>
> Key: MESOS-8297
> URL: https://issues.apache.org/jira/browse/MESOS-8297
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
>
> If docker executor receives a kill task request and the task has never been 
> launch, the request is ignored. We now know that: the executor has never 
> received the registration confirmation, hence has ignored the launch task 
> request, hence the task has never started. And this is how the executor 
> enters an idle state, waiting for registration and ignoring kill task 
> requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8297) Built-in driver-based executors ignore kill task if the task has not been launched.

2017-12-22 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-8297:
---
Fix Version/s: 1.5.0
   1.4.2

> Built-in driver-based executors ignore kill task if the task has not been 
> launched.
> ---
>
> Key: MESOS-8297
> URL: https://issues.apache.org/jira/browse/MESOS-8297
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
> Fix For: 1.4.2, 1.5.0
>
>
> If docker executor receives a kill task request and the task has never been 
> launch, the request is ignored. We now know that: the executor has never 
> received the registration confirmation, hence has ignored the launch task 
> request, hence the task has never started. And this is how the executor 
> enters an idle state, waiting for registration and ignoring kill task 
> requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6843) Fetcher should not assume stdout/stderr in the sandbox.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-6843:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Fetcher should not assume stdout/stderr in the sandbox.
> ---
>
> Key: MESOS-6843
> URL: https://issues.apache.org/jira/browse/MESOS-6843
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 1.0.2, 1.1.0
>Reporter: Jie Yu
>Priority: Critical
>  Labels: mesosphere
>
> If container logger is used, this assumption might not be true. For instance, 
> a journald logger might redirect all task logs to journald. So in theory, the 
> fetcher log should go to journald as well, rather than writing to 
> sandbox/stdout and sandbox/stderr.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6784) IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301693#comment-16301693
 ] 

Jie Yu commented on MESOS-6784:
---

Haven't seen this test being flaky for months on head. Close it for now. 
RE-open if you see this being flaky again. cc [~alexr]

> IOSwitchboardTest.KillSwitchboardContainerDestroyed is flaky
> 
>
> Key: MESOS-6784
> URL: https://issues.apache.org/jira/browse/MESOS-6784
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Neil Conway
>Priority: Critical
>  Labels: mesosphere
> Fix For: 1.5.0
>
>
> {noformat}
> [ RUN  ] IOSwitchboardTest.KillSwitchboardContainerDestroyed
> I1212 13:57:02.641043  2211 containerizer.cpp:220] Using isolation: 
> posix/cpu,filesystem/posix,network/cni
> W1212 13:57:02.641438  2211 backend.cpp:76] Failed to create 'overlay' 
> backend: OverlayBackend requires root privileges, but is running as user nrc
> W1212 13:57:02.641559  2211 backend.cpp:76] Failed to create 'bind' backend: 
> BindBackend requires root privileges
> I1212 13:57:02.642822  2268 containerizer.cpp:594] Recovering containerizer
> I1212 13:57:02.643975  2253 provisioner.cpp:253] Provisioner recovery complete
> I1212 13:57:02.644953  2255 containerizer.cpp:986] Starting container 
> 09e87380-00ab-4987-83c9-fa1c5d86717f for executor 'executor' of framework
> I1212 13:57:02.647004  2245 switchboard.cpp:430] Allocated pseudo terminal 
> '/dev/pts/54' for container 09e87380-00ab-4987-83c9-fa1c5d86717f
> I1212 13:57:02.652305  2245 switchboard.cpp:596] Created I/O switchboard 
> server (pid: 2705) listening on socket file 
> '/tmp/mesos-io-switchboard-b4af1c92-6633-44f3-9d35-e0e36edaf70a' for 
> container 09e87380-00ab-4987-83c9-fa1c5d86717f
> I1212 13:57:02.655513  2267 launcher.cpp:133] Forked child with pid '2706' 
> for container '09e87380-00ab-4987-83c9-fa1c5d86717f'
> I1212 13:57:02.655732  2267 containerizer.cpp:1621] Checkpointing container's 
> forked pid 2706 to 
> '/tmp/IOSwitchboardTest_KillSwitchboardContainerDestroyed_Me5CRx/meta/slaves/frameworks/executors/executor/runs/09e87380-00ab-4987-83c9-fa1c5d86717f/pids/forked.pid'
> I1212 13:57:02.726306  2265 containerizer.cpp:2463] Container 
> 09e87380-00ab-4987-83c9-fa1c5d86717f has exited
> I1212 13:57:02.726352  2265 containerizer.cpp:2100] Destroying container 
> 09e87380-00ab-4987-83c9-fa1c5d86717f in RUNNING state
> E1212 13:57:02.726495  2243 switchboard.cpp:861] Unexpected termination of 
> I/O switchboard server: 'IOSwitchboard' exited with signal: Killed for 
> container 09e87380-00ab-4987-83c9-fa1c5d86717f
> I1212 13:57:02.726563  2265 launcher.cpp:149] Asked to destroy container 
> 09e87380-00ab-4987-83c9-fa1c5d86717f
> E1212 13:57:02.783607  2228 switchboard.cpp:799] Failed to remove unix domain 
> socket file '/tmp/mesos-io-switchboard-b4af1c92-6633-44f3-9d35-e0e36edaf70a' 
> for container '09e87380-00ab-4987-83c9-fa1c5d86717f': No such file or 
> directory
> ../../mesos/src/tests/containerizer/io_switchboard_tests.cpp:661: Failure
> Value of: wait.get()->reasons().size() == 1
>   Actual: false
> Expected: true
> *** Aborted at 1481579822 (unix time) try "date -d @1481579822" if you are 
> using GNU date ***
> PC: @  0x1bf16d0 testing::UnitTest::AddTestPartResult()
> *** SIGSEGV (@0x0) received by PID 2211 (TID 0x7faed7d078c0) from PID 0; 
> stack trace: ***
> @ 0x7faecf855100 (unknown)
> @  0x1bf16d0 testing::UnitTest::AddTestPartResult()
> @  0x1be6247 testing::internal::AssertHelper::operator=()
> @  0x19ed751 
> mesos::internal::tests::IOSwitchboardTest_KillSwitchboardContainerDestroyed_Test::TestBody()
> @  0x1c0ed8c 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x1c09e74 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x1beb505 testing::Test::Run()
> @  0x1bebc88 testing::TestInfo::Run()
> @  0x1bec2ce testing::TestCase::Run()
> @  0x1bf2ba8 testing::internal::UnitTestImpl::RunAllTests()
> @  0x1c0f9b1 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>()
> @  0x1c0a9f2 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
> @  0x1bf18ee testing::UnitTest::Run()
> @  0x11bc9e3 RUN_ALL_TESTS()
> @  0x11bc599 main
> @ 0x7faece663b15 __libc_start_main
> @   0xa9c219 (unknown)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6240) Allow executor/agent communication over non-TCP/IP stream socket.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301677#comment-16301677
 ] 

Jie Yu commented on MESOS-6240:
---

Re-target for 1.6.0 as no progress has been made in a few months.

> Allow executor/agent communication over non-TCP/IP stream socket.
> -
>
> Key: MESOS-6240
> URL: https://issues.apache.org/jira/browse/MESOS-6240
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
> Environment: Linux and Windows
>Reporter: Avinash Sridharan
>Assignee: Benjamin Hindman
>Priority: Critical
>  Labels: mesosphere
>
> Currently, the executor agent communication happens specifically over TCP 
> sockets. This works fine in most cases, but specifically for the 
> `MesosContainerizer` when containers are running on CNI networks, this mode 
> of communication starts imposing constraints on the CNI network. Since, now 
> there has to connectivity between the CNI network  (on which the executor is 
> running) and the agent. Introducing paths from a CNI network to the 
> underlying agent, at best, creates headaches for operators and at worst 
> introduces serious security holes in the network, since it is breaking the 
> isolation between the container CNI network and the host network (on which 
> the agent is running).
> In order to simplify/strengthen deployment of Mesos containers on CNI 
> networks we therefore need to move away from using TCP/IP sockets for 
> executor/agent communication. Since, executor and agent are guaranteed to run 
> on the same host, the above problems can be resolved if, for the 
> `MesosContainerizer`, we use UNIX domain sockets or named pipes instead of 
> TCP/IP sockets for the executor/agent communication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6240) Allow executor/agent communication over non-TCP/IP stream socket.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-6240:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Allow executor/agent communication over non-TCP/IP stream socket.
> -
>
> Key: MESOS-6240
> URL: https://issues.apache.org/jira/browse/MESOS-6240
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
> Environment: Linux and Windows
>Reporter: Avinash Sridharan
>Assignee: Benjamin Hindman
>Priority: Critical
>  Labels: mesosphere
>
> Currently, the executor agent communication happens specifically over TCP 
> sockets. This works fine in most cases, but specifically for the 
> `MesosContainerizer` when containers are running on CNI networks, this mode 
> of communication starts imposing constraints on the CNI network. Since, now 
> there has to connectivity between the CNI network  (on which the executor is 
> running) and the agent. Introducing paths from a CNI network to the 
> underlying agent, at best, creates headaches for operators and at worst 
> introduces serious security holes in the network, since it is breaking the 
> isolation between the container CNI network and the host network (on which 
> the agent is running).
> In order to simplify/strengthen deployment of Mesos containers on CNI 
> networks we therefore need to move away from using TCP/IP sockets for 
> executor/agent communication. Since, executor and agent are guaranteed to run 
> on the same host, the above problems can be resolved if, for the 
> `MesosContainerizer`, we use UNIX domain sockets or named pipes instead of 
> TCP/IP sockets for the executor/agent communication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7103) Container Attach/Exec Improvements

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301695#comment-16301695
 ] 

Jie Yu commented on MESOS-7103:
---

Re-target for 1.6.0 as no progress has been made recently.

> Container Attach/Exec Improvements
> --
>
> Key: MESOS-7103
> URL: https://issues.apache.org/jira/browse/MESOS-7103
> Project: Mesos
>  Issue Type: Epic
>Reporter: Kevin Klues
>  Labels: tech-debt
>
> Most of the core changes required to add "container exec" and "container 
> attach" support to Mesos landed in the 1.2 release. However, some features 
> (such as actually integrating this support into the CLI) haven't quite landed 
> yet.
> This Epic aims to capture the tickets that still need to be resolved before 
> we can consider work on this feature complete. It is targeted for the 1.3 
> release.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7141) Support hook scripts to customize actions for container's lifecycle

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301696#comment-16301696
 ] 

Jie Yu commented on MESOS-7141:
---

Retarget to 1.6.0 as no progress has been made.

> Support hook scripts to customize actions for container's lifecycle
> ---
>
> Key: MESOS-7141
> URL: https://issues.apache.org/jira/browse/MESOS-7141
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jason Lai
>Assignee: Jason Lai
>  Labels: containerizer, hooks
>
> Inspired by [hooks | 
> https://github.com/opencontainers/runtime-spec/blob/master/config.md#hooks] 
> in [OCI's runtime spec | https://github.com/opencontainers/runtime-spec], it 
> would be great to have scripts hooked into the lifecycle of containers.
> The OCI doc has specified 3 stages for hooking:
> * Prestart
> * Poststart
> * Poststop
> We can consider having the 3 stages to start with.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7103) Container Attach/Exec Improvements

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu reassigned MESOS-7103:
-

Assignee: (was: Kevin Klues)

> Container Attach/Exec Improvements
> --
>
> Key: MESOS-7103
> URL: https://issues.apache.org/jira/browse/MESOS-7103
> Project: Mesos
>  Issue Type: Epic
>Reporter: Kevin Klues
>  Labels: tech-debt
>
> Most of the core changes required to add "container exec" and "container 
> attach" support to Mesos landed in the 1.2 release. However, some features 
> (such as actually integrating this support into the CLI) haven't quite landed 
> yet.
> This Epic aims to capture the tickets that still need to be resolved before 
> we can consider work on this feature complete. It is targeted for the 1.3 
> release.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7103) Container Attach/Exec Improvements

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7103:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Container Attach/Exec Improvements
> --
>
> Key: MESOS-7103
> URL: https://issues.apache.org/jira/browse/MESOS-7103
> Project: Mesos
>  Issue Type: Epic
>Reporter: Kevin Klues
>  Labels: tech-debt
>
> Most of the core changes required to add "container exec" and "container 
> attach" support to Mesos landed in the 1.2 release. However, some features 
> (such as actually integrating this support into the CLI) haven't quite landed 
> yet.
> This Epic aims to capture the tickets that still need to be resolved before 
> we can consider work on this feature complete. It is targeted for the 1.3 
> release.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8350) Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302032#comment-16302032
 ] 

Jie Yu commented on MESOS-8350:
---

Re-target this for 1.5.1 given the likelihood for this to happen is pretty rare 
and we do have a workaround for this.

> Resource provider-capable agents not correctly synchronizing checkpointed 
> agent resources on reregistration
> ---
>
> Key: MESOS-8350
> URL: https://issues.apache.org/jira/browse/MESOS-8350
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Critical
>
> For resource provider-capable agents the master does not re-send checkpointed 
> resources on agent reregistration; instead the checkpointed resources sent as 
> part of the {{ReregisterSlaveMessage}} should be used.
> This is not what happens in reality. If e.g., checkpointing of an offer 
> operation fails and the agent fails over the checkpointed resources would, as 
> expected, not be reflected in the agent, but would still be assumed in the 
> master.
> A workaround is to fail over the master which would lead to the newly elected 
> master bootstrapping agent state from {{ReregisterSlaveMessage}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8350) Resource provider-capable agents not correctly synchronizing checkpointed agent resources on reregistration

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-8350:
--
Target Version/s: 1.5.1  (was: 1.5.0)

> Resource provider-capable agents not correctly synchronizing checkpointed 
> agent resources on reregistration
> ---
>
> Key: MESOS-8350
> URL: https://issues.apache.org/jira/browse/MESOS-8350
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Critical
>
> For resource provider-capable agents the master does not re-send checkpointed 
> resources on agent reregistration; instead the checkpointed resources sent as 
> part of the {{ReregisterSlaveMessage}} should be used.
> This is not what happens in reality. If e.g., checkpointing of an offer 
> operation fails and the agent fails over the checkpointed resources would, as 
> expected, not be reflected in the agent, but would still be assumed in the 
> master.
> A workaround is to fail over the master which would lead to the newly elected 
> master bootstrapping agent state from {{ReregisterSlaveMessage}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8337) Invalid state transition attempted when agent is lost.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302039#comment-16302039
 ] 

Jie Yu commented on MESOS-8337:
---

[~jpe...@apache.org] who is working on this issue? Is that a blocker for 1.5.0?

> Invalid state transition attempted when agent is lost.
> --
>
> Key: MESOS-8337
> URL: https://issues.apache.org/jira/browse/MESOS-8337
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: James Peach
>
> The change in MESOS-7215 can attempt to transition a task from {{FAILED}} to 
> {{LOST}} when removing a lost agent. This ends up triggering a {{CHECK}} that 
> was added in the same patch.
> {noformat}
> I1214 23:42:16.507931 22396 master.cpp:10155] Removing task 
> mobius-mloop-1512774555_3661616380-xxx with resources disk(allocated: *):200; 
> cpus(allocated: *):0.01; mem(allocated: *):200; ports(allocated: 
> *):[31068-31068, 31069-31069, 31072-31072] of framework 
> afcbfa05-7973-4ad3-8399-4153556a8fa9-3607 on agent 
> daceae53-448b-4349-8503-9dd8132a6828-S4 at slave(1)@17.147.52.220:5 
> (magent0006.xxx.com)
> F1214 23:42:16.507961 22396 master.hpp:2342] Check failed: task->state() == 
> TASK_UNREACHABLE || task->state() == TASK_LOST TASK_FAILED
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8321) Validate that offer operations contain only master-known resource provider resources

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-8321:
--
Target Version/s: 1.5.1  (was: 1.5.0)

> Validate that offer operations contain only master-known resource provider 
> resources
> 
>
> Key: MESOS-8321
> URL: https://issues.apache.org/jira/browse/MESOS-8321
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benjamin Bannier
>
> We should update the master's offer operation validation to also check that 
> any offer operation only works with resources from known resource providers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8303) Add user doc for agent reconfiguration

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-8303:
--
Target Version/s:   (was: 1.5.0)

> Add user doc for agent reconfiguration
> --
>
> Key: MESOS-8303
> URL: https://issues.apache.org/jira/browse/MESOS-8303
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Benno Evers
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8321) Validate that offer operations contain only master-known resource provider resources

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302041#comment-16302041
 ] 

Jie Yu commented on MESOS-8321:
---

Re-target for 1.5.1 given RP related feature is experimental

> Validate that offer operations contain only master-known resource provider 
> resources
> 
>
> Key: MESOS-8321
> URL: https://issues.apache.org/jira/browse/MESOS-8321
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benjamin Bannier
>
> We should update the master's offer operation validation to also check that 
> any offer operation only works with resources from known resource providers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8303) Add user doc for agent reconfiguration

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302042#comment-16302042
 ] 

Jie Yu commented on MESOS-8303:
---

Remove the target version 1.5.0 to unblock release. Please add doc asap!

> Add user doc for agent reconfiguration
> --
>
> Key: MESOS-8303
> URL: https://issues.apache.org/jira/browse/MESOS-8303
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Benno Evers
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8219) Validate that any offer operation is only applied on resources from a single provider

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302050#comment-16302050
 ] 

Jie Yu commented on MESOS-8219:
---

Re-target this for 1.5.1 given RP related features are experimental

> Validate that any offer operation is only applied on resources from a single 
> provider
> -
>
> Key: MESOS-8219
> URL: https://issues.apache.org/jira/browse/MESOS-8219
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Bannier
>
> Offer operations can only be applied to resources from one single resource 
> provider. A number of places in the implementation assume that the provider 
> ID obtained from any {Resource} in an offer operation is equivalent to the 
> one from any other resource. We should update the master to validate that 
> invariant and reject malformed operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8219) Validate that any offer operation is only applied on resources from a single provider

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-8219:
--
Target Version/s: 1.5.1  (was: 1.5.0)

> Validate that any offer operation is only applied on resources from a single 
> provider
> -
>
> Key: MESOS-8219
> URL: https://issues.apache.org/jira/browse/MESOS-8219
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Bannier
>
> Offer operations can only be applied to resources from one single resource 
> provider. A number of places in the implementation assume that the provider 
> ID obtained from any {Resource} in an offer operation is equivalent to the 
> one from any other resource. We should update the master to validate that 
> invariant and reject malformed operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8221) Use protobuf reflection to simplify downgrading of resources.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302048#comment-16302048
 ] 

Jie Yu commented on MESOS-8221:
---

[~bmahler], [~mcypark], is this a blocker for 1.5.0?

> Use protobuf reflection to simplify downgrading of resources.
> -
>
> Key: MESOS-8221
> URL: https://issues.apache.org/jira/browse/MESOS-8221
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Michael Park
>Assignee: Michael Park
>
> We currently have a {{downgradeResources}} function which is called on every
> {{repeated Resource}} field in every message that we checkpoint. We should 
> leverage
> protobuf reflection to automatically downgrade any instances of {{Resource}} 
> within any
> protobuf message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8247) Executor registered message is lost

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302046#comment-16302046
 ] 

Jie Yu commented on MESOS-8247:
---

[~abudnik], [~alexr], is this a blocker for 1.5.0?

> Executor registered message is lost
> ---
>
> Key: MESOS-8247
> URL: https://issues.apache.org/jira/browse/MESOS-8247
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrei Budnik
>Assignee: Andrei Budnik
>
> h3. Brief description of successful agent-executor communication.
> Executor sends `RegisterExecutorMessage` message to Agent during 
> initialization step. Agent sends a `ExecutorRegisteredMessage` message as a 
> response to the Executor in `registerExecutor()` method. Whenever executor 
> receives `ExecutorRegisteredMessage`, it prints a `Executor registered on 
> agent...` to stderr logs.
> h3. Problem description.
> The agent launches built-in docker executor, which is stuck in `STAGING` 
> state.
> stderr logs of the docker executor:
> {code}
> I1114 23:03:17.919090 14322 exec.cpp:162] Version: 1.2.3
> {code}
> It doesn't contain a message like `Executor registered on agent...`. At the 
> same time agent received `RegisterExecutorMessage` and sent `runTask` message 
> to the executor.
> stdout logs consists of the same repeating message:
> {code}
> Received killTask for task ...
> {code}
> Also, the docker executor process doesn't contain child processes.
> Currently, executor [doesn't 
> attempt|https://github.com/apache/mesos/blob/2a253093ecdc7d743c9c0874d6e01b68f6a813e4/src/exec/exec.cpp#L320]
>  to launch a task if it is not registered at the agent, while [task 
> killing|https://github.com/apache/mesos/blob/2a253093ecdc7d743c9c0874d6e01b68f6a813e4/src/exec/exec.cpp#L343]
>  doesn't have such a check.
> It looks like `ExecutorRegisteredMessage` has been lost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8041) Add a document for `cgroups/blkio` isolation

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-8041:
--
Target Version/s:   (was: 1.5.0)

> Add a document for `cgroups/blkio` isolation
> 
>
> Key: MESOS-8041
> URL: https://issues.apache.org/jira/browse/MESOS-8041
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Qian Zhang
>Assignee: Jason Lai
>
> Now we have supported {{cgroups/blkio}} isolation in Mesos agent for 
> collecting blkio statistics, we need to add a document for it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8025) Update the master field in the new CLI config to accept a URL instead of an

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-8025:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Update the master field in the new CLI config to accept a URL instead of an 
> 
> -
>
> Key: MESOS-8025
> URL: https://issues.apache.org/jira/browse/MESOS-8025
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli
> Environment: This will be useful in cases where the master is behind 
> a proxy or when the master is sitting directly on port 80.
>Reporter: Kevin Klues
>Assignee: Armand Grillet
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8025) Update the master field in the new CLI config to accept a URL instead of an

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302056#comment-16302056
 ] 

Jie Yu commented on MESOS-8025:
---

Retarget this as no progress has been made recently

> Update the master field in the new CLI config to accept a URL instead of an 
> 
> -
>
> Key: MESOS-8025
> URL: https://issues.apache.org/jira/browse/MESOS-8025
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli
> Environment: This will be useful in cases where the master is behind 
> a proxy or when the master is sitting directly on port 80.
>Reporter: Kevin Klues
>Assignee: Armand Grillet
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7974) Accept "application/recordio" type is rejected for master operator API SUBSCRIBE call

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7974:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Accept "application/recordio" type is rejected for master operator API 
> SUBSCRIBE call
> -
>
> Key: MESOS-7974
> URL: https://issues.apache.org/jira/browse/MESOS-7974
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: James DeFelice
>  Labels: mesosphere
>
> The agent operator API supports for "application/recordio" for things like 
> attach-container-output, which streams objects back to the caller. I expected 
> the master operator API SUBSCRIBE call to work the same way, w/ 
> Accept/Content-Type headers for "recordio" and 
> Message-Accept/Message-Content-Type headers for json (or protobuf). This was 
> not the case.
> Looking again at the master operator API documentation, SUBSCRIBE docs 
> illustrate usage Accept and Content-Type headers for the "application/json" 
> type. Not a "recordio" type. So my experience, as per the docs, seems 
> expected. However, this is counter-intuitive since the whole point of adding 
> the new Message-prefixed headers was to help callers consistently request 
> (and differentiate) streaming responses from non-streaming responses in the 
> v1 API.
> Please fix the master operator API implementation to also support the 
> Message-prefixed headers w/ Accept/Content-Type set to "recordio".
> Observed on ubuntu w/ mesos package version 1.2.1-2.0.1



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8022) Add tests proving the HTTP authenticatee modularize.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-8022:
--
Target Version/s:   (was: 1.5.0)

> Add tests proving the HTTP authenticatee modularize.
> 
>
> Key: MESOS-8022
> URL: https://issues.apache.org/jira/browse/MESOS-8022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Till Toenshoff
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8041) Add a document for `cgroups/blkio` isolation

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302055#comment-16302055
 ] 

Jie Yu commented on MESOS-8041:
---

Remove the target version. Can this be closed?

> Add a document for `cgroups/blkio` isolation
> 
>
> Key: MESOS-8041
> URL: https://issues.apache.org/jira/browse/MESOS-8041
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Qian Zhang
>Assignee: Jason Lai
>
> Now we have supported {{cgroups/blkio}} isolation in Mesos agent for 
> collecting blkio statistics, we need to add a document for it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7974) Accept "application/recordio" type is rejected for master operator API SUBSCRIBE call

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302058#comment-16302058
 ] 

Jie Yu commented on MESOS-7974:
---

Re-target this to 1.6.0 as no progress has been made

[~vinodkone], can you take a look at this?

> Accept "application/recordio" type is rejected for master operator API 
> SUBSCRIBE call
> -
>
> Key: MESOS-7974
> URL: https://issues.apache.org/jira/browse/MESOS-7974
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: James DeFelice
>  Labels: mesosphere
>
> The agent operator API supports for "application/recordio" for things like 
> attach-container-output, which streams objects back to the caller. I expected 
> the master operator API SUBSCRIBE call to work the same way, w/ 
> Accept/Content-Type headers for "recordio" and 
> Message-Accept/Message-Content-Type headers for json (or protobuf). This was 
> not the case.
> Looking again at the master operator API documentation, SUBSCRIBE docs 
> illustrate usage Accept and Content-Type headers for the "application/json" 
> type. Not a "recordio" type. So my experience, as per the docs, seems 
> expected. However, this is counter-intuitive since the whole point of adding 
> the new Message-prefixed headers was to help callers consistently request 
> (and differentiate) streaming responses from non-streaming responses in the 
> v1 API.
> Please fix the master operator API implementation to also support the 
> Message-prefixed headers w/ Accept/Content-Type set to "recordio".
> Observed on ubuntu w/ mesos package version 1.2.1-2.0.1



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8022) Add tests proving the HTTP authenticatee modularize.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302057#comment-16302057
 ] 

Jie Yu commented on MESOS-8022:
---

Remove target version as tests shouldn't block a release.

> Add tests proving the HTTP authenticatee modularize.
> 
>
> Key: MESOS-8022
> URL: https://issues.apache.org/jira/browse/MESOS-8022
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Till Toenshoff
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7967) Make `mesos-execute` work with old-style resources

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7967:
--
Target Version/s:   (was: 1.4.1, 1.5.0)

> Make `mesos-execute` work with old-style resources
> --
>
> Key: MESOS-7967
> URL: https://issues.apache.org/jira/browse/MESOS-7967
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli
>Reporter: Michael Park
>
> {{mesos-execute}} should be updated to be able to handle
> "pre-reservation-refinement" resource format.
> For reservation refinement, new resource format were introduced.
> The master and agent have been carefully updated to be able to handle
> pre/post reservation-refinement resource formats, whereas the example
> frameworks and {{mesos-execute}} were updated such that they require
> the new resource format. While the example frameworks are probably fine
> being updated to use the new format, {{mesos-execute}} is used as a
> developer tool, and as such we should update it to be more robust in its
> handling of resource formats.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7967) Make `mesos-execute` work with old-style resources

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302061#comment-16302061
 ] 

Jie Yu commented on MESOS-7967:
---

[~mcypark] any plan to work on this? I remove the target versions for now

> Make `mesos-execute` work with old-style resources
> --
>
> Key: MESOS-7967
> URL: https://issues.apache.org/jira/browse/MESOS-7967
> Project: Mesos
>  Issue Type: Improvement
>  Components: cli
>Reporter: Michael Park
>
> {{mesos-execute}} should be updated to be able to handle
> "pre-reservation-refinement" resource format.
> For reservation refinement, new resource format were introduced.
> The master and agent have been carefully updated to be able to handle
> pre/post reservation-refinement resource formats, whereas the example
> frameworks and {{mesos-execute}} were updated such that they require
> the new resource format. While the example frameworks are probably fine
> being updated to use the new format, {{mesos-execute}} is used as a
> developer tool, and as such we should update it to be more robust in its
> handling of resource formats.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7776) Document `MESOS_CONTAINER_IP`

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7776:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Document `MESOS_CONTAINER_IP` 
> --
>
> Key: MESOS-7776
> URL: https://issues.apache.org/jira/browse/MESOS-7776
> Project: Mesos
>  Issue Type: Documentation
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>
> We introduced `MESOS_CONTAINER_IP` to inform tasks launched by the 
> default-executor to inform the tasks about their container IP. This was done 
> primarily to break the dependency of the containers on `LIBPROCESS_IP` to 
> learn their IP addresses which was misleading. 
> This change need to be documented.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7949) Upgrade Mesos to C++14.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302066#comment-16302066
 ] 

Jie Yu commented on MESOS-7949:
---

Retarget this to 1.6.0

> Upgrade Mesos to C++14.
> ---
>
> Key: MESOS-7949
> URL: https://issues.apache.org/jira/browse/MESOS-7949
> Project: Mesos
>  Issue Type: Epic
>Reporter: Michael Park
>
> Upgrading Mesos to C++14 will give us features such as
> - Generic lambdas
> - New lambda captures (Proper move captures)
> - SFINAE result_of (We can remove {{stout/result_of.hpp}})
> - Variable templates
> - Relaxed {{constexpr}} functions
> - Simple utilities such as {{std::make_unique}}
> - Metaprogramming facilities such as {{decay_t}}, {{index_sequence}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7958) The example framework `test-framework` is broken.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302063#comment-16302063
 ] 

Jie Yu commented on MESOS-7958:
---

Removed the target version. Is this still an issue? Can we close?

> The example framework `test-framework` is broken.
> -
>
> Key: MESOS-7958
> URL: https://issues.apache.org/jira/browse/MESOS-7958
> Project: Mesos
>  Issue Type: Bug
>  Components: framework
>Affects Versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 
> 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1
>Reporter: Michael Park
> Attachments: screenshot-1.png
>
>
> The {{test-framework}} example framework does not work.
> Launching a cluster like so:
> {code}
> MESOS_RESOURCES="cpus:32;mem:512;disk:1024" MESOS_REGISTRY="in_memory" 
> ./bin/mesos-local.sh --num_slaves=1 --ip=127.0.0.1 --port=4040 
> --work_dir=$HOME/mesos-local
> {code}
> and trying to launch the {{test-framework}} like so:
> {code}
> ./src/test-framework --master=127.0.0.1:4040
> {code}
> {code}
> /home/mpark/projects/mesos/build/src/.libs/test-executor: error while loading 
> shared libraries: libmesos-1.5.0.so: cannot open shared object file: No such 
> file or directory
> {code}
> It seems that {{test-executor}} cannot load {{libmesos.so}} correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7949) Upgrade Mesos to C++14.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7949:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Upgrade Mesos to C++14.
> ---
>
> Key: MESOS-7949
> URL: https://issues.apache.org/jira/browse/MESOS-7949
> Project: Mesos
>  Issue Type: Epic
>Reporter: Michael Park
>
> Upgrading Mesos to C++14 will give us features such as
> - Generic lambdas
> - New lambda captures (Proper move captures)
> - SFINAE result_of (We can remove {{stout/result_of.hpp}})
> - Variable templates
> - Relaxed {{constexpr}} functions
> - Simple utilities such as {{std::make_unique}}
> - Metaprogramming facilities such as {{decay_t}}, {{index_sequence}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7950) Update autotools and CMake to build in C++14 mode.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302065#comment-16302065
 ] 

Jie Yu commented on MESOS-7950:
---

Re-target this to 1.6.0.

> Update autotools and CMake to build in C++14 mode.
> --
>
> Key: MESOS-7950
> URL: https://issues.apache.org/jira/browse/MESOS-7950
> Project: Mesos
>  Issue Type: Task
>  Components: build
>Reporter: Michael Park
>
> Update the {{configure.ac}} for autotools, and 
> {{cmake/CompilationConfigure.cmake}} for CMake to build in C++14 mode.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7958) The example framework `test-framework` is broken.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7958:
--
Target Version/s:   (was: 1.5.0)

> The example framework `test-framework` is broken.
> -
>
> Key: MESOS-7958
> URL: https://issues.apache.org/jira/browse/MESOS-7958
> Project: Mesos
>  Issue Type: Bug
>  Components: framework
>Affects Versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 
> 1.3.0, 1.3.1, 1.3.2, 1.4.0, 1.4.1
>Reporter: Michael Park
> Attachments: screenshot-1.png
>
>
> The {{test-framework}} example framework does not work.
> Launching a cluster like so:
> {code}
> MESOS_RESOURCES="cpus:32;mem:512;disk:1024" MESOS_REGISTRY="in_memory" 
> ./bin/mesos-local.sh --num_slaves=1 --ip=127.0.0.1 --port=4040 
> --work_dir=$HOME/mesos-local
> {code}
> and trying to launch the {{test-framework}} like so:
> {code}
> ./src/test-framework --master=127.0.0.1:4040
> {code}
> {code}
> /home/mpark/projects/mesos/build/src/.libs/test-executor: error while loading 
> shared libraries: libmesos-1.5.0.so: cannot open shared object file: No such 
> file or directory
> {code}
> It seems that {{test-executor}} cannot load {{libmesos.so}} correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7950) Update autotools and CMake to build in C++14 mode.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7950:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Update autotools and CMake to build in C++14 mode.
> --
>
> Key: MESOS-7950
> URL: https://issues.apache.org/jira/browse/MESOS-7950
> Project: Mesos
>  Issue Type: Task
>  Components: build
>Reporter: Michael Park
>
> Update the {{configure.ac}} for autotools, and 
> {{cmake/CompilationConfigure.cmake}} for CMake to build in C++14 mode.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7776) Document `MESOS_CONTAINER_IP`

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302068#comment-16302068
 ] 

Jie Yu commented on MESOS-7776:
---

Retarget this to 1.6.0. [~avinash.mesos], do you still plan to work on this?

> Document `MESOS_CONTAINER_IP` 
> --
>
> Key: MESOS-7776
> URL: https://issues.apache.org/jira/browse/MESOS-7776
> Project: Mesos
>  Issue Type: Documentation
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>
> We introduced `MESOS_CONTAINER_IP` to inform tasks launched by the 
> default-executor to inform the tasks about their container IP. This was done 
> primarily to break the dependency of the containers on `LIBPROCESS_IP` to 
> learn their IP addresses which was misleading. 
> This change need to be documented.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7903) Include in the DefaultExecutor logs the output of timed out checks

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302067#comment-16302067
 ] 

Jie Yu commented on MESOS-7903:
---

Retarget this to 1.6.0 due to inactivity

> Include in the DefaultExecutor logs the output of timed out checks
> --
>
> Key: MESOS-7903
> URL: https://issues.apache.org/jira/browse/MESOS-7903
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Gastón Kleiman
>Priority: Minor
>  Labels: check, default-executor
>
> Once the patches for https://issues.apache.org/jira/browse/MESOS-7861 land, 
> the output of successful and failed checks will be included in the 
> DefaultExecutor logs, but the output of timed out checks won't be included.
> Right now the checker process sends the {{LAUNCH_NESTED_CONTAINER_SESSION}} 
> requests using {{streamed=false}}. Libprocess will then convert the streaming 
> response into a body (non-streamed) response, completing the future returned 
> by {{Connection::send()}} only once the request has been fully received. The 
> checker will then read the whole process output from the response's body and 
> log it.
> However when a check times out, the checker will close the connection before 
> the full response is received. So the future returned by 
> {{Connection::send()}} will be failed, and the checker won't have access to 
> the response.
> In order to log the output of timed out checks, we will probably need to make 
> the checker send the launch request with {{streamed=true}}, and then make it 
> read the check output from the pipe of the streamed response.
> If we do that, we should probably turn the {{Future> 
> getProcessIOData(...)}} method from {{api_tests.cpp}} into a helper method 
> and use it in {{checker_process.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7903) Include in the DefaultExecutor logs the output of timed out checks

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7903:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Include in the DefaultExecutor logs the output of timed out checks
> --
>
> Key: MESOS-7903
> URL: https://issues.apache.org/jira/browse/MESOS-7903
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Gastón Kleiman
>Priority: Minor
>  Labels: check, default-executor
>
> Once the patches for https://issues.apache.org/jira/browse/MESOS-7861 land, 
> the output of successful and failed checks will be included in the 
> DefaultExecutor logs, but the output of timed out checks won't be included.
> Right now the checker process sends the {{LAUNCH_NESTED_CONTAINER_SESSION}} 
> requests using {{streamed=false}}. Libprocess will then convert the streaming 
> response into a body (non-streamed) response, completing the future returned 
> by {{Connection::send()}} only once the request has been fully received. The 
> checker will then read the whole process output from the response's body and 
> log it.
> However when a check times out, the checker will close the connection before 
> the full response is received. So the future returned by 
> {{Connection::send()}} will be failed, and the checker won't have access to 
> the response.
> In order to log the output of timed out checks, we will probably need to make 
> the checker send the launch request with {{streamed=true}}, and then make it 
> read the check output from the pipe of the streamed response.
> If we do that, we should probably turn the {{Future> 
> getProcessIOData(...)}} method from {{api_tests.cpp}} into a helper method 
> and use it in {{checker_process.cpp}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7691) Support local enabled cgroups subsystems automatically.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7691:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Support local enabled cgroups subsystems automatically.
> ---
>
> Key: MESOS-7691
> URL: https://issues.apache.org/jira/browse/MESOS-7691
> Project: Mesos
>  Issue Type: Improvement
>  Components: cgroups
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: cgroups
>
> Currently, each cgroup subsystem needs to be turned on as an isolator, e.g., 
> "cgroups/blkio". Ideally, mesos should be able to detect all local enabled 
> cgroup subsystems and turn them on automatically (or we call it auto cgroups).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7691) Support local enabled cgroups subsystems automatically.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302071#comment-16302071
 ] 

Jie Yu commented on MESOS-7691:
---

Re-target this to 1.6.0.

> Support local enabled cgroups subsystems automatically.
> ---
>
> Key: MESOS-7691
> URL: https://issues.apache.org/jira/browse/MESOS-7691
> Project: Mesos
>  Issue Type: Improvement
>  Components: cgroups
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: cgroups
>
> Currently, each cgroup subsystem needs to be turned on as an isolator, e.g., 
> "cgroups/blkio". Ideally, mesos should be able to detect all local enabled 
> cgroup subsystems and turn them on automatically (or we call it auto cgroups).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7705) Reconsider restricting the resource format for frameworks.

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302070#comment-16302070
 ] 

Jie Yu commented on MESOS-7705:
---

[~bmahler], [~mcypark], is this a blocker for 1.5.0? If not, can you retarget?

> Reconsider restricting the resource format for frameworks.
> --
>
> Key: MESOS-7705
> URL: https://issues.apache.org/jira/browse/MESOS-7705
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Michael Park
>Assignee: Michael Park
>
> We output the "endpoint" format through the endpoints
> for backward compatibility of external tooling. A framework should be
> able to use the result of an endpoint and pass it back to Mesos,
> since the result was produced by Mesos. This is especially applicable
> to the V1 API. We also allow the "pre-reservation-refinement" format
> because existing "resources files" are written in that format, and
> they should still be usable without modification.
> This is probably too flexible however, since a framework without
> a RESERVATION_REFINEMENT capability could make refined reservations
> using the "post-reservation-refinement" format, although they wouldn't be
> offered such resources. It still seems undesirable if anyone were to
> run into it, and we should consider adding sensible restrictions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8357) Example frameworks have an inconsistent UX.

2017-12-22 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301905#comment-16301905
 ] 

Till Toenshoff commented on MESOS-8357:
---

Our example frameworks are considered part of the infrastructure used for 
testing Mesos - but they are not part of the Mesos core distribution. My 
suggestion would be to consistently use the prefix {{TEST_}}, but not 
{{MESOS_}} and not {{DEFAULT_}}.

> Example frameworks have an inconsistent UX.
> ---
>
> Key: MESOS-8357
> URL: https://issues.apache.org/jira/browse/MESOS-8357
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>
> Our example frameworks are a bit inconsistent when it comes to specifying 
> things like the framework principal / secret etc.. 
> Many of these examples have great value in testing a Mesos cluster. Unifying 
> the parameterizing would improve the user experience when testing Mesos.
> {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling 
> / disabling authentication. {{load_generator_framework}} as one example 
> however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials 
> themselves are most commonly expected in environment variables 
> {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to 
> use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8357) Example frameworks have an inconsistent UX.

2017-12-22 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-8357:
--
Affects Version/s: 1.5.0

> Example frameworks have an inconsistent UX.
> ---
>
> Key: MESOS-8357
> URL: https://issues.apache.org/jira/browse/MESOS-8357
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>Priority: Minor
>
> Our example frameworks are a bit inconsistent when it comes to specifying 
> things like the framework principal / secret etc.. 
> Many of these examples have great value in testing a Mesos cluster. Unifying 
> the parameterizing would improve the user experience when testing Mesos.
> {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling 
> / disabling authentication. {{load_generator_framework}} as one example 
> however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials 
> themselves are most commonly expected in environment variables 
> {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to 
> use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8357) Example frameworks have an inconsistent UX.

2017-12-22 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-8357:
--
Priority: Minor  (was: Major)

> Example frameworks have an inconsistent UX.
> ---
>
> Key: MESOS-8357
> URL: https://issues.apache.org/jira/browse/MESOS-8357
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.5.0
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>Priority: Minor
>
> Our example frameworks are a bit inconsistent when it comes to specifying 
> things like the framework principal / secret etc.. 
> Many of these examples have great value in testing a Mesos cluster. Unifying 
> the parameterizing would improve the user experience when testing Mesos.
> {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling 
> / disabling authentication. {{load_generator_framework}} as one example 
> however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials 
> themselves are most commonly expected in environment variables 
> {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to 
> use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7007) filesystem/shared and --default_container_info broken since 1.1

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301881#comment-16301881
 ] 

Jie Yu commented on MESOS-7007:
---

I made an attempt to clean this up. I used the patch from [~jpepy] 
(https://reviews.apache.org/r/63598/), and added a followup patch 
(https://reviews.apache.org/r/64811/)

> filesystem/shared and --default_container_info broken since 1.1
> ---
>
> Key: MESOS-7007
> URL: https://issues.apache.org/jira/browse/MESOS-7007
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Pierre Cheynier
>Assignee: Chun-Hung Hsiao
>  Labels: storage
>
> I face this issue, that prevent me to upgrade to 1.1.0 (and the change was 
> consequently introduced in this version):
> I'm using default_container_info to mount a /tmp volume in the container's 
> mount namespace from its current sandbox, meaning that each container have a 
> dedicated /tmp, thanks to the {{filesystem/shared}} isolator.
> I noticed through our automation pipeline that integration tests were failing 
> and found that this is because /tmp (the one from the host!) contents is 
> trashed each time a container is created.
> Here is my setup: 
> * 
> {{--isolation='cgroups/cpu,cgroups/mem,namespaces/pid,*disk/du,filesystem/shared,filesystem/linux*,docker/runtime'}}
> * 
> {{--default_container_info='\{"type":"MESOS","volumes":\[\{"host_path":"tmp","container_path":"/tmp","mode":"RW"\}\]\}'}}
> I discovered this issue in the early days of 1.1 (end of Nov, spoke with 
> someone on Slack), but had unfortunately no time to dig into the symptoms a 
> bit more.
> I found nothing interesting even using GLOGv=3.
> Maybe it's a bad usage of isolators that trigger this issue ? If it's the 
> case, then at least a documentation update should be done.
> Let me know if more information is needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7007) filesystem/shared and --default_container_info broken since 1.1

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7007:
--
Shepherd: Jie Yu  (was: Gilbert Song)

> filesystem/shared and --default_container_info broken since 1.1
> ---
>
> Key: MESOS-7007
> URL: https://issues.apache.org/jira/browse/MESOS-7007
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Affects Versions: 1.1.0, 1.2.0
>Reporter: Pierre Cheynier
>Assignee: Chun-Hung Hsiao
>  Labels: storage
>
> I face this issue, that prevent me to upgrade to 1.1.0 (and the change was 
> consequently introduced in this version):
> I'm using default_container_info to mount a /tmp volume in the container's 
> mount namespace from its current sandbox, meaning that each container have a 
> dedicated /tmp, thanks to the {{filesystem/shared}} isolator.
> I noticed through our automation pipeline that integration tests were failing 
> and found that this is because /tmp (the one from the host!) contents is 
> trashed each time a container is created.
> Here is my setup: 
> * 
> {{--isolation='cgroups/cpu,cgroups/mem,namespaces/pid,*disk/du,filesystem/shared,filesystem/linux*,docker/runtime'}}
> * 
> {{--default_container_info='\{"type":"MESOS","volumes":\[\{"host_path":"tmp","container_path":"/tmp","mode":"RW"\}\]\}'}}
> I discovered this issue in the early days of 1.1 (end of Nov, spoke with 
> someone on Slack), but had unfortunately no time to dig into the symptoms a 
> bit more.
> I found nothing interesting even using GLOGv=3.
> Maybe it's a bad usage of isolators that trigger this issue ? If it's the 
> case, then at least a documentation update should be done.
> Let me know if more information is needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8357) Example frameworks have an inconsistent UX.

2017-12-22 Thread Till Toenshoff (JIRA)
Till Toenshoff created MESOS-8357:
-

 Summary: Example frameworks have an inconsistent UX.
 Key: MESOS-8357
 URL: https://issues.apache.org/jira/browse/MESOS-8357
 Project: Mesos
  Issue Type: Improvement
Reporter: Till Toenshoff


Our example frameworks are a bit inconsistent when it comes to specifying 
things like the framework principal / secret etc.. 
Many of these examples have great value in testing a Mesos cluster. Unifying 
the parameterizing would improve the user experience when testing Mesos.

{{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling / 
disabling authentication. {{load_generator_framework}} as one example however 
uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials themselves are 
most commonly expected in environment variables {{DEFAULT_PRINCIPAL}} and 
{{DEFAULT_SECRET}} while in some cases we chose to use {{MESOS_PRINCIPAL}}, 
{{MESOS_SECRET}} instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-5362) Add authentication to example frameworks

2017-12-22 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff reassigned MESOS-5362:
-

Assignee: Till Toenshoff  (was: Greg Mann)

> Add authentication to example frameworks
> 
>
> Key: MESOS-5362
> URL: https://issues.apache.org/jira/browse/MESOS-5362
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: Greg Mann
>Assignee: Till Toenshoff
>  Labels: authentication, mesosphere, security
>
> Some example frameworks do not have the ability to authenticate with the 
> master. Adding authentication to the example frameworks that don't already 
> have it implemented would allow us to use these frameworks for testing in 
> authenticated/authorized scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-8357) Example frameworks have an inconsistent UX.

2017-12-22 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff reassigned MESOS-8357:
-

Assignee: Till Toenshoff

> Example frameworks have an inconsistent UX.
> ---
>
> Key: MESOS-8357
> URL: https://issues.apache.org/jira/browse/MESOS-8357
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Till Toenshoff
>Assignee: Till Toenshoff
>
> Our example frameworks are a bit inconsistent when it comes to specifying 
> things like the framework principal / secret etc.. 
> Many of these examples have great value in testing a Mesos cluster. Unifying 
> the parameterizing would improve the user experience when testing Mesos.
> {{MESOS_AUTHENTICATE_FRAMEWORKS}} is being used by many examples for enabling 
> / disabling authentication. {{load_generator_framework}} as one example 
> however uses {{MESOS_AUTHENTICATE}} for that purpose. The credentials 
> themselves are most commonly expected in environment variables 
> {{DEFAULT_PRINCIPAL}} and {{DEFAULT_SECRET}} while in some cases we chose to 
> use {{MESOS_PRINCIPAL}}, {{MESOS_SECRET}} instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7643) The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302072#comment-16302072
 ] 

Jie Yu commented on MESOS-7643:
---

[~jpe...@apache.org], do you have a patch ready for this? Would be nice to fix 
in 1.5.0.

> The order of isolators provided in '--isolation' flag is not preserved and 
> instead sorted alphabetically
> 
>
> Key: MESOS-7643
> URL: https://issues.apache.org/jira/browse/MESOS-7643
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.1.2, 1.2.0, 1.3.0
>Reporter: Michael Cherny
>Assignee: James Peach
>Priority: Critical
>  Labels: isolation
>
> According to documentation and comments in code the order of the entries in 
> the --isolation flag should specify the ordering of the isolators. 
> Specifically, the `create` and `prepare` calls for each isolator should run 
> serially in the order in which they appear in the --isolation flag, while the 
> `cleanup` call should be serialized in reverse order (with exception of 
> filesystem isolator which is always first).
> But in fact, the isolators provided in '--isolation' flag are sorted 
> alphabetically.
> That happens in [this line of 
> code|https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L377].
>  In this line use of 'set' is done (apparently instead of list or 
> vector) and set is a sorted container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7607) Support for first-class fault domains.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7607:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Support for first-class fault domains.
> --
>
> Key: MESOS-7607
> URL: https://issues.apache.org/jira/browse/MESOS-7607
> Project: Mesos
>  Issue Type: Epic
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> Mesos should support a first-class notion of "fault domains", which 
> effectively provide a common vocabulary for describing the region and zone 
> where a node (either master or agent) is located.
> Design doc: 
> https://drive.google.com/open?id=1gEugdkLRbBsqsiFv3urRPRNrHwUC-i1HwfFfHR_MvC8



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7563) Make the HTTP command executor the default implementation.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7563:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Make the HTTP command executor the default implementation.
> --
>
> Key: MESOS-7563
> URL: https://issues.apache.org/jira/browse/MESOS-7563
> Project: Mesos
>  Issue Type: Epic
>Reporter: Anand Mazumdar
>
> This epic tracks the work needed to make HTTP command executors the default 
> i.e., enable the {{http_command_executor}} flag. Currently, all command 
> executors use the old executor driver implementation. With this flag being 
> always enabled, the command executors would use the v1 HTTP API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-6394) Improvements to partition-aware Mesos frameworks.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu reassigned MESOS-6394:
-

Assignee: Jie Yu  (was: Neil Conway)

> Improvements to partition-aware Mesos frameworks.
> -
>
> Key: MESOS-6394
> URL: https://issues.apache.org/jira/browse/MESOS-6394
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Jie Yu
>  Labels: mesosphere
>
> This is a follow up epic to MESOS-5344 to capture further improvements and 
> changes that need to be made to the MVP.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7404) Ensure hierarchical roles work with old Mesos agents

2017-12-22 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302076#comment-16302076
 ] 

Jie Yu commented on MESOS-7404:
---

Retarget due to inactivity

> Ensure hierarchical roles work with old Mesos agents
> 
>
> Key: MESOS-7404
> URL: https://issues.apache.org/jira/browse/MESOS-7404
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Jie Yu
>  Labels: mesosphere
>
> If the Mesos master supports hierarchical roles but the agent does not, we 
> need to ensure that we avoid putting the agent into a bad state, e.g., if the 
> user creates a persistent volume.
> One approach is to use an agent capability for hierarchical roles, and 
> disallow creating persistent-volumes using a hierarchical role if the agent 
> doesn't have the capability. We could also use an agent version check, 
> although until MESOS-6975 is implemented, that will be a bit awkward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-6623) Re-enable tests impacted by request streaming support

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu reassigned MESOS-6623:
-

Assignee: (was: Anand Mazumdar)

> Re-enable tests impacted by request streaming support
> -
>
> Key: MESOS-6623
> URL: https://issues.apache.org/jira/browse/MESOS-6623
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, test
>Reporter: Anand Mazumdar
>Priority: Critical
>  Labels: mesosphere
>
> We added support for HTTP request streaming in libprocess as part of 
> MESOS-6466. However, this broke a few tests that relied on HTTP request 
> filtering since the handlers no longer have access to the body of the request 
> when {{visit()}} is invoked. We would need to revisit how we do HTTP request 
> filtering and then re-enable these tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7317) Add master endpoint to deactivate / activate agent

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7317:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Add master endpoint to deactivate / activate agent
> --
>
> Key: MESOS-7317
> URL: https://issues.apache.org/jira/browse/MESOS-7317
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, master
>Reporter: Neil Conway
>  Labels: mesosphere
>
> This would allow the operator to deactivate and then subsequently activate an 
> agent. The allocator does not make offers for deactivated agents; this 
> functionality would be useful to help operators "manually (incrementally) 
> drain" the tasks running on an agent, e.g., before taking the agent down.
> At present, if the operator causes a framework to kill a task running on the 
> agent, the framework will often receive an offer for the unused resources on 
> the agent, which will often result in respawning the killed task on the same 
> agent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7404) Ensure hierarchical roles work with old Mesos agents

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu reassigned MESOS-7404:
-

Assignee: Jie Yu  (was: Neil Conway)

> Ensure hierarchical roles work with old Mesos agents
> 
>
> Key: MESOS-7404
> URL: https://issues.apache.org/jira/browse/MESOS-7404
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Jie Yu
>  Labels: mesosphere
>
> If the Mesos master supports hierarchical roles but the agent does not, we 
> need to ensure that we avoid putting the agent into a bad state, e.g., if the 
> user creates a persistent volume.
> One approach is to use an agent capability for hierarchical roles, and 
> disallow creating persistent-volumes using a hierarchical role if the agent 
> doesn't have the capability. We could also use an agent version check, 
> although until MESOS-6975 is implemented, that will be a bit awkward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7473) Use "-dev" prerelease label for version during development

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7473:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Use "-dev" prerelease label for version during development
> --
>
> Key: MESOS-7473
> URL: https://issues.apache.org/jira/browse/MESOS-7473
> Project: Mesos
>  Issue Type: Task
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> Prior discussion:
> https://lists.apache.org/thread.html/6e291c504fd44b79e452744b80073cb33adc1be85c17e22bbca35a6c@%3Cdev.mesos.apache.org%3E
> https://lists.apache.org/thread.html/eb526c9295b3cf8e4efc7e0a7d2dacabb61ab5ed867a05e7d913d3fb@%3Cdev.mesos.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7404) Ensure hierarchical roles work with old Mesos agents

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7404:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Ensure hierarchical roles work with old Mesos agents
> 
>
> Key: MESOS-7404
> URL: https://issues.apache.org/jira/browse/MESOS-7404
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> If the Mesos master supports hierarchical roles but the agent does not, we 
> need to ensure that we avoid putting the agent into a bad state, e.g., if the 
> user creates a persistent volume.
> One approach is to use an agent capability for hierarchical roles, and 
> disallow creating persistent-volumes using a hierarchical role if the agent 
> doesn't have the capability. We could also use an agent version check, 
> although until MESOS-6975 is implemented, that will be a bit awkward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7428) Report exit code of tasks from default and command executors

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7428:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Report exit code of tasks from default and command executors
> 
>
> Key: MESOS-7428
> URL: https://issues.apache.org/jira/browse/MESOS-7428
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>
> Use case: some tasks should only be retried if the exit code matches certain 
> user requirement.
> Based on [~gilbert], we already checkpoint the exit code in containerizer 
> now, and we need to clarify how to report exit code for executor containers 
> v.s. nested containers, and we should do this consistently for command and 
> default executor.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7426) Support for agent lifecycle management.

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7426:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Support for agent lifecycle management.
> ---
>
> Key: MESOS-7426
> URL: https://issues.apache.org/jira/browse/MESOS-7426
> Project: Mesos
>  Issue Type: Epic
>  Components: agent
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: agent-lifecycle, mesosphere
>
> This epic co-ordinates the work for introducing agent lifecycle management in 
> Mesos allowing a framework to be notified in case of agent node failures. The 
> existing {{Event::Failure}} is not enough for frameworks to know that the 
> given agent node isn't ever coming back.
> The primary motivations for introducing such a feature would be:
> - Currently, when an agent running a task fails, there is inherently an 
> operator interference needed (manual step) to remove the node via a 
> configuration API exposed by the framework e.g., dcos cassandra node replace 
> for the cassandra framework. This needs to be done once for every stateful 
> framework running on the cluster.
> - When an agent is marked as unhealthy, the removal rate is bounded if the 
> `--agent_rate_removal_limit` option is set. This is specifically problematic 
> for operators relying on EC2 autoscaling groups or for workload bursting to 
> another cloud.
> - When an agent is marked as unhealthy, the removal rate is bounded if the 
> `--agent_rate_removal_limit` option is set. This is specifically problematic 
> for operators relying on EC2 autoscaling groups or for workload bursting to 
> another cloud.
> - When the fault domain associated with an agent changes (e.g., it is moved 
> from an unallocated rack to an allocated rack), there is no feedback 
> mechanism for the framework.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7278) Implement configuration reader/writer for the new CLI

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7278:
--
Target Version/s: 1.6.0  (was: 1.5.0)

> Implement configuration reader/writer for the new CLI
> -
>
> Key: MESOS-7278
> URL: https://issues.apache.org/jira/browse/MESOS-7278
> Project: Mesos
>  Issue Type: Task
>  Components: cli
>Affects Versions: 1.3.0
>Reporter: Eric Chung
>Assignee: Eric Chung
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6623) Re-enable tests impacted by request streaming support

2017-12-22 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-6623:
--
Target Version/s:   (was: 1.5.0)

> Re-enable tests impacted by request streaming support
> -
>
> Key: MESOS-6623
> URL: https://issues.apache.org/jira/browse/MESOS-6623
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, test
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>Priority: Critical
>  Labels: mesosphere
>
> We added support for HTTP request streaming in libprocess as part of 
> MESOS-6466. However, this broke a few tests that relied on HTTP request 
> filtering since the handlers no longer have access to the body of the request 
> when {{visit()}} is invoked. We would need to revisit how we do HTTP request 
> filtering and then re-enable these tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >