[jira] [Commented] (MESOS-7899) Expose sandboxes using virtual paths and hide the agent work directory.
[ https://issues.apache.org/jira/browse/MESOS-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872880#comment-16872880 ] Benjamin Mahler commented on MESOS-7899: Hi [~tomq42]! I'd like to direct you instead to the user@ mailing list or slack (e.g. #containerizer) to get help with this. > Expose sandboxes using virtual paths and hide the agent work directory. > --- > > Key: MESOS-7899 > URL: https://issues.apache.org/jira/browse/MESOS-7899 > Project: Mesos > Issue Type: Task >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Major > Fix For: 1.5.0 > > > {{Files}} interface already supports a virtual file system. We should figure > out a way to enable this in {{ /files/download}} endpoint to hide agent > sandbox. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9865) Jenkins setup for validating the packaging pipeline.
Till Toenshoff created MESOS-9865: - Summary: Jenkins setup for validating the packaging pipeline. Key: MESOS-9865 URL: https://issues.apache.org/jira/browse/MESOS-9865 Project: Mesos Issue Type: Documentation Reporter: Till Toenshoff We should provide documentation or even turn-key solution for testing our Jenkins (packaging) pipeline in a replicated setup outside the ASF, for development and pre-commit testing purposes. Maybe a dockerized Jenkins does it already? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-9853) Update Docker executor to allow kill policy overrides
[ https://issues.apache.org/jira/browse/MESOS-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann reassigned MESOS-9853: Assignee: Greg Mann > Update Docker executor to allow kill policy overrides > - > > Key: MESOS-9853 > URL: https://issues.apache.org/jira/browse/MESOS-9853 > Project: Mesos > Issue Type: Task >Reporter: Greg Mann >Assignee: Greg Mann >Priority: Major > Labels: foundations, mesosphere > > In order for the agent to successfully override the task kill policy of > Docker tasks when the agent is being drained, the Docker executor must be > able to receive kill policy overrides and must be updated to honor them. > Since the Docker executor runs using the executor driver, this is currently > not possible. We could, for example, update the executor driver interface, or > move the Docker executor off of the executor driver. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9807) Introduce a `struct Quota` wrapper.
[ https://issues.apache.org/jira/browse/MESOS-9807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872747#comment-16872747 ] Meng Zhu commented on MESOS-9807: - {noformat} commit 8eba78cbddc8b70f78c07a501ee0dc1d6204f280 Author: Meng Zhu Date: Thu Jun 20 17:29:28 2019 -0700 Replaced `Quota` with `Quota2` in the master state. This paves way to remove `struct Quota`. Review: https://reviews.apache.org/r/70916 commit 5907a357180ccd8fe398f2b6638c85912fafe8b2 Author: Meng Zhu Date: Thu Jun 20 18:50:38 2019 -0700 Replaced the old `struct Quota`. The new `struct Quota` is consistent with the proto `QuotaConfig` where guarantees and limits are decoupled and uses more proper abstractions: `ResourceQuantities` and `ResourceLimits`. Review: https://reviews.apache.org/r/70919 {noformat} > Introduce a `struct Quota` wrapper. > --- > > Key: MESOS-9807 > URL: https://issues.apache.org/jira/browse/MESOS-9807 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Meng Zhu >Assignee: Meng Zhu >Priority: Major > Labels: resource-management > > We should introduce: > struct Qutota { > ResourceQuantities guarantees; > ResourceLimits limits; > } > There are a couple of small hurdles. First, there is already a struct Quota > wrapper in "include/mesos/quota/quota.hpp", we need to deprecate that first. > Second, `ResourceQuantities` and `ResourceLimits` are right now only used in > internal headers. We probably want to move them into public header, since > this struct will also be used in allocator interface which is also in the > public header. (Looking at this line, the boundary is alreayd breached: > https://github.com/apache/mesos/blob/master/include/mesos/allocator/allocator.hpp#L41) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9820) Add `updateQuota()` method to the allocator.
[ https://issues.apache.org/jira/browse/MESOS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872745#comment-16872745 ] Meng Zhu commented on MESOS-9820: - {noformat} commit 373393bbaaeadf992c2e8d5399462ffe128eaec4 Author: Meng Zhu Date: Thu Jun 20 18:48:28 2019 -0700 Removed `setQuota` and `removeQuota` methods in the allocator. These are replaced by the `updateQuota` method. Review: https://reviews.apache.org/r/70918 {noformat} > Add `updateQuota()` method to the allocator. > > > Key: MESOS-9820 > URL: https://issues.apache.org/jira/browse/MESOS-9820 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Meng Zhu >Assignee: Meng Zhu >Priority: Major > Labels: resource-management > > This is the method that underlies the `UPDATE_QUOTA` operator call. This will > allow the allocator to set different values for guarantees and limits. > The existing `setQuota` and `removeQuota` methods in the allocator will be > deprecated. This will likely break many existing allocator tests. We should > fix and refactor tests to verify the bursting up to limits feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9820) Add `updateQuota()` method to the allocator.
[ https://issues.apache.org/jira/browse/MESOS-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872744#comment-16872744 ] Meng Zhu commented on MESOS-9820: - {noformat} commit 86affdd0b5c2208627eb194e5d02794fa264c383 Author: Meng Zhu Date: Thu Jun 20 18:09:36 2019 -0700 Refactored the allocator test to use the `updateQuota` method. This paves the way to remove `setQuota` and `removeQuota` methods. Review: https://reviews.apache.org/r/70917 {noformat} > Add `updateQuota()` method to the allocator. > > > Key: MESOS-9820 > URL: https://issues.apache.org/jira/browse/MESOS-9820 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Meng Zhu >Assignee: Meng Zhu >Priority: Major > Labels: resource-management > > This is the method that underlies the `UPDATE_QUOTA` operator call. This will > allow the allocator to set different values for guarantees and limits. > The existing `setQuota` and `removeQuota` methods in the allocator will be > deprecated. This will likely break many existing allocator tests. We should > fix and refactor tests to verify the bursting up to limits feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9854) /roles endpoint should return both guarantees and limits.
[ https://issues.apache.org/jira/browse/MESOS-9854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872742#comment-16872742 ] Meng Zhu commented on MESOS-9854: - {noformat} commit b23b4e52a24637231a85faf2416b75180cfd9063 Author: Meng Zhu m...@mesosphere.io Date: Thu Jun 20 17:17:41 2019 -0700 Made `/roles` endpoint also return quota limits. Now that guarantees are decoupled from limits, we should return limits and guarantees separately in the `/roles` endpoint. Three incompatible changes are introduced: - The `principal` field is removed. This legacy field was used to record the principal of the operator who configured the quota. So that later, if a different operator with a different principal wants to modify the quota, the action can be properly authorized. This use case has since been deprecated and the principal field will no longer be filled going forward. - Resources with zero quantity will no longer be included in the `guarantee` field. - The `guarantee` field will continue to be filled. However, since we are decoupling the quota guarantee from the limit. One can no longer assume that the limit will be the same as guarantee. A separate `limit` field is introduced. Before, the response might contain: ``` { "quota": { "guarantee": { "cpus": 1, "disk": 0, "gpus": 0, "mem": 512 }, "principal": "test-principal", "role": "foo" } } ``` After: ``` { "quota": { "guarantee": { "cpus": 1, "mem": 512 }, "limit": { "cpus": 1, "mem": 512 }, "role": "foo" } } ``` Also fixed an affected test. Review: https://reviews.apache.org/r/70915 {noformat} > /roles endpoint should return both guarantees and limits. > -- > > Key: MESOS-9854 > URL: https://issues.apache.org/jira/browse/MESOS-9854 > Project: Mesos > Issue Type: Bug >Reporter: Meng Zhu >Assignee: Meng Zhu >Priority: Major > Labels: resource-management > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9864) Add enchanced multi-role capability support to python bindings.
Andrei Sekretenko created MESOS-9864: Summary: Add enchanced multi-role capability support to python bindings. Key: MESOS-9864 URL: https://issues.apache.org/jira/browse/MESOS-9864 Project: Mesos Issue Type: Task Reporter: Andrei Sekretenko Fix For: 1.9.0 Methods of the V0 SchedulerDriver currently not supported by python bindings: * updateFramework() * reviveOffers(roles) * subscribing with non-empty suppressed roles on construction * suppressOffers(roles), if added to the driver -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-9849) Add support for per-role REVIVE / SUPPRESS to V0 scheduler driver.
[ https://issues.apache.org/jira/browse/MESOS-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Sekretenko reassigned MESOS-9849: Assignee: Andrei Sekretenko > Add support for per-role REVIVE / SUPPRESS to V0 scheduler driver. > -- > > Key: MESOS-9849 > URL: https://issues.apache.org/jira/browse/MESOS-9849 > Project: Mesos > Issue Type: Task > Components: scheduler driver >Reporter: Benjamin Mahler >Assignee: Andrei Sekretenko >Priority: Major > Labels: resource-management > > Unfortunately, there are still schedulers that are using the v0 bindings and > are unable to move to v1 before wanting to use the per-role REVIVE / SUPPRESS > calls. > We'll need to add per-role REVIVE / SUPPRESS into the v1 scheduler driver. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-9862) Agent should fail task launches while draining
[ https://issues.apache.org/jira/browse/MESOS-9862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier reassigned MESOS-9862: --- Assignee: Benjamin Bannier > Agent should fail task launches while draining > -- > > Key: MESOS-9862 > URL: https://issues.apache.org/jira/browse/MESOS-9862 > Project: Mesos > Issue Type: Task > Environment: When receiving a drain request the agent will attempt to > kill all tasks. It should also prevent launching new tasks to deal with > possible (future) race scenarios where a task launch arrives after a drain > request. > It seems on place one could insert such a check is {{Slave::__run}}. >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9863) Libprocess SSL tests may fail client certificate validation
Benno Evers created MESOS-9863: -- Summary: Libprocess SSL tests may fail client certificate validation Key: MESOS-9863 URL: https://issues.apache.org/jira/browse/MESOS-9863 Project: Mesos Issue Type: Bug Reporter: Benno Evers In the current libprocess `ssl_tests.cpp`, we create a "valid" server certificate containing the hostname returned by ::getnameinfo() for the IP of `libprocess::address()`. The libprocess IP is by default determined by a DNS lookup for the current hostname. As an example, let's assume my hostname is `poincare` and the libprocess IP is `127.0.1.1`. The tests then spawn the `ssl-client` binary as a subprocess passing the server IP as a command-line argument. The `ssl-client` binary will connect to the passed IP. Since we do not bind() before calling connect, the source IP for that connection will be automatically determined by the kernel. Continuing the example, the `ssl-client` connects to 127.0.1.1. Since it is a loopback address, the kernel will automatically select 127.0.0.1 as the source IP. On the server side, libprocess will now do a reverse DNS lookup on the source IP to determine the hostname of the connecting client. If it doesnt match the provided client certificate, the connection is rejected. In the example, libprocess will determine (127.0.0.1, 'localhost') as source ip/hostname, but the certificate contains (127.0.1.1, 'poincare'). Therefore, the connection attempt is rejected. Possible solutions to this include binding before calling connect to fix the source ip, or only running these tests with the 'openssl' hostname validation scheme after the corresponding review chain has landed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9862) Agent should fail task launches while draining
Benjamin Bannier created MESOS-9862: --- Summary: Agent should fail task launches while draining Key: MESOS-9862 URL: https://issues.apache.org/jira/browse/MESOS-9862 Project: Mesos Issue Type: Task Environment: When receiving a drain request the agent will attempt to kill all tasks. It should also prevent launching new tasks to deal with possible (future) race scenarios where a task launch arrives after a drain request. It seems on place one could insert such a check is {{Slave::__run}}. Reporter: Benjamin Bannier -- This message was sent by Atlassian JIRA (v7.6.3#76005)