[jira] [Assigned] (MESOS-9608) Refactor and Improve `class ResourceQuantity`.
[ https://issues.apache.org/jira/browse/MESOS-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Meng Zhu reassigned MESOS-9608: --- Assignee: Meng Zhu > Refactor and Improve `class ResourceQuantity`. > -- > > Key: MESOS-9608 > URL: https://issues.apache.org/jira/browse/MESOS-9608 > Project: Mesos > Issue Type: Improvement >Reporter: Meng Zhu >Assignee: Meng Zhu >Priority: Major > Labels: resource-management > > Currently, the `ResourceQuantity` only provides a minimal map interface with > no built-in arithmetic and contains operations. This makes it unwieldy. > The intention was to avoid the ambiguities between "absent-means-zero" > (guarantee like semantic) and "absent-means-infinite" (limits like semantic). > Instead of only providing a minimal interface and leave the rest to the > caller, we should provide two classes for each semantic: > - "ResourceQuantities" will have "absent-means-zero" semantic > - "ResourceLimits" will have "absent-means-infinite" semantic -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9608) Refactor and Improve `class ResourceQuantity`.
[ https://issues.apache.org/jira/browse/MESOS-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778772#comment-16778772 ] Meng Zhu commented on MESOS-9608: - `ResourceQuantity` part: https://reviews.apache.org/r/70061 https://reviews.apache.org/r/70062 https://reviews.apache.org/r/70063 > Refactor and Improve `class ResourceQuantity`. > -- > > Key: MESOS-9608 > URL: https://issues.apache.org/jira/browse/MESOS-9608 > Project: Mesos > Issue Type: Improvement >Reporter: Meng Zhu >Priority: Major > Labels: resource-management > > Currently, the `ResourceQuantity` only provides a minimal map interface with > no built-in arithmetic and contains operations. This makes it unwieldy. > The intention was to avoid the ambiguities between "absent-means-zero" > (guarantee like semantic) and "absent-means-infinite" (limits like semantic). > Instead of only providing a minimal interface and leave the rest to the > caller, we should provide two classes for each semantic: > - "ResourceQuantities" will have "absent-means-zero" semantic > - "ResourceLimits" will have "absent-means-infinite" semantic > We can have both classes derived from the current class (rename to > "ResourceMap") to save some common logic. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-4599) ReviewBot should re-verify a review chain if any of the reviews is updated
[ https://issues.apache.org/jira/browse/MESOS-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone reassigned MESOS-4599: - Assignee: Vinod Kone > ReviewBot should re-verify a review chain if any of the reviews is updated > -- > > Key: MESOS-4599 > URL: https://issues.apache.org/jira/browse/MESOS-4599 > Project: Mesos > Issue Type: Improvement > Components: reviewbot >Reporter: Vinod Kone >Assignee: Vinod Kone >Priority: Major > Labels: integration, newbie++ > > Currently reviewbot only re-verifies a review chain if the last review in the > chain is updated (new diff or new depends on field). It should also re-verify > if one of the dependent reviews in the chain is updated! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9610) Fetcher vulnerability - escaping from sandbox
[ https://issues.apache.org/jira/browse/MESOS-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778613#comment-16778613 ] Joseph Wu commented on MESOS-9610: -- This is related to the introduction of libarchive in 1.7.0. The code which creates files/directories does not sanitize paths for extraneous ".."s: https://github.com/apache/mesos/blob/4a2dbe25c7377636fe3a9d9c8576297a6db561cd/3rdparty/stout/include/stout/archiver.hpp#L128-L130 > Fetcher vulnerability - escaping from sandbox > - > > Key: MESOS-9610 > URL: https://issues.apache.org/jira/browse/MESOS-9610 > Project: Mesos > Issue Type: Bug > Components: fetcher >Affects Versions: 1.7.2 >Reporter: Mariusz Derela >Priority: Blocker > Labels: bug, security-issue, vulnerabilities > > I have noticed that there is a possibility to exploit fetcher and overwrite > any file on the agent host. > scenario to reproduce: > 1) prepare a file with any content and name a file like "../../../etc/test" > and archive it. We can use python and zipfile module to achieve that: > {code:java} > >>> import zipfile > >>> zip = zipfile.ZipFile("exploit.zip", "w") > >>> zip.writestr("../../../../../../../../../../../../etc/mariusz_was_here.txt", > >>> "some content") > >>> zip.close() > {code} > 2) prepare a service that will use our artifact (exploit.zip) > 3) run service > at the end in /etc we will get our file. As you can imagine there is a lot > possibility how we can use it. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9613) Support seccomp `unconfined` option for whitelisting.
Gilbert Song created MESOS-9613: --- Summary: Support seccomp `unconfined` option for whitelisting. Key: MESOS-9613 URL: https://issues.apache.org/jira/browse/MESOS-9613 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Gilbert Song Assignee: Andrei Budnik Support seccomp `unconfined` option for whitelisting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-7258) Provide scheduler calls to subscribe to additional roles and unsubscribe from roles.
[ https://issues.apache.org/jira/browse/MESOS-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778575#comment-16778575 ] Benjamin Mahler commented on MESOS-7258: Linking this to MESOS-9523, since adoption of per-framework minimum resource quantity offer filters requires that the framework doesn't re-subscribe. > Provide scheduler calls to subscribe to additional roles and unsubscribe from > roles. > > > Key: MESOS-7258 > URL: https://issues.apache.org/jira/browse/MESOS-7258 > Project: Mesos > Issue Type: Improvement > Components: master, scheduler api >Reporter: Benjamin Mahler >Assignee: Kapil Arya >Priority: Major > Labels: multitenancy > > The current support for schedulers to subscribe to additional roles or > unsubscribe from some of their roles requires that the scheduler obtain a new > subscription with the master which invalidates the event stream. > A more lightweight mechanism would be to provide calls for the scheduler to > subscribe to additional roles or unsubscribe from some roles such that the > existing event stream remains open and offers to the new roles arrive on the > existing event stream. E.g. > SUBSCRIBE_TO_ROLE > UNSUBSCRIBE_FROM_ROLE > One open question pertains to the terminology here, whether we would want to > avoid using "subscribe" in this context. An alternative would be: > UPDATE_FRAMEWORK_INFO > Which provides a generic mechanism for a framework to perform framework info > updates without obtaining a new event stream. > In addition, it would be easier to use if it returned 200 on success and an > error response if invalid, etc. Rather than returning 202. > *NOTE*: Not specific to this issue, but we need to figure out how to allow > the framework to not leak reservations, e.g. MESOS-7651. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9542) Hierarchical allocator check failure when an operation on a shutdown framework finishes
[ https://issues.apache.org/jira/browse/MESOS-9542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778553#comment-16778553 ] Benjamin Bannier commented on MESOS-9542: - Note that while this crash is disruptive, after the master is restarted it will reconcile with the agent in question just fine and correctly reflect its state. > Hierarchical allocator check failure when an operation on a shutdown > framework finishes > --- > > Key: MESOS-9542 > URL: https://issues.apache.org/jira/browse/MESOS-9542 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1, 1.7.0, 1.7.1, 1.8.0 >Reporter: Benjamin Bannier >Assignee: Joseph Wu >Priority: Blocker > Labels: foundations, mesosphere, mesosphere-dss-ga, > operation-feedback > Fix For: 1.8.0 > > > When a non-speculated operation like e.g., {{CREATE_DISK}} becomes terminal > after the originating framework was torn down, we run into an assertion > failure in the allocator. > {noformat} > I0129 11:55:35.764394 57857 master.cpp:11373] Updating the state of operation > 'operation' (uuid: 10a782bd-9e60-42da-90d6-c00997a25645) for framework > a4d0499b-c0d3-4abf-8458-73e595d061ce- (latest state: OPERATION_PENDING, > status update state: OPERATION_FINISHED) > F0129 11:55:35.764744 57925 hierarchical.cpp:834] Check failed: > frameworks.contains(frameworkId){noformat} > With non-speculated operations like e.g., {{CREATE_DISK}} it became possible > that operations outlive their originating framework. This was not possible > with speculated operations like {{RESERVE}} which were always applied > immediately by the master. > The master does not take this into account, but instead unconditionally calls > {{Allocator::updateAllocation}} which asserts that the framework is still > known to the allocator. > Reproducer: > * register a framework with the master. > * add a master with a resource provider. > * let the framework trigger a non-speculated operation like {{CREATE_DISK.}} > * tear down the framework before a terminal operation status update reaches > the master; this causes the master to e.g., remove the framework from the > allocator. > * let a terminal, successful operation status update reach the master > * 💥 > To solve this we should cleanup the lifetimes of operations. Since operations > can outlive their framework (unlike e.g., tasks), we probably need a > different approach here. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9564) Logrotate container logger lets tasks execute arbitrary commands in the Mesos agent's namespace
[ https://issues.apache.org/jira/browse/MESOS-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778488#comment-16778488 ] Joseph Wu commented on MESOS-9564: -- I'll be backporting to 1.5.x and beyond, but the backports will not block any of the ongoing releases since the module is optional. > Logrotate container logger lets tasks execute arbitrary commands in the Mesos > agent's namespace > --- > > Key: MESOS-9564 > URL: https://issues.apache.org/jira/browse/MESOS-9564 > Project: Mesos > Issue Type: Bug > Components: agent, modules >Reporter: Joseph Wu >Assignee: Andrei Budnik >Priority: Critical > Labels: foundations, mesosphere > Fix For: 1.8.0 > > > The non-default {{LogrotateContainerLogger}} module allows tasks to configure > sandbox log rotation (See > http://mesos.apache.org/documentation/latest/logging/#Containers ). The > {{logrotate_stdout_options}} and {{logrotate_stderr_options}} in particular > let the task specify free-form text, which is written to a configuration file > located in the task's sandbox. The module does not sanitize or check this > configuration at all. > The logger itself will eventually run {{logrotate}} against the written > configuration file, but the logger is not isolated in the same way as the > task. For both the Mesos and Docker containerizers, the logger binary will > run in the same namespace as the Mesos agent. This makes it possible to > affect files outside of the task's mount namespace. > Two modes of attack are known to be problematic: > * Changing or adding entries to the configuration file. Normally, the > configuration file contains a single file to rotate: > {code} > /path/to/sandbox/stdout { > > } > {code} > It is trivial to add text to the {{logrotate_stdout_options}} to add a new > entry: > {code} > /path/to/sandbox/stdout { > > } > /path/to/other/file/on/disk { > > } > {code} > * Logrotate's {{postrotate}} option allows for execution of arbitrary > commands. This can again be supplied with the {{logrotate_stdout_options}} > variable. > {code} > /path/to/sandbox/stdout { > postrotate > rm -rf / > endscript > } > {code} > Some potential fixes to consider: > * Overwrite the .logrotate.conf files each time. This would give only > milliseconds between writing and calling logrotate for a thirdparty to modify > the config files maliciously. This would not help if the task itself had > postrotate options in its environment variables. > * Sanitize the free-form options field in the environment variables to remove > postrotate or injection attempts like }\n/path/to/some/file\noptions{. > * Refactor parts of the Mesos isolation code path so that the logger and IO > switchboard binary live in the same namespaces as the container (instead of > the agent). This would also be nice in that the logger's CPU usage would then > be accounted for within the container's resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9612) Resource provider manager assumes all operations are triggered by frameworks
Benjamin Bannier created MESOS-9612: --- Summary: Resource provider manager assumes all operations are triggered by frameworks Key: MESOS-9612 URL: https://issues.apache.org/jira/browse/MESOS-9612 Project: Mesos Issue Type: Bug Components: agent Reporter: Benjamin Bannier When the agent tries to apply an operation to resource provider resources, it invokes {{ResourceProviderManager::applyOperation}} which in turn invokes {{ResourceProviderManagerProcess::applyOperation}}. That function currently assumes that the received message contains a valid {{FrameworkID}}, {noformat} void ResourceProviderManagerProcess::applyOperation( const ApplyOperationMessage& message) { const Offer::Operation& operation = message.operation_info(); const FrameworkID& frameworkId = message.framework_id(); // `framework_id` is `optional`. {noformat} Since {{FrameworkID}} is not a trivial proto types, but instead one with a {{required}} field {{value}}, the message composed with the {{frameworkId}} below cannot be serialized which leads to a failure below which in turn triggers a {{CHECK}} failure in the agent's function interfacing with the manager. A typical scenario where we would want to support operator API calls here is to destroy leftover persistent volumes or reservations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9611) Add `/machines` endpoint to show mapping between machines and agents
Benno Evers created MESOS-9611: -- Summary: Add `/machines` endpoint to show mapping between machines and agents Key: MESOS-9611 URL: https://issues.apache.org/jira/browse/MESOS-9611 Project: Mesos Issue Type: Improvement Reporter: Benno Evers It is currently quite hard to get information about the machines known to the master. This can result in situations that are hard to debug for silly reasons, e.g. mistyping a machine id when posting a maintenance schedule. It would be nice to have an endpoint that displays the current mapping between machine id's and agents to the user. This could become a new endpoint like `/machines` or `/machine/info`, or added as part of an existing one like `/mainenance/status`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)