[jira] [Commented] (MESOS-6835) Fix SIGBUS on ARM64/AArch64
[ https://issues.apache.org/jira/browse/MESOS-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15779461#comment-15779461 ] AndyPang commented on MESOS-6835: - I meet this question long long ago, and simply fix this question, it maybe helps: https://issues.apache.org/jira/browse/MESOS-4577 > Fix SIGBUS on ARM64/AArch64 > --- > > Key: MESOS-6835 > URL: https://issues.apache.org/jira/browse/MESOS-6835 > Project: Mesos > Issue Type: Bug > Components: security, stout >Reporter: Aaron Wood >Assignee: Aaron Wood > > Currently in the Linux launcher when the stack is allocated and prepared for > a call to clone() it is not properly aligned. This is not an issue for x86 or > x64 but for ARM64/AArch64 it is because of the requirement of having the > stack aligned to a 16 byte boundary. While x86 and x64 also expect the stack > to have a 16 byte aligned stack, it is not enforced. An explanation of the > stack and requirements for ARM64 can be found here > http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf > (specifically section 5.2.2.1 that says SP mod 16 = 0. The stack must be > quad-word aligned.) > Additionally, the way that the stack is currently allocated and passed to > clone() accidentally chops off one entry, making a stack overflow using those > missing 8 bytes a possibility. Fixing this while aligning the memory will fix > both the issue of the stack overflow issue as well as the SIGBUS crash. > https://reviews.apache.org/r/54996/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4641) Support Container Network Interface (CNI).
[ https://issues.apache.org/jira/browse/MESOS-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15779447#comment-15779447 ] Adam B commented on MESOS-4641: --- [~jieyu], [~qianzhang], [~avin...@mesosphere.io], Looks like this Epic is complete except for the bug MESOS-5533 . Can we close out this Epic and consider it "Done" for Mesos 1.2? > Support Container Network Interface (CNI). > -- > > Key: MESOS-4641 > URL: https://issues.apache.org/jira/browse/MESOS-4641 > Project: Mesos > Issue Type: Epic >Reporter: Jie Yu >Assignee: Qian Zhang > Labels: mesosphere > > CoreOS developed the Container Network Interface (CNI), a proposed standard > for configuring network interfaces for Linux containers. Many CNI plugins > (e.g., calico) have already been developed. > https://coreos.com/blog/rkt-cni-networking.html > https://github.com/appc/cni/blob/master/SPEC.md > Kubernetes supports CNI as well. > http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html > In the context of Unified Containerizer, it would be nice if we can have a > 'network/cni' isolator which will speak the CNI protocol and prepare the > network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4641) Support Container Network Interface (CNI).
[ https://issues.apache.org/jira/browse/MESOS-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4641: -- Priority: Critical (was: Major) > Support Container Network Interface (CNI). > -- > > Key: MESOS-4641 > URL: https://issues.apache.org/jira/browse/MESOS-4641 > Project: Mesos > Issue Type: Epic >Reporter: Jie Yu >Assignee: Qian Zhang >Priority: Critical > Labels: mesosphere > > CoreOS developed the Container Network Interface (CNI), a proposed standard > for configuring network interfaces for Linux containers. Many CNI plugins > (e.g., calico) have already been developed. > https://coreos.com/blog/rkt-cni-networking.html > https://github.com/appc/cni/blob/master/SPEC.md > Kubernetes supports CNI as well. > http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html > In the context of Unified Containerizer, it would be nice if we can have a > 'network/cni' isolator which will speak the CNI protocol and prepare the > network for the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5821) Clean up the thousands of compiler warnings on MSVC
[ https://issues.apache.org/jira/browse/MESOS-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5821: -- Summary: Clean up the thousands of compiler warnings on MSVC (was: Clean up the billions of compiler warnings on MSVC) > Clean up the thousands of compiler warnings on MSVC > --- > > Key: MESOS-5821 > URL: https://issues.apache.org/jira/browse/MESOS-5821 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Alex Clemmer >Assignee: Daniel Pravat > Labels: mesosphere, microsoft, slave > Fix For: 1.2.0 > > > Clean builds of Mesos on Windows will result in approximately {{5800 > Warning(s)}} or more. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6082) Add scheduler Call and Event based metrics to the master.
[ https://issues.apache.org/jira/browse/MESOS-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6082: -- Priority: Critical (was: Major) > Add scheduler Call and Event based metrics to the master. > - > > Key: MESOS-6082 > URL: https://issues.apache.org/jira/browse/MESOS-6082 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Benjamin Mahler >Assignee: Abhishek Dasgupta >Priority: Critical > > Currently, the master only has metrics for the old-style messages and these > are re-used for calls unfortunately: > {code} > // Messages from schedulers. > process::metrics::Counter messages_register_framework; > process::metrics::Counter messages_reregister_framework; > process::metrics::Counter messages_unregister_framework; > process::metrics::Counter messages_deactivate_framework; > process::metrics::Counter messages_kill_task; > process::metrics::Counter messages_status_update_acknowledgement; > process::metrics::Counter messages_resource_request; > process::metrics::Counter messages_launch_tasks; > process::metrics::Counter messages_decline_offers; > process::metrics::Counter messages_revive_offers; > process::metrics::Counter messages_suppress_offers; > process::metrics::Counter messages_reconcile_tasks; > process::metrics::Counter messages_framework_to_executor; > {code} > Now that we've introduced the Call/Event based API, we should have metrics > that reflect this. For example: > {code} > { > scheduler/calls: 100 > scheduler/calls/decline: 90, > scheduler/calls/accept: 10, > scheduler/calls/accept/operations/create: 1, > scheduler/calls/accept/operations/destroy: 0, > scheduler/calls/accept/operations/launch: 4, > scheduler/calls/accept/operations/launch_group: 2, > scheduler/calls/accept/operations/reserve: 1, > scheduler/calls/accept/operations/unreserve: 0, > scheduler/calls/kill: 0, > // etc > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6035) Add non-recursive version of cgroups::get
[ https://issues.apache.org/jira/browse/MESOS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6035: -- Component/s: cgroups > Add non-recursive version of cgroups::get > - > > Key: MESOS-6035 > URL: https://issues.apache.org/jira/browse/MESOS-6035 > Project: Mesos > Issue Type: Improvement > Components: cgroups >Reporter: haosdent >Assignee: haosdent >Priority: Minor > Fix For: 1.2.0 > > > In some cases, we only need to get the top level cgroups instead of to get > all cgroups recursively. Add a non-recursive version could help to avoid > unnecessary paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6551) Add attach/exec commands to the Mesos CLI
[ https://issues.apache.org/jira/browse/MESOS-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6551: -- Component/s: cli > Add attach/exec commands to the Mesos CLI > - > > Key: MESOS-6551 > URL: https://issues.apache.org/jira/browse/MESOS-6551 > Project: Mesos > Issue Type: Task > Components: cli >Reporter: Kevin Klues >Assignee: Kevin Klues > Labels: debugging, mesosphere > > After all of this support has landed, we need to update the Mesos CLI to > implement {{attach}} and {{exec}} functionality as outlined in the Design Doc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6639) Update 'io::redirect()' to take an optional vector of callback hooks.
[ https://issues.apache.org/jira/browse/MESOS-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6639: -- Summary: Update 'io::redirect()' to take an optional vector of callback hooks. (was: Updat 'io::redirect()' to take an optional vector of callback hooks.) > Update 'io::redirect()' to take an optional vector of callback hooks. > - > > Key: MESOS-6639 > URL: https://issues.apache.org/jira/browse/MESOS-6639 > Project: Mesos > Issue Type: Improvement >Reporter: Kevin Klues >Assignee: Kevin Klues > > These callback hooks should be invoked before passing any data read from > the 'from' file descriptor on to the 'to' file descriptor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6551) Add attach/exec commands to the Mesos CLI
[ https://issues.apache.org/jira/browse/MESOS-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6551: -- Priority: Critical (was: Major) > Add attach/exec commands to the Mesos CLI > - > > Key: MESOS-6551 > URL: https://issues.apache.org/jira/browse/MESOS-6551 > Project: Mesos > Issue Type: Task > Components: cli >Reporter: Kevin Klues >Assignee: Kevin Klues >Priority: Critical > Labels: debugging, mesosphere > > After all of this support has landed, we need to update the Mesos CLI to > implement {{attach}} and {{exec}} functionality as outlined in the Design Doc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6506) Show framework info in /state and /frameworks for frameworks that have orphan tasks
[ https://issues.apache.org/jira/browse/MESOS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6506: -- Component/s: master > Show framework info in /state and /frameworks for frameworks that have orphan > tasks > --- > > Key: MESOS-6506 > URL: https://issues.apache.org/jira/browse/MESOS-6506 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Vinod Kone >Assignee: Vinod Kone > > Since Mesos 1.0, the master has access to FrameworkInfo of frameworks that > have orphan tasks. So we could expose this information in /state and > /frameworks endpoints. Note that this information is already present in the > v1 operator API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5967) Add support for 'docker image inspect' in our docker abstraction.
[ https://issues.apache.org/jira/browse/MESOS-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5967: -- Component/s: docker containerization > Add support for 'docker image inspect' in our docker abstraction. > - > > Key: MESOS-5967 > URL: https://issues.apache.org/jira/browse/MESOS-5967 > Project: Mesos > Issue Type: Improvement > Components: containerization, docker >Reporter: Kevin Klues >Assignee: Guangya Liu > Labels: gpu > > Docker's command line tool for {{docker inspect}} can take either a > {{container}}, an {{image}}, or a {{task}} as its argument, and return a JSON > array containing low-level information about that container, image or task. > However, the current {{docker inspect}} support in our docker abstraction > only supports inspecting containers (not images or tasks). We should expand > this to (at least) support images. > In particular, this additional functionality is motivated by the upcoming GPU > support, which needs to inspect the labels in a docker image to decide if it > should inject the required Nvidia volumes into a container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6377) Complete all unit-tests required to strengthen test on CNI port-mapper plugin.
[ https://issues.apache.org/jira/browse/MESOS-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6377: -- Component/s: tests network > Complete all unit-tests required to strengthen test on CNI port-mapper plugin. > -- > > Key: MESOS-6377 > URL: https://issues.apache.org/jira/browse/MESOS-6377 > Project: Mesos > Issue Type: Epic > Components: network, tests > Environment: Linux >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > Labels: mesosphere > > This epic captures all the unit-test tickets that we need to complete to get > better test-coverage for the CNI port-mapper plugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6348) Allow `network/cni` isolator unit-tests to run with CNI plugins
[ https://issues.apache.org/jira/browse/MESOS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6348: -- Component/s: tests network > Allow `network/cni` isolator unit-tests to run with CNI plugins > > > Key: MESOS-6348 > URL: https://issues.apache.org/jira/browse/MESOS-6348 > Project: Mesos > Issue Type: Task > Components: network, tests >Reporter: Avinash Sridharan >Assignee: Avinash Sridharan > Labels: mesosphere > > Currently, we don't have any infrastructure to allow for CNI plugins to be > used in `network/cni` isolator unit-tests. This forces us to mock CNI plugins > that don't use new network namespaces leading to very restricting form of > unit-tests. > Especially for port-mapper plugin, in order to test its DNAT functionality it > will be very useful if we run the containers in separate network namespace > requiring an actual CNI plugin. > The proposal is there to introduce a test filter called CNIPLUGIN, that gets > set when CNI_PATH env var is set. Tests using the CNIPLUGIN filter can then > use actual CNI plugins in their tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6335) Add user doc for task group tasks
[ https://issues.apache.org/jira/browse/MESOS-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6335: -- Component/s: documentation > Add user doc for task group tasks > - > > Key: MESOS-6335 > URL: https://issues.apache.org/jira/browse/MESOS-6335 > Project: Mesos > Issue Type: Documentation > Components: documentation >Reporter: Vinod Kone >Assignee: Gilbert Song > Fix For: 1.2.0 > > > Committed some basic documentation. So moving this to pods-improvements epic > and targeting this for 1.2.0. I would like this to track the more > comprehensive documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6494) Clean up the flags parsing in the executors.
[ https://issues.apache.org/jira/browse/MESOS-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6494: -- Component/s: executor > Clean up the flags parsing in the executors. > > > Key: MESOS-6494 > URL: https://issues.apache.org/jira/browse/MESOS-6494 > Project: Mesos > Issue Type: Improvement > Components: executor >Reporter: Gastón Kleiman >Assignee: Gastón Kleiman > Labels: mesosphere > > The current executors and the executor libraries use a mix of `stout::flags` > and `os::getenv` to parse flags, leading to a lot of unnecessary and > sometimes duplicated code. > This should be cleaned up, using only {{stout::flags}} to parse flags. > Environment variables should be used for the flags that are common to ALL the > executors (listed in the Executor HTTP API doc). > Command line parameters should be used for flags that apply only to > individual executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6032) Add infrastructure for unit tests in the new python-based CLI.
[ https://issues.apache.org/jira/browse/MESOS-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6032: -- Component/s: tests cli > Add infrastructure for unit tests in the new python-based CLI. > -- > > Key: MESOS-6032 > URL: https://issues.apache.org/jira/browse/MESOS-6032 > Project: Mesos > Issue Type: Task > Components: cli, tests >Reporter: Kevin Klues >Assignee: Kevin Klues > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6430) The python linter doesn't rebuild the virtual environment before linting when "pip-requirements.txt" has changed
[ https://issues.apache.org/jira/browse/MESOS-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6430: -- Component/s: build > The python linter doesn't rebuild the virtual environment before linting when > "pip-requirements.txt" has changed > > > Key: MESOS-6430 > URL: https://issues.apache.org/jira/browse/MESOS-6430 > Project: Mesos > Issue Type: Bug > Components: build >Reporter: Kevin Klues >Assignee: Kevin Klues > > We need to detect if "pip-requirements.txt" changes and rebuild the virtual > environment if it has. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6264) Investigate the high memory usage of the default executor.
[ https://issues.apache.org/jira/browse/MESOS-6264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6264: -- Component/s: executor > Investigate the high memory usage of the default executor. > -- > > Key: MESOS-6264 > URL: https://issues.apache.org/jira/browse/MESOS-6264 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Anand Mazumdar > Labels: mesosphere > Attachments: pmap_output_for_the_default_executor.txt > > > It seems that a default executor with two sleep tasks is using ~32 mb on > average and can sometimes lead to it being killed for some tests like > {{SlaveRecoveryTest/0.ROOT_CGROUPS_ReconnectDefaultExecutor}} on our internal > CI. Attached the {{pmap}} output for the default executor. Please note that > the command executor memory usage is also pretty high (~26 mb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6280) Task group executor should support command health checks.
[ https://issues.apache.org/jira/browse/MESOS-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6280: -- Component/s: executor > Task group executor should support command health checks. > - > > Key: MESOS-6280 > URL: https://issues.apache.org/jira/browse/MESOS-6280 > Project: Mesos > Issue Type: Improvement > Components: executor >Affects Versions: 1.1.0 >Reporter: Alexander Rukletsov >Assignee: Gastón Kleiman >Priority: Critical > Labels: health-check, mesosphere > > Currently, the default (aka pod) executor supports only HTTP and TCP health > checks. We should also support command health checks as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6405) Benchmark call ingestion path on the Mesos master.
[ https://issues.apache.org/jira/browse/MESOS-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6405: -- Component/s: scheduler api master > Benchmark call ingestion path on the Mesos master. > -- > > Key: MESOS-6405 > URL: https://issues.apache.org/jira/browse/MESOS-6405 > Project: Mesos > Issue Type: Improvement > Components: master, scheduler api >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > [~drexin] reported on the user mailing > [list|http://mail-archives.apache.org/mod_mbox/mesos-user/201610.mbox/%3C6B42E374-9AB7--A315-A6558753E08B%40apple.com%3E] > that there seems to be a significant regression in performance on the call > ingestion path on the Mesos master wrt to the scheduler driver (v0 API). > We should create a benchmark to first get a sense of the numbers and then go > about fixing the performance issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6623) Re-enable tests impacted by request streaming support
[ https://issues.apache.org/jira/browse/MESOS-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-6623: -- Component/s: tests HTTP API > Re-enable tests impacted by request streaming support > - > > Key: MESOS-6623 > URL: https://issues.apache.org/jira/browse/MESOS-6623 > Project: Mesos > Issue Type: Bug > Components: HTTP API, tests >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar >Priority: Critical > Labels: mesosphere > > We added support for HTTP request streaming in libprocess as part of > MESOS-6466. However, this broke a few tests that relied on HTTP request > filtering since the handlers no longer have access to the body of the request > when {{visit()}} is invoked. We would need to revisit how we do HTTP request > filtering and then re-enable these tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5455) Transition away from temporary build variables
[ https://issues.apache.org/jira/browse/MESOS-5455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Clemmer updated MESOS-5455: Labels: mesosphere microsoft (was: mesosphere) > Transition away from temporary build variables > -- > > Key: MESOS-5455 > URL: https://issues.apache.org/jira/browse/MESOS-5455 > Project: Mesos > Issue Type: Bug > Components: cmake >Reporter: Alex Clemmer >Assignee: Alex Clemmer > Labels: mesosphere, microsoft > > Right now the CMake build system has a bunch of stub values for variables > like `BUILD_DATE`. > We should replace these with "real" values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)