[jira] [Comment Edited] (MESOS-7176) Add versioning support to network/cni isolator
[ https://issues.apache.org/jira/browse/MESOS-7176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374228#comment-16374228 ] Qian Zhang edited comment on MESOS-7176 at 7/20/18 1:19 AM: According to [CNI spec|https://github.com/containernetworking/cni/blob/master/SPEC.md#released-versions], one of the major changes introduced in CNI spec 0.3.0 is rich result type, the result type of CNI spec 0.3.0 is [https://github.com/containernetworking/cni/blob/spec-v0.3.0/SPEC.md#result|https://github.com/containernetworking/cni/blob/spec-v0.3.0/SPEC.md#result] which is different from CNI spec 0.2.0. What CNI isolator in Mesos is using is CNI spec 0.2.0, see [here|https://github.com/apache/mesos/blob/1.5.0/src/slave/containerizer/mesos/isolators/network/cni/spec.proto#L63:L67] for details. As a result, currently CNI isolator can NOT support CNI network configuration whose version is 0.3.0+, because if CNI isolator invokes a CNI plugins (suppose it also supports CNI spec 0.3.0+) with a CNI network configuration of version 0.3.0+ (see below as an example) as its input, the CNI plugin will return the result which conforms the same version of CNI spec as the input CNI network configuration (i.e., 0.3.0 in the example below), but CNI isolator will always use CNI spec 0.2.0 to parse the result (see [here|https://github.com/apache/mesos/blob/1.5.0/src/slave/containerizer/mesos/isolators/network/cni/spec.cpp#L46:L59] for details.) which will fail. {code:java} { "cniVersion": "0.3.0", "name": "dbnet", "type": "bridge", "bridge": "cni0", "ipam": { "type": "dhcp" } }{code} So I think we should improve CNI isolator to support CNI spec 0.3.0 as well, and parse the result returned by CNI plugin based on the CNI spec version of the result. was (Author: qianzhang): According to [CNI spec|https://github.com/containernetworking/cni/blob/master/SPEC.md#released-versions], one of the major changes introduced in CNI spec 0.3.0 is rich result type, the result type of CNI spec 0.3.0 is [https://github.com/containernetworking/cni/blob/spec-v0.3.0/SPEC.md#result|https://github.com/containernetworking/cni/blob/spec-v0.3.0/SPEC.md#result,] which is different from CNI spec 0.2.0. What CNI isolator in Mesos is using is CNI spec 0.2.0, see [here|https://github.com/apache/mesos/blob/1.5.0/src/slave/containerizer/mesos/isolators/network/cni/spec.proto#L63:L67] for details. As a result, currently CNI isolator can NOT support CNI network configuration whose version is 0.3.0+, because if CNI isolator invokes a CNI plugins (suppose it also supports CNI spec 0.3.0+) with a CNI network configuration of version 0.3.0+ (see below as an example) as its input, the CNI plugin will return the result which conforms the same version of CNI spec as the input CNI network configuration (i.e., 0.3.0 in the example below), but CNI isolator will always use CNI spec 0.2.0 to parse the result (see [here|https://github.com/apache/mesos/blob/1.5.0/src/slave/containerizer/mesos/isolators/network/cni/spec.cpp#L46:L59] for details.) which will fail. {code:java} { "cniVersion": "0.3.0", "name": "dbnet", "type": "bridge", "bridge": "cni0", "ipam": { "type": "dhcp" } }{code} So I think we should improve CNI isolator to support CNI spec 0.3.0 as well, and parse the result returned by CNI plugin based on the CNI spec version of the result. > Add versioning support to network/cni isolator > -- > > Key: MESOS-7176 > URL: https://issues.apache.org/jira/browse/MESOS-7176 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Avinash Sridharan >Assignee: Deepak Goel >Priority: Major > > Currently the network/cni isolator support CNI SPEC version 0.2 . The CNI > SPEC version 0.3 has already been ratified and introduces new features such > as CNI service chaining and CNI plugin capabilities. However, CNI spec > version 0.3 is incompatible with CNI spec 0.2. Hence we need to introduce > versioning support in `network/cni` isolator in order to make it backward > compatible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-5818) Port libprocess reap_tests.cpp
[ https://issues.apache.org/jira/browse/MESOS-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549979#comment-16549979 ] Andrew Schwartzmeyer commented on MESOS-5818: - These are annoying because they use the {{Fork}} and {{Exec}} constructs which we chose not to port. > Port libprocess reap_tests.cpp > -- > > Key: MESOS-5818 > URL: https://issues.apache.org/jira/browse/MESOS-5818 > Project: Mesos > Issue Type: Task >Reporter: Andrew Schwartzmeyer >Assignee: Eric Mumau >Priority: Major > Labels: libprocess, mesosphere, windows > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-5819) Port libprocess sequence_tests.cpp
[ https://issues.apache.org/jira/browse/MESOS-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Schwartzmeyer reassigned MESOS-5819: --- Assignee: Andrew Schwartzmeyer (was: Eric Mumau) > Port libprocess sequence_tests.cpp > -- > > Key: MESOS-5819 > URL: https://issues.apache.org/jira/browse/MESOS-5819 > Project: Mesos > Issue Type: Task >Reporter: Andrew Schwartzmeyer >Assignee: Andrew Schwartzmeyer >Priority: Major > Labels: libprocess, mesosphere, windows > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9098) `os::clone` returns `Failed to clone: Success` on error.
Chun-Hung Hsiao created MESOS-9098: -- Summary: `os::clone` returns `Failed to clone: Success` on error. Key: MESOS-9098 URL: https://issues.apache.org/jira/browse/MESOS-9098 Project: Mesos Issue Type: Improvement Components: stout Affects Versions: 1.7.0 Reporter: Chun-Hung Hsiao {{os::clone}} in stout is implemented in a way that when {{::clone}} fails, it would call {{::munmap}} to free the allocated stack memory, which would overwrite {{errno}}, causing it to return an {{Failed to clone: Success}} error: [https://github.com/apache/mesos/blob/master/3rdparty/stout/include/stout/os/linux.hpp#L165] We should preserve {{errno}} before calling {{::munmap}}, and return {{::munmap}}'s {{errno}} only if {{::clone}}'s {{errno}} is not zero. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9097) `libwinio_loop` must be initialized before `Socket` constructor is called
[ https://issues.apache.org/jira/browse/MESOS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549836#comment-16549836 ] Andrew Schwartzmeyer commented on MESOS-9097: - https://reviews.apache.org/r/67976/ https://reviews.apache.org/r/67977/ > `libwinio_loop` must be initialized before `Socket` constructor is called > - > > Key: MESOS-9097 > URL: https://issues.apache.org/jira/browse/MESOS-9097 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.7.0 > Environment: Windows with \{{-DENABLE_LIBWINIO=ON}} >Reporter: Andrew Schwartzmeyer >Assignee: Akash Gupta >Priority: Major > Labels: libprocess, windows > > When building with {{-DENABLE_LIBWINIO}}, initializing the Windows event loop > (specifically the pointer {{process::libwinio_loop}}) becomes a prerequisite > to creating a {{Socket}}. If it has not been initialized, then when the > {{Socket}} constructor calls {{prepare_async()}}, a null pointer is > dereferenced, leading to a hang on Windows. > This was discovered in the simple program {{test-linkee}} where a {{Socket}} > is created and used, but the entire libprocess event loop is unused. This is > temporarily fixed by calling {{process::initialize()}} early in > {{test-linkee}}, but this should probably not be required. Instead, > {{prepare_async()}} (or any use of {{libwinio_loop}} should probably > auto-initialize the event loop if required. > For now, I am adding fatal checks before a null pointer dereference. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9095) Consider including public protobuf definitions in generated jar
[ https://issues.apache.org/jira/browse/MESOS-9095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549695#comment-16549695 ] Tim Harper commented on MESOS-9095: --- Thank you for filing this, Benjamin. This will be really helpful. Currently, Marathon does what you say (we copy the Proto sources into our own code base, and check in the generated code). > Consider including public protobuf definitions in generated jar > --- > > Key: MESOS-9095 > URL: https://issues.apache.org/jira/browse/MESOS-9095 > Project: Mesos > Issue Type: Improvement > Components: java api >Reporter: Benjamin Bannier >Priority: Major > > We currently do not package public proto sources alongside other resources in > the jar. This is inconsistent with what we do e.g., for packages or {{install > rules}} on the C++ side. > Frameworks seem to work around this by forking required proto sources into > their own source code, or (slightly less worse) fetching them from > potentially poorly versioned internet resources. Both approaches can lead to > complicate dependencies between used jar and proto sources. > We should include them in the jar we publish, e.g., by declaring them as > {{resources}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-9097) `libwinio_loop` must be initialized before `Socket` constructor is called
[ https://issues.apache.org/jira/browse/MESOS-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Schwartzmeyer reassigned MESOS-9097: --- Assignee: Akash Gupta > `libwinio_loop` must be initialized before `Socket` constructor is called > - > > Key: MESOS-9097 > URL: https://issues.apache.org/jira/browse/MESOS-9097 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.7.0 > Environment: Windows with \{{-DENABLE_LIBWINIO=ON}} >Reporter: Andrew Schwartzmeyer >Assignee: Akash Gupta >Priority: Major > Labels: libprocess, windows > > When building with {{-DENABLE_LIBWINIO}}, initializing the Windows event loop > (specifically the pointer {{process::libwinio_loop}}) becomes a prerequisite > to creating a {{Socket}}. If it has not been initialized, then when the > {{Socket}} constructor calls {{prepare_async()}}, a null pointer is > dereferenced, leading to a hang on Windows. > This was discovered in the simple program {{test-linkee}} where a {{Socket}} > is created and used, but the entire libprocess event loop is unused. This is > temporarily fixed by calling {{process::initialize()}} early in > {{test-linkee}}, but this should probably not be required. Instead, > {{prepare_async()}} (or any use of {{libwinio_loop}} should probably > auto-initialize the event loop if required. > For now, I am adding fatal checks before a null pointer dereference. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9097) `libwinio_loop` must be initialized before `Socket` constructor is called
Andrew Schwartzmeyer created MESOS-9097: --- Summary: `libwinio_loop` must be initialized before `Socket` constructor is called Key: MESOS-9097 URL: https://issues.apache.org/jira/browse/MESOS-9097 Project: Mesos Issue Type: Bug Affects Versions: 1.7.0 Environment: Windows with \{{-DENABLE_LIBWINIO=ON}} Reporter: Andrew Schwartzmeyer When building with {{-DENABLE_LIBWINIO}}, initializing the Windows event loop (specifically the pointer {{process::libwinio_loop}}) becomes a prerequisite to creating a {{Socket}}. If it has not been initialized, then when the {{Socket}} constructor calls {{prepare_async()}}, a null pointer is dereferenced, leading to a hang on Windows. This was discovered in the simple program {{test-linkee}} where a {{Socket}} is created and used, but the entire libprocess event loop is unused. This is temporarily fixed by calling {{process::initialize()}} early in {{test-linkee}}, but this should probably not be required. Instead, {{prepare_async()}} (or any use of {{libwinio_loop}} should probably auto-initialize the event loop if required. For now, I am adding fatal checks before a null pointer dereference. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9096) Consider introducing a linter to check changes to tag numbers in public protos
Benjamin Bannier created MESOS-9096: --- Summary: Consider introducing a linter to check changes to tag numbers in public protos Key: MESOS-9096 URL: https://issues.apache.org/jira/browse/MESOS-9096 Project: Mesos Issue Type: Improvement Components: build Reporter: Benjamin Bannier Right now detecting breaking changes to proto messages where a tag number changes require manual inspection. It seems it should be possible to write a proto linter which would detect such changes. It could implement the following flow: * check if the proto is public, e.g., in some public include path * check that the diff contains no changes to tag numbers (same field name, similar location). We should also check whether such tools already exist and we could add them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-2633) Move implementations of Framework struct functions out of master.hpp
[ https://issues.apache.org/jira/browse/MESOS-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov reassigned MESOS-2633: -- Assignee: (was: Isabel Jimenez) > Move implementations of Framework struct functions out of master.hpp > > > Key: MESOS-2633 > URL: https://issues.apache.org/jira/browse/MESOS-2633 > Project: Mesos > Issue Type: Task > Components: master >Reporter: Joris Van Remoortere >Priority: Trivial > Labels: master, newbie, tech-debt, trivial > > To help reduce compile time and keep the header easy to read, let's move the > implementations of the Framework struct functions out of master.hpp -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9095) Consider including public protobuf definitions in generated jar
Benjamin Bannier created MESOS-9095: --- Summary: Consider including public protobuf definitions in generated jar Key: MESOS-9095 URL: https://issues.apache.org/jira/browse/MESOS-9095 Project: Mesos Issue Type: Improvement Components: java api Reporter: Benjamin Bannier We currently do not package public proto sources alongside other resources in the jar. This is inconsistent with what we do e.g., for packages or {{install rules}} on the C++ side. Frameworks seem to work around this by forking required proto sources into their own source code, or (slightly less worse) fetching them from potentially poorly versioned internet resources. Both approaches can lead to complicate dependencies between used jar and proto sources. We should include them in the jar we publish, e.g., by declaring them as {{resources}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9094) On macOS libprocess_tests fail to link when compiling with gRPC
[ https://issues.apache.org/jira/browse/MESOS-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548922#comment-16548922 ] Jan Schlicht commented on MESOS-9094: - cc [~chhsia0]. Found https://grpc.io/grpc/cpp/classgrpc_1_1_time_point.html which seems to be related. > On macOS libprocess_tests fail to link when compiling with gRPC > --- > > Key: MESOS-9094 > URL: https://issues.apache.org/jira/browse/MESOS-9094 > Project: Mesos > Issue Type: Bug > Environment: macOS 10.13.6 with clang 6.0.1. >Reporter: Jan Schlicht >Priority: Major > Fix For: 1.7.0 > > > Seems like this was introduces with commit > {{a211b4cadf289168464fc50987255d883c226e89}}. Linking {{libprocess-tests}} on > macOS with enabled gRPC fails with > {noformat} > Undefined symbols for architecture x86_64: > > "grpc::TimePoint std::__1::chrono::duration > > > >::you_need_a_specialization_of_TimePoint()", referenced from: > process::Future > > process::grpc::client::Runtime::call, > std::__1::default_delete > > > (tests::PingPong::Stub::*)(grpc::ClientContext*, tests::Ping const&, > grpc::CompletionQueue*), tests::Ping, tests::Pong, > 0>(process::grpc::client::Connection const&, > std::__1::unique_ptr, > std::__1::default_delete > > > (tests::PingPong::Stub::*&&)(grpc::ClientContext*, tests::Ping const&, > grpc::CompletionQueue*), tests::Ping&&, process::grpc::client::CallOptions > const&)::'lambda'(tests::Ping const&, bool, > grpc::CompletionQueue*)::operator()(tests::Ping const&, bool, > grpc::CompletionQueue*) const in libprocess_tests-grpc_tests.o > ld: symbol(s) not found for architecture x86_64 > clang-6.0: error: linker command failed with exit code 1 (use -v to see > invocation) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9094) On macOS libprocess_tests fail to link when compiling with gRPC
Jan Schlicht created MESOS-9094: --- Summary: On macOS libprocess_tests fail to link when compiling with gRPC Key: MESOS-9094 URL: https://issues.apache.org/jira/browse/MESOS-9094 Project: Mesos Issue Type: Bug Environment: macOS 10.13.6 with clang 6.0.1. Reporter: Jan Schlicht Fix For: 1.7.0 Seems like this was introduces with commit {{a211b4cadf289168464fc50987255d883c226e89}}. Linking {{libprocess-tests}} on macOS with enabled gRPC fails with {noformat} Undefined symbols for architecture x86_64: "grpc::TimePoint > > >::you_need_a_specialization_of_TimePoint()", referenced from: process::Future > process::grpc::client::Runtime::call, std::__1::default_delete > > (tests::PingPong::Stub::*)(grpc::ClientContext*, tests::Ping const&, grpc::CompletionQueue*), tests::Ping, tests::Pong, 0>(process::grpc::client::Connection const&, std::__1::unique_ptr, std::__1::default_delete > > (tests::PingPong::Stub::*&&)(grpc::ClientContext*, tests::Ping const&, grpc::CompletionQueue*), tests::Ping&&, process::grpc::client::CallOptions const&)::'lambda'(tests::Ping const&, bool, grpc::CompletionQueue*)::operator()(tests::Ping const&, bool, grpc::CompletionQueue*) const in libprocess_tests-grpc_tests.o ld: symbol(s) not found for architecture x86_64 clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)