[jira] [Commented] (MESOS-9760) Decouple Docker runtime isolator manifest configuration from image provider
[ https://issues.apache.org/jira/browse/MESOS-9760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831311#comment-16831311 ] Jacob Janco commented on MESOS-9760: https://reviews.apache.org/r/70581/ > Decouple Docker runtime isolator manifest configuration from image provider > --- > > Key: MESOS-9760 > URL: https://issues.apache.org/jira/browse/MESOS-9760 > Project: Mesos > Issue Type: Improvement > Components: agent, docker >Reporter: Jacob Janco >Assignee: Jacob Janco >Priority: Minor > > The Docker runtime isolator propagates manifest configuration metadata. This > is not always desirable, e.g. a customer may want to ignore the defined > WORKDIR/ENV defined in the manifest. > We propose adding a flag `–docker_ignore_manifest_config` to decouple the > choice of Docker as an image provider and the need for the Docker runtime > isolator. In the background, this flag will conditionally ignore the work of > the Docker runtime isolator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9698) DroppedOperationStatusUpdate test is flaky
[ https://issues.apache.org/jira/browse/MESOS-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831308#comment-16831308 ] Greg Mann commented on MESOS-9698: -- It looks like this failure occurs because two {{ReconcileOperationsMessages}} are sent from the master to the agent for the same operation. This leads to two OPERATION_DROPPED updates for the same operation; after the first one is processed by the master, the operation is removed. Then, when the second one is received, we hit [this block|https://github.com/apache/mesos/blob/3a0c2fa2ae338eeab292e7c0d3dae55b66f5886d/src/master/master.cpp#L8817-L8836] and incorrectly assume that the framework ID is SOME. > DroppedOperationStatusUpdate test is flaky > -- > > Key: MESOS-9698 > URL: https://issues.apache.org/jira/browse/MESOS-9698 > Project: Mesos > Issue Type: Bug > Environment: Debian 8 >Reporter: Andrei Budnik >Assignee: Greg Mann >Priority: Major > Labels: flaky-test, foundations, mesosphere, operation-feedback > Attachments: DroppedOperationStatusUpdate-badrun1.txt > > > DroppedOperationStatusUpdate test failed with the following backtrace: > {code:java} > 06:50:21 mesos-tests: ../../3rdparty/stout/include/stout/option.hpp:120: T& > Option::get() & [with T = mesos::FrameworkID]: Assertion `isSome()' failed. > 06:50:21 *** Aborted at 1554360620 (unix time) try "date -d @1554360620" if > you are using GNU date *** > 06:50:21 I0404 06:50:20.663539 16308 scheduler.cpp:847] Enqueuing event > OFFERS received from http://172.16.10.126:42550/master/api/v1/scheduler > 06:50:21 I0404 06:50:20.663702 16308 scheduler.cpp:847] Enqueuing event > UPDATE_OPERATION_STATUS received from > http://172.16.10.126:42550/master/api/v1/scheduler > 06:50:21 PC: @ 0x7fa726c66067 (unknown) > 06:50:21 *** SIGABRT (@0x6fad) received by PID 28589 (TID 0x7fa71dfc9700) > from PID 28589; stack trace: *** > 06:50:21 @ 0x7fa726feb890 (unknown) > 06:50:21 @ 0x7fa726c66067 (unknown) > 06:50:21 @ 0x7fa726c67448 (unknown) > 06:50:21 @ 0x7fa726c5f266 (unknown) > 06:50:21 @ 0x7fa726c5f312 (unknown) > 06:50:21 @ 0x7fa72a1be89a > _ZNR6OptionIN5mesos11FrameworkIDEE3getEv.part.500 > 06:50:21 @ 0x7fa72a54002a > mesos::internal::master::Master::updateOperationStatus() > 06:50:21 @ 0x7fa72a5c583b ProtobufProcess<>::_handlerMutM<>() > 06:50:21 @ 0x7fa72a58e680 ProtobufProcess<>::consume() > 06:50:21 @ 0x7fa72a50cf04 mesos::internal::master::Master::_consume() > 06:50:21 @ 0x7fa72a52975d mesos::internal::master::Master::consume() > 06:50:21 @ 0x7fa72b60b1d3 process::ProcessManager::resume() > 06:50:21 @ 0x7fa72b610ea6 > _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv > 06:50:21 @ 0x7fa7277c6970 (unknown) > 06:50:21 @ 0x7fa726fe4064 start_thread > 06:50:21 @ 0x7fa726d1962d (unknown) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9760) Decouple Docker runtime isolator manifest configuration from image provider
Jacob Janco created MESOS-9760: -- Summary: Decouple Docker runtime isolator manifest configuration from image provider Key: MESOS-9760 URL: https://issues.apache.org/jira/browse/MESOS-9760 Project: Mesos Issue Type: Improvement Components: agent, docker Reporter: Jacob Janco Assignee: Jacob Janco The Docker runtime isolator propagates manifest configuration metadata. This is not always desirable, e.g. a customer may want to ignore the defined WORKDIR/ENV defined in the manifest. We propose adding a flag `–docker_ignore_manifest_config` to decouple the choice of Docker as an image provider and the need for the Docker runtime isolator. In the background, this flag will conditionally ignore the work of the Docker runtime isolator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9619) Mesos Master Crashes with Launch Group when using Port Resources
[ https://issues.apache.org/jira/browse/MESOS-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831259#comment-16831259 ] Benjamin Mahler commented on MESOS-9619: Updated test: https://reviews.apache.org/r/70580/ > Mesos Master Crashes with Launch Group when using Port Resources > > > Key: MESOS-9619 > URL: https://issues.apache.org/jira/browse/MESOS-9619 > Project: Mesos > Issue Type: Bug > Components: allocation >Affects Versions: 1.4.3, 1.7.1 > Environment: > Testing in both Mesos 1.4.3 and Mesos 1.7.1 >Reporter: Nimi Wariboko Jr. >Assignee: Greg Mann >Priority: Critical > Labels: foundations, master, mesosphere > Fix For: 1.5.4, 1.6.3, 1.7.3, 1.8.0 > > Attachments: mesos-master.log, mesos-master.snippet.log > > > Original Issue: > [https://lists.apache.org/thread.html/979c8799d128ad0c436b53f2788568212f97ccf324933524f1b4d189@%3Cuser.mesos.apache.org%3E] > When the ports resources is removed, Mesos functions normally (I'm able to > launch the task as many times as possible, while it always fails continually). > Attached is a snippet of the mesos master log from OFFER to crash. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9738) Add per-framework metrics for offer round trip time.
[ https://issues.apache.org/jira/browse/MESOS-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831215#comment-16831215 ] Benjamin Bannier commented on MESOS-9738: - Thanks for filing this [~mzhu]. Like we discussed offline, we recently had a situation where we suspected problematic hoarding and offer handling, and exposing this directly would help to drill down into what could cause certain allocation issues. It is still not clear to me how an operator would make use of that information to deal with problematic, non-cooperative behavior though outside of changing framework weights to reduce the number of offers to such frameworks. I recently filed MESOS-9748 which proposes one automatic but only partial mitigation strategy, but I feel there might be more we could do here, even with existing APIs. > Add per-framework metrics for offer round trip time. > > > Key: MESOS-9738 > URL: https://issues.apache.org/jira/browse/MESOS-9738 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Meng Zhu >Priority: Major > Labels: mesosphere, resource-management > > This would provide more insights into framework responsiveness, help detect > worrisome behaviors such as offer timeout, offer hoarding and etc. > One tricky thing is that we need to take Mesos's own queuing delay into > consideration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9759) Log required quota headroom and available quota headroom in the allocator.
Meng Zhu created MESOS-9759: --- Summary: Log required quota headroom and available quota headroom in the allocator. Key: MESOS-9759 URL: https://issues.apache.org/jira/browse/MESOS-9759 Project: Mesos Issue Type: Improvement Components: allocation Reporter: Meng Zhu This would ease the debugging of allocation issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-9759) Log required quota headroom and available quota headroom in the allocator.
[ https://issues.apache.org/jira/browse/MESOS-9759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Meng Zhu reassigned MESOS-9759: --- Assignee: Meng Zhu > Log required quota headroom and available quota headroom in the allocator. > -- > > Key: MESOS-9759 > URL: https://issues.apache.org/jira/browse/MESOS-9759 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Meng Zhu >Assignee: Meng Zhu >Priority: Major > Labels: resource-management > > This would ease the debugging of allocation issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9758) Take ports out of the roles endpoints.
Meng Zhu created MESOS-9758: --- Summary: Take ports out of the roles endpoints. Key: MESOS-9758 URL: https://issues.apache.org/jira/browse/MESOS-9758 Project: Mesos Issue Type: Bug Reporter: Meng Zhu It does not make sense to combine ports across agents. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9751) Build mesos example not found
[ https://issues.apache.org/jira/browse/MESOS-9751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831178#comment-16831178 ] Joseph Wu commented on MESOS-9751: -- The important mesos binaries will be generated under {{/src/mesos-*}}. The {{src/examples/}} folder only contains some shared libraries used by the example frameworks. > Build mesos example not found > - > > Key: MESOS-9751 > URL: https://issues.apache.org/jira/browse/MESOS-9751 > Project: Mesos > Issue Type: Bug > Components: build >Affects Versions: 1.6.1 >Reporter: darion yaphet >Priority: Major > > I try to build mesos from source code using make I think it should be build > out a binary under src/examples, but I don't find it . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9757) Design doc for container debug endpoint.
Gilbert Song created MESOS-9757: --- Summary: Design doc for container debug endpoint. Key: MESOS-9757 URL: https://issues.apache.org/jira/browse/MESOS-9757 Project: Mesos Issue Type: Task Components: containerization Reporter: Gilbert Song -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9756) Introduce a container debug endpoint.
Gilbert Song created MESOS-9756: --- Summary: Introduce a container debug endpoint. Key: MESOS-9756 URL: https://issues.apache.org/jira/browse/MESOS-9756 Project: Mesos Issue Type: Epic Components: containerization Reporter: Gilbert Song -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9755) Update Protobuf Library to support JDK 9+
Kaiwalya Joshi created MESOS-9755: - Summary: Update Protobuf Library to support JDK 9+ Key: MESOS-9755 URL: https://issues.apache.org/jira/browse/MESOS-9755 Project: Mesos Issue Type: Wish Reporter: Kaiwalya Joshi We're noticing the following warning emitted by the JVM on JDK9+ for Google Protobuf _v3.5.0_ {code} WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil (file:/home/kjoshi/.gradle/caches/modules-2/files-2.1/com.google.protobuf/protobuf-java/3.5.0/200fb936907fbab5e521d148026f6033d4aa539e/protobuf-java-3.5.0.jar) to field java.nio.Buffer.address WARNING: Please consider reporting this to the maintainers of com.google.protobuf.UnsafeUtil {code} This warning is fixed in ProtoBuf versions [_v3.7.0_ and above|https://github.com/protocolbuffers/protobuf/releases/tag/v3.7.0]. As the current access warning can turn into an access violation in later versions of the JDK, we're requesting Mesos to update to a version of ProtoBuf that incorporates the needed fixes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9754) Design doc for agent draining
Greg Mann created MESOS-9754: Summary: Design doc for agent draining Key: MESOS-9754 URL: https://issues.apache.org/jira/browse/MESOS-9754 Project: Mesos Issue Type: Task Reporter: Greg Mann -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9753) Agent Draining
Greg Mann created MESOS-9753: Summary: Agent Draining Key: MESOS-9753 URL: https://issues.apache.org/jira/browse/MESOS-9753 Project: Mesos Issue Type: Epic Reporter: Greg Mann This epic holds tickets related to maintenance primitive improvements which facilitate draining of agent nodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9752) ./configure fails in certain strange Cyrus SASL setups
David Gilman created MESOS-9752: --- Summary: ./configure fails in certain strange Cyrus SASL setups Key: MESOS-9752 URL: https://issues.apache.org/jira/browse/MESOS-9752 Project: Mesos Issue Type: Task Environment: MacOS X 10.13 Cyrus SASL 2.1.27 installed through MacPorts Reporter: David Gilman I have an installation of Cyrus SASL that, for some unknown reason, has duplicated SASL mechanisms installed. The crammd5_installed.c will print out "found" for each CRAM-MD5 mechanism set up, resulting in output of "foundfound" (once for each CRAM-MD5) which fails the Mesos ./configure test which expects just "found". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9751) Build mesos example not found
darion yaphet created MESOS-9751: Summary: Build mesos example not found Key: MESOS-9751 URL: https://issues.apache.org/jira/browse/MESOS-9751 Project: Mesos Issue Type: Bug Components: build Affects Versions: 1.6.1 Reporter: darion yaphet I try to build mesos from source code using make I think it should be build out a binary under src/examples, but I don't find it . -- This message was sent by Atlassian JIRA (v7.6.3#76005)