[jira] [Commented] (MESOS-9760) Decouple Docker runtime isolator manifest configuration from image provider

2019-05-01 Thread Jacob Janco (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831311#comment-16831311
 ] 

Jacob Janco commented on MESOS-9760:


https://reviews.apache.org/r/70581/

> Decouple Docker runtime isolator manifest configuration from image provider
> ---
>
> Key: MESOS-9760
> URL: https://issues.apache.org/jira/browse/MESOS-9760
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent, docker
>Reporter: Jacob Janco
>Assignee: Jacob Janco
>Priority: Minor
>
> The Docker runtime isolator propagates manifest configuration metadata. This 
> is not always desirable, e.g. a customer may want to ignore the defined 
> WORKDIR/ENV defined in the manifest. 
> We propose adding a flag `–docker_ignore_manifest_config` to decouple the 
> choice of Docker as an image provider and the need for the Docker runtime 
> isolator. In the background, this flag will conditionally ignore the work of 
> the Docker runtime isolator. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9698) DroppedOperationStatusUpdate test is flaky

2019-05-01 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831308#comment-16831308
 ] 

Greg Mann commented on MESOS-9698:
--

It looks like this failure occurs because two {{ReconcileOperationsMessages}} 
are sent from the master to the agent for the same operation. This leads to two 
OPERATION_DROPPED updates for the same operation; after the first one is 
processed by the master, the operation is removed. Then, when the second one is 
received, we hit [this 
block|https://github.com/apache/mesos/blob/3a0c2fa2ae338eeab292e7c0d3dae55b66f5886d/src/master/master.cpp#L8817-L8836]
 and incorrectly assume that the framework ID is SOME.

> DroppedOperationStatusUpdate test is flaky
> --
>
> Key: MESOS-9698
> URL: https://issues.apache.org/jira/browse/MESOS-9698
> Project: Mesos
>  Issue Type: Bug
> Environment: Debian 8
>Reporter: Andrei Budnik
>Assignee: Greg Mann
>Priority: Major
>  Labels: flaky-test, foundations, mesosphere, operation-feedback
> Attachments: DroppedOperationStatusUpdate-badrun1.txt
>
>
> DroppedOperationStatusUpdate test failed with the following backtrace:
> {code:java}
> 06:50:21 mesos-tests: ../../3rdparty/stout/include/stout/option.hpp:120: T& 
> Option::get() & [with T = mesos::FrameworkID]: Assertion `isSome()' failed.
> 06:50:21 *** Aborted at 1554360620 (unix time) try "date -d @1554360620" if 
> you are using GNU date ***
> 06:50:21 I0404 06:50:20.663539 16308 scheduler.cpp:847] Enqueuing event 
> OFFERS received from http://172.16.10.126:42550/master/api/v1/scheduler
> 06:50:21 I0404 06:50:20.663702 16308 scheduler.cpp:847] Enqueuing event 
> UPDATE_OPERATION_STATUS received from 
> http://172.16.10.126:42550/master/api/v1/scheduler
> 06:50:21 PC: @ 0x7fa726c66067 (unknown)
> 06:50:21 *** SIGABRT (@0x6fad) received by PID 28589 (TID 0x7fa71dfc9700) 
> from PID 28589; stack trace: ***
> 06:50:21 @ 0x7fa726feb890 (unknown)
> 06:50:21 @ 0x7fa726c66067 (unknown)
> 06:50:21 @ 0x7fa726c67448 (unknown)
> 06:50:21 @ 0x7fa726c5f266 (unknown)
> 06:50:21 @ 0x7fa726c5f312 (unknown)
> 06:50:21 @ 0x7fa72a1be89a 
> _ZNR6OptionIN5mesos11FrameworkIDEE3getEv.part.500
> 06:50:21 @ 0x7fa72a54002a 
> mesos::internal::master::Master::updateOperationStatus()
> 06:50:21 @ 0x7fa72a5c583b ProtobufProcess<>::_handlerMutM<>()
> 06:50:21 @ 0x7fa72a58e680 ProtobufProcess<>::consume()
> 06:50:21 @ 0x7fa72a50cf04 mesos::internal::master::Master::_consume()
> 06:50:21 @ 0x7fa72a52975d mesos::internal::master::Master::consume()
> 06:50:21 @ 0x7fa72b60b1d3 process::ProcessManager::resume()
> 06:50:21 @ 0x7fa72b610ea6 
> _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv
> 06:50:21 @ 0x7fa7277c6970 (unknown)
> 06:50:21 @ 0x7fa726fe4064 start_thread
> 06:50:21 @ 0x7fa726d1962d (unknown)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9760) Decouple Docker runtime isolator manifest configuration from image provider

2019-05-01 Thread Jacob Janco (JIRA)
Jacob Janco created MESOS-9760:
--

 Summary: Decouple Docker runtime isolator manifest configuration 
from image provider
 Key: MESOS-9760
 URL: https://issues.apache.org/jira/browse/MESOS-9760
 Project: Mesos
  Issue Type: Improvement
  Components: agent, docker
Reporter: Jacob Janco
Assignee: Jacob Janco


The Docker runtime isolator propagates manifest configuration metadata. This is 
not always desirable, e.g. a customer may want to ignore the defined 
WORKDIR/ENV defined in the manifest. 

We propose adding a flag `–docker_ignore_manifest_config` to decouple the 
choice of Docker as an image provider and the need for the Docker runtime 
isolator. In the background, this flag will conditionally ignore the work of 
the Docker runtime isolator. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9619) Mesos Master Crashes with Launch Group when using Port Resources

2019-05-01 Thread Benjamin Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831259#comment-16831259
 ] 

Benjamin Mahler commented on MESOS-9619:


Updated test: https://reviews.apache.org/r/70580/

> Mesos Master Crashes with Launch Group when using Port Resources
> 
>
> Key: MESOS-9619
> URL: https://issues.apache.org/jira/browse/MESOS-9619
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Affects Versions: 1.4.3, 1.7.1
> Environment:  
> Testing in both Mesos 1.4.3 and Mesos 1.7.1
>Reporter: Nimi Wariboko Jr.
>Assignee: Greg Mann
>Priority: Critical
>  Labels: foundations, master, mesosphere
> Fix For: 1.5.4, 1.6.3, 1.7.3, 1.8.0
>
> Attachments: mesos-master.log, mesos-master.snippet.log
>
>
> Original Issue: 
> [https://lists.apache.org/thread.html/979c8799d128ad0c436b53f2788568212f97ccf324933524f1b4d189@%3Cuser.mesos.apache.org%3E]
>  When the ports resources is removed, Mesos functions normally (I'm able to 
> launch the task as many times as possible, while it always fails continually).
> Attached is a snippet of the mesos master log from OFFER to crash.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9738) Add per-framework metrics for offer round trip time.

2019-05-01 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831215#comment-16831215
 ] 

Benjamin Bannier commented on MESOS-9738:
-

Thanks for filing this [~mzhu]. Like we discussed offline, we recently had a 
situation where we suspected problematic hoarding and offer handling, and 
exposing this directly would help to drill down into what could cause certain 
allocation issues. It is still not clear to me how an operator would make use 
of that information to deal with problematic, non-cooperative behavior though 
outside of changing framework weights to reduce the number of offers to such 
frameworks. I recently filed MESOS-9748 which proposes one automatic but only 
partial mitigation strategy, but I feel there might be more we could do here, 
even with existing APIs.

> Add per-framework metrics for offer round trip time.
> 
>
> Key: MESOS-9738
> URL: https://issues.apache.org/jira/browse/MESOS-9738
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Meng Zhu
>Priority: Major
>  Labels: mesosphere, resource-management
>
> This would provide more insights into framework responsiveness, help detect 
> worrisome behaviors such as offer timeout, offer hoarding and etc.
> One tricky thing is that we need to take Mesos's own queuing delay into 
> consideration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9759) Log required quota headroom and available quota headroom in the allocator.

2019-05-01 Thread Meng Zhu (JIRA)
Meng Zhu created MESOS-9759:
---

 Summary: Log required quota headroom and available quota headroom 
in the allocator.
 Key: MESOS-9759
 URL: https://issues.apache.org/jira/browse/MESOS-9759
 Project: Mesos
  Issue Type: Improvement
  Components: allocation
Reporter: Meng Zhu


This would ease the debugging of allocation issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9759) Log required quota headroom and available quota headroom in the allocator.

2019-05-01 Thread Meng Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Meng Zhu reassigned MESOS-9759:
---

Assignee: Meng Zhu

> Log required quota headroom and available quota headroom in the allocator.
> --
>
> Key: MESOS-9759
> URL: https://issues.apache.org/jira/browse/MESOS-9759
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Meng Zhu
>Assignee: Meng Zhu
>Priority: Major
>  Labels: resource-management
>
> This would ease the debugging of allocation issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9758) Take ports out of the roles endpoints.

2019-05-01 Thread Meng Zhu (JIRA)
Meng Zhu created MESOS-9758:
---

 Summary: Take ports out of the roles endpoints.
 Key: MESOS-9758
 URL: https://issues.apache.org/jira/browse/MESOS-9758
 Project: Mesos
  Issue Type: Bug
Reporter: Meng Zhu


It does not make sense to combine ports across agents.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9751) Build mesos example not found

2019-05-01 Thread Joseph Wu (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831178#comment-16831178
 ] 

Joseph Wu commented on MESOS-9751:
--

The important mesos binaries will be generated under 
{{/src/mesos-*}}.  The {{src/examples/}} folder only contains 
some shared libraries used by the example frameworks.

> Build mesos example not found
> -
>
> Key: MESOS-9751
> URL: https://issues.apache.org/jira/browse/MESOS-9751
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.6.1
>Reporter: darion yaphet
>Priority: Major
>
> I try to build mesos from source code using make I think it should be build 
> out a binary under src/examples, but I don't find it .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9757) Design doc for container debug endpoint.

2019-05-01 Thread Gilbert Song (JIRA)
Gilbert Song created MESOS-9757:
---

 Summary: Design doc for container debug endpoint.
 Key: MESOS-9757
 URL: https://issues.apache.org/jira/browse/MESOS-9757
 Project: Mesos
  Issue Type: Task
  Components: containerization
Reporter: Gilbert Song






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9756) Introduce a container debug endpoint.

2019-05-01 Thread Gilbert Song (JIRA)
Gilbert Song created MESOS-9756:
---

 Summary: Introduce a container debug endpoint.
 Key: MESOS-9756
 URL: https://issues.apache.org/jira/browse/MESOS-9756
 Project: Mesos
  Issue Type: Epic
  Components: containerization
Reporter: Gilbert Song






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9755) Update Protobuf Library to support JDK 9+

2019-05-01 Thread Kaiwalya Joshi (JIRA)
Kaiwalya Joshi created MESOS-9755:
-

 Summary: Update Protobuf Library to support JDK 9+
 Key: MESOS-9755
 URL: https://issues.apache.org/jira/browse/MESOS-9755
 Project: Mesos
  Issue Type: Wish
Reporter: Kaiwalya Joshi


We're noticing the following warning emitted by the JVM on JDK9+ for Google 
Protobuf _v3.5.0_

{code}
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil 
(file:/home/kjoshi/.gradle/caches/modules-2/files-2.1/com.google.protobuf/protobuf-java/3.5.0/200fb936907fbab5e521d148026f6033d4aa539e/protobuf-java-3.5.0.jar)
 to field java.nio.Buffer.address
WARNING: Please consider reporting this to the maintainers of 
com.google.protobuf.UnsafeUtil
{code}

This warning is fixed in ProtoBuf versions [_v3.7.0_ and 
above|https://github.com/protocolbuffers/protobuf/releases/tag/v3.7.0].

As the current access warning can turn into an access violation in later 
versions of the JDK, we're requesting Mesos to update to a version of ProtoBuf 
that incorporates the needed fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9754) Design doc for agent draining

2019-05-01 Thread Greg Mann (JIRA)
Greg Mann created MESOS-9754:


 Summary: Design doc for agent draining
 Key: MESOS-9754
 URL: https://issues.apache.org/jira/browse/MESOS-9754
 Project: Mesos
  Issue Type: Task
Reporter: Greg Mann






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9753) Agent Draining

2019-05-01 Thread Greg Mann (JIRA)
Greg Mann created MESOS-9753:


 Summary: Agent Draining
 Key: MESOS-9753
 URL: https://issues.apache.org/jira/browse/MESOS-9753
 Project: Mesos
  Issue Type: Epic
Reporter: Greg Mann


This epic holds tickets related to maintenance primitive improvements which 
facilitate draining of agent nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9752) ./configure fails in certain strange Cyrus SASL setups

2019-05-01 Thread David Gilman (JIRA)
David Gilman created MESOS-9752:
---

 Summary: ./configure fails in certain strange Cyrus SASL setups
 Key: MESOS-9752
 URL: https://issues.apache.org/jira/browse/MESOS-9752
 Project: Mesos
  Issue Type: Task
 Environment: MacOS X 10.13

Cyrus SASL 2.1.27 installed through MacPorts
Reporter: David Gilman


I have an installation of Cyrus SASL that, for some unknown reason, has 
duplicated SASL mechanisms installed. The crammd5_installed.c will print out 
"found" for each CRAM-MD5 mechanism set up, resulting in output of "foundfound" 
(once for each CRAM-MD5) which fails the Mesos ./configure test which expects 
just "found".

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9751) Build mesos example not found

2019-05-01 Thread darion yaphet (JIRA)
darion yaphet created MESOS-9751:


 Summary: Build mesos example not found
 Key: MESOS-9751
 URL: https://issues.apache.org/jira/browse/MESOS-9751
 Project: Mesos
  Issue Type: Bug
  Components: build
Affects Versions: 1.6.1
Reporter: darion yaphet


I try to build mesos from source code using make I think it should be build out 
a binary under src/examples, but I don't find it .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)