[jira] [Created] (MESOS-7924) Add a javascript linter to the webui.

2017-08-28 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-7924:
--

 Summary: Add a javascript linter to the webui.
 Key: MESOS-7924
 URL: https://issues.apache.org/jira/browse/MESOS-7924
 Project: Mesos
  Issue Type: Improvement
  Components: webui
Reporter: Benjamin Mahler


As far as I can tell, javascript linters (e.g. ESLint) help catch some 
functional errors as well, for example, we've made some "strict" mistakes a few 
times that ESLint can catch: MESOS-6624, MESOS-7912.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7912) Master WebUI not working in Chrome.

2017-08-28 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-7912:
---
Shepherd: Benjamin Mahler

> Master WebUI not working in Chrome.
> ---
>
> Key: MESOS-7912
> URL: https://issues.apache.org/jira/browse/MESOS-7912
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.3.0
> Environment: Mesos Master Version 1.3.0
> Chrome Windows Version Version 37.0.2062.102 m
>Reporter: Alastair Montgomery
>Assignee: Alastair Montgomery
>Priority: Critical
> Fix For: 1.3.2, 1.4.1
>
>
> Just getting "No master is currently leading ..." when browsing to Mesos 
> Master UI using Chrome.
> Although displays correctly on IE.
> The following is displayed the Chrome console,
> {noformat}
> Uncaught SyntaxError: In strict mode code, functions can only be declared at 
> top level or immediately within another function. controllers.js:848
> Error: [ng:areq] 
> http://errors.angularjs.org/1.2.3/ng/areq?p0=MainCtrl&p1=not%20a%20function%2C%20got%20undefined
> at Error (native)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:6:449
> at tb 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:250)
> at Oa 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:337)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:62:96
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:49:117
> at q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:7:361)
> at Q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:48:492)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:24)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:41)
>  angular-1.2.3.min.js:84
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular.min.js.map 
> 404 (Not Found) :5050/static/js/angular.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-route.min.js.map
>  404 (Not Found) :5050/static/js/angular-route.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/css/bootstrap.min.css.map
>  404 (Not Found) 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7912) Master WebUI not working in Chrome.

2017-08-28 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-7912:
---
Fix Version/s: 1.3.2

> Master WebUI not working in Chrome.
> ---
>
> Key: MESOS-7912
> URL: https://issues.apache.org/jira/browse/MESOS-7912
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.3.0
> Environment: Mesos Master Version 1.3.0
> Chrome Windows Version Version 37.0.2062.102 m
>Reporter: Alastair Montgomery
>Assignee: Alastair Montgomery
>Priority: Critical
> Fix For: 1.3.2, 1.4.1
>
>
> Just getting "No master is currently leading ..." when browsing to Mesos 
> Master UI using Chrome.
> Although displays correctly on IE.
> The following is displayed the Chrome console,
> {noformat}
> Uncaught SyntaxError: In strict mode code, functions can only be declared at 
> top level or immediately within another function. controllers.js:848
> Error: [ng:areq] 
> http://errors.angularjs.org/1.2.3/ng/areq?p0=MainCtrl&p1=not%20a%20function%2C%20got%20undefined
> at Error (native)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:6:449
> at tb 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:250)
> at Oa 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:337)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:62:96
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:49:117
> at q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:7:361)
> at Q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:48:492)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:24)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:41)
>  angular-1.2.3.min.js:84
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular.min.js.map 
> 404 (Not Found) :5050/static/js/angular.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-route.min.js.map
>  404 (Not Found) :5050/static/js/angular-route.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/css/bootstrap.min.css.map
>  404 (Not Found) 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7912) Master WebUI not working in Chrome.

2017-08-28 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-7912:
---
Summary: Master WebUI not working in Chrome.  (was: Master WebUI not 
working in Chrome)

> Master WebUI not working in Chrome.
> ---
>
> Key: MESOS-7912
> URL: https://issues.apache.org/jira/browse/MESOS-7912
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.3.0
> Environment: Mesos Master Version 1.3.0
> Chrome Windows Version Version 37.0.2062.102 m
>Reporter: Alastair Montgomery
>Assignee: Alastair Montgomery
>Priority: Critical
> Fix For: 1.4.1
>
>
> Just getting "No master is currently leading ..." when browsing to Mesos 
> Master UI using Chrome.
> Although displays correctly on IE.
> The following is displayed the Chrome console,
> {noformat}
> Uncaught SyntaxError: In strict mode code, functions can only be declared at 
> top level or immediately within another function. controllers.js:848
> Error: [ng:areq] 
> http://errors.angularjs.org/1.2.3/ng/areq?p0=MainCtrl&p1=not%20a%20function%2C%20got%20undefined
> at Error (native)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:6:449
> at tb 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:250)
> at Oa 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:337)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:62:96
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:49:117
> at q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:7:361)
> at Q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:48:492)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:24)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:41)
>  angular-1.2.3.min.js:84
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular.min.js.map 
> 404 (Not Found) :5050/static/js/angular.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-route.min.js.map
>  404 (Not Found) :5050/static/js/angular-route.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/css/bootstrap.min.css.map
>  404 (Not Found) 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7912) Master WebUI not working in Chrome

2017-08-28 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler reassigned MESOS-7912:
--

Assignee: Alastair Montgomery

> Master WebUI not working in Chrome
> --
>
> Key: MESOS-7912
> URL: https://issues.apache.org/jira/browse/MESOS-7912
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.3.0
> Environment: Mesos Master Version 1.3.0
> Chrome Windows Version Version 37.0.2062.102 m
>Reporter: Alastair Montgomery
>Assignee: Alastair Montgomery
>Priority: Critical
>
> Just getting "No master is currently leading ..." when browsing to Mesos 
> Master UI using Chrome.
> Although displays correctly on IE.
> The following is displayed the Chrome console,
> {noformat}
> Uncaught SyntaxError: In strict mode code, functions can only be declared at 
> top level or immediately within another function. controllers.js:848
> Error: [ng:areq] 
> http://errors.angularjs.org/1.2.3/ng/areq?p0=MainCtrl&p1=not%20a%20function%2C%20got%20undefined
> at Error (native)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:6:449
> at tb 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:250)
> at Oa 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:18:337)
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:62:96
> at 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:49:117
> at q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:7:361)
> at Q 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:48:492)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:24)
> at f 
> (http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-1.2.3.min.js:43:41)
>  angular-1.2.3.min.js:84
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular.min.js.map 
> 404 (Not Found) :5050/static/js/angular.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/js/angular-route.min.js.map
>  404 (Not Found) :5050/static/js/angular-route.min.js.map:1
> GET 
> http://pp3xmes01mst001.pp3.williamhill.plc:5050/static/css/bootstrap.min.css.map
>  404 (Not Found) 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7922) Fix communication between old masters and new agents.

2017-08-28 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144546#comment-16144546
 ] 

Michael Park commented on MESOS-7922:
-

{noformat}
commit 30e2b2ad818e4e90c8df03b9802a4b1a431605c7
Author: Michael Park 
Date:   Mon Aug 28 15:19:31 2017 -0700

Fixed the communication between old masters and new agents.

For re-registration, 1.4 agents used to send the resources in tasks
and executors to the master in the "post-reservation-refinement" format,
which is incompatible for pre-1.4 masters. This patch changes the agent
such that it always downgrades the resources to
the "pre-reservation-refinement" format, and the master unconditionally
upgrades the resources to "post-reservation-refinement" format.

Review: https://reviews.apache.org/r/61952/
{noformat}

> Fix communication between old masters and new agents.
> -
>
> Key: MESOS-7922
> URL: https://issues.apache.org/jira/browse/MESOS-7922
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, master
>Reporter: Michael Park
>Assignee: Michael Park
>Priority: Blocker
> Fix For: 1.4.0
>
>
> For re-registration, agents currently send the resources in tasks
> and executors to the master in the "post-reservation-refinement" format,
> which is incompatible for pre-1.4 masters. We should change the agent
> such that it always downgrades the resources to
> the "pre-reservation-refinement" format, and the master unconditionally
> upgrade the resources to "post-reservation-refinement" format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7921) process::EventQueue sometimes crashes

2017-08-28 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-7921:
--
Description: 
The following segfault is found on 
[ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
 in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
and shows up in other tests and environments (with or without 
--enable-lock-free-event-queue) as well.

{noformat: title=Configuration}
./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
{noformat}

{noformat:title=}
*** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
using GNU date ***
PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
*** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
trace: ***
@ 0x2b9e29d26330 (unknown)
@ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
@ 0x2b9e25800a40 process::ProcessManager::resume()
@ 0x2b9e2580f891 
process::ProcessManager::init_threads()::$_9::operator()()
@ 0x2b9e2580f7d5 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
@ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
@ 0x2b9e29fe5a60 (unknown)
@ 0x2b9e29d1e184 start_thread
@ 0x2b9e2a851ffd (unknown)
make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
{noformat}

A bui...@mesos.apache.org query shows many such instances: 
https://lists.apache.org/list.html?bui...@mesos.apache.org:lte=1M:process%3A%3AEventQueue%3A%3AConsumer%3A%3Aempty

  was:
The following segfault is found on 
[ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
 in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
and shows up in other tests and environments (with or without 
--enable-lock-free-event-queue) as well.

{noformat: title=Configuration}
./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
{noformat}

{noformat:title=}
*** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
using GNU date ***
PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
*** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
trace: ***
@ 0x2b9e29d26330 (unknown)
@ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
@ 0x2b9e25800a40 process::ProcessManager::resume()
@ 0x2b9e2580f891 
process::ProcessManager::init_threads()::$_9::operator()()
@ 0x2b9e2580f7d5 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
@ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
@ 0x2b9e29fe5a60 (unknown)
@ 0x2b9e29d1e184 start_thread
@ 0x2b9e2a851ffd (unknown)
make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
{noformat}


> process::EventQueue sometimes crashes
> -
>
> Key: MESOS-7921
> URL: https://issues.apache.org/jira/browse/MESOS-7921
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Affects Versions: 1.4.0
> Environment: autotools,gcc,--verbose,GLOG_v=1 
> MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details: 
> https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
>Reporter: Yan Xu
>Priority: Blocker
> Attachments: 
> MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt
>
>
> The following segfault is found on 
> [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
>  in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
> and shows up in other tests and environments (with or without 
> --enable-lock-free-event-queue) as well.
> {noformat: title=Configuration}
> ./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
> {noformat}
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
> using GNU date ***
> PC: @ 0x2b9e2581caa0 process::EventQueue:

[jira] [Updated] (MESOS-7921) process::EventQueue sometimes crashes

2017-08-28 Thread Benjamin Mahler (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-7921:
---
Description: 
The following segfault is found on 
[ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
 in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
and shows up in other tests and environments (with or without 
--enable-lock-free-event-queue) as well.

{noformat: title=Configuration}
./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
{noformat}

{noformat:title=}
*** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
using GNU date ***
PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
*** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
trace: ***
@ 0x2b9e29d26330 (unknown)
@ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
@ 0x2b9e25800a40 process::ProcessManager::resume()
@ 0x2b9e2580f891 
process::ProcessManager::init_threads()::$_9::operator()()
@ 0x2b9e2580f7d5 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
@ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
@ 0x2b9e29fe5a60 (unknown)
@ 0x2b9e29d1e184 start_thread
@ 0x2b9e2a851ffd (unknown)
make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
{noformat}

  was:
The following segfault is found on 
[ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
 in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
and shows up in other tests and environments (with or without 
--enable-lock-free-event-queue) as well.

{noformat:title=}
*** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
using GNU date ***
PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
*** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
trace: ***
@ 0x2b9e29d26330 (unknown)
@ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
@ 0x2b9e25800a40 process::ProcessManager::resume()
@ 0x2b9e2580f891 
process::ProcessManager::init_threads()::$_9::operator()()
@ 0x2b9e2580f7d5 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
@ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
@ 0x2b9e29fe5a60 (unknown)
@ 0x2b9e29d1e184 start_thread
@ 0x2b9e2a851ffd (unknown)
make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
{noformat}


> process::EventQueue sometimes crashes
> -
>
> Key: MESOS-7921
> URL: https://issues.apache.org/jira/browse/MESOS-7921
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Affects Versions: 1.4.0
> Environment: autotools,gcc,--verbose,GLOG_v=1 
> MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details: 
> https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
>Reporter: Yan Xu
>Priority: Blocker
> Attachments: 
> MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt
>
>
> The following segfault is found on 
> [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
>  in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
> and shows up in other tests and environments (with or without 
> --enable-lock-free-event-queue) as well.
> {noformat: title=Configuration}
> ./bootstrap '&&' ./configure --verbose '&&' make -j6 distcheck
> {noformat}
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
> using GNU date ***
> PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
> trace: ***
> @ 0x2b9e29d26330 (unknown)
> @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> @ 0x2b9e25800a40 process::ProcessManage

[jira] [Commented] (MESOS-7801) Retry logic for unsuccessful `docker rm` during agent recovery

2017-08-28 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144543#comment-16144543
 ] 

Gilbert Song commented on MESOS-7801:
-

[~xds2000], sorry for the delay on this optimization change. Jie and I are 
under heavy workloads and did not get a chance onto this issue. We will try to 
prioritize this from our side. Any comments from you on those two patches will 
be absolutely welcome.:)

> Retry logic for unsuccessful `docker rm` during agent recovery
> --
>
> Key: MESOS-7801
> URL: https://issues.apache.org/jira/browse/MESOS-7801
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>
> In MESOS- we skip the failure when `docker rm` fails due to mount leakage 
> during agent recovery. In order not to leave residual docker containers in 
> the docker daemon, we could do a best-effort `docker rm` retry with an 
> exponential backoff since we cannot control when the leakage would be 
> terminated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7801) Retry logic for unsuccessful `docker rm` during agent recovery

2017-08-28 Thread Deshi Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144522#comment-16144522
 ] 

Deshi Xiao commented on MESOS-7801:
---

MESOS- already resolve my case, i have not provide another issue right now, 
i just curious that why the optimization patch is always pending? it let me 
confuse abt the general workflow.

> Retry logic for unsuccessful `docker rm` during agent recovery
> --
>
> Key: MESOS-7801
> URL: https://issues.apache.org/jira/browse/MESOS-7801
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>
> In MESOS- we skip the failure when `docker rm` fails due to mount leakage 
> during agent recovery. In order not to leave residual docker containers in 
> the docker daemon, we could do a best-effort `docker rm` retry with an 
> exponential backoff since we cannot control when the leakage would be 
> terminated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-1871) Sending SIGTERM to a task command may render it orphaned

2017-08-28 Thread Deshi Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144518#comment-16144518
 ] 

Deshi Xiao commented on MESOS-1871:
---

[~idownes] i have some cycle to work on this issue, could you please shepherd 
me. where is best start on fix it?

> Sending SIGTERM to a task command may render it orphaned
> 
>
> Key: MESOS-1871
> URL: https://issues.apache.org/jira/browse/MESOS-1871
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Alexander Rukletsov
>Priority: Minor
>
> {{CommandExecutor}} launches tasks wrapping them into {{sh -c}}. That means 
> signals are sent to the top process—that is {{sh -c}}—and not to the task 
> directly. Though {{SIGTERM}} is propagated by {{sh -c}} down the process 
> tree, if the task is unresponsive to {{SIGTERM}}, {{sh -c}} terminates 
> reporting success to the {{CommandExecutor}}, rendering the task detached 
> from the parent process and still running. Because the {{CommandExecutor}} 
> thinks the command terminated normally, its OS process exits normally and may 
> not trigger containerizer's escalation which destroys cgroups.
> Here is the test related to the first part: 
> [https://gist.github.com/rukletsov/68259dfb02421813f9e6].
> Here is the test related to the second part: 
> [https://gist.github.com/rukletsov/3f19ecc7389fa51e65c0].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7921) process::EventQueue sometimes crashes

2017-08-28 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-7921:
--
Target Version/s: 1.4.0
Priority: Blocker  (was: Major)

> process::EventQueue sometimes crashes
> -
>
> Key: MESOS-7921
> URL: https://issues.apache.org/jira/browse/MESOS-7921
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Affects Versions: 1.4.0
> Environment: autotools,gcc,--verbose,GLOG_v=1 
> MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details: 
> https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
>Reporter: Yan Xu
>Priority: Blocker
> Attachments: 
> MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt
>
>
> The following segfault is found on 
> [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
>  in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
> and shows up in other tests and environments (with or without 
> --enable-lock-free-event-queue) as well.
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
> using GNU date ***
> PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
> trace: ***
> @ 0x2b9e29d26330 (unknown)
> @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> @ 0x2b9e25800a40 process::ProcessManager::resume()
> @ 0x2b9e2580f891 
> process::ProcessManager::init_threads()::$_9::operator()()
> @ 0x2b9e2580f7d5 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
> @ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
> @ 0x2b9e29fe5a60 (unknown)
> @ 0x2b9e29d1e184 start_thread
> @ 0x2b9e2a851ffd (unknown)
> make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7923) Make args optional in mesos port mapper plugin

2017-08-28 Thread Deepak Goel (JIRA)
Deepak Goel created MESOS-7923:
--

 Summary: Make args optional in mesos port mapper plugin
 Key: MESOS-7923
 URL: https://issues.apache.org/jira/browse/MESOS-7923
 Project: Mesos
  Issue Type: Bug
  Components: network
Reporter: Deepak Goel
Assignee: Deepak Goel


Current implementation of the mesos-port-mapper plugin fails if the args field 
is absent in the cni config which makes it very specific to mesos. Instead, if 
args could be optional then this plugin could be used in a more generic 
environment. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7223) Linux filesystem isolator cannot mount host volume /dev/log.

2017-08-28 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu reassigned MESOS-7223:
-

Assignee: Jie Yu  (was: Gilbert Song)

> Linux filesystem isolator cannot mount host volume /dev/log.
> 
>
> Key: MESOS-7223
> URL: https://issues.apache.org/jira/browse/MESOS-7223
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.2, 1.1.0, 1.2.0
>Reporter: Haralds Ulmanis
>Assignee: Jie Yu
>  Labels: mesosphere, volumes
>
> I'm trying to mount /dev/log.
> ls -l /dev/log
> lrwxrwxrwx 1 root root 28 Mar  9 01:49 /dev/log -> 
> /run/systemd/journal/dev-log
> # ls -l /run/systemd/journal/dev-log
> srw-rw-rw- 1 root root 0 Mar  9 01:49 /run/systemd/journal/dev-log
> I have tried mounting /dev/log and /run/systemd/journal/dev-log, both produce 
> same errors:
> from stdout:
> Executing pre-exec command 
> '{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/usr\/lib\/mesos\/mesos-containerizer"}'
> Executing pre-exec command 
> '{"arguments":["mount","-n","--rbind","\/data\/mesos-agent\/slaves\/9b7ad711-9381-4338-b3c0-dac86253701e-S93\/frameworks\/a872f621-d10f-4021-a886-c5d564df104e-\/executors\/services_dev-2_lb-6.b8202973-04b0-11e7-be02-0a2b9a5c33cf\/runs\/cfb170f0-6c69-4475-9dbe-bb9967e19b42","\/data\/mesos-agent\/provisioner\/containers\/cfb170f0-6c69-4475-9dbe-bb9967e19b42\/backends\/overlay\/rootfses\/890a25e6-cb15-42e3-be9c-0aa3baf889f8\/data\/mesos-agent\/sandbox"],"shell":false,"value":"mount"}'
> Executing pre-exec command 
> '{"arguments":["mount","-n","--rbind","\/run\/systemd\/journal\/dev-log","\/data\/mesos-agent\/provisioner\/containers\/cfb170f0-6c69-4475-9dbe-bb9967e19b42\/backends\/overlay\/rootfses\/890a25e6-cb15-42e3-be9c-0aa3baf889f8\/dev\/log"],"shell":false,"value":"mount"}'
> from stderr:
> mount: mount(2) failed: 
> /data/mesos-agent/provisioner/containers/cfb170f0-6c69-4475-9dbe-bb9967e19b42/backends/overlay/rootfses/890a25e6-cb15-42e3-be9c-0aa3baf889f8/dev/log:
>  Not a directory
> Failed to execute pre-exec command 
> '{"arguments":["mount","-n","--rbind","\/run\/systemd\/journal\/dev-log","\/data\/mesos-agent\/provisioner\/containers\/cfb170f0-6c69-4475-9dbe-bb9967e19b42\/backends\/overlay\/rootfses\/890a25e6-cb15-42e3-be9c-0aa3baf889f8\/dev\/log"],"shell":false,"value":"mount"}'
> This particular job  i start from marathon and have the following definition 
> (if I change MESOS to DOCKER - it works): 
> "container": {
> "type": "MESOS",
> "volumes": [
>   {
> "hostPath": "/run/systemd/journal/dev-log",
> "containerPath": "/dev/log",
> "mode": "RW"
>   }
> ],
> "docker": {
>   "image": "",
>   "credential": null,
>   "forcePullImage": true
> }
>   },



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7917) Docker statistics not reported on Windows.

2017-08-28 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-7917:
---
Shepherd: Alexander Rukletsov
Story Points: 3

> Docker statistics not reported on Windows.
> --
>
> Key: MESOS-7917
> URL: https://issues.apache.org/jira/browse/MESOS-7917
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
> Environment: Windows 10
>Reporter: Andrew Schwartzmeyer
>Assignee: Andrew Schwartzmeyer
>  Labels: docker, microsoft, windows
>
> On Windows, the JSON information provided by the agent at the /container API 
> does not contain the expected {{statistics}} object for Docker containers on 
> Windows. This breaks the dcos-metrics tool, required for DC/OS integration on 
> Windows.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7922) Fix communication between old masters and new agents.

2017-08-28 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-7922:

Target Version/s: 1.4.0  (was: 1.4.1)
Priority: Blocker  (was: Major)

> Fix communication between old masters and new agents.
> -
>
> Key: MESOS-7922
> URL: https://issues.apache.org/jira/browse/MESOS-7922
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, master
>Reporter: Michael Park
>Assignee: Michael Park
>Priority: Blocker
>
> For re-registration, agents currently send the resources in tasks
> and executors to the master in the "post-reservation-refinement" format,
> which is incompatible for pre-1.4 masters. We should change the agent
> such that it always downgrades the resources to
> the "pre-reservation-refinement" format, and the master unconditionally
> upgrade the resources to "post-reservation-refinement" format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7922) Fix communication between old masters and new agents.

2017-08-28 Thread Michael Park (JIRA)
Michael Park created MESOS-7922:
---

 Summary: Fix communication between old masters and new agents.
 Key: MESOS-7922
 URL: https://issues.apache.org/jira/browse/MESOS-7922
 Project: Mesos
  Issue Type: Bug
  Components: agent, master
Reporter: Michael Park
Assignee: Michael Park


For re-registration, agents currently send the resources in tasks
and executors to the master in the "post-reservation-refinement" format,
which is incompatible for pre-1.4 masters. We should change the agent
such that it always downgrades the resources to
the "pre-reservation-refinement" format, and the master unconditionally
upgrade the resources to "post-reservation-refinement" format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7922) Fix communication between old masters and new agents.

2017-08-28 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144168#comment-16144168
 ] 

Michael Park commented on MESOS-7922:
-

https://reviews.apache.org/r/61952

> Fix communication between old masters and new agents.
> -
>
> Key: MESOS-7922
> URL: https://issues.apache.org/jira/browse/MESOS-7922
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, master
>Reporter: Michael Park
>Assignee: Michael Park
>
> For re-registration, agents currently send the resources in tasks
> and executors to the master in the "post-reservation-refinement" format,
> which is incompatible for pre-1.4 masters. We should change the agent
> such that it always downgrades the resources to
> the "pre-reservation-refinement" format, and the master unconditionally
> upgrade the resources to "post-reservation-refinement" format.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7643) The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically

2017-08-28 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-7643:
--
Target Version/s: 1.5.0, 1.4.1  (was: 1.4.0)

> The order of isolators provided in '--isolation' flag is not preserved and 
> instead sorted alphabetically
> 
>
> Key: MESOS-7643
> URL: https://issues.apache.org/jira/browse/MESOS-7643
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.1.2, 1.2.0, 1.3.0
>Reporter: Michael Cherny
>Assignee: James Peach
>Priority: Critical
>  Labels: isolation
>
> According to documentation and comments in code the order of the entries in 
> the --isolation flag should specify the ordering of the isolators. 
> Specifically, the `create` and `prepare` calls for each isolator should run 
> serially in the order in which they appear in the --isolation flag, while the 
> `cleanup` call should be serialized in reverse order (with exception of 
> filesystem isolator which is always first).
> But in fact, the isolators provided in '--isolation' flag are sorted 
> alphabetically.
> That happens in [this line of 
> code|https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L377].
>  In this line use of 'set' is done (apparently instead of list or 
> vector) and set is a sorted container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7921) process::EventQueue sometimes crashes

2017-08-28 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-7921:
--
Attachment: 
MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt

Attached the full log on ASF CI for this instance.

> process::EventQueue sometimes crashes
> -
>
> Key: MESOS-7921
> URL: https://issues.apache.org/jira/browse/MESOS-7921
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Affects Versions: 1.4.0
> Environment: autotools,gcc,--verbose,GLOG_v=1 
> MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details: 
> https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
>Reporter: Yan Xu
> Attachments: 
> MesosContainerizerSlaveRecoveryTest.ResourceStatisticsFullLog.txt
>
>
> The following segfault is found on 
> [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
>  in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
> and shows up in other tests and environments (with or without 
> --enable-lock-free-event-queue) as well.
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
> using GNU date ***
> PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
> trace: ***
> @ 0x2b9e29d26330 (unknown)
> @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> @ 0x2b9e25800a40 process::ProcessManager::resume()
> @ 0x2b9e2580f891 
> process::ProcessManager::init_threads()::$_9::operator()()
> @ 0x2b9e2580f7d5 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
> @ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
> @ 0x2b9e29fe5a60 (unknown)
> @ 0x2b9e29d1e184 start_thread
> @ 0x2b9e2a851ffd (unknown)
> make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7921) process::EventQueue sometimes crashes

2017-08-28 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144134#comment-16144134
 ] 

Yan Xu commented on MESOS-7921:
---

[~benjaminhindman] [~bmahler]

> process::EventQueue sometimes crashes
> -
>
> Key: MESOS-7921
> URL: https://issues.apache.org/jira/browse/MESOS-7921
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Affects Versions: 1.4.0
> Environment: autotools,gcc,--verbose,GLOG_v=1 
> MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)
> Note that --enable-lock-free-event-queue is not enabled.
> Details: 
> https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
>Reporter: Yan Xu
>
> The following segfault is found on 
> [ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
>  in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
> and shows up in other tests and environments (with or without 
> --enable-lock-free-event-queue) as well.
> {noformat:title=}
> *** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
> using GNU date ***
> PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> *** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
> trace: ***
> @ 0x2b9e29d26330 (unknown)
> @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
> @ 0x2b9e25800a40 process::ProcessManager::resume()
> @ 0x2b9e2580f891 
> process::ProcessManager::init_threads()::$_9::operator()()
> @ 0x2b9e2580f7d5 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
> @ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
> @ 0x2b9e29fe5a60 (unknown)
> @ 0x2b9e29d1e184 start_thread
> @ 0x2b9e2a851ffd (unknown)
> make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7921) process::EventQueue sometimes crashes

2017-08-28 Thread Yan Xu (JIRA)
Yan Xu created MESOS-7921:
-

 Summary: process::EventQueue sometimes crashes
 Key: MESOS-7921
 URL: https://issues.apache.org/jira/browse/MESOS-7921
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Affects Versions: 1.4.0
 Environment: autotools,gcc,--verbose,GLOG_v=1 
MESOS_VERBOSE=1,ubuntu:14.04,(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)

Note that --enable-lock-free-event-queue is not enabled.

Details: 
https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/injectedEnvVars/
Reporter: Yan Xu


The following segfault is found on 
[ASF|https://builds.apache.org/job/Mesos-Buildbot/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)/4159/]
 in {{MesosContainerizerSlaveRecoveryTest.ResourceStatistics}} but it's flaky 
and shows up in other tests and environments (with or without 
--enable-lock-free-event-queue) as well.

{noformat:title=}
*** Aborted at 1503937885 (unix time) try "date -d @1503937885" if you are 
using GNU date ***
PC: @ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
*** SIGSEGV (@0x8) received by PID 751 (TID 0x2b9e31978700) from PID 8; stack 
trace: ***
@ 0x2b9e29d26330 (unknown)
@ 0x2b9e2581caa0 process::EventQueue::Consumer::empty()
@ 0x2b9e25800a40 process::ProcessManager::resume()
@ 0x2b9e2580f891 
process::ProcessManager::init_threads()::$_9::operator()()
@ 0x2b9e2580f7d5 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_9vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2b9e2580f7a5 std::_Bind_simple<>::operator()()
@ 0x2b9e2580f77c std::thread::_Impl<>::_M_run()
@ 0x2b9e29fe5a60 (unknown)
@ 0x2b9e29d1e184 start_thread
@ 0x2b9e2a851ffd (unknown)
make[3]: *** [CMakeFiles/check] Segmentation fault (core dumped)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7801) Retry logic for unsuccessful `docker rm` during agent recovery

2017-08-28 Thread Chun-Hung Hsiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144116#comment-16144116
 ] 

Chun-Hung Hsiao commented on MESOS-7801:


Hi [~xds2000]. This patch is an optimization for MESOS-, which has been 
landed a while ago. I'd like to see this patch landed but since it is just an 
optimization, it might not receive as high priority as other pending issues. I 
was wondering if you encounter any problem that cannot be resolved by the patch 
for MESOS-. Could you provide more information to help us understand the 
severity? Thanks!

> Retry logic for unsuccessful `docker rm` during agent recovery
> --
>
> Key: MESOS-7801
> URL: https://issues.apache.org/jira/browse/MESOS-7801
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>
> In MESOS- we skip the failure when `docker rm` fails due to mount leakage 
> during agent recovery. In order not to leave residual docker containers in 
> the docker daemon, we could do a best-effort `docker rm` retry with an 
> exponential backoff since we cannot control when the leakage would be 
> terminated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-1871) Sending SIGTERM to a task command may render it orphaned

2017-08-28 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144062#comment-16144062
 ] 

Ian Downes commented on MESOS-1871:
---

[~xds2000] Might be the same underlying problem but this looks to be a 
different manifestation. This ticket is strictly about correctness of killing 
all processes in the container. The linked ticket is related to graceful 
shutdown.

> Sending SIGTERM to a task command may render it orphaned
> 
>
> Key: MESOS-1871
> URL: https://issues.apache.org/jira/browse/MESOS-1871
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Alexander Rukletsov
>Priority: Minor
>
> {{CommandExecutor}} launches tasks wrapping them into {{sh -c}}. That means 
> signals are sent to the top process—that is {{sh -c}}—and not to the task 
> directly. Though {{SIGTERM}} is propagated by {{sh -c}} down the process 
> tree, if the task is unresponsive to {{SIGTERM}}, {{sh -c}} terminates 
> reporting success to the {{CommandExecutor}}, rendering the task detached 
> from the parent process and still running. Because the {{CommandExecutor}} 
> thinks the command terminated normally, its OS process exits normally and may 
> not trigger containerizer's escalation which destroys cgroups.
> Here is the test related to the first part: 
> [https://gist.github.com/rukletsov/68259dfb02421813f9e6].
> Here is the test related to the second part: 
> [https://gist.github.com/rukletsov/3f19ecc7389fa51e65c0].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7643) The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically

2017-08-28 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143936#comment-16143936
 ] 

James Peach commented on MESOS-7643:


No, it is still not preserving the order.

> The order of isolators provided in '--isolation' flag is not preserved and 
> instead sorted alphabetically
> 
>
> Key: MESOS-7643
> URL: https://issues.apache.org/jira/browse/MESOS-7643
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.1.2, 1.2.0, 1.3.0
>Reporter: Michael Cherny
>Assignee: James Peach
>Priority: Critical
>  Labels: isolation
>
> According to documentation and comments in code the order of the entries in 
> the --isolation flag should specify the ordering of the isolators. 
> Specifically, the `create` and `prepare` calls for each isolator should run 
> serially in the order in which they appear in the --isolation flag, while the 
> `cleanup` call should be serialized in reverse order (with exception of 
> filesystem isolator which is always first).
> But in fact, the isolators provided in '--isolation' flag are sorted 
> alphabetically.
> That happens in [this line of 
> code|https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L377].
>  In this line use of 'set' is done (apparently instead of list or 
> vector) and set is a sorted container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6428) Mesos containerizer helper function signalSafeWriteStatus is not AS-Safe

2017-08-28 Thread Andrei Budnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143878#comment-16143878
 ] 

Andrei Budnik commented on MESOS-6428:
--

https://reviews.apache.org/r/61800/

> Mesos containerizer helper function signalSafeWriteStatus is not AS-Safe
> 
>
> Key: MESOS-6428
> URL: https://issues.apache.org/jira/browse/MESOS-6428
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.1.0
>Reporter: Benjamin Bannier
>Assignee: Jing Chen
>  Labels: newbie, tech-debt
>
> In {{src/slave/containerizer/mesos/launch.cpp}} a helper function 
> {{signalSafeWriteStatus}} is defined. Its name seems to suggest that this 
> function is safe to call in e.g., signal handlers, and it is used in this 
> file's {{signalHandler}} for exactly that purpose.
> Currently this function is not AS-Safe since it e.g., allocates memory via 
> construction of {{string}} instances, and might destructively modify 
> {{errno}}.
> We should clean up this function to be in fact AS-Safe.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7088) Support private registry credential per container.

2017-08-28 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-7088:
--
Fix Version/s: 1.4.0

> Support private registry credential per container.
> --
>
> Key: MESOS-7088
> URL: https://issues.apache.org/jira/browse/MESOS-7088
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: containerizer
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7909) Ordering dependency between 'linux/capabilities' and 'docker/runtime' isolator.

2017-08-28 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-7909:
--
Fix Version/s: (was: 1.4.1)
   (was: 1.5.0)
   1.4.0

> Ordering dependency between 'linux/capabilities' and 'docker/runtime' 
> isolator.
> ---
>
> Key: MESOS-7909
> URL: https://issues.apache.org/jira/browse/MESOS-7909
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.2.2, 1.3.1, 1.4.0
>Reporter: Jie Yu
>Assignee: Jie Yu
> Fix For: 1.2.3, 1.3.2, 1.4.0
>
>
> Looks like there is an unintentional ordering dependency between 
> linux/capabilities isolator and docker/runtime isolator. 
> For the command task case, since both isolators set 
> ContainerLaunchInfo.command. When merging ContainerLaunchInfo.command, 
> docker/runtime isolator assumes its command is before linux/capabilities 
> isolator's command because 'mesos-execute' should be used as argv[0].
> We should try to eliminate this dependency.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7863) Agent may drop pending kill task status updates.

2017-08-28 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-7863:
--
Fix Version/s: (was: 1.4.1)
   1.4.0

> Agent may drop pending kill task status updates.
> 
>
> Key: MESOS-7863
> URL: https://issues.apache.org/jira/browse/MESOS-7863
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Benjamin Mahler
>Assignee: Benjamin Mahler
>Priority: Critical
> Fix For: 1.1.3, 1.2.3, 1.3.2, 1.4.0
>
>
> Currently there is an assumption that when a pending task is killed, the 
> framework will still be stored in the agent. However, this assumption can be 
> violated in two cases:
> # Another pending task was killed and we removed the framework in 
> 'Slave::run' thinking it was idle, because pending tasks were empty (we 
> remove from pending tasks when processing the kill). (MESOS-7783 is an 
> example instance of this).
> # The last executor terminated without tasks to send terminal updates for, or 
> the last terminated executor received its last acknowledgement. At this 
> point, we remove the framework thinking there were no pending tasks if the 
> task was killed (removed from pending).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7744) Mesos Agent Sends TASK_KILL status update to Master, and still launches task

2017-08-28 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-7744:
--
Fix Version/s: (was: 1.4.1)
   1.4.0

> Mesos Agent Sends TASK_KILL status update to Master, and still launches task
> 
>
> Key: MESOS-7744
> URL: https://issues.apache.org/jira/browse/MESOS-7744
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.0.1
>Reporter: Sargun Dhillon
>Assignee: Benjamin Mahler
>Priority: Critical
>  Labels: reliability
> Fix For: 1.1.3, 1.2.3, 1.3.2, 1.4.0
>
>
> We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a 
> TASK_STARTING back from the agent. Under certain conditions it can result in 
> Mesos losing track of the task. The chunk of the logs which is interesting is 
> here:
> {code}
> Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:26.951799  5171 slave.cpp:1495] Got assigned 
> task Titus-7590548-worker-0-4476 for framework TitusFramework
> Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:26.952251  5171 slave.cpp:1614] Launching task 
> Titus-7590548-worker-0-4476 for framework TitusFramework
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.484611  5171 slave.cpp:1853] Queuing task 
> ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework 
> TitusFramework at executor(1)@100.66.11.10:17707
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.487876  5171 slave.cpp:2035] Asked to kill 
> task Titus-7590548-worker-0-4476 of framework TitusFramework
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.488994  5171 slave.cpp:3211] Handling 
> status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for 
> task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c 
> mesos-slave[4290]: I0629 23:22:37.490603  5171 slave.cpp:2005] Sending queued 
> task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework 
> TitusFramework at executor(1)@100.66.11.10:17707{
> {code}
> In our executor, we see that the launch message arrives after the master has 
> already gotten the kill update. We then send non-terminal state updates to 
> the agent, and yet it doesn't forward these to our framework. We're using a 
> custom executor which is based on the older mesos-go bindings. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7783) Framework might not receive status update when a just launched task is killed immediately

2017-08-28 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-7783:
--
Fix Version/s: (was: 1.4.1)
   1.4.0

> Framework might not receive status update when a just launched task is killed 
> immediately
> -
>
> Key: MESOS-7783
> URL: https://issues.apache.org/jira/browse/MESOS-7783
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Affects Versions: 1.2.0
>Reporter: Benjamin Bannier
>Assignee: Benjamin Mahler
>Priority: Critical
>  Labels: reliability
> Fix For: 1.1.3, 1.2.3, 1.3.2, 1.4.0
>
> Attachments: GroupDeployIntegrationTest.log.zip, logs
>
>
> Our Marathon team are seeing issues in their integration test suite when 
> Marathon gets stuck in an infinite loop trying to kill a just launched task. 
> In their test a task launched which is immediately followed by killing the 
> task -- the framework does e.g., not wait for any task status update.
> In this case the launch and kill messages arrive at the agent in the correct 
> order, but both the launch and kill paths in the agent do not reach the point 
> where a status update is sent to the framework. Since the framework has seen 
> no status update on the task it re-triggers a kill, causing an infinite loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7865) Agent may process a kill task and still launch the task.

2017-08-28 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya updated MESOS-7865:
--
Fix Version/s: (was: 1.4.1)
   1.4.0

> Agent may process a kill task and still launch the task.
> 
>
> Key: MESOS-7865
> URL: https://issues.apache.org/jira/browse/MESOS-7865
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Benjamin Mahler
>Assignee: Benjamin Mahler
>Priority: Critical
> Fix For: 1.1.3, 1.2.3, 1.3.2, 1.4.0
>
>
> Based on the investigation of MESOS-7744, the agent has a race in which 
> "queued" tasks can still be launched after the agent has processed a kill 
> task for them. This race was introduced when {{Slave::statusUpdate}} was made 
> asynchronous:
> (1) {{Slave::__run}} completes, task is now within {{Executor::queuedTasks}}
> (2) {{Slave::killTask}} locates the executor based on the task ID residing in 
> queuedTasks, calls {{Slave::statusUpdate()}} with {{TASK_KILLED}}
> (3) {{Slave::___run}} assumes that killed tasks have been removed from 
> {{Executor::queuedTasks}}, but this now occurs asynchronously in 
> {{Slave::_statusUpdate}}. So, the executor still sees the queued task and 
> delivers it and adds the task to {{Executor::launchedTasks}}.
> (3) {{Slave::_statusUpdate}} runs, removes the task from 
> {{Executor::launchedTasks}} and adds it to {{Executor::terminatedTasks}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7586) Make use of cout/cerr and glog consistent.

2017-08-28 Thread Andrei Budnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Budnik updated MESOS-7586:
-
Description: 
Some parts of mesos use glog before initialization of glog, hence messages via 
glog might not end up in a logdir:
bq. WARNING: Logging before InitGoogleLogging() is written to STDERR

The solution might be:
{{cout/cerr}} should be used before logging initialization.
{{glog}} should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
generated by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
In addition, initialization of glog should be added, where it's necessary.

  was:
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
bq. WARNING: Logging before InitGoogleLogging() is written to STDERR
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
{{cout/cerr}} should be used before logging initialization.
{{glog}} should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
generated by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
In addition, initialization of glog should be added, where it's necessary.


> Make use of cout/cerr and glog consistent.
> --
>
> Key: MESOS-7586
> URL: https://issues.apache.org/jira/browse/MESOS-7586
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrei Budnik
>Assignee: Armand Grillet
>Priority: Minor
>  Labels: debugging, log, newbie
>
> Some parts of mesos use glog before initialization of glog, hence messages 
> via glog might not end up in a logdir:
> bq. WARNING: Logging before InitGoogleLogging() is written to STDERR
> The solution might be:
> {{cout/cerr}} should be used before logging initialization.
> {{glog}} should be used after logging initialization.
>  
> Usually, main function has initialization pattern like:
> # load = flags.load(argc, argv) // Load flags from command line.
> # Check if flags are correct, otherwise print error message to cerr and then 
> exit.
> # Check if user passed --help flag to print help message to cout and then 
> exit.
> # Parsing and setup of environment variables. If this fails, EXIT macro is 
> used to print error message via glog.
> # process::initialize()
> # logging::initialize()
>  
> Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
> generated by glog like current time, date and log level.
> It would be preferable to move step 6 between steps 3 and 4 safely, because 
> {{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
> In addition, initialization of glog should be added, where it's necessary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7586) Make use of cout/cerr and glog consistent.

2017-08-28 Thread Andrei Budnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Budnik updated MESOS-7586:
-
Description: 
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
bq. WARNING: Logging before InitGoogleLogging() is written to STDERR
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
{{cout/cerr}} should be used before logging initialization.
{{glog}} should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
generated by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
In addition, initialization of glog should be added, where it's necessary.

  was:
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
bq. WARNING: Logging before InitGoogleLogging() is written to STDERR
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
{{cout/cerr}} should be used before logging initialization.
{{glog}} should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
generated by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
In addition, initialization of glog should be added, where it necessary.


> Make use of cout/cerr and glog consistent.
> --
>
> Key: MESOS-7586
> URL: https://issues.apache.org/jira/browse/MESOS-7586
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrei Budnik
>Assignee: Armand Grillet
>Priority: Minor
>  Labels: debugging, log, newbie
>
> Some parts of mesos use glog before initialization of glog. This leads to 
> message like:
> bq. WARNING: Logging before InitGoogleLogging() is written to STDERR
> Also, messages via glog before logging is initialized might not end up in a 
> logdir.
>  
> The solution might be:
> {{cout/cerr}} should be used before logging initialization.
> {{glog}} should be used after logging initialization.
>  
> Usually, main function has initialization pattern like:
> # load = flags.load(argc, argv) // Load flags from command line.
> # Check if flags are correct, otherwise print error message to cerr and then 
> exit.
> # Check if user passed --help flag to print help message to cout and then 
> exit.
> # Parsing and setup of environment variables. If this fails, EXIT macro is 
> used to print error message via glog.
> # process::initialize()
> # logging::initialize()
>  
> Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
> generated by glog like current time, date and log level.
> It would be preferable to move step 6 between steps 3 and 4 safely, because 
> {{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
> In addition, initialization of glog should be added, where it's necessary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7586) Make use of cout/cerr and glog consistent.

2017-08-28 Thread Andrei Budnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Budnik updated MESOS-7586:
-
Description: 
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
bq. WARNING: Logging before InitGoogleLogging() is written to STDERR
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
{{cout/cerr}} should be used before logging initialization.
{{glog}} should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
generated by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
In addition, initialization of glog should be added, where it necessary.

  was:
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
“WARNING: Logging before InitGoogleLogging() is written to STDERR”
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
{{cout/cerr}} should be used before logging initialization.
{{glog}} should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
generated by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
In addition, initialization of glog should be added, where it necessary.


> Make use of cout/cerr and glog consistent.
> --
>
> Key: MESOS-7586
> URL: https://issues.apache.org/jira/browse/MESOS-7586
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrei Budnik
>Assignee: Armand Grillet
>Priority: Minor
>  Labels: debugging, log, newbie
>
> Some parts of mesos use glog before initialization of glog. This leads to 
> message like:
> bq. WARNING: Logging before InitGoogleLogging() is written to STDERR
> Also, messages via glog before logging is initialized might not end up in a 
> logdir.
>  
> The solution might be:
> {{cout/cerr}} should be used before logging initialization.
> {{glog}} should be used after logging initialization.
>  
> Usually, main function has initialization pattern like:
> # load = flags.load(argc, argv) // Load flags from command line.
> # Check if flags are correct, otherwise print error message to cerr and then 
> exit.
> # Check if user passed --help flag to print help message to cout and then 
> exit.
> # Parsing and setup of environment variables. If this fails, EXIT macro is 
> used to print error message via glog.
> # process::initialize()
> # logging::initialize()
>  
> Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
> generated by glog like current time, date and log level.
> It would be preferable to move step 6 between steps 3 and 4 safely, because 
> {{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
> In addition, initialization of glog should be added, where it necessary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7586) Make use of cout/cerr and glog consistent.

2017-08-28 Thread Andrei Budnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Budnik updated MESOS-7586:
-
Description: 
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
“WARNING: Logging before InitGoogleLogging() is written to STDERR”
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
{{cout/cerr}} should be used before logging initialization.
{{glog}} should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
generated by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
In addition, initialization of glog should be added, where it necessary.

  was:
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
“WARNING: Logging before InitGoogleLogging() is written to STDERR”
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
cout/cerr should be used before logging initialization.
glog should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use cout/cerr to eliminate any extra information generated 
by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on process::initialize().
Some parts of mesos don’t call logging::initialize(). This should also be fixed.


> Make use of cout/cerr and glog consistent.
> --
>
> Key: MESOS-7586
> URL: https://issues.apache.org/jira/browse/MESOS-7586
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrei Budnik
>Assignee: Armand Grillet
>Priority: Minor
>  Labels: debugging, log, newbie
>
> Some parts of mesos use glog before initialization of glog. This leads to 
> message like:
> “WARNING: Logging before InitGoogleLogging() is written to STDERR”
> Also, messages via glog before logging is initialized might not end up in a 
> logdir.
>  
> The solution might be:
> {{cout/cerr}} should be used before logging initialization.
> {{glog}} should be used after logging initialization.
>  
> Usually, main function has initialization pattern like:
> # load = flags.load(argc, argv) // Load flags from command line.
> # Check if flags are correct, otherwise print error message to cerr and then 
> exit.
> # Check if user passed --help flag to print help message to cout and then 
> exit.
> # Parsing and setup of environment variables. If this fails, EXIT macro is 
> used to print error message via glog.
> # process::initialize()
> # logging::initialize()
>  
> Steps 2 and 3 should use {{cout/cerr}} to eliminate any extra information 
> generated by glog like current time, date and log level.
> It would be preferable to move step 6 between steps 3 and 4 safely, because 
> {{logging::initialize()}} doesn’t depend on {{process::initialize()}}.
> In addition, initialization of glog should be added, where it necessary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7586) Make use of cout/cerr and glog consistent.

2017-08-28 Thread Andrei Budnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrei Budnik updated MESOS-7586:
-
Description: 
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
“WARNING: Logging before InitGoogleLogging() is written to STDERR”
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
cout/cerr should be used before logging initialization.
glog should be used after logging initialization.
 
Usually, main function has initialization pattern like:
# load = flags.load(argc, argv) // Load flags from command line.
# Check if flags are correct, otherwise print error message to cerr and then 
exit.
# Check if user passed --help flag to print help message to cout and then exit.
# Parsing and setup of environment variables. If this fails, EXIT macro is used 
to print error message via glog.
# process::initialize()
# logging::initialize()
 
Steps 2 and 3 should use cout/cerr to eliminate any extra information generated 
by glog like current time, date and log level.

It would be preferable to move step 6 between steps 3 and 4 safely, because 
{{logging::initialize()}} doesn’t depend on process::initialize().
Some parts of mesos don’t call logging::initialize(). This should also be fixed.

  was:
Some parts of mesos use glog before initialization of glog. This leads to 
message like:
“WARNING: Logging before InitGoogleLogging() is written to STDERR”
Also, messages via glog before logging is initialized might not end up in a 
logdir.
 
The solution might be:
cout/cerr should be used before logging initialization.
glog should be used after logging initialization.
 
Usually, main function has pattern like:
1. load = flags.load(argc, argv) // Load flags from command line.
2. Check if flags are correct, otherwise print error message to cerr and then 
exit.
3. Check if user passed --help flag to print help message to cout and then exit.
4. Parsing and setup of environment variables. If this fails, EXIT macro is 
used to print error message via glog.
5. process::initialize()
6. logging::initialize()
7. ...
 
Steps 2 and 3 should use cout/cerr to eliminate any extra information generated 
by glog like current time, date and log level.
It is possible to move step 6 between steps 3 and 4 safely, because 
logging::initialize() doesn’t depend on process::initialize().
Some parts of mesos don’t call logging::initialize(). This should also be fixed.


> Make use of cout/cerr and glog consistent.
> --
>
> Key: MESOS-7586
> URL: https://issues.apache.org/jira/browse/MESOS-7586
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrei Budnik
>Assignee: Armand Grillet
>Priority: Minor
>  Labels: debugging, log, newbie
>
> Some parts of mesos use glog before initialization of glog. This leads to 
> message like:
> “WARNING: Logging before InitGoogleLogging() is written to STDERR”
> Also, messages via glog before logging is initialized might not end up in a 
> logdir.
>  
> The solution might be:
> cout/cerr should be used before logging initialization.
> glog should be used after logging initialization.
>  
> Usually, main function has initialization pattern like:
> # load = flags.load(argc, argv) // Load flags from command line.
> # Check if flags are correct, otherwise print error message to cerr and then 
> exit.
> # Check if user passed --help flag to print help message to cout and then 
> exit.
> # Parsing and setup of environment variables. If this fails, EXIT macro is 
> used to print error message via glog.
> # process::initialize()
> # logging::initialize()
>  
> Steps 2 and 3 should use cout/cerr to eliminate any extra information 
> generated by glog like current time, date and log level.
> It would be preferable to move step 6 between steps 3 and 4 safely, because 
> {{logging::initialize()}} doesn’t depend on process::initialize().
> Some parts of mesos don’t call logging::initialize(). This should also be 
> fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7643) The order of isolators provided in '--isolation' flag is not preserved and instead sorted alphabetically

2017-08-28 Thread Deshi Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143589#comment-16143589
 ] 

Deshi Xiao commented on MESOS-7643:
---

this issue already fixed ? 
https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L156

> The order of isolators provided in '--isolation' flag is not preserved and 
> instead sorted alphabetically
> 
>
> Key: MESOS-7643
> URL: https://issues.apache.org/jira/browse/MESOS-7643
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.1.2, 1.2.0, 1.3.0
>Reporter: Michael Cherny
>Assignee: James Peach
>Priority: Critical
>  Labels: isolation
>
> According to documentation and comments in code the order of the entries in 
> the --isolation flag should specify the ordering of the isolators. 
> Specifically, the `create` and `prepare` calls for each isolator should run 
> serially in the order in which they appear in the --isolation flag, while the 
> `cleanup` call should be serialized in reverse order (with exception of 
> filesystem isolator which is always first).
> But in fact, the isolators provided in '--isolation' flag are sorted 
> alphabetically.
> That happens in [this line of 
> code|https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L377].
>  In this line use of 'set' is done (apparently instead of list or 
> vector) and set is a sorted container.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7801) Retry logic for unsuccessful `docker rm` during agent recovery

2017-08-28 Thread Deshi Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143577#comment-16143577
 ] 

Deshi Xiao commented on MESOS-7801:
---

[~jieyu] does this patch can summited?

> Retry logic for unsuccessful `docker rm` during agent recovery
> --
>
> Key: MESOS-7801
> URL: https://issues.apache.org/jira/browse/MESOS-7801
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>
> In MESOS- we skip the failure when `docker rm` fails due to mount leakage 
> during agent recovery. In order not to leave residual docker containers in 
> the docker daemon, we could do a best-effort `docker rm` retry with an 
> exponential backoff since we cannot control when the leakage would be 
> terminated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-1871) Sending SIGTERM to a task command may render it orphaned

2017-08-28 Thread Deshi Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143572#comment-16143572
 ] 

Deshi Xiao commented on MESOS-1871:
---

[~idownes]  this ticket is duplicated by  MESOS-6933, close it ?

> Sending SIGTERM to a task command may render it orphaned
> 
>
> Key: MESOS-1871
> URL: https://issues.apache.org/jira/browse/MESOS-1871
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
>Reporter: Alexander Rukletsov
>Priority: Minor
>
> {{CommandExecutor}} launches tasks wrapping them into {{sh -c}}. That means 
> signals are sent to the top process—that is {{sh -c}}—and not to the task 
> directly. Though {{SIGTERM}} is propagated by {{sh -c}} down the process 
> tree, if the task is unresponsive to {{SIGTERM}}, {{sh -c}} terminates 
> reporting success to the {{CommandExecutor}}, rendering the task detached 
> from the parent process and still running. Because the {{CommandExecutor}} 
> thinks the command terminated normally, its OS process exits normally and may 
> not trigger containerizer's escalation which destroys cgroups.
> Here is the test related to the first part: 
> [https://gist.github.com/rukletsov/68259dfb02421813f9e6].
> Here is the test related to the second part: 
> [https://gist.github.com/rukletsov/3f19ecc7389fa51e65c0].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6615) Running mesos-slave in the docker that leave many zombie process

2017-08-28 Thread Stefan Eder (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143516#comment-16143516
 ] 

Stefan Eder commented on MESOS-6615:


Maybe this helps: https://github.com/mesosphere/docker-containers/issues/9 
(it's not a bug per se, you just need to run the agent docker container with 
--pid=host)

> Running mesos-slave in the docker that leave many zombie process
> 
>
> Key: MESOS-6615
> URL: https://issues.apache.org/jira/browse/MESOS-6615
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, containerization
>Affects Versions: 0.28.2
> Environment: Mesos 0.28.2 
> Docker 1.12.1
>Reporter: Lei Xu
>Priority: Critical
>
> Here are some zombie process if I run mesos-slave in the docker.
> {code}
> root 10547 19464  0 Oct25 ?00:00:00 [docker] 
> root 14505 19464  0 Oct25 ?00:00:00 [docker] 
> root 16069 19464  0 Oct25 ?00:00:00 [docker] 
> root 19962 19464  0 Oct25 ?00:00:00 [docker] 
> root 23346 19464  0 Oct25 ?00:00:00 [docker] 
> root 24544 19464  0 Oct25 ?00:00:00 [docker] 
> {code}
> And I find the zombies come from {{mesos-slave}} process:
> {code}
> pstree -p -s 10547
> systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
> {code}
> The logs has been deleted by the cron job a few weeks ago, but I remember so 
> many {{Failed to shutdown socket with fd xx: Transport endpoint is not 
> connected}} in the log.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)