[jira] [Commented] (MESOS-5116) Investigate supporting accounting only mode in XFS isolator

2017-07-27 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104458#comment-16104458
 ] 

Yan Xu commented on MESOS-5116:
---

[~jpe...@apache.org] I think this is worth calling out in the CHANGELOG?

> Investigate supporting accounting only mode in XFS isolator
> ---
>
> Key: MESOS-5116
> URL: https://issues.apache.org/jira/browse/MESOS-5116
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Yan Xu
>Assignee: James Peach
> Fix For: 1.4.0
>
>
> The initial implementation of XFS isolator always enforces the disk quota 
> limit. In contrast, Posix disk isolator supports optionally monitoring the 
> disk usage without enforcement. This eases the transition into disk quota 
> enforcement mode.
> Mesos agent provides a {{flags.enforce_container_disk_quota}} flag to turn on 
> enforcement when the Posix isolator is added. With XFS either we support it 
> as well or we need to change the flag so it's Posix disk isolator specific.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-3488) /sys/fs/cgroup/memory/mesos missing after running a while

2017-07-27 Thread stevenlee (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104306#comment-16104306
 ] 

stevenlee commented on MESOS-3488:
--

Hi, Chengwei Yang
How did you resolve this issue ? I encountered a similar problem, but the 
missing directory is /sys/fs/cgroup/devices/mesos.

> /sys/fs/cgroup/memory/mesos missing after running a while
> -
>
> Key: MESOS-3488
> URL: https://issues.apache.org/jira/browse/MESOS-3488
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.21.0
> Environment: mesos 0.21.0 on CentOS 7.1
>Reporter: Chengwei Yang
>
> I setup mesos 0.21.0 on CentOS 7.1 with mesos-0.21.0 rpm downloaded from 
> mesosphere.
> at first, it works fine, jobs are finished correctly, however, after running 
> a while, all task goes to LOST.
> And there is nothing in **sandbox** and I see below from mesos-slave.ERROR
> ```
> E0922 14:02:31.329264  8336 slave.cpp:2787] Container 
> '865c9a1c-3abe-4263-921e-8d0be2f0a56d' for executor 
> '21bf1400-6456-45e5-8f28-fe5ade7e7bfd' of framework 
> '20150401-105258-3755085578-5050-14676-' failed to start: Failed to 
> prepare isolator: Failed to create directory 
> '/sys/fs/cgroup/memory/mesos/865c9a1c-3abe-4263-921e-8d0be2f0a56d': No such 
> file or directory
> ```
> And I checked that /sys/fs/cgroup/cpu/mesos does exist while 
> /sys/fs/cgroup/memory/mesos is missing.
> Since at first mesos works fine (and I checked source code, that if it create 
> /sys/fs/cgroup/memory/mesos failed at mesos-slave startup, it will log that 
> error), so I'm curious when/which removed /sys/fs/cgroup/memory/mesos.
> Has anyone saw this issue before?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7821) Resource refinement does downgrade task.executor.resources in LAUNCH_GROUP handler.

2017-07-27 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-7821:

Sprint: Mesosphere Sprint 60

> Resource refinement does downgrade task.executor.resources in LAUNCH_GROUP 
> handler.
> ---
>
> Key: MESOS-7821
> URL: https://issues.apache.org/jira/browse/MESOS-7821
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Jie Yu
>Assignee: Michael Park
>
> Looks like we need to downgrade task.executor.resources as well:
> https://github.com/apache/mesos/blob/master/src/master/master.cpp#L4970-L4982



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7831) Resource refinement is not applied to tasks in completed_frameworks.

2017-07-27 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-7831:

Sprint: Mesosphere Sprint 60

> Resource refinement is not applied to tasks in completed_frameworks.
> 
>
> Key: MESOS-7831
> URL: https://issues.apache.org/jira/browse/MESOS-7831
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Yan Xu
>Assignee: Yan Xu
>Priority: Blocker
>
> When an agent reregisters, the master [doesn't apply refinement to 
> completed_frameworks|https://github.pie.apple.com/pie/mesos/blob/cd3380c4e9521b4b26f9030658816eee7a4b89a1/src/master/master.cpp#L8600]
>  the agents sends. This then would violate CHECKs that could be triggered by 
> querying the /state endpoint.
> A sample CHECK failure backtrace:
> {noformat:title=}
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #1  0x7fae8a752105 in abort () from /lib64/libc.so.6
> #2  0x7fae8d6bccf4 in DumpStackTraceAndExit () at src/utilities.cc:147
> #3  0x7fae8d6b5aaa in Fail () at src/logging.cc:1458
> #4  0x7fae8d6b5a06 in SendToLog () at src/logging.cc:1412
> #5  0x7fae8d6b53fc in Flush () at src/logging.cc:1281
> #6  0x7fae8d6b81b6 in ~LogMessageFatal () at src/logging.cc:1984
> #7  0x7fae8c6d3c1a in mesos::Resources::isEmpty (resource=...) at 
> ../../src/common/resources.cpp:1052
> #8  0x7fae8c6d3d16 in mesos::Resources::Resource_::isEmpty 
> (this=this@entry=0x7fae787e6910) at ../../src/common/resources.cpp:1174
> #9  0x7fae8c6d3d43 in mesos::Resources::add (this=0x7fae787e6a40, 
> that=...) at ../../src/common/resources.cpp:1994
> #10 0x7fae8c6d53f0 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2017
> #11 0x7fae8c6d54a8 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2026
> #12 0x7fae8c6d55fb in mesos::Resources::Resources (this=0x7fae787e6a40, 
> _resources=...) at ../../src/common/resources.cpp:1278
> #13 0x7fae8c6ade1e in mesos::json (writer=writer@entry=0x7fae787e6b30, 
> task=...) at ../../src/common/http.cpp:675
> #14 0x7fae8c85d62f in operator() (stream=, 
> __closure=) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:771
> #15 std::_Function_handler (std::ostream*)> JSON::internal::jsonify(mesos::Task const&, 
> JSON::internal::LessPrefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data
>  const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #16 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6bd0) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #17 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #18 0x7fae8c85f3f9 in JSON::ArrayWriter::element 
> (this=0x7fae787e6cf0, value=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:406
> #19 0x7fae8c86a378 in 
> mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*) 
> const::{lambda(JSON::ArrayWriter*)#3}::operator()(JSON::ArrayWriter*) const (
> __closure=0x7fae787e6db0, writer=writer@entry=0x7fae787e6cf0) at 
> ../../src/master/http.cpp:348
> #20 0x7fae8c86a5ad in operator() (stream=, 
> __closure=0x7fad9a48af68) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:759
> #21 std::_Function_handler (std::ostream*)> 
> JSON::internal::jsonify  const::{lambda(JSON::ArrayWriter*)#3}, 
> void>(mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*)
>  const::{lambda(JSON::ArrayWriter*)#3} const&, 
> JSON::internal::Prefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data 
> const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #22 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6e00) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #23 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #24 0x7fae8c8b9331 in 
> field  const:: > (value=..., key="completed_tasks", 
> this=0x7fae787e6eb0)
> at ../../3rdparty/stout/include/stout/jsonify.hpp:440
> #25 mesos::internal::master::FullFrameworkWriter::operator() 
> (this=0x7fae787e6f70, 

[jira] [Updated] (MESOS-7246) Add documentation for AGENT_ADDED/AGENT_REMOVED events.

2017-07-27 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-7246:
--
Shepherd: Anand Mazumdar

> Add documentation for AGENT_ADDED/AGENT_REMOVED events.
> ---
>
> Key: MESOS-7246
> URL: https://issues.apache.org/jira/browse/MESOS-7246
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Anand Mazumdar
>Assignee: Quinn
>Priority: Minor
>  Labels: newbie, newbie++
>
> We need to add documentation to the existing Mesos Operator API docs for the 
> newly added {{AGENT_ADDED}}/{{AGENT_REMOVED}} events. The protobuf definition 
> for the events can be found here:
> https://github.com/apache/mesos/blob/master/include/mesos/v1/master/master.proto



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7246) Add documentation for AGENT_ADDED/AGENT_REMOVED events.

2017-07-27 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar reassigned MESOS-7246:
-

Assignee: Quinn

> Add documentation for AGENT_ADDED/AGENT_REMOVED events.
> ---
>
> Key: MESOS-7246
> URL: https://issues.apache.org/jira/browse/MESOS-7246
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Anand Mazumdar
>Assignee: Quinn
>Priority: Minor
>  Labels: newbie, newbie++
>
> We need to add documentation to the existing Mesos Operator API docs for the 
> newly added {{AGENT_ADDED}}/{{AGENT_REMOVED}} events. The protobuf definition 
> for the events can be found here:
> https://github.com/apache/mesos/blob/master/include/mesos/v1/master/master.proto



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7246) Add documentation for AGENT_ADDED/AGENT_REMOVED events.

2017-07-27 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-7246:
--
  Sprint: Mesosphere Sprint 60
Story Points: 1

> Add documentation for AGENT_ADDED/AGENT_REMOVED events.
> ---
>
> Key: MESOS-7246
> URL: https://issues.apache.org/jira/browse/MESOS-7246
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Anand Mazumdar
>Assignee: Quinn
>Priority: Minor
>  Labels: newbie, newbie++
>
> We need to add documentation to the existing Mesos Operator API docs for the 
> newly added {{AGENT_ADDED}}/{{AGENT_REMOVED}} events. The protobuf definition 
> for the events can be found here:
> https://github.com/apache/mesos/blob/master/include/mesos/v1/master/master.proto



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7831) Resource refinement is not applied to tasks in completed_frameworks.

2017-07-27 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-7831:

Shepherd: Michael Park

> Resource refinement is not applied to tasks in completed_frameworks.
> 
>
> Key: MESOS-7831
> URL: https://issues.apache.org/jira/browse/MESOS-7831
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Yan Xu
>Assignee: Yan Xu
>Priority: Blocker
>
> When an agent reregisters, the master [doesn't apply refinement to 
> completed_frameworks|https://github.pie.apple.com/pie/mesos/blob/cd3380c4e9521b4b26f9030658816eee7a4b89a1/src/master/master.cpp#L8600]
>  the agents sends. This then would violate CHECKs that could be triggered by 
> querying the /state endpoint.
> A sample CHECK failure backtrace:
> {noformat:title=}
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #1  0x7fae8a752105 in abort () from /lib64/libc.so.6
> #2  0x7fae8d6bccf4 in DumpStackTraceAndExit () at src/utilities.cc:147
> #3  0x7fae8d6b5aaa in Fail () at src/logging.cc:1458
> #4  0x7fae8d6b5a06 in SendToLog () at src/logging.cc:1412
> #5  0x7fae8d6b53fc in Flush () at src/logging.cc:1281
> #6  0x7fae8d6b81b6 in ~LogMessageFatal () at src/logging.cc:1984
> #7  0x7fae8c6d3c1a in mesos::Resources::isEmpty (resource=...) at 
> ../../src/common/resources.cpp:1052
> #8  0x7fae8c6d3d16 in mesos::Resources::Resource_::isEmpty 
> (this=this@entry=0x7fae787e6910) at ../../src/common/resources.cpp:1174
> #9  0x7fae8c6d3d43 in mesos::Resources::add (this=0x7fae787e6a40, 
> that=...) at ../../src/common/resources.cpp:1994
> #10 0x7fae8c6d53f0 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2017
> #11 0x7fae8c6d54a8 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2026
> #12 0x7fae8c6d55fb in mesos::Resources::Resources (this=0x7fae787e6a40, 
> _resources=...) at ../../src/common/resources.cpp:1278
> #13 0x7fae8c6ade1e in mesos::json (writer=writer@entry=0x7fae787e6b30, 
> task=...) at ../../src/common/http.cpp:675
> #14 0x7fae8c85d62f in operator() (stream=, 
> __closure=) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:771
> #15 std::_Function_handler (std::ostream*)> JSON::internal::jsonify(mesos::Task const&, 
> JSON::internal::LessPrefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data
>  const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #16 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6bd0) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #17 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #18 0x7fae8c85f3f9 in JSON::ArrayWriter::element 
> (this=0x7fae787e6cf0, value=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:406
> #19 0x7fae8c86a378 in 
> mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*) 
> const::{lambda(JSON::ArrayWriter*)#3}::operator()(JSON::ArrayWriter*) const (
> __closure=0x7fae787e6db0, writer=writer@entry=0x7fae787e6cf0) at 
> ../../src/master/http.cpp:348
> #20 0x7fae8c86a5ad in operator() (stream=, 
> __closure=0x7fad9a48af68) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:759
> #21 std::_Function_handler (std::ostream*)> 
> JSON::internal::jsonify  const::{lambda(JSON::ArrayWriter*)#3}, 
> void>(mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*)
>  const::{lambda(JSON::ArrayWriter*)#3} const&, 
> JSON::internal::Prefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data 
> const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #22 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6e00) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #23 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #24 0x7fae8c8b9331 in 
> field  const:: > (value=..., key="completed_tasks", 
> this=0x7fae787e6eb0)
> at ../../3rdparty/stout/include/stout/jsonify.hpp:440
> #25 mesos::internal::master::FullFrameworkWriter::operator() 
> (this=0x7fae787e6f70, 

[jira] [Updated] (MESOS-7713) Optimize number of copies made in dispatch/defer mechanism

2017-07-27 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-7713:
--
  Sprint: Mesosphere Sprint 60
Story Points: 3

> Optimize number of copies made in dispatch/defer mechanism
> --
>
> Key: MESOS-7713
> URL: https://issues.apache.org/jira/browse/MESOS-7713
> Project: Mesos
>  Issue Type: Task
>  Components: libprocess
>Affects Versions: 1.2.0, 1.2.1, 1.3.0
>Reporter: Dmitry Zhuk
>Assignee: Dmitry Zhuk
>
> Profiling agents reregistration for a large cluster shows, that many CPU 
> cycles are spent on copying protobuf objects. This is partially due to copies 
> made by a code like this:
> {code}
> future.then(defer(self(), ::method, param);
> {code}
> {{param}} could be copied 8-10 times before it reaches {{method}}. 
> Specifically, {{reregisterSlave}} accepts vectors of rather complex objects, 
> which are passed to {{defer}}.
> Currently there are some places in {{defer}}, {{dispatch}} and {{Future}} 
> code, which could use {{std::move}} and {{std::forward}} to evade some of the 
> copies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7246) Add documentation for AGENT_ADDED/AGENT_REMOVED events.

2017-07-27 Thread Quinn (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103987#comment-16103987
 ] 

Quinn commented on MESOS-7246:
--

https://reviews.apache.org/r/61194

> Add documentation for AGENT_ADDED/AGENT_REMOVED events.
> ---
>
> Key: MESOS-7246
> URL: https://issues.apache.org/jira/browse/MESOS-7246
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Anand Mazumdar
>Priority: Minor
>  Labels: newbie, newbie++
>
> We need to add documentation to the existing Mesos Operator API docs for the 
> newly added {{AGENT_ADDED}}/{{AGENT_REMOVED}} events. The protobuf definition 
> for the events can be found here:
> https://github.com/apache/mesos/blob/master/include/mesos/v1/master/master.proto



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7831) Resource refinement is not applied to tasks in completed_frameworks.

2017-07-27 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu reassigned MESOS-7831:
-

Assignee: Yan Xu

> Resource refinement is not applied to tasks in completed_frameworks.
> 
>
> Key: MESOS-7831
> URL: https://issues.apache.org/jira/browse/MESOS-7831
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Yan Xu
>Assignee: Yan Xu
>Priority: Blocker
>
> When an agent reregisters, the master [doesn't apply refinement to 
> completed_frameworks|https://github.pie.apple.com/pie/mesos/blob/cd3380c4e9521b4b26f9030658816eee7a4b89a1/src/master/master.cpp#L8600]
>  the agents sends. This then would violate CHECKs that could be triggered by 
> querying the /state endpoint.
> A sample CHECK failure backtrace:
> {noformat:title=}
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #1  0x7fae8a752105 in abort () from /lib64/libc.so.6
> #2  0x7fae8d6bccf4 in DumpStackTraceAndExit () at src/utilities.cc:147
> #3  0x7fae8d6b5aaa in Fail () at src/logging.cc:1458
> #4  0x7fae8d6b5a06 in SendToLog () at src/logging.cc:1412
> #5  0x7fae8d6b53fc in Flush () at src/logging.cc:1281
> #6  0x7fae8d6b81b6 in ~LogMessageFatal () at src/logging.cc:1984
> #7  0x7fae8c6d3c1a in mesos::Resources::isEmpty (resource=...) at 
> ../../src/common/resources.cpp:1052
> #8  0x7fae8c6d3d16 in mesos::Resources::Resource_::isEmpty 
> (this=this@entry=0x7fae787e6910) at ../../src/common/resources.cpp:1174
> #9  0x7fae8c6d3d43 in mesos::Resources::add (this=0x7fae787e6a40, 
> that=...) at ../../src/common/resources.cpp:1994
> #10 0x7fae8c6d53f0 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2017
> #11 0x7fae8c6d54a8 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2026
> #12 0x7fae8c6d55fb in mesos::Resources::Resources (this=0x7fae787e6a40, 
> _resources=...) at ../../src/common/resources.cpp:1278
> #13 0x7fae8c6ade1e in mesos::json (writer=writer@entry=0x7fae787e6b30, 
> task=...) at ../../src/common/http.cpp:675
> #14 0x7fae8c85d62f in operator() (stream=, 
> __closure=) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:771
> #15 std::_Function_handler (std::ostream*)> JSON::internal::jsonify(mesos::Task const&, 
> JSON::internal::LessPrefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data
>  const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #16 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6bd0) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #17 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #18 0x7fae8c85f3f9 in JSON::ArrayWriter::element 
> (this=0x7fae787e6cf0, value=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:406
> #19 0x7fae8c86a378 in 
> mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*) 
> const::{lambda(JSON::ArrayWriter*)#3}::operator()(JSON::ArrayWriter*) const (
> __closure=0x7fae787e6db0, writer=writer@entry=0x7fae787e6cf0) at 
> ../../src/master/http.cpp:348
> #20 0x7fae8c86a5ad in operator() (stream=, 
> __closure=0x7fad9a48af68) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:759
> #21 std::_Function_handler (std::ostream*)> 
> JSON::internal::jsonify  const::{lambda(JSON::ArrayWriter*)#3}, 
> void>(mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*)
>  const::{lambda(JSON::ArrayWriter*)#3} const&, 
> JSON::internal::Prefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data 
> const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #22 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6e00) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #23 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #24 0x7fae8c8b9331 in 
> field  const:: > (value=..., key="completed_tasks", 
> this=0x7fae787e6eb0)
> at ../../3rdparty/stout/include/stout/jsonify.hpp:440
> #25 mesos::internal::master::FullFrameworkWriter::operator() 
> (this=0x7fae787e6f70, 

[jira] [Commented] (MESOS-7831) Resource refinement is not applied to tasks in completed_frameworks.

2017-07-27 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103975#comment-16103975
 ] 

Yan Xu commented on MESOS-7831:
---

Similar to MESOS-7716.

> Resource refinement is not applied to tasks in completed_frameworks.
> 
>
> Key: MESOS-7831
> URL: https://issues.apache.org/jira/browse/MESOS-7831
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Yan Xu
>Priority: Blocker
>
> When an agent reregisters, the master [doesn't apply refinement to 
> completed_frameworks|https://github.pie.apple.com/pie/mesos/blob/cd3380c4e9521b4b26f9030658816eee7a4b89a1/src/master/master.cpp#L8600]
>  the agents sends. This then would violate CHECKs that could be triggered by 
> querying the /state endpoint.
> A sample CHECK failure backtrace:
> {noformat:title=}
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #0  0x7fae8a750925 in raise () from /lib64/libc.so.6
> #1  0x7fae8a752105 in abort () from /lib64/libc.so.6
> #2  0x7fae8d6bccf4 in DumpStackTraceAndExit () at src/utilities.cc:147
> #3  0x7fae8d6b5aaa in Fail () at src/logging.cc:1458
> #4  0x7fae8d6b5a06 in SendToLog () at src/logging.cc:1412
> #5  0x7fae8d6b53fc in Flush () at src/logging.cc:1281
> #6  0x7fae8d6b81b6 in ~LogMessageFatal () at src/logging.cc:1984
> #7  0x7fae8c6d3c1a in mesos::Resources::isEmpty (resource=...) at 
> ../../src/common/resources.cpp:1052
> #8  0x7fae8c6d3d16 in mesos::Resources::Resource_::isEmpty 
> (this=this@entry=0x7fae787e6910) at ../../src/common/resources.cpp:1174
> #9  0x7fae8c6d3d43 in mesos::Resources::add (this=0x7fae787e6a40, 
> that=...) at ../../src/common/resources.cpp:1994
> #10 0x7fae8c6d53f0 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2017
> #11 0x7fae8c6d54a8 in mesos::Resources::operator+= 
> (this=this@entry=0x7fae787e6a40, that=...) at 
> ../../src/common/resources.cpp:2026
> #12 0x7fae8c6d55fb in mesos::Resources::Resources (this=0x7fae787e6a40, 
> _resources=...) at ../../src/common/resources.cpp:1278
> #13 0x7fae8c6ade1e in mesos::json (writer=writer@entry=0x7fae787e6b30, 
> task=...) at ../../src/common/http.cpp:675
> #14 0x7fae8c85d62f in operator() (stream=, 
> __closure=) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:771
> #15 std::_Function_handler (std::ostream*)> JSON::internal::jsonify(mesos::Task const&, 
> JSON::internal::LessPrefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data
>  const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #16 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6bd0) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #17 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #18 0x7fae8c85f3f9 in JSON::ArrayWriter::element 
> (this=0x7fae787e6cf0, value=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:406
> #19 0x7fae8c86a378 in 
> mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*) 
> const::{lambda(JSON::ArrayWriter*)#3}::operator()(JSON::ArrayWriter*) const (
> __closure=0x7fae787e6db0, writer=writer@entry=0x7fae787e6cf0) at 
> ../../src/master/http.cpp:348
> #20 0x7fae8c86a5ad in operator() (stream=, 
> __closure=0x7fad9a48af68) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:759
> #21 std::_Function_handler (std::ostream*)> 
> JSON::internal::jsonify  const::{lambda(JSON::ArrayWriter*)#3}, 
> void>(mesos::internal::master::FullFrameworkWriter::operator()(JSON::ObjectWriter*)
>  const::{lambda(JSON::ArrayWriter*)#3} const&, 
> JSON::internal::Prefer)::{lambda(std::ostream*)#1}>::_M_invoke(std::_Any_data 
> const&, std::ostream*) (__functor=..., __args#0=) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2039
> #22 0x7fae8c69af73 in operator() (__args#0=0x7fae787e7820, 
> this=0x7fae787e6e00) at 
> /opt/rh/devtoolset-3/root/usr/include/c++/4.9.2/functional:2439
> #23 JSON::operator<<(std::ostream&, JSON::Proxy&&) (stream=..., that=...) at 
> ../../3rdparty/stout/include/stout/jsonify.hpp:172
> #24 0x7fae8c8b9331 in 
> field  const:: > (value=..., key="completed_tasks", 
> this=0x7fae787e6eb0)
> at ../../3rdparty/stout/include/stout/jsonify.hpp:440
> #25 mesos::internal::master::FullFrameworkWriter::operator() 
> (this=0x7fae787e6f70, 

[jira] [Created] (MESOS-7839) io switchboard: clarify expected behavior when using TTYInfo with the default executor of a TaskGroup

2017-07-27 Thread James DeFelice (JIRA)
James DeFelice created MESOS-7839:
-

 Summary: io switchboard: clarify expected behavior when using 
TTYInfo with the default executor of a TaskGroup
 Key: MESOS-7839
 URL: https://issues.apache.org/jira/browse/MESOS-7839
 Project: Mesos
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: James DeFelice


I executed a LaunchGroup operation with an Executor of a DEFAULT type and with 
TTYInfo set to a non-empty protobuf. The tasks of the group did not specify a 
ContainerInfo.

Mesos "successfully" launched the task group and in the executor sandbox stderr 
reported
{code}
The io switchboard server failed: Failed redirecting stdout: Input/output error
{code}

... which seems relatively uninformative. Mesos also returned TASK_RUNNING 
followed by TASK_FINISHED for the tasks in the launched group. This wasn't what 
I expected: my goal was to launch a pod and have a TTY attached to the first 
task in the group.

After discussing with [~klueska] the solution to my problem was to specify 
TTYInfo for the container of the task within the group, not on the group's 
executor. But we agreed that Mesos could probably exhibit better behavior in 
the initial scenario that I tested.

Some (mutually exclusive) possibilities for alternate Mesos behavior:

(a) fail-fast: using the Default Executor with a task group doesn't support 
TTYInfo so Mesos should just refuse to launch the task group (and return an 
appropriate error code and message w/ a reasonable explanation).

(b) support TTYInfo when using the Default Executor with a task group. The use 
case for this is unclear.

(c) when using TTYInfo with the DefaultExecutor and task group, attach the TTY 
to the first task in the group.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7838) GC space utilization metrics.

2017-07-27 Thread James Peach (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach updated MESOS-7838:
---
Component/s: storage
 agent

> GC space utilization metrics.
> -
>
> Key: MESOS-7838
> URL: https://issues.apache.org/jira/browse/MESOS-7838
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, storage
>Reporter: James Peach
>
> To help operators understand and tune Mesos GC, it would be helpful to have 
> metrics that track how much space is eligible for garbage collection and how 
> much space is being released as garbage collection takes place. We can 
> implement this fairly efficiently (though with some loss of accuracy) if we 
> persist the space usage calculated by the disk isolators. Knowing the space 
> consumed by each sandbox path will not only let us publish these metrics, but 
> can also be used to make better garbage collection choices (i.e. we can know 
> how much space to release to get back within tolerance).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7838) GC space utilization metrics.

2017-07-27 Thread James Peach (JIRA)
James Peach created MESOS-7838:
--

 Summary: GC space utilization metrics.
 Key: MESOS-7838
 URL: https://issues.apache.org/jira/browse/MESOS-7838
 Project: Mesos
  Issue Type: Bug
Reporter: James Peach


To help operators understand and tune Mesos GC, it would be helpful to have 
metrics that track how much space is eligible for garbage collection and how 
much space is being released as garbage collection takes place. We can 
implement this fairly efficiently (though with some loss of accuracy) if we 
persist the space usage calculated by the disk isolators. Knowing the space 
consumed by each sandbox path will not only let us publish these metrics, but 
can also be used to make better garbage collection choices (i.e. we can know 
how much space to release to get back within tolerance).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7652) Docker image with universal containerizer does not work if WORKDIR is missing in the rootfs.

2017-07-27 Thread Gilbert Song (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilbert Song updated MESOS-7652:

Target Version/s: 1.2.2, 1.3.1, 1.4.0
  Labels: mesosphere  (was: )

> Docker image with universal containerizer does not work if WORKDIR is missing 
> in the rootfs.
> 
>
> Key: MESOS-7652
> URL: https://issues.apache.org/jira/browse/MESOS-7652
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.2.1
>Reporter: michael beisiegel
>Assignee: Gilbert Song
>Priority: Minor
>  Labels: mesosphere
>
> hello,
> used the following docker image recently
> quay.io/spinnaker/front50:master
> https://quay.io/repository/spinnaker/front50
> Here the link to the Dockerfile
> https://github.com/spinnaker/front50/blob/master/Dockerfile
> and here the source
> {color:blue}FROM java:8
> MAINTAINER delivery-engineer...@netflix.com
> COPY . workdir/
> WORKDIR workdir
> RUN GRADLE_USER_HOME=cache ./gradlew buildDeb -x test && \
>   dpkg -i ./front50-web/build/distributions/*.deb && \
>   cd .. && \
>   rm -rf workdir
> CMD ["/opt/front50/bin/front50"]{color}
> The image works fine with the docker containerizer, but the universal 
> containerizer shows the following in stderr.
> "Failed to chdir into current working directory '/workdir': No such file or 
> directory"
> The problem comes from the fact that the Dockerfile creates a workdir but 
> then later removes the created dir as part of a RUN. The docker containerizer 
> has no problem with it if you do
> docker run -ti --rm quay.io/spinnaker/front50:master bash
> you get into the working dir, but the universal containerizer fails with the 
> error.
> thanks for your help,
> Michael



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7714) Fix agent downgrade for reservation refinement

2017-07-27 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103626#comment-16103626
 ] 

Yan Xu commented on MESOS-7714:
---

Great. I guess I am just not clear on the mechanism to achieve that.

Downgrading currently fails [the CHECK 
here|https://github.com/apache/mesos/blob/1.3.0/src/slave/paths.cpp#L478] for 
me (from a 1.4 agent with persistent volumes). It looks like the proposal is to 
commit some downgrading logic to 1.3.x branch? Sorry it's not clear to me if 
the case I am mentioning is covered.

> Fix agent downgrade for reservation refinement
> --
>
> Key: MESOS-7714
> URL: https://issues.apache.org/jira/browse/MESOS-7714
> Project: Mesos
>  Issue Type: Bug
>Reporter: Michael Park
>Priority: Blocker
>
> The agent code only partially supports downgrading of an agent correctly.
> The checkpointed resources are done correctly, but the resources within
> the {{SlaveInfo}} message as well as tasks and executors also need to be 
> downgraded
> correctly and converted back on recovery.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7714) Fix agent downgrade for reservation refinement

2017-07-27 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103520#comment-16103520
 ] 

Michael Park commented on MESOS-7714:
-

[~xujyan]: This *is* for the downgrade of 1.4 to <= 1.3.x.

> Fix agent downgrade for reservation refinement
> --
>
> Key: MESOS-7714
> URL: https://issues.apache.org/jira/browse/MESOS-7714
> Project: Mesos
>  Issue Type: Bug
>Reporter: Michael Park
>Priority: Blocker
>
> The agent code only partially supports downgrading of an agent correctly.
> The checkpointed resources are done correctly, but the resources within
> the {{SlaveInfo}} message as well as tasks and executors also need to be 
> downgraded
> correctly and converted back on recovery.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7714) Fix agent downgrade for reservation refinement

2017-07-27 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-7714:

Target Version/s: 1.4.0

> Fix agent downgrade for reservation refinement
> --
>
> Key: MESOS-7714
> URL: https://issues.apache.org/jira/browse/MESOS-7714
> Project: Mesos
>  Issue Type: Bug
>Reporter: Michael Park
>Priority: Blocker
>
> The agent code only partially supports downgrading of an agent correctly.
> The checkpointed resources are done correctly, but the resources within
> the {{SlaveInfo}} message as well as tasks and executors also need to be 
> downgraded
> correctly and converted back on recovery.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MESOS-7714) Fix agent downgrade for reservation refinement

2017-07-27 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103505#comment-16103505
 ] 

Yan Xu edited comment on MESOS-7714 at 7/27/17 5:06 PM:


[~mcypark] [~bmahler] Just to be sure. This ticket is not for supporting 
downgrade of an agent from 1.4 to <= 1.3.x right? Could you clarify?


was (Author: xujyan):
[~mcypark] Just to be sure. This ticket is not for supporting downgrade of an 
agent from 1.4 to <= 1.3.x right? Could you clarify?

> Fix agent downgrade for reservation refinement
> --
>
> Key: MESOS-7714
> URL: https://issues.apache.org/jira/browse/MESOS-7714
> Project: Mesos
>  Issue Type: Bug
>Reporter: Michael Park
>Priority: Blocker
>
> The agent code only partially supports downgrading of an agent correctly.
> The checkpointed resources are done correctly, but the resources within
> the {{SlaveInfo}} message as well as tasks and executors also need to be 
> downgraded
> correctly and converted back on recovery.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7714) Fix agent downgrade for reservation refinement

2017-07-27 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103505#comment-16103505
 ] 

Yan Xu commented on MESOS-7714:
---

[~mcypark] Just to be sure. This ticket is not for supporting downgrade of an 
agent from 1.4 to <= 1.3.x right? Could you clarify?

> Fix agent downgrade for reservation refinement
> --
>
> Key: MESOS-7714
> URL: https://issues.apache.org/jira/browse/MESOS-7714
> Project: Mesos
>  Issue Type: Bug
>Reporter: Michael Park
>Priority: Blocker
>
> The agent code only partially supports downgrading of an agent correctly.
> The checkpointed resources are done correctly, but the resources within
> the {{SlaveInfo}} message as well as tasks and executors also need to be 
> downgraded
> correctly and converted back on recovery.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7337) DefaultExecutorCheckTest.CommandCheckTimeout becomes flaky under load

2017-07-27 Thread Deshi Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103255#comment-16103255
 ] 

Deshi Xiao commented on MESOS-7337:
---

howto reproduce the testing case.

> DefaultExecutorCheckTest.CommandCheckTimeout becomes flaky under load
> -
>
> Key: MESOS-7337
> URL: https://issues.apache.org/jira/browse/MESOS-7337
> Project: Mesos
>  Issue Type: Bug
>  Components: flaky, test
> Environment: Mac OS 10.12.4 (16E195), SSL debug build w/o 
> optimizations, clang version 5.0.0 (http://llvm.org/git/clang 
> c511a96ffe744933459ef64bf963629538057a90) (http://llvm.org/git/llvm 
> 0cd81d8a1055f167e0f588dd1b476863b00da3d5)
>Reporter: Benjamin Bannier
>  Labels: flaky-test, mesosphere
> Attachments: DefaultExecutorCheckTest.CommandCheckTimeout.log
>
>
> The test {{DefaultExecutorCheckTest.CommandCheckTimeout}} randomly fails for 
> me when executing tests in parallel, e.g.,
> {code}
> [ RUN  ] DefaultExecutorCheckTest.CommandCheckTimeout
> ../../src/tests/check_tests.cpp:1374: Failure
> Failed to wait 15secs for updateCheckResultTimeout
> ../../src/tests/check_tests.cpp:1334: Failure
> Actual function call count doesn't match EXPECT_CALL(*scheduler, update(_, 
> _))...
>  Expected: to be called at least 3 times
>Actual: called twice - unsatisfied and active
> [  FAILED  ] DefaultExecutorCheckTest.CommandCheckTimeout (25351 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-7837) Propagate resource updates from local resource providers to master

2017-07-27 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-7837:
---

 Summary: Propagate resource updates from local resource providers 
to master
 Key: MESOS-7837
 URL: https://issues.apache.org/jira/browse/MESOS-7837
 Project: Mesos
  Issue Type: Improvement
  Components: agent
Reporter: Benjamin Bannier
Assignee: Benjamin Bannier


When a resource provider registers with a resource provider manager, the 
manager should sent a message to its subscribers informing them on the changed 
resources.

For the first iteration where we add agent-specific, local resource providers, 
the agent would be subscribed to the manager. It should be changed to handle 
such a resource update by informing the master about its changed resources. In 
order to support master failovers, we should make sure to similarly inform the 
master on agent reregistration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-6804) Running 'tty' inside a debug container that has a tty reports "Not a tty"

2017-07-27 Thread Deshi Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103192#comment-16103192
 ] 

Deshi Xiao commented on MESOS-6804:
---

[~klueska] could you please explain what reason for that. 

> Running 'tty' inside a debug container that has a tty reports "Not a tty"
> -
>
> Key: MESOS-6804
> URL: https://issues.apache.org/jira/browse/MESOS-6804
> Project: Mesos
>  Issue Type: Bug
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Critical
>  Labels: debugging, mesosphere
>
> We need to inject `/dev/console` into the container and map it to the slave 
> end of the TTY we are attached to.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MESOS-7813) when lxc run after a period of time, the file(/proc/pid/cgroup) is modified, devices,blkio,memory,cpuacct is changed. why?

2017-07-27 Thread y123456yz (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101065#comment-16101065
 ] 

y123456yz edited comment on MESOS-7813 at 7/27/17 10:31 AM:


[~jvanremoortere]

thanks very much.

1. 
my lxc-start is started by my Executor, lxc-start running 
with"/usr/bin/lxc-start -f /xxx/config -n xxx",  Executor running with 
"/usr/bin/Executor  ", they started by command lines, not through systemd 
service.

whther I should start my Executor and lxc by systemd service(add 
delegate=true)? now, I start them by command lines, not by systemd service, as 
following:
/usr/bin/my-Executor -f xxx -d &
/usr/bin/lxc-start -f config -n xxx &

if must add "delegate=true", how to add "delegate=true" when I start my 
Executor  and lxc-start by command lines.


2.
there have another problem, sometimes, when I restart mesos service, the lxc 
container stoped automaticly, I don't known what happen.

when I run "systemctl restart mesos-slave", mesos-slave send KILL signal to 
my-Executor, then all the lxc-start process also receive KILL signal. 
my-Executor run lxc-start by " CmdUtil.exec("lxc-start -f config -n xxx -d ")

thanks again, Joris Van Remoortere



was (Author: y123456yz):
[~jvanremoortere]

thanks very much.

1. 
my lxc-start is started by my Executor, lxc-start running 
with"/usr/bin/lxc-start -f /xxx/config -n xxx",  Executor running with 
"/usr/bin/Executor  ", they started by command lines, not through systemd 
service.

whther I should start my Executor and lxc by systemd service(add 
delegate=true)? now, I start them by command lines, not by systemd service, as 
following:
/usr/bin/my-Executor -f xxx -d &
/usr/bin/lxc-start -f config -n xxx &

if must add "delegate=true", how to add "delegate=true" when I start my 
Executor  and lxc-start by command lines.


2.
there have another problem, sometimes, when I restart mesos service, the lxc 
container stoped automaticly, I don't known what happen.

thanks again, Joris Van Remoortere


> when lxc run after a period of time, the file(/proc/pid/cgroup) is modified, 
> devices,blkio,memory,cpuacct is changed. why?
> --
>
> Key: MESOS-7813
> URL: https://issues.apache.org/jira/browse/MESOS-7813
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, cgroups, executor, framework
> Environment: 1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 
> GNU/Linux
>Reporter: y123456yz
>
> when lxc run after a period of time, the file(/proc/pid/cgroup) is modified, 
> devices,blkio,memory,cpuacct is changed. why?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)