[jira] [Assigned] (MESOS-9197) Optimize `Resources::contains()`.

2018-09-06 Thread Meng Zhu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Meng Zhu reassigned MESOS-9197:
---

Assignee: (was: Meng Zhu)

> Optimize `Resources::contains()`.
> -
>
> Key: MESOS-9197
> URL: https://issues.apache.org/jira/browse/MESOS-9197
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Meng Zhu
>Priority: Major
>  Labels: mesosphere, performance
>
> Surprisingly, `contains` always make a copy of the superset resources. This 
> could be avoided.
> https://github.com/apache/mesos/blob/f7e3872b0359c6095f8eeaefe408cb7dcef5bb83/src/common/resources.cpp#L1415



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9213) Avoid multiple message conversions when incrementing metrics

2018-09-06 Thread Benjamin Mahler (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler reassigned MESOS-9213:
--

Assignee: Benjamin Mahler

> Avoid multiple message conversions when incrementing metrics
> 
>
> Key: MESOS-9213
> URL: https://issues.apache.org/jira/browse/MESOS-9213
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, metrics
>Affects Versions: 1.7.0
>Reporter: Greg Mann
>Assignee: Benjamin Mahler
>Priority: Major
>  Labels: mesosphere, metrics
>
> When incrementing metrics, we currently do stuff like
> {code}
> metrics.incrementEvent(devolve(evolve(message)));
> {code}
> which is not efficient. We should update such callsites to avoid gratuitous 
> conversions which could degrade performance when many events are being sent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9213) Avoid multiple message conversions when incrementing metrics

2018-09-06 Thread Greg Mann (JIRA)
Greg Mann created MESOS-9213:


 Summary: Avoid multiple message conversions when incrementing 
metrics
 Key: MESOS-9213
 URL: https://issues.apache.org/jira/browse/MESOS-9213
 Project: Mesos
  Issue Type: Improvement
  Components: metrics, master
Affects Versions: 1.7.0
Reporter: Greg Mann


When incrementing metrics, we currently do stuff like
{code}
metrics.incrementEvent(devolve(evolve(message)));
{code}
which is not efficient. We should update such callsites to avoid gratuitous 
conversions which could degrade performance when many events are being sent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-6175) Add (un)reserved_resources to /metrics/snapshot

2018-09-06 Thread Sunil Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606311#comment-16606311
 ] 

Sunil Shah commented on MESOS-6175:
---

That would be fine! Thanks [~greggomann].

> Add (un)reserved_resources to /metrics/snapshot
> ---
>
> Key: MESOS-6175
> URL: https://issues.apache.org/jira/browse/MESOS-6175
> Project: Mesos
>  Issue Type: Bug
>Reporter: Nathan Handler
>Priority: Major
>
> It would be useful to have the total number of (un)reserved resources 
> available from the /metrics/snapshot endpoint. Currently, to get these 
> values, you need to query something like /state-summary, iterate over each 
> slave, and sum up the (un)reserved_resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (MESOS-6175) Add (un)reserved_resources to /metrics/snapshot

2018-09-06 Thread Sunil Shah (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Shah updated MESOS-6175:
--
Comment: was deleted

(was: That would be fine! Thanks [~greggomann].)

> Add (un)reserved_resources to /metrics/snapshot
> ---
>
> Key: MESOS-6175
> URL: https://issues.apache.org/jira/browse/MESOS-6175
> Project: Mesos
>  Issue Type: Bug
>Reporter: Nathan Handler
>Priority: Major
>
> It would be useful to have the total number of (un)reserved resources 
> available from the /metrics/snapshot endpoint. Currently, to get these 
> values, you need to query something like /state-summary, iterate over each 
> slave, and sum up the (un)reserved_resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-6175) Add (un)reserved_resources to /metrics/snapshot

2018-09-06 Thread Sunil Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606310#comment-16606310
 ] 

Sunil Shah commented on MESOS-6175:
---

That would be fine! Thanks [~greggomann].

> Add (un)reserved_resources to /metrics/snapshot
> ---
>
> Key: MESOS-6175
> URL: https://issues.apache.org/jira/browse/MESOS-6175
> Project: Mesos
>  Issue Type: Bug
>Reporter: Nathan Handler
>Priority: Major
>
> It would be useful to have the total number of (un)reserved resources 
> available from the /metrics/snapshot endpoint. Currently, to get these 
> values, you need to query something like /state-summary, iterate over each 
> slave, and sum up the (un)reserved_resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-6551) Add attach/exec commands to the Mesos CLI

2018-09-06 Thread Greg Mann (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606299#comment-16606299
 ] 

Greg Mann commented on MESOS-6551:
--

Hey [~ArmandGrillet] [~klueska] I see you guys have been making good progress 
on the parent epic of this ticket - do you have any idea what the timeline 
might be for adding attach/exec support to the Mesos CLI? Sounds like having 
this functionality is blocking Yelp from moving some workloads to the Mesos 
containerizer.

> Add attach/exec commands to the Mesos CLI
> -
>
> Key: MESOS-6551
> URL: https://issues.apache.org/jira/browse/MESOS-6551
> Project: Mesos
>  Issue Type: Task
>  Components: cli
>Reporter: Kevin Klues
>Assignee: Armand Grillet
>Priority: Major
>  Labels: debugging, mesosphere
>
> After all of this support has landed, we need to update the Mesos CLI to 
> implement {{attach}} and {{exec}} functionality as outlined in the Design Doc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9212) Disable SIGCHILD handling in libev.

2018-09-06 Thread James Peach (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach reassigned MESOS-9212:
--

Assignee: James Peach

| [r/68660|https://reviews.apache.org/r/68660] | Disabled SIGCHLD handling in 
the libev event loop. |

> Disable SIGCHILD handling in libev.
> ---
>
> Key: MESOS-9212
> URL: https://issues.apache.org/jira/browse/MESOS-9212
> Project: Mesos
>  Issue Type: Bug
>Reporter: James Peach
>Assignee: James Peach
>Priority: Major
>
> On Fedora 28, building against the system version of libev (version 4.24) 
> causes the following tests to fail:
> The following tests fail:
> {noformat}
> [  FAILED  ] ReapTest.NonChildProcess
> [  FAILED  ] ReapTest.ChildProcess
> [  FAILED  ] ReapTest.TerminatedChildProcess
> [  FAILED  ] SubprocessTest.PipeOutputToFileDescriptor
> [  FAILED  ] SubprocessTest.PipeOutputToPath
> [  FAILED  ] SubprocessTest.EnvironmentEcho
> [  FAILED  ] SubprocessTest.Status
> [  FAILED  ] SubprocessTest.PipeOutput
> [  FAILED  ] SubprocessTest.PipeLargeOutput
> [  FAILED  ] SubprocessTest.PipeInput
> [  FAILED  ] SubprocessTest.PipeRedirect
> [  FAILED  ] SubprocessTest.PathOutput
> [  FAILED  ] SubprocessTest.PathInput
> [  FAILED  ] SubprocessTest.FdOutput
> [  FAILED  ] SubprocessTest.FdInput
> [  FAILED  ] SubprocessTest.Default
> [  FAILED  ] SubprocessTest.Flags
> [  FAILED  ] SubprocessTest.Environment
> [  FAILED  ] SubprocessTest.EnvironmentWithSpaces
> [  FAILED  ] SubprocessTest.EnvironmentWithSpacesAndQuotes
> [  FAILED  ] SubprocessTest.EnvironmentOverride
> {noformat}
> This build configuration succeeds:
> {noformat}
> $ ../configure --disable-java --disable-python --enable-silent-rules 
> --disable-hardening --disable-werror --disable-libtool-wrappers 
> --enable-xfs-disk-isolator --enable-install-module-dependencies 
> --enable-port-mapping-isolator --enable-network-ports-isolator 
> --with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr 
> --with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -fvisibility-inlines-hidden 
> -Wno-unused-local-typedefs -Wno-deprecated" "CFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS= 
> CXX=/home/jpeach/src/asf-mesos/build/c++ 
> CC=/home/jpeach/src/asf-mesos/build/cc LD=/home/jpeach/src/asf-mesos/build/ld
> {noformat}
> This build configuration fails:
> {noformat}
>   $ ../configure --disable-java --disable-python --enable-silent-rules 
> --disable-hardening --disable-werror --disable-libtool-wrappers 
> --enable-xfs-disk-isolator --enable-install-module-dependencies 
> --enable-port-mapping-isolator --enable-network-ports-isolator 
> --with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr 
> --with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -fvisibility-inlines-hidden 
> -Wno-unused-local-typedefs -Wno-deprecated" "CFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS= 
> CXX=/home/jpeach/src/asf-mesos/build/c++ 
> CC=/home/jpeach/src/asf-mesos/build/cc LD=/home/jpeach/src/asf-mesos/build/ld 
> --with-libev=/usr
> {noformat}
> I think what happens here is that the child process gets reaped wrongly 
> somehow:
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from SubprocessTest
> [ RUN  ] SubprocessTest.EnvironmentWithSpaces
> [pid 25909] clone(child_stack=NULL, 
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
> child_tidptr=0x7fa11881fcd0) = 25923
> strace: Process 25923 attached
> [pid 25923] execve("/usr/bin/sh", ["sh", "-c", "echo $MESSAGE"], 0x1ff3950 /* 
> 1 var */) = 0
> [pid 25923] arch_prctl(ARCH_SET_FS, 0x7f24561c5740) = 0
> [pid 25923] exit_group(0)   = ?
> [pid 25923] +++ exited with 0 +++
> [pid 25909] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=25923, 
> si_uid=9306, si_status=0, si_utime=0, si_stime=0} ---
> [pid 25922] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 
> WNOHANG|WSTOPPED|WCONTINUED, NULL) = 25923
> [pid 25922] wait4(-1, 0x7fa10a74da44, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 
> ECHILD (No child processes)
> [pid 25919] wait4(25923, 0x7fa10bf50548, WNOHANG, NULL) = -1 ECHILD (No child 
> processes)
> ../../../3rdparty/libprocess/src/tests/subprocess_tests.cpp:977: Failure
> (s->status()).get() is NONE
> [  FAILED  ] SubprocessTest.EnvironmentWithSpaces (12 ms)
> [--] 1 test from SubprocessTest (12 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (12 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] SubprocessTest.EnvironmentWithSpaces
> {noformat}



--
This message was sent by 

[jira] [Commented] (MESOS-9212) Subprocess tests fail with libev 4.24

2018-09-06 Thread James Peach (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606233#comment-16606233
 ] 

James Peach commented on MESOS-9212:


This might be due to the libev patch we are carrying?

{noformat}
[jpeach@jpeach 3rdparty]$ cat libev-4.22.patch
diff --git a/ev.h b/ev.h
index 38f62d8..0055cfd 100644
--- a/ev.h
+++ b/ev.h
@@ -125,7 +125,7 @@ EV_CPP(extern "C" {)
 # ifdef _WIN32
 #  define EV_CHILD_ENABLE 0
 # else
-#  define EV_CHILD_ENABLE EV_FEATURE_WATCHERS
+#  define EV_CHILD_ENABLE 0
 #endif
 #endif
[jpeach@jpeach 3rdparty]$ grep -r EV_CHILD_ENABLE /usr/include/
/usr/include/ev.h:#ifndef EV_CHILD_ENABLE
/usr/include/ev.h:#  define EV_CHILD_ENABLE 0
/usr/include/ev.h:#  define EV_CHILD_ENABLE EV_FEATURE_WATCHERS
/usr/include/ev.h:#if EV_CHILD_ENABLE && !EV_SIGNAL_ENABLE
/usr/include/ev.h:# if EV_CHILD_ENABLE
/usr/include/ev++.h:  #if EV_CHILD_ENABLE
{noformat}

> Subprocess tests fail with libev 4.24
> -
>
> Key: MESOS-9212
> URL: https://issues.apache.org/jira/browse/MESOS-9212
> Project: Mesos
>  Issue Type: Bug
>Reporter: James Peach
>Priority: Major
>
> On Fedora 28, building against the system version of libev (version 4.24) 
> causes the following tests to fail:
> The following tests fail:
> {noformat}
> [  FAILED  ] ReapTest.NonChildProcess
> [  FAILED  ] ReapTest.ChildProcess
> [  FAILED  ] ReapTest.TerminatedChildProcess
> [  FAILED  ] SubprocessTest.PipeOutputToFileDescriptor
> [  FAILED  ] SubprocessTest.PipeOutputToPath
> [  FAILED  ] SubprocessTest.EnvironmentEcho
> [  FAILED  ] SubprocessTest.Status
> [  FAILED  ] SubprocessTest.PipeOutput
> [  FAILED  ] SubprocessTest.PipeLargeOutput
> [  FAILED  ] SubprocessTest.PipeInput
> [  FAILED  ] SubprocessTest.PipeRedirect
> [  FAILED  ] SubprocessTest.PathOutput
> [  FAILED  ] SubprocessTest.PathInput
> [  FAILED  ] SubprocessTest.FdOutput
> [  FAILED  ] SubprocessTest.FdInput
> [  FAILED  ] SubprocessTest.Default
> [  FAILED  ] SubprocessTest.Flags
> [  FAILED  ] SubprocessTest.Environment
> [  FAILED  ] SubprocessTest.EnvironmentWithSpaces
> [  FAILED  ] SubprocessTest.EnvironmentWithSpacesAndQuotes
> [  FAILED  ] SubprocessTest.EnvironmentOverride
> {noformat}
> This build configuration succeeds:
> {noformat}
> $ ../configure --disable-java --disable-python --enable-silent-rules 
> --disable-hardening --disable-werror --disable-libtool-wrappers 
> --enable-xfs-disk-isolator --enable-install-module-dependencies 
> --enable-port-mapping-isolator --enable-network-ports-isolator 
> --with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr 
> --with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -fvisibility-inlines-hidden 
> -Wno-unused-local-typedefs -Wno-deprecated" "CFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS= 
> CXX=/home/jpeach/src/asf-mesos/build/c++ 
> CC=/home/jpeach/src/asf-mesos/build/cc LD=/home/jpeach/src/asf-mesos/build/ld
> {noformat}
> This build configuration fails:
> {noformat}
>   $ ../configure --disable-java --disable-python --enable-silent-rules 
> --disable-hardening --disable-werror --disable-libtool-wrappers 
> --enable-xfs-disk-isolator --enable-install-module-dependencies 
> --enable-port-mapping-isolator --enable-network-ports-isolator 
> --with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr 
> --with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -fvisibility-inlines-hidden 
> -Wno-unused-local-typedefs -Wno-deprecated" "CFLAGS=-O0 -ggdb3 
> -fno-omit-frame-pointer -Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS= 
> CXX=/home/jpeach/src/asf-mesos/build/c++ 
> CC=/home/jpeach/src/asf-mesos/build/cc LD=/home/jpeach/src/asf-mesos/build/ld 
> --with-libev=/usr
> {noformat}
> I think what happens here is that the child process gets reaped wrongly 
> somehow:
> {noformat}
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from SubprocessTest
> [ RUN  ] SubprocessTest.EnvironmentWithSpaces
> [pid 25909] clone(child_stack=NULL, 
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
> child_tidptr=0x7fa11881fcd0) = 25923
> strace: Process 25923 attached
> [pid 25923] execve("/usr/bin/sh", ["sh", "-c", "echo $MESSAGE"], 0x1ff3950 /* 
> 1 var */) = 0
> [pid 25923] arch_prctl(ARCH_SET_FS, 0x7f24561c5740) = 0
> [pid 25923] exit_group(0)   = ?
> [pid 25923] +++ exited with 0 +++
> [pid 25909] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=25923, 
> si_uid=9306, si_status=0, si_utime=0, si_stime=0} ---
> [pid 25922] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 
> WNOHANG|WSTOPPED|WCONTINUED, NULL) = 25923
> [pid 25922] wait4(-1, 0x7fa10a74da44, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 
> 

[jira] [Created] (MESOS-9212) Subprocess tests fail with libev 4.24

2018-09-06 Thread James Peach (JIRA)
James Peach created MESOS-9212:
--

 Summary: Subprocess tests fail with libev 4.24
 Key: MESOS-9212
 URL: https://issues.apache.org/jira/browse/MESOS-9212
 Project: Mesos
  Issue Type: Bug
Reporter: James Peach


On Fedora 28, building against the system version of libev (version 4.24) 
causes the following tests to fail:

The following tests fail:
{noformat}
[  FAILED  ] ReapTest.NonChildProcess
[  FAILED  ] ReapTest.ChildProcess
[  FAILED  ] ReapTest.TerminatedChildProcess
[  FAILED  ] SubprocessTest.PipeOutputToFileDescriptor
[  FAILED  ] SubprocessTest.PipeOutputToPath
[  FAILED  ] SubprocessTest.EnvironmentEcho
[  FAILED  ] SubprocessTest.Status
[  FAILED  ] SubprocessTest.PipeOutput
[  FAILED  ] SubprocessTest.PipeLargeOutput
[  FAILED  ] SubprocessTest.PipeInput
[  FAILED  ] SubprocessTest.PipeRedirect
[  FAILED  ] SubprocessTest.PathOutput
[  FAILED  ] SubprocessTest.PathInput
[  FAILED  ] SubprocessTest.FdOutput
[  FAILED  ] SubprocessTest.FdInput
[  FAILED  ] SubprocessTest.Default
[  FAILED  ] SubprocessTest.Flags
[  FAILED  ] SubprocessTest.Environment
[  FAILED  ] SubprocessTest.EnvironmentWithSpaces
[  FAILED  ] SubprocessTest.EnvironmentWithSpacesAndQuotes
[  FAILED  ] SubprocessTest.EnvironmentOverride
{noformat}

This build configuration succeeds:
{noformat}
$ ../configure --disable-java --disable-python --enable-silent-rules 
--disable-hardening --disable-werror --disable-libtool-wrappers 
--enable-xfs-disk-isolator --enable-install-module-dependencies 
--enable-port-mapping-isolator --enable-network-ports-isolator 
--with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr 
--with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3 
-fno-omit-frame-pointer -fvisibility-inlines-hidden -Wno-unused-local-typedefs 
-Wno-deprecated" "CFLAGS=-O0 -ggdb3 -fno-omit-frame-pointer 
-Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS= 
CXX=/home/jpeach/src/asf-mesos/build/c++ CC=/home/jpeach/src/asf-mesos/build/cc 
LD=/home/jpeach/src/asf-mesos/build/ld
{noformat}

This build configuration fails:

{noformat}
  $ ../configure --disable-java --disable-python --enable-silent-rules 
--disable-hardening --disable-werror --disable-libtool-wrappers 
--enable-xfs-disk-isolator --enable-install-module-dependencies 
--enable-port-mapping-isolator --enable-network-ports-isolator 
--with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr 
--with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3 
-fno-omit-frame-pointer -fvisibility-inlines-hidden -Wno-unused-local-typedefs 
-Wno-deprecated" "CFLAGS=-O0 -ggdb3 -fno-omit-frame-pointer 
-Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS= 
CXX=/home/jpeach/src/asf-mesos/build/c++ CC=/home/jpeach/src/asf-mesos/build/cc 
LD=/home/jpeach/src/asf-mesos/build/ld --with-libev=/usr
{noformat}

I think what happens here is that the child process gets reaped wrongly somehow:
{noformat}
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from SubprocessTest
[ RUN  ] SubprocessTest.EnvironmentWithSpaces
[pid 25909] clone(child_stack=NULL, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7fa11881fcd0) = 25923
strace: Process 25923 attached
[pid 25923] execve("/usr/bin/sh", ["sh", "-c", "echo $MESSAGE"], 0x1ff3950 /* 1 
var */) = 0
[pid 25923] arch_prctl(ARCH_SET_FS, 0x7f24561c5740) = 0
[pid 25923] exit_group(0)   = ?
[pid 25923] +++ exited with 0 +++
[pid 25909] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=25923, 
si_uid=9306, si_status=0, si_utime=0, si_stime=0} ---
[pid 25922] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 
WNOHANG|WSTOPPED|WCONTINUED, NULL) = 25923
[pid 25922] wait4(-1, 0x7fa10a74da44, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 
ECHILD (No child processes)
[pid 25919] wait4(25923, 0x7fa10bf50548, WNOHANG, NULL) = -1 ECHILD (No child 
processes)
../../../3rdparty/libprocess/src/tests/subprocess_tests.cpp:977: Failure
(s->status()).get() is NONE
[  FAILED  ] SubprocessTest.EnvironmentWithSpaces (12 ms)
[--] 1 test from SubprocessTest (12 ms total)

[--] Global test environment tear-down
[==] 1 test from 1 test case ran. (12 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] SubprocessTest.EnvironmentWithSpaces
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-9211) We need to add new commands to make the CLI useful for devops

2018-09-06 Thread Armand Grillet (JIRA)
Armand Grillet created MESOS-9211:
-

 Summary: We need to add new commands to make the CLI useful for 
devops
 Key: MESOS-9211
 URL: https://issues.apache.org/jira/browse/MESOS-9211
 Project: Mesos
  Issue Type: Epic
Reporter: Armand Grillet






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-9194) Extend request batching to '/roles' endpoint

2018-09-06 Thread Alexander Rukletsov (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov reassigned MESOS-9194:
--

 Assignee: Benno Evers
   Sprint: Mesosphere Sprint 2018-28
 Story Points: 3
   Labels: mesosphere  (was: )
Fix Version/s: 1.8.0

> Extend request batching to '/roles' endpoint
> 
>
> Key: MESOS-9194
> URL: https://issues.apache.org/jira/browse/MESOS-9194
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benno Evers
>Assignee: Benno Evers
>Priority: Major
>  Labels: mesosphere
> Fix For: 1.8.0
>
>
> For consistency and improved performance under load, the `/roles` endpoint 
> should use the same request batching mechanism as `/state`, '/tasks`, ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MESOS-9116) Launch nested container session fails due to incorrect detection of `mnt` namespace of command executor's task.

2018-09-06 Thread Alexander Rukletsov (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586014#comment-16586014
 ] 

Alexander Rukletsov edited comment on MESOS-9116 at 9/6/18 11:44 AM:
-

{noformat}
commit d95a16e03d27a2b6575148183e53a3b4507a16c1
Author: Andrei Budnik 
AuthorDate: Mon Aug 20 16:22:33 2018 +0200
Commit: Alexander Rukletsov 
CommitDate: Mon Aug 20 16:22:33 2018 +0200

Added `LaunchNestedContainerSessionInParallel` test.

This patch adds a test which verifies that launching multiple
short-lived nested container sessions succeeds. This test
implicitly verifies that agent correctly detects `mnt` namespace
of a command executor's task. If the detection fails, the
containerizer launcher (aka `nanny`) process fails to enter `mnt`
namespace, so it prints an error message into stderr for this
nested container.

This test is disabled until we fix MESOS-8545.

Review: https://reviews.apache.org/r/68256/
{noformat}
{noformat}
commit e78f636d84f2709da17275f7d70265520c0f4f94
Author: Andrei Budnik 
AuthorDate: Mon Aug 20 16:28:31 2018 +0200
Commit: Alexander Rukletsov 
CommitDate: Mon Aug 20 16:28:31 2018 +0200

Fixed incorrect `mnt` namespace detection of command executor's task.

Previously, we were walking the process tree from the container's
`init` process to find the first process along the way whose `mnt`
namespace differs from the `init` process. We expected this algorithm
to always return the PID of the command executor's task.

However, if someone launches multiple nested containers within the
process tree, the aforementioned algorithm might detect the PID of
one of those nested container instead of the command executor's task.
Even though the `mnt` namespace will be the same across all these
candidates, the detected PID might belong to a short-lived container,
which might terminate before the containerizer launcher (aka `nanny`
process) tries to enter its `mnt` namespace.

This patch fixes the detection algorithm so that it always returns
the PID of the command executor's task.

Review: https://reviews.apache.org/r/68257/
{noformat}
{noformat}
commit 31499a5dc1de29fa2178e6ea9e5398d8c668a933
Author: Andrei Budnik 
AuthorDate: Mon Aug 20 16:28:38 2018 +0200
Commit: Alexander Rukletsov 
CommitDate: Mon Aug 20 16:28:38 2018 +0200

Added `ROOT_CGROUPS_LaunchNestedDebugAfterUnshareMntNamespace` test.

This test verifies detection of task's `mnt` namespace for a debug
nested container. Debug nested container must enter `mnt` namespace
of the task, so the agent tries to detect task's `mnt` namespace.
This test launches a long-running task which runs a subtask that
unshares `mnt` namespace. The structure of the resulting process tree
is similar to the process tree of the command executor (the task of
the command executor unshares `mnt` ns):

  0. root (aka "nanny"/"launcher" process) [root `mnt` namespace]
1. task: sleep 1000 [root `mnt` namespace]
  2. subtaks: sleep 1000 [subtask's `mnt` namespace]

We expect that the agent detects task's `mnt` namespace.

Review: https://reviews.apache.org/r/68408/
{noformat}
{noformat}
commit b3c9c6939964831170e819f88134af7b275ffe1b
Author: Andrei Budnik 
AuthorDate: Mon Aug 20 16:28:44 2018 +0200
Commit: Alexander Rukletsov 
CommitDate: Mon Aug 20 16:28:44 2018 +0200

Fixed wrong `mnt` namespace detection for non-command executor tasks.

Previously, we were calling `getMountNamespaceTarget()` not only in
case of the command executor but in all other cases too, including
the default executor. That might lead to various subtle bugs, caused by
wrong detection of `mnt` namespace target. This patch fixes the issue
by setting a parent PID as `mnt` namespace target in case of
non-command executor task.

Review: https://reviews.apache.org/r/68348/
{noformat}
{noformat}
commit 52be35f47caea2712a0b13d7f963f7236533a2f1
Author: Andrei Budnik 
AuthorDate: Thu Sep 6 13:41:06 2018 +0200
Commit: Alexander Rukletsov 
CommitDate: Thu Sep 6 13:41:06 2018 +0200

Fixed `LaunchNestedContainerSessionsInParallel` test.

Previously, we sent `ATTACH_CONTAINER_OUTPUT` to attach to a
short-living nested container. An attempt to attach to a terminated
nested container leads to HTTP 500 error. This patch gets rid of
`ATTACH_CONTAINER_OUTPUT` in favor of `LAUNCH_NESTED_CONTAINER_SESSION`
so that we can read the container's output without using an extra call.

Review: https://reviews.apache.org/r/68236/
{noformat}


was (Author: alexr):
{noformat}
commit d95a16e03d27a2b6575148183e53a3b4507a16c1
Author: Andrei Budnik 
AuthorDate: Mon Aug 20 16:22:33 2018 +0200
Commit: