[jira] [Updated] (MESOS-8090) Mesos 1.4.0 crashes with 1.3.x agent with oversubscription

2017-10-13 Thread Zhitao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li updated MESOS-8090:
-
Description: 
We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a 
over-subscription enabled agent running 1.3.1 code.

The crash line is:

{code:none}
resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19
{code}

Stack trace in gdb:

{panel:title=My title}
#0  0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f22f3554448 in __GI_abort () at abort.c:89
#2  0x7f22f615cd79 in google::DumpStackTraceAndExit () at 
src/utilities.cc:147
#3  0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458
#4  0x7f22f61566cd in google::LogMessage::SendToLog (this=) 
at src/logging.cc:1412
#5  0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at 
src/logging.cc:1281
#6  0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal 
(this=, __in_chrg=) at src/logging.cc:1984
#7  0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at 
/mesos/src/common/resources.cpp:1051
#8  0x7f22f527e1e5 in mesos::Resources::Resource_::isEmpty 
(this=this@entry=0x7f22e713d2e0) at /mesos/src/common/resources.cpp:1173
#9  0x7f22f527e20c in mesos::Resources::add (this=0x7f22e713d400, that=...) 
at /mesos/src/common/resources.cpp:1993
#10 0x7f22f527f860 in mesos::Resources::operator+= 
(this=this@entry=0x7f22e713d400, that=...) at 
/mesos/src/common/resources.cpp:2016
#11 0x7f22f527f91d in mesos::Resources::operator+= 
(this=this@entry=0x7f22e713d400, that=...) at 
/mesos/src/common/resources.cpp:2025
#12 0x7f22f527fa4b in mesos::Resources::Resources (this=0x7f22e713d400, 
_resources=...) at /mesos/src/common/resources.cpp:1277
#13 0x7f22f548b812 in mesos::internal::master::Master::updateSlave 
(this=0x558137bbae70, message=...) at /mesos/src/master/master.cpp:6681
#14 0x7f22f550adc1 in 
ProtobufProcess::_handlerM
 (t=0x558137bbae70, method=
(void (mesos::internal::master::Master::*)(mesos::internal::master::Master 
* const, const mesos::internal::UpdateSlaveMessage &)) 0x7f22f548b6d0 
, 
data="\n)\n'07ba28cc-d9fa-44fb-8d6b-f8c5c90f8a90-S1\022\030\n\004cpus\020\000\032\t\t\000\000\000\000\000\000\063@2\001*J")
at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:799
#15 0x7f22f54c8791 in 
ProtobufProcess::visit (this=0x558137bbae70, 
event=...) at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:104
#16 0x7f22f54572d4 in mesos::internal::master::Master::_visit 
(this=this@entry=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1643
#17 0x7f22f547014d in mesos::internal::master::Master::visit 
(this=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1575
#18 0x7f22f60b7169 in serve (event=..., this=0x558137bbbf28) at 
/mesos/3rdparty/libprocess/include/process/process.hpp:87
#19 process::ProcessManager::resume (this=, 
process=0x558137bbbf28) at /mesos/3rdparty/libprocess/src/process.cpp:3346
#20 0x7f22f60bd056 in operator() (__closure=0x558137aa3218) at 
/mesos/3rdparty/libprocess/src/process.cpp:2881
#21 _M_invoke<> (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1700
#22 operator() (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1688
#23 
std::thread::_Impl()>
 >::_M_run(void) (this=0x558137aa3200) at /usr/include/c++/4.9/thread:115
#24 0x7f22f40b3970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#25 0x7f22f38d1064 in start_thread (arg=0x7f22e713e700) at 
pthread_create.c:309
#26 0x7f22f360662d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
{panel}



  was:
We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a 
over-subscription enabled agent running 1.3.1 code.

The crash line is:


{panel:title=My title}
resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19
{panel}

Stack trace in gdb:

{panel:title=My title}
#0  0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f22f3554448 in __GI_abort () at abort.c:89
#2  0x7f22f615cd79 in google::DumpStackTraceAndExit () at 
src/utilities.cc:147
#3  0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458
#4  0x7f22f61566cd in google::LogMessage::SendToLog (this=) 
at src/logging.cc:1412
#5  0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at 
src/logging.cc:1281
#6  0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal 
(this=, __in_chrg=) at src/logging.cc:1984
#7  0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at 
/mesos/src/common/resources.cpp:1051
#8  0x7f22f527e1e5 in 

[jira] [Updated] (MESOS-8090) Mesos 1.4.0 crashes with 1.3.x agent with oversubscription

2017-10-13 Thread Zhitao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li updated MESOS-8090:
-
Description: 
We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a 
over-subscription enabled agent running 1.3.1 code.

The crash line is:


{panel:title=My title}
resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19
{panel}

Stack trace in gdb:

{panel:title=My title}
#0  0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f22f3554448 in __GI_abort () at abort.c:89
#2  0x7f22f615cd79 in google::DumpStackTraceAndExit () at 
src/utilities.cc:147
#3  0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458
#4  0x7f22f61566cd in google::LogMessage::SendToLog (this=) 
at src/logging.cc:1412
#5  0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at 
src/logging.cc:1281
#6  0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal 
(this=, __in_chrg=) at src/logging.cc:1984
#7  0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at 
/mesos/src/common/resources.cpp:1051
#8  0x7f22f527e1e5 in mesos::Resources::Resource_::isEmpty 
(this=this@entry=0x7f22e713d2e0) at /mesos/src/common/resources.cpp:1173
#9  0x7f22f527e20c in mesos::Resources::add (this=0x7f22e713d400, that=...) 
at /mesos/src/common/resources.cpp:1993
#10 0x7f22f527f860 in mesos::Resources::operator+= 
(this=this@entry=0x7f22e713d400, that=...) at 
/mesos/src/common/resources.cpp:2016
#11 0x7f22f527f91d in mesos::Resources::operator+= 
(this=this@entry=0x7f22e713d400, that=...) at 
/mesos/src/common/resources.cpp:2025
#12 0x7f22f527fa4b in mesos::Resources::Resources (this=0x7f22e713d400, 
_resources=...) at /mesos/src/common/resources.cpp:1277
#13 0x7f22f548b812 in mesos::internal::master::Master::updateSlave 
(this=0x558137bbae70, message=...) at /mesos/src/master/master.cpp:6681
#14 0x7f22f550adc1 in 
ProtobufProcess::_handlerM
 (t=0x558137bbae70, method=
(void (mesos::internal::master::Master::*)(mesos::internal::master::Master 
* const, const mesos::internal::UpdateSlaveMessage &)) 0x7f22f548b6d0 
, 
data="\n)\n'07ba28cc-d9fa-44fb-8d6b-f8c5c90f8a90-S1\022\030\n\004cpus\020\000\032\t\t\000\000\000\000\000\000\063@2\001*J")
at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:799
#15 0x7f22f54c8791 in 
ProtobufProcess::visit (this=0x558137bbae70, 
event=...) at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:104
#16 0x7f22f54572d4 in mesos::internal::master::Master::_visit 
(this=this@entry=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1643
#17 0x7f22f547014d in mesos::internal::master::Master::visit 
(this=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1575
#18 0x7f22f60b7169 in serve (event=..., this=0x558137bbbf28) at 
/mesos/3rdparty/libprocess/include/process/process.hpp:87
#19 process::ProcessManager::resume (this=, 
process=0x558137bbbf28) at /mesos/3rdparty/libprocess/src/process.cpp:3346
#20 0x7f22f60bd056 in operator() (__closure=0x558137aa3218) at 
/mesos/3rdparty/libprocess/src/process.cpp:2881
#21 _M_invoke<> (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1700
#22 operator() (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1688
#23 
std::thread::_Impl()>
 >::_M_run(void) (this=0x558137aa3200) at /usr/include/c++/4.9/thread:115
#24 0x7f22f40b3970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#25 0x7f22f38d1064 in start_thread (arg=0x7f22e713e700) at 
pthread_create.c:309
#26 0x7f22f360662d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
{panel}



  was:
We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a 
over-subscription enabled agent running 1.3.1 code.

The crash line is:

resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19

Stack trace in gdb:

{panel:title=My title}
#0  0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f22f3554448 in __GI_abort () at abort.c:89
#2  0x7f22f615cd79 in google::DumpStackTraceAndExit () at 
src/utilities.cc:147
#3  0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458
#4  0x7f22f61566cd in google::LogMessage::SendToLog (this=) 
at src/logging.cc:1412
#5  0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at 
src/logging.cc:1281
#6  0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal 
(this=, __in_chrg=) at src/logging.cc:1984
#7  0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at 
/mesos/src/common/resources.cpp:1051
#8  0x7f22f527e1e5 in mesos::Resources::Resource_::isEmpty 

[jira] [Commented] (MESOS-7935) CMake build should fail immediately for in-source builds

2017-10-13 Thread Nathan Jackson (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204424#comment-16204424
 ] 

Nathan Jackson commented on MESOS-7935:
---

[~kaysoky] I submitted a review. I tried to add [~milipili] but I couldn't find 
him on review board.

Using this:

{code}
set(CMAKE_DISABLE_SOURCE_CHANGES ON)
set(CMAKE_DISABLE_IN_SOURCE_BUILD ON)
{code}

I get the following error:

{code}
CMake Error at /usr/share/cmake/Modules/CMakeDetermineSystem.cmake:174 (file):
  file attempted to write a file: /data/mesos/CMakeFiles/CMakeOutput.log into
  a source directory.
Call Stack (most recent call first):
  CMakeLists.txt:33 (project)
{code}

I think this error message makes more sense:

{code}
CMake Error at CMakeLists.txt:28 (message):
  In-source builds are not supported.
{code}

Something to consider as well.

> CMake build should fail immediately for in-source builds
> 
>
> Key: MESOS-7935
> URL: https://issues.apache.org/jira/browse/MESOS-7935
> Project: Mesos
>  Issue Type: Improvement
>  Components: cmake
> Environment: macOS 10.12
> GNU/Linux Debian Stretch
>Reporter: Damien Gerard
>Assignee: Nathan Jackson
>  Labels: build
>
> In-source builds are neither recommended or supported.  It is simple enough 
> to add a check to fail the build immediately.
> ---
> In-source build of master branch was broken with:
> {noformat}
> cd /Users/damien.gerard/projects/acp/mesos/src && 
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
>   -DBUILD_FLAGS=\"\" -DBUILD_JAVA_JVM_LIBRARY=\"\" -DHAS_AUTHENTICATION=1 
> -DLIBDIR=\"/usr/local/libmesos\" -DPICOJSON_USE_INT64 
> -DPKGDATADIR=\"/usr/local/share/mesos\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" -DUSE_CMAKE_BUILD_CONFIG 
> -DUSE_STATIC_LIB -DVERSION=\"1.4.0\" -D__STDC_FORMAT_MACROS 
> -Dmesos_1_4_0_EXPORTS -I/Users/damien.gerard/projects/acp/mesos/include 
> -I/Users/damien.gerard/projects/acp/mesos/include/mesos 
> -I/Users/damien.gerard/projects/acp/mesos/src -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/protobuf-3.3.0/src/protobuf-3.3.0-lib/lib/include
>  -isystem /Users/damien.gerard/projects/acp/mesos/3rdparty/libprocess/include 
> -isystem /usr/local/opt/apr/libexec/include/apr-1 -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/boost-1.53.0/src/boost-1.53.0
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/elfio-3.2/src/elfio-3.2 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/glog-0.3.3/src/glog-0.3.3-lib/lib/include
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/nvml-352.79/src/nvml-352.79 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/picojson-1.3.0/src/picojson-1.3.0
>  -isystem /usr/local/include/subversion-1 -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/stout/include -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/http_parser-2.6.2/src/http_parser-2.6.2
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/concurrentqueue-1.0.0-beta/src/concurrentqueue-1.0.0-beta
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/libev-4.22/src/libev-4.22 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/zookeeper-3.4.8/src/zookeeper-3.4.8/src/c/include
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/zookeeper-3.4.8/src/zookeeper-3.4.8/src/c/generated
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/leveldb-1.19/src/leveldb-1.19/include
>   -std=c++11 -fPIC   -o 
> CMakeFiles/mesos-1.4.0.dir/slave/containerizer/mesos/provisioner/backends/copy.cpp.o
>  -c 
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/appc/store.cpp:132:46:
>  error: no member named 'fetcher' in namespace 'mesos::uri'; did you mean 
> 'Fetcher'?
>   Try uriFetcher = uri::fetcher::create();
> ~^~~
>  Fetcher
> /Users/damien.gerard/projects/acp/mesos/include/mesos/uri/fetcher.hpp:46:7: 
> note: 'Fetcher' declared here
> class Fetcher
>   ^
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/appc/store.cpp:132:55:
>  error: no member named 'create' in 'mesos::uri::Fetcher'
>   Try uriFetcher = uri::fetcher::create();
> {noformat}
> Both Linux & macOS, not tested elsewhere, on {{master}} and tag 1.4.0-rc3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7504) Parent's mount namespace cannot be determined when launching a nested container.

2017-10-13 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204236#comment-16204236
 ] 

Jie Yu commented on MESOS-7504:
---

Sounds like we should make `getMountNamespaceTarget` function more robust. I 
think it didn't consider the pre-exec command which is also 2nd level (and 
short running most likely).

I think the algorithm can be: find two levels like we did right now, but ignore 
errors about failed to get mount namespace. If in the end, we cannot find one, 
return error. Otherwise, return the new mount namespace.

> Parent's mount namespace cannot be determined when launching a nested 
> container.
> 
>
> Key: MESOS-7504
> URL: https://issues.apache.org/jira/browse/MESOS-7504
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
> Environment: Ubuntu 16.04
>Reporter: Alexander Rukletsov
>Assignee: Andrei Budnik
>  Labels: containerizer, flaky-test, mesosphere
>
> I've observed this failure twice in different Linux environments. Here is an 
> example of such failure:
> {noformat}
> [ RUN  ] 
> NestedMesosContainerizerTest.ROOT_CGROUPS_DestroyDebugContainerOnRecover
> I0509 21:53:25.471657 17167 containerizer.cpp:221] Using isolation: 
> cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image
> I0509 21:53:25.475124 17167 linux_launcher.cpp:150] Using 
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> I0509 21:53:25.475407 17167 provisioner.cpp:249] Using default backend 
> 'overlay'
> I0509 21:53:25.481232 17186 containerizer.cpp:608] Recovering containerizer
> I0509 21:53:25.482295 17186 provisioner.cpp:410] Provisioner recovery complete
> I0509 21:53:25.482587 17187 containerizer.cpp:1001] Starting container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d for executor 'executor' of framework 
> I0509 21:53:25.482918 17189 cgroups.cpp:410] Creating cgroup at 
> '/sys/fs/cgroup/cpu,cpuacct/mesos_test_d989f526-efe0-4553-bf79-936ad66c3753/21bc372c-0f2c-49f5-b8ab-8d32c232b95d'
>  for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d
> I0509 21:53:25.484103 17190 cpu.cpp:101] Updated 'cpu.shares' to 1024 (cpus 
> 1) for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d
> I0509 21:53:25.484808 17186 containerizer.cpp:1524] Launching 
> 'mesos-containerizer' with flags '--help="false" 
> --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"sleep
>  
> 1000"},"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/ubuntu\/workspace\/mesos\/Mesos_CI-build\/FLAG\/SSL\/label\/mesos-ec2-ubuntu-16.04\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount
>  -n -t proc proc \/proc -o 
> nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}"
>  --pipe_read="29" --pipe_write="32" 
> --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_sKhtj7/containers/21bc372c-0f2c-49f5-b8ab-8d32c232b95d"
>  --unshare_namespace_mnt="false"'
> I0509 21:53:25.484978 17189 linux_launcher.cpp:429] Launching container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d and cloning with namespaces CLONE_NEWNS 
> | CLONE_NEWPID
> I0509 21:53:25.513890 17186 containerizer.cpp:1623] Checkpointing container's 
> forked pid 1873 to 
> '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_Rdjw6M/meta/slaves/frameworks/executors/executor/runs/21bc372c-0f2c-49f5-b8ab-8d32c232b95d/pids/forked.pid'
> I0509 21:53:25.515878 17190 fetcher.cpp:353] Starting to fetch URIs for 
> container: 21bc372c-0f2c-49f5-b8ab-8d32c232b95d, directory: 
> /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr
> I0509 21:53:25.517715 17193 containerizer.cpp:1791] Starting nested container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.518569 17193 switchboard.cpp:545] Launching 
> 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" 
> --help="false" 
> --socket_address="/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b"
>  --stderr_from_fd="36" --stderr_to_fd="2" --stdin_to_fd="32" 
> --stdout_from_fd="33" --stdout_to_fd="1" --tty="false" 
> --wait_for_connection="true"' for container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.521229 17193 switchboard.cpp:575] Created I/O switchboard 
> server (pid: 1881) listening on socket 

[jira] [Updated] (MESOS-8091) Allow the KillPolicy to specify a signal

2017-10-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-8091:
-
Story Points: 3  (was: 1)
 Description: As specified in the design doc of MESOS-7951, the default 
executor should be updated to allow the framework to specify a particular 
signal to be used when initiating task termination.  (was: The {{KillPolicy}} 
protobuf message should be updated to match the design doc of MESOS-7951.)
 Summary: Allow the KillPolicy to specify a signal  (was: Update the 
KillPolicy protobuf message)

> Allow the KillPolicy to specify a signal
> 
>
> Key: MESOS-8091
> URL: https://issues.apache.org/jira/browse/MESOS-8091
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Greg Mann
>  Labels: mesosphere
>
> As specified in the design doc of MESOS-7951, the default executor should be 
> updated to allow the framework to specify a particular signal to be used when 
> initiating task termination.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8092) Allow the KillPolicy to specify a command

2017-10-13 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8092:


 Summary: Allow the KillPolicy to specify a command
 Key: MESOS-8092
 URL: https://issues.apache.org/jira/browse/MESOS-8092
 Project: Mesos
  Issue Type: Improvement
Reporter: Greg Mann


As specified in the design doc of MESOS-7951, the default executor should be 
extended to allow the specification of a command in the {{KillPolicy}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8091) Update the KillPolicy protobuf message

2017-10-13 Thread Greg Mann (JIRA)
Greg Mann created MESOS-8091:


 Summary: Update the KillPolicy protobuf message
 Key: MESOS-8091
 URL: https://issues.apache.org/jira/browse/MESOS-8091
 Project: Mesos
  Issue Type: Improvement
Reporter: Greg Mann


The {{KillPolicy}} protobuf message should be updated to match the design doc 
of MESOS-7951.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8090) Mesos 1.4.0 crashes with 1.3.x agent with oversubscription

2017-10-13 Thread Zhitao Li (JIRA)
Zhitao Li created MESOS-8090:


 Summary: Mesos 1.4.0 crashes with 1.3.x agent with oversubscription
 Key: MESOS-8090
 URL: https://issues.apache.org/jira/browse/MESOS-8090
 Project: Mesos
  Issue Type: Bug
  Components: master, oversubscription
Reporter: Zhitao Li
Assignee: Michael Park


We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a 
over-subscription enabled agent running 1.3.1 code.

The crash line is:

resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19

Stack trace in gdb:

{panel:title=My title}
#0  0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f22f3554448 in __GI_abort () at abort.c:89
#2  0x7f22f615cd79 in google::DumpStackTraceAndExit () at 
src/utilities.cc:147
#3  0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458
#4  0x7f22f61566cd in google::LogMessage::SendToLog (this=) 
at src/logging.cc:1412
#5  0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at 
src/logging.cc:1281
#6  0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal 
(this=, __in_chrg=) at src/logging.cc:1984
#7  0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at 
/mesos/src/common/resources.cpp:1051
#8  0x7f22f527e1e5 in mesos::Resources::Resource_::isEmpty 
(this=this@entry=0x7f22e713d2e0) at /mesos/src/common/resources.cpp:1173
#9  0x7f22f527e20c in mesos::Resources::add (this=0x7f22e713d400, that=...) 
at /mesos/src/common/resources.cpp:1993
#10 0x7f22f527f860 in mesos::Resources::operator+= 
(this=this@entry=0x7f22e713d400, that=...) at 
/mesos/src/common/resources.cpp:2016
#11 0x7f22f527f91d in mesos::Resources::operator+= 
(this=this@entry=0x7f22e713d400, that=...) at 
/mesos/src/common/resources.cpp:2025
#12 0x7f22f527fa4b in mesos::Resources::Resources (this=0x7f22e713d400, 
_resources=...) at /mesos/src/common/resources.cpp:1277
#13 0x7f22f548b812 in mesos::internal::master::Master::updateSlave 
(this=0x558137bbae70, message=...) at /mesos/src/master/master.cpp:6681
#14 0x7f22f550adc1 in 
ProtobufProcess::_handlerM
 (t=0x558137bbae70, method=
(void (mesos::internal::master::Master::*)(mesos::internal::master::Master 
* const, const mesos::internal::UpdateSlaveMessage &)) 0x7f22f548b6d0 
, 
data="\n)\n'07ba28cc-d9fa-44fb-8d6b-f8c5c90f8a90-S1\022\030\n\004cpus\020\000\032\t\t\000\000\000\000\000\000\063@2\001*J")
at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:799
#15 0x7f22f54c8791 in 
ProtobufProcess::visit (this=0x558137bbae70, 
event=...) at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:104
#16 0x7f22f54572d4 in mesos::internal::master::Master::_visit 
(this=this@entry=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1643
#17 0x7f22f547014d in mesos::internal::master::Master::visit 
(this=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1575
#18 0x7f22f60b7169 in serve (event=..., this=0x558137bbbf28) at 
/mesos/3rdparty/libprocess/include/process/process.hpp:87
#19 process::ProcessManager::resume (this=, 
process=0x558137bbbf28) at /mesos/3rdparty/libprocess/src/process.cpp:3346
#20 0x7f22f60bd056 in operator() (__closure=0x558137aa3218) at 
/mesos/3rdparty/libprocess/src/process.cpp:2881
#21 _M_invoke<> (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1700
#22 operator() (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1688
#23 
std::thread::_Impl()>
 >::_M_run(void) (this=0x558137aa3200) at /usr/include/c++/4.9/thread:115
#24 0x7f22f40b3970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#25 0x7f22f38d1064 in start_thread (arg=0x7f22e713e700) at 
pthread_create.c:309
#26 0x7f22f360662d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
{panel}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8090) Mesos 1.4.0 crashes with 1.3.x agent with oversubscription

2017-10-13 Thread Zhitao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li updated MESOS-8090:
-
Affects Version/s: 1.4.0

> Mesos 1.4.0 crashes with 1.3.x agent with oversubscription
> --
>
> Key: MESOS-8090
> URL: https://issues.apache.org/jira/browse/MESOS-8090
> Project: Mesos
>  Issue Type: Bug
>  Components: master, oversubscription
>Affects Versions: 1.4.0
>Reporter: Zhitao Li
>Assignee: Michael Park
>
> We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a 
> over-subscription enabled agent running 1.3.1 code.
> The crash line is:
> resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19
> Stack trace in gdb:
> {panel:title=My title}
> #0  0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at 
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x7f22f3554448 in __GI_abort () at abort.c:89
> #2  0x7f22f615cd79 in google::DumpStackTraceAndExit () at 
> src/utilities.cc:147
> #3  0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458
> #4  0x7f22f61566cd in google::LogMessage::SendToLog (this= out>) at src/logging.cc:1412
> #5  0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at 
> src/logging.cc:1281
> #6  0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal 
> (this=, __in_chrg=) at src/logging.cc:1984
> #7  0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at 
> /mesos/src/common/resources.cpp:1051
> #8  0x7f22f527e1e5 in mesos::Resources::Resource_::isEmpty 
> (this=this@entry=0x7f22e713d2e0) at /mesos/src/common/resources.cpp:1173
> #9  0x7f22f527e20c in mesos::Resources::add (this=0x7f22e713d400, 
> that=...) at /mesos/src/common/resources.cpp:1993
> #10 0x7f22f527f860 in mesos::Resources::operator+= 
> (this=this@entry=0x7f22e713d400, that=...) at 
> /mesos/src/common/resources.cpp:2016
> #11 0x7f22f527f91d in mesos::Resources::operator+= 
> (this=this@entry=0x7f22e713d400, that=...) at 
> /mesos/src/common/resources.cpp:2025
> #12 0x7f22f527fa4b in mesos::Resources::Resources (this=0x7f22e713d400, 
> _resources=...) at /mesos/src/common/resources.cpp:1277
> #13 0x7f22f548b812 in mesos::internal::master::Master::updateSlave 
> (this=0x558137bbae70, message=...) at /mesos/src/master/master.cpp:6681
> #14 0x7f22f550adc1 in 
> ProtobufProcess::_handlerM
>  (t=0x558137bbae70, method=
> (void 
> (mesos::internal::master::Master::*)(mesos::internal::master::Master * const, 
> const mesos::internal::UpdateSlaveMessage &)) 0x7f22f548b6d0 
>   const&)>, 
> data="\n)\n'07ba28cc-d9fa-44fb-8d6b-f8c5c90f8a90-S1\022\030\n\004cpus\020\000\032\t\t\000\000\000\000\000\000\063@2\001*J")
> at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:799
> #15 0x7f22f54c8791 in 
> ProtobufProcess::visit (this=0x558137bbae70, 
> event=...) at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:104
> #16 0x7f22f54572d4 in mesos::internal::master::Master::_visit 
> (this=this@entry=0x558137bbae70, event=...) at 
> /mesos/src/master/master.cpp:1643
> #17 0x7f22f547014d in mesos::internal::master::Master::visit 
> (this=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1575
> #18 0x7f22f60b7169 in serve (event=..., this=0x558137bbbf28) at 
> /mesos/3rdparty/libprocess/include/process/process.hpp:87
> #19 process::ProcessManager::resume (this=, 
> process=0x558137bbbf28) at /mesos/3rdparty/libprocess/src/process.cpp:3346
> #20 0x7f22f60bd056 in operator() (__closure=0x558137aa3218) at 
> /mesos/3rdparty/libprocess/src/process.cpp:2881
> #21 _M_invoke<> (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1700
> #22 operator() (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1688
> #23 
> std::thread::_Impl()>
>  >::_M_run(void) (this=0x558137aa3200) at /usr/include/c++/4.9/thread:115
> #24 0x7f22f40b3970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #25 0x7f22f38d1064 in start_thread (arg=0x7f22e713e700) at 
> pthread_create.c:309
> #26 0x7f22f360662d in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> {panel}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8088) Introduce Lamport timestamp for offer operations.

2017-10-13 Thread Benjamin Bannier (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203627#comment-16203627
 ] 

Benjamin Bannier commented on MESOS-8088:
-

It might be possible to introduce this by adding tooling working with 
{{TimeInfo}}.

> Introduce Lamport timestamp for offer operations.
> -
>
> Key: MESOS-8088
> URL: https://issues.apache.org/jira/browse/MESOS-8088
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>
> We need to use Lamport clock 
> (https://en.wikipedia.org/wiki/Lamport_timestamps) to establish partial 
> ordering between offer operation, and the resources the operation is 
> operating on.
> It is used to establish happens before relations so that RPs can reject those 
> operations that applies to a stale snapshot of the resources due to 
> speculation failures.
> See more details in this doc:
> https://docs.google.com/document/d/1RrrLVATZUyaURpEOeGjgxA6ccshuLo94G678IbL-Yco/edit#
> Given that the Lamport clock needs to be transferred between agent and 
> masters, it needs to be serialized to protobuf. We probably needs to define 
> the following methods for it:
> ```
> merge(...); // Take a max between the two.
> increment();
> operation<(...);
> copy and assignment operator
> ```



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (MESOS-7935) CMake build should fail immediately for in-source builds

2017-10-13 Thread Damien Gerard (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203458#comment-16203458
 ] 

Damien Gerard edited comment on MESOS-7935 at 10/13/17 12:05 PM:
-

Maybe undocumented by many projects rely on that for years now, and promoted 
many times on the CMake mailing list. Those options are unlikely to disappear 
tomorrow. There are additionnal checks that you will have difficulties to 
mimic, like checking that CMake does not override one way or the other the 
sources (itself or via Makefile).

Anyway, I don't see why searching for custom and more complex alternatives 
would be a good thing. Assuming those options disappear one day, you may 
consider at that time to eventually find alternatives.


was (Author: milipili):
Maybe undocumented by many projects rely on that for years now, and promoted 
many times on the CMake mailing list. Those options are unlikely to disappear 
tomorrow. There are additionnal checks that you will have difficulties to 
mimic, like checking that CMake does not override one way or the other the 
sources (itself or via Makefile).

Anyway, I don't see why searching for custom and more complex alternatives 
would be a good things. Assuming those options disappear one day, you may 
consider at that time to eventually find alternatives.

> CMake build should fail immediately for in-source builds
> 
>
> Key: MESOS-7935
> URL: https://issues.apache.org/jira/browse/MESOS-7935
> Project: Mesos
>  Issue Type: Improvement
>  Components: cmake
> Environment: macOS 10.12
> GNU/Linux Debian Stretch
>Reporter: Damien Gerard
>Assignee: Nathan Jackson
>  Labels: build
>
> In-source builds are neither recommended or supported.  It is simple enough 
> to add a check to fail the build immediately.
> ---
> In-source build of master branch was broken with:
> {noformat}
> cd /Users/damien.gerard/projects/acp/mesos/src && 
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
>   -DBUILD_FLAGS=\"\" -DBUILD_JAVA_JVM_LIBRARY=\"\" -DHAS_AUTHENTICATION=1 
> -DLIBDIR=\"/usr/local/libmesos\" -DPICOJSON_USE_INT64 
> -DPKGDATADIR=\"/usr/local/share/mesos\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" -DUSE_CMAKE_BUILD_CONFIG 
> -DUSE_STATIC_LIB -DVERSION=\"1.4.0\" -D__STDC_FORMAT_MACROS 
> -Dmesos_1_4_0_EXPORTS -I/Users/damien.gerard/projects/acp/mesos/include 
> -I/Users/damien.gerard/projects/acp/mesos/include/mesos 
> -I/Users/damien.gerard/projects/acp/mesos/src -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/protobuf-3.3.0/src/protobuf-3.3.0-lib/lib/include
>  -isystem /Users/damien.gerard/projects/acp/mesos/3rdparty/libprocess/include 
> -isystem /usr/local/opt/apr/libexec/include/apr-1 -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/boost-1.53.0/src/boost-1.53.0
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/elfio-3.2/src/elfio-3.2 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/glog-0.3.3/src/glog-0.3.3-lib/lib/include
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/nvml-352.79/src/nvml-352.79 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/picojson-1.3.0/src/picojson-1.3.0
>  -isystem /usr/local/include/subversion-1 -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/stout/include -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/http_parser-2.6.2/src/http_parser-2.6.2
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/concurrentqueue-1.0.0-beta/src/concurrentqueue-1.0.0-beta
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/libev-4.22/src/libev-4.22 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/zookeeper-3.4.8/src/zookeeper-3.4.8/src/c/include
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/zookeeper-3.4.8/src/zookeeper-3.4.8/src/c/generated
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/leveldb-1.19/src/leveldb-1.19/include
>   -std=c++11 -fPIC   -o 
> CMakeFiles/mesos-1.4.0.dir/slave/containerizer/mesos/provisioner/backends/copy.cpp.o
>  -c 
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/appc/store.cpp:132:46:
>  error: no member named 'fetcher' in namespace 'mesos::uri'; did you mean 
> 'Fetcher'?
>   Try uriFetcher = uri::fetcher::create();
> ~^~~
>  Fetcher
> /Users/damien.gerard/projects/acp/mesos/include/mesos/uri/fetcher.hpp:46:7: 
> note: 'Fetcher' declared here
> class Fetcher
>   ^
> 

[jira] [Commented] (MESOS-7935) CMake build should fail immediately for in-source builds

2017-10-13 Thread Damien Gerard (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203458#comment-16203458
 ] 

Damien Gerard commented on MESOS-7935:
--

Maybe undocumented by many projects rely on that for years now, and promoted 
many times on the CMake mailing list. Those options are unlikely to disappear 
tomorrow. There are additionnal checks that you will have difficulties to 
mimic, like checking that CMake does not override one way or the other the 
sources (itself or via Makefile).

Anyway, I don't see why searching for custom and more complex alternatives 
would be a good things. Assuming those options disappear one day, you may 
consider at that time to eventually find alternatives.

> CMake build should fail immediately for in-source builds
> 
>
> Key: MESOS-7935
> URL: https://issues.apache.org/jira/browse/MESOS-7935
> Project: Mesos
>  Issue Type: Improvement
>  Components: cmake
> Environment: macOS 10.12
> GNU/Linux Debian Stretch
>Reporter: Damien Gerard
>Assignee: Nathan Jackson
>  Labels: build
>
> In-source builds are neither recommended or supported.  It is simple enough 
> to add a check to fail the build immediately.
> ---
> In-source build of master branch was broken with:
> {noformat}
> cd /Users/damien.gerard/projects/acp/mesos/src && 
> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
>   -DBUILD_FLAGS=\"\" -DBUILD_JAVA_JVM_LIBRARY=\"\" -DHAS_AUTHENTICATION=1 
> -DLIBDIR=\"/usr/local/libmesos\" -DPICOJSON_USE_INT64 
> -DPKGDATADIR=\"/usr/local/share/mesos\" 
> -DPKGLIBEXECDIR=\"/usr/local/libexec/mesos\" -DUSE_CMAKE_BUILD_CONFIG 
> -DUSE_STATIC_LIB -DVERSION=\"1.4.0\" -D__STDC_FORMAT_MACROS 
> -Dmesos_1_4_0_EXPORTS -I/Users/damien.gerard/projects/acp/mesos/include 
> -I/Users/damien.gerard/projects/acp/mesos/include/mesos 
> -I/Users/damien.gerard/projects/acp/mesos/src -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/protobuf-3.3.0/src/protobuf-3.3.0-lib/lib/include
>  -isystem /Users/damien.gerard/projects/acp/mesos/3rdparty/libprocess/include 
> -isystem /usr/local/opt/apr/libexec/include/apr-1 -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/boost-1.53.0/src/boost-1.53.0
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/elfio-3.2/src/elfio-3.2 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/glog-0.3.3/src/glog-0.3.3-lib/lib/include
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/nvml-352.79/src/nvml-352.79 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/picojson-1.3.0/src/picojson-1.3.0
>  -isystem /usr/local/include/subversion-1 -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/stout/include -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/http_parser-2.6.2/src/http_parser-2.6.2
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/concurrentqueue-1.0.0-beta/src/concurrentqueue-1.0.0-beta
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/libev-4.22/src/libev-4.22 
> -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/zookeeper-3.4.8/src/zookeeper-3.4.8/src/c/include
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/zookeeper-3.4.8/src/zookeeper-3.4.8/src/c/generated
>  -isystem 
> /Users/damien.gerard/projects/acp/mesos/3rdparty/leveldb-1.19/src/leveldb-1.19/include
>   -std=c++11 -fPIC   -o 
> CMakeFiles/mesos-1.4.0.dir/slave/containerizer/mesos/provisioner/backends/copy.cpp.o
>  -c 
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/backends/copy.cpp
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/appc/store.cpp:132:46:
>  error: no member named 'fetcher' in namespace 'mesos::uri'; did you mean 
> 'Fetcher'?
>   Try uriFetcher = uri::fetcher::create();
> ~^~~
>  Fetcher
> /Users/damien.gerard/projects/acp/mesos/include/mesos/uri/fetcher.hpp:46:7: 
> note: 'Fetcher' declared here
> class Fetcher
>   ^
> /Users/damien.gerard/projects/acp/mesos/src/slave/containerizer/mesos/provisioner/appc/store.cpp:132:55:
>  error: no member named 'create' in 'mesos::uri::Fetcher'
>   Try uriFetcher = uri::fetcher::create();
> {noformat}
> Both Linux & macOS, not tested elsewhere, on {{master}} and tag 1.4.0-rc3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8089) Add messages to publish resources on a resource provider

2017-10-13 Thread Jan Schlicht (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Schlicht updated MESOS-8089:

  Sprint: Mesosphere Sprint 66
Story Points: 7

> Add messages to publish resources on a resource provider
> 
>
> Key: MESOS-8089
> URL: https://issues.apache.org/jira/browse/MESOS-8089
> Project: Mesos
>  Issue Type: Task
>Reporter: Jan Schlicht
>Assignee: Jan Schlicht
>  Labels: mesosphere
>
> Before launching a task that uses resource provider resources, the resource 
> provider needs to be informed to "publish" these resources as it may take 
> some necessary actions. For external resource providers resources might also 
> have to be "unpublished" when a task is finished. The resource provider needs 
> to ack these calls after it's ready.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8089) Add messages to publish resources on a resource provider

2017-10-13 Thread Jan Schlicht (JIRA)
Jan Schlicht created MESOS-8089:
---

 Summary: Add messages to publish resources on a resource provider
 Key: MESOS-8089
 URL: https://issues.apache.org/jira/browse/MESOS-8089
 Project: Mesos
  Issue Type: Task
Reporter: Jan Schlicht
Assignee: Jan Schlicht


Before launching a task that uses resource provider resources, the resource 
provider needs to be informed to "publish" these resources as it may take some 
necessary actions. For external resource providers resources might also have to 
be "unpublished" when a task is finished. The resource provider needs to ack 
these calls after it's ready.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7537) Add functionality to disconnect resource providers in the master

2017-10-13 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-7537:

Labels: external-resources mesosphere storage  (was: mesosphere storage)

> Add functionality to disconnect resource providers in the master
> 
>
> Key: MESOS-7537
> URL: https://issues.apache.org/jira/browse/MESOS-7537
> Project: Mesos
>  Issue Type: Task
>Reporter: Jan Schlicht
>  Labels: external-resources, mesosphere, storage
>
> Similar to the existing {{disconnect}} methods for frameworks and agents, a 
> similar function has to be added to the master.
> It needs to be called in {{Master::exited}}, i.e. when it detects that a 
> resource provider is no longer reachable.
> For local resource providers this also has to be called when the agent 
> disconnects where these are running on.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7330) Add resource provider to offer

2017-10-13 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-7330:

Labels: external-resources mesosphere storage  (was: mesosphere storage)

> Add resource provider to offer
> --
>
> Key: MESOS-7330
> URL: https://issues.apache.org/jira/browse/MESOS-7330
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Benjamin Bannier
>Priority: Minor
>  Labels: external-resources, mesosphere, storage
>
> In order to introduce external resource providers we need to add an 
> {{optional}} resource provider field to the {{Offer}} message which can be 
> used to unambiguously identify the provider. In addition, the existing 
> {{slave_id}} will become {{optional}} with the requirement that either 
> {{slave_id}} or {{resource_provider_id}} is set,
> {code}
> message Offer {
>   // ..
>   optional SlaveID slave_id = 3;
>   optional ResourceProviderID resource_provider_id = 11;
>   // ..
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8080) The default executor does not propagate missing task exit status correctly.

2017-10-13 Thread Qian Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang updated MESOS-8080:
--
Component/s: executor

> The default executor does not propagate missing task exit status correctly.
> ---
>
> Key: MESOS-8080
> URL: https://issues.apache.org/jira/browse/MESOS-8080
> Project: Mesos
>  Issue Type: Bug
>  Components: executor
>Reporter: James Peach
>Assignee: James Peach
> Fix For: 1.5.0
>
>
> The default executor is not handling a missing nested container
> exit status correctly. It is assuming the protobuf accessor was
> returning an Option rather than explicitly checking whether the
> `exit_status` field was present in the message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8080) The default executor does not propagate missing task exit status correctly.

2017-10-13 Thread Qian Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang updated MESOS-8080:
--
Sprint: Mesosphere Sprint 65

> The default executor does not propagate missing task exit status correctly.
> ---
>
> Key: MESOS-8080
> URL: https://issues.apache.org/jira/browse/MESOS-8080
> Project: Mesos
>  Issue Type: Bug
>Reporter: James Peach
>Assignee: James Peach
> Fix For: 1.5.0
>
>
> The default executor is not handling a missing nested container
> exit status correctly. It is assuming the protobuf accessor was
> returning an Option rather than explicitly checking whether the
> `exit_status` field was present in the message.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)