[jira] [Commented] (MESOS-3780) Replace Master/Slave Terminology Phase I - Update all strings output
[ https://issues.apache.org/jira/browse/MESOS-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215489#comment-15215489 ] zhou xing commented on MESOS-3780: -- Based on comment #3, we break this ticket into three sub-tickets: - MESOS-5055 - MESOS-5056 - MESOS-5057 please goto these tickets to check the details. > Replace Master/Slave Terminology Phase I - Update all strings output > > > Key: MESOS-3780 > URL: https://issues.apache.org/jira/browse/MESOS-3780 > Project: Mesos > Issue Type: Task >Reporter: Diana Arroyo >Assignee: zhou xing > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5057) Replace Master/Slave Terminology Phase I - Update strings in error messages and other strings
zhou xing created MESOS-5057: Summary: Replace Master/Slave Terminology Phase I - Update strings in error messages and other strings Key: MESOS-5057 URL: https://issues.apache.org/jira/browse/MESOS-5057 Project: Mesos Issue Type: Task Reporter: zhou xing Assignee: zhou xing This is a sub ticket of MESOS-3780. In this ticket, we will update all the slave to agent in the error messages and other strings in the code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5056) Replace Master/Slave Terminology Phase I - Update strings in the shell scripts outputs
zhou xing created MESOS-5056: Summary: Replace Master/Slave Terminology Phase I - Update strings in the shell scripts outputs Key: MESOS-5056 URL: https://issues.apache.org/jira/browse/MESOS-5056 Project: Mesos Issue Type: Task Reporter: zhou xing Assignee: zhou xing This is a sub ticket of MESOS-3780. In this ticket, we will rename slave to agent in the shell script outputs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5055) Replace Master/Slave Terminology Phase I - Update strings in the log message and standard output
[ https://issues.apache.org/jira/browse/MESOS-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215485#comment-15215485 ] zhou xing commented on MESOS-5055: -- review request has been submitted at: https://reviews.apache.org/r/45213/ > Replace Master/Slave Terminology Phase I - Update strings in the log message > and standard output > > > Key: MESOS-5055 > URL: https://issues.apache.org/jira/browse/MESOS-5055 > Project: Mesos > Issue Type: Task >Reporter: zhou xing >Assignee: zhou xing > > This is a sub ticket of MESOS-3780. In this ticket, we will rename all the > slave to agent in the log messages and standard output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5055) Replace Master/Slave Terminology Phase I - Update strings in the log message and standard output
zhou xing created MESOS-5055: Summary: Replace Master/Slave Terminology Phase I - Update strings in the log message and standard output Key: MESOS-5055 URL: https://issues.apache.org/jira/browse/MESOS-5055 Project: Mesos Issue Type: Task Reporter: zhou xing Assignee: zhou xing This is a sub ticket of MESOS-3780. In this ticket, we will rename all the slave to agent in the log messages and standard output. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5048) MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
[ https://issues.apache.org/jira/browse/MESOS-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215438#comment-15215438 ] Jian Qiu commented on MESOS-5048: - [~anandmazumdar] Unfortunately, the failure only happens when verbose logging is not enabled. > MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky > --- > > Key: MESOS-5048 > URL: https://issues.apache.org/jira/browse/MESOS-5048 > Project: Mesos > Issue Type: Bug > Components: tests >Affects Versions: 0.28.0 > Environment: Ubuntu 15.04 >Reporter: Jian Qiu > Labels: flaky-test > > ./mesos-tests.sh > --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics > --gtest_repeat=100 --gtest_break_on_failure > This is found in rb, and reproduced in my local machine. There are two types > of failures. However, the failure does not appear when enabling verbose... > {code} > ../../src/tests/environment.cpp:790: Failure > Failed > Tests completed with child processes remaining: > -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests > \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor >\--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor > {code} > And > {code} > I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0 > I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave > 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0 > Registered executor on mesos > ../../src/tests/slave_recovery_tests.cpp:3506: Failure > Value of: containers.get().size() > Actual: 0 > Expected: 1u > Which is: 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5049) Refactore subproces setup functions.
[ https://issues.apache.org/jira/browse/MESOS-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215433#comment-15215433 ] Jie Yu commented on MESOS-5049: --- commit b2cab0deb285a91d1442df6ea2c8b54da1e59158 Author: Joerg SchadDate: Mon Mar 28 22:03:49 2016 -0700 Fixed typo in subprocess doxygen comments. Review: https://reviews.apache.org/r/45401/ commit e67ad670cd4db7ee7571576db047463c7dc60733 Author: Joerg Schad Date: Mon Mar 28 22:03:24 2016 -0700 Adapted port_mapping isolator with missing subprocess parameter. Review: https://reviews.apache.org/r/45400/ > Refactore subproces setup functions. > > > Key: MESOS-5049 > URL: https://issues.apache.org/jira/browse/MESOS-5049 > Project: Mesos > Issue Type: Improvement >Reporter: Joerg Schad >Assignee: Joerg Schad > Fix For: 0.29.0 > > > Executing arbitrary setup functions while creating new processes is > dangerous as all functions called have to be async safe. As setup > functions are used for only very few purposes (setsid, chdir, monitoring > and killing a process (see upcoming review) it makes sense to support > them safely via parameters to subprocess. > Another common use of child setup are is to block the child while doing some > work in the parent. This pattern can be more cleanly expressed with > parentHooks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5023) MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky.
[ https://issues.apache.org/jira/browse/MESOS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215107#comment-15215107 ] Jie Yu commented on MESOS-5023: --- Sorry, I don't get the bug. onAny callbacks registered on the 'future' will be invoked in serial order when the promise is set. I don't understand why this can go out of order. Can someone show more context here? > MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky. > > > Key: MESOS-5023 > URL: https://issues.apache.org/jira/browse/MESOS-5023 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Alexander Rukletsov >Assignee: Gilbert Song >Priority: Critical > Labels: mesosphere > Fix For: 0.28.1 > > > Observed on the Apache Jenkins. > {noformat} > [ RUN ] MesosContainerizerProvisionerTest.ProvisionFailed > I0324 13:38:56.284261 2948 containerizer.cpp:666] Starting container > 'test_container' for executor 'executor' of framework '' > I0324 13:38:56.285825 2939 containerizer.cpp:1421] Destroying container > 'test_container' > I0324 13:38:56.285854 2939 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'test_container' > [ OK ] MesosContainerizerProvisionerTest.ProvisionFailed (7 ms) > [ RUN ] MesosContainerizerProvisionerTest.DestroyWhileProvisioning > I0324 13:38:56.291187 2944 containerizer.cpp:666] Starting container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' for executor 'executor' of framework '' > I0324 13:38:56.292157 2944 containerizer.cpp:1421] Destroying container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > I0324 13:38:56.292179 2944 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > F0324 13:38:56.292899 2944 containerizer.cpp:752] Check failed: > containers_.contains(containerId) > *** Check failure stack trace: *** > @ 0x2ac9973d0ae4 google::LogMessage::Fail() > @ 0x2ac9973d0a30 google::LogMessage::SendToLog() > @ 0x2ac9973d0432 google::LogMessage::Flush() > @ 0x2ac9973d3346 google::LogMessageFatal::~LogMessageFatal() > @ 0x2ac996af897c > mesos::internal::slave::MesosContainerizerProcess::_launch() > @ 0x2ac996b1f18a > _ZZN7process8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS1_11ContainerIDERK6OptionINS1_8TaskInfoEERKNS1_12ExecutorInfoERKSsRKS8_ISsERKNS1_7SlaveIDERKNS_3PIDINS3_5SlaveEEEbRKS8_INS3_13ProvisionInfoEES5_SA_SD_SsSI_SL_SQ_bSU_EENS_6FutureIT_EERKNSO_IT0_EEMS10_FSZ_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_ENKUlPNS_11ProcessBaseEE_clES1P_ > @ 0x2ac996b479d9 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERK6OptionINS5_8TaskInfoEERKNS5_12ExecutorInfoERKSsRKSC_ISsERKNS5_7SlaveIDERKNS0_3PIDINS7_5SlaveEEEbRKSC_INS7_13ProvisionInfoEES9_SE_SH_SsSM_SP_SU_bSY_EENS0_6FutureIT_EERKNSS_IT0_EEMS14_FS13_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2ac997334fef std::function<>::operator()() > @ 0x2ac99731b1c7 process::ProcessBase::visit() > @ 0x2ac997321154 process::DispatchEvent::visit() > @ 0x9a699c process::ProcessBase::serve() > @ 0x2ac9973173c0 process::ProcessManager::resume() > @ 0x2ac99731445a > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2ac997320916 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2ac9973208c6 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2ac997320858 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2ac9973207af > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2ac997320748 > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2ac9989aea60 (unknown) > @ 0x2ac999125182 start_thread > @ 0x2ac99943547d (unknown) > make[4]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[4]: *** [check-local] Aborted > make[3]: *** [check-am] Error 2 > make[3]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[2]: *** [check] Error 2 >
[jira] [Created] (MESOS-5054) Namespace the stout flags
Greg Mann created MESOS-5054: Summary: Namespace the stout flags Key: MESOS-5054 URL: https://issues.apache.org/jira/browse/MESOS-5054 Project: Mesos Issue Type: Improvement Components: stout Affects Versions: 0.28.0 Reporter: Greg Mann A recent name collision occurred when updating the 3rdparty http-parser library: https://github.com/apache/mesos/commit/94df63f72146501872a06c6487e94bdfd0f23025 We should put stout's {{flags}} namespace within another suitable namespace (perhaps {{stout::flags}}) to avoid such collisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4805) Update ry-http-parser-1c3624a to nodejs/http-parser 2.6.1
[ https://issues.apache.org/jira/browse/MESOS-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214967#comment-15214967 ] Vinod Kone commented on MESOS-4805: --- commit a984656e70ed354257a407f392630e1f9f38f22a Author: Greg MannDate: Mon Mar 28 14:56:56 2016 -0700 Changed name of http-parser enum to 'flags_enum'. Needed this because it conflicts with stout's flags namespace. Review: https://reviews.apache.org/r/45397/ > Update ry-http-parser-1c3624a to nodejs/http-parser 2.6.1 > - > > Key: MESOS-4805 > URL: https://issues.apache.org/jira/browse/MESOS-4805 > Project: Mesos > Issue Type: Improvement >Reporter: Qian Zhang >Assignee: Chen Zhiwei > Fix For: 0.29.0 > > > See https://github.com/nodejs/http-parser/releases/tag/v2.6.1. > The motivation is that nodejs/http-parser 2.6.1 has officially supported IBM > Power (ppc64le), so this is needed by > [MESOS-4312|https://issues.apache.org/jira/browse/MESOS-4312]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214937#comment-15214937 ] Tomasz Janiszewski commented on MESOS-3243: --- I published changes on ReviewBoard https://reviews.apache.org/r/44843/ but it squashed commits so you cannot see changes made by {{clang-tidy}} so I pushed it also to github. Probably same thing I can achieve with ReviewBoard by creating new review and publishing commits one by one, so you can click and see what wasn't changed by {{clang-tidy}}. > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomasz Janiszewski updated MESOS-3243: -- Comment: was deleted (was: I published changes on ReviewBoard https://reviews.apache.org/r/44843/ but it squashed commits so you cannot see changes made by {{clang-tidy}} so I pushed it also to github. Probably same thing I can achieve with ReviewBoard by creating new review and publishing commits one by one, so you can click and see what wasn't changed by {{clang-tidy}}.) > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214936#comment-15214936 ] Tomasz Janiszewski commented on MESOS-3243: --- I published changes on ReviewBoard https://reviews.apache.org/r/44843/ but it squashed commits so you cannot see changes made by {{clang-tidy}} so I pushed it also to github. Probably same thing I can achieve with ReviewBoard by creating new review and publishing commits one by one, so you can click and see what wasn't changed by {{clang-tidy}}. > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214911#comment-15214911 ] Adam B commented on MESOS-1739: --- Probably also want to rescind all outstanding offers from that agent, so that new offers can be generated with the updated attributes. > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Epic >Reporter: Patrick Reilly > Labels: external-volumes, mesosphere, myriad > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4090) Create a light-weight, executor only mesos egg
[ https://issues.apache.org/jira/browse/MESOS-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214907#comment-15214907 ] Steve Niemitz commented on MESOS-4090: -- Changelog update @ https://reviews.apache.org/r/45398/ > Create a light-weight, executor only mesos egg > -- > > Key: MESOS-4090 > URL: https://issues.apache.org/jira/browse/MESOS-4090 > Project: Mesos > Issue Type: Improvement > Components: build >Reporter: Steve Niemitz >Assignee: Steve Niemitz > > Currently, when running tasks in docker containers, if the executor uses the > mesos.native python library, the execution environment inside the container > (OS, native libs, etc) must match the execution environment outside the > container fairly closely in order to load the mesos.so library. > The solution here can be to introduce a much lighter weight python egg, > mesos.executor, which only includes code (and dependencies) needed to create > and run an MesosExecutorDriver. Executors can then use this native library > instead of mesos.native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214901#comment-15214901 ] Michael Park commented on MESOS-3243: - Did you try to commit to https://github.com/apache/mesos or something? You should create review requests on ReviewBoard for them to be committed. The Github repository is simply a mirror of the Apache repository. > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4090) Create a light-weight, executor only mesos egg
[ https://issues.apache.org/jira/browse/MESOS-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214895#comment-15214895 ] Steve Niemitz commented on MESOS-4090: -- Ah thanks for the reminder, it almost did :) > Create a light-weight, executor only mesos egg > -- > > Key: MESOS-4090 > URL: https://issues.apache.org/jira/browse/MESOS-4090 > Project: Mesos > Issue Type: Improvement > Components: build >Reporter: Steve Niemitz >Assignee: Steve Niemitz > > Currently, when running tasks in docker containers, if the executor uses the > mesos.native python library, the execution environment inside the container > (OS, native libs, etc) must match the execution environment outside the > container fairly closely in order to load the mesos.so library. > The solution here can be to introduce a much lighter weight python egg, > mesos.executor, which only includes code (and dependencies) needed to create > and run an MesosExecutorDriver. Executors can then use this native library > instead of mesos.native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214894#comment-15214894 ] Yan Xu edited comment on MESOS-1739 at 3/28/16 9:05 PM: Agreed. Therefore for a more graceful mechanism to notify the framework, I think we can send an event to the framework for each task that the agent attribute change impacts about the changed condition that it runs under, this can be done via a special status update. The framework can then choose to kill the task. What do you think? was (Author: xujyan): Agreed. Therefore for a more graceful mechanism to notify the framework, I think we can send an event to the framework for each task that the agent attribute change impacts about the has changed condition that it runs under, this can be done via a special status update. The framework can then choose to kill the task. What do you think? > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Epic >Reporter: Patrick Reilly > Labels: external-volumes, mesosphere, myriad > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214894#comment-15214894 ] Yan Xu commented on MESOS-1739: --- Agreed. Therefore for a more graceful mechanism to notify the framework, I think we can send an event to the framework for each task that the agent attribute change impacts about the has changed condition that it runs under, this can be done via a special status update. The framework can then choose to kill the task. What do you think? > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Epic >Reporter: Patrick Reilly > Labels: external-volumes, mesosphere, myriad > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214886#comment-15214886 ] Tomasz Janiszewski commented on MESOS-3243: --- I fixed PR. And made two separate commits ([diff|https://github.com/apache/mesos/compare/077ca9505c533f1c8fcf94e3cc1779d873ab8bea...c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f]) 1. {{clang-tidy}} [077ca9505c533f1c8fcf94e3cc1779d873ab8bea|https://github.com/apache/mesos/commit/077ca9505c533f1c8fcf94e3cc1779d873ab8bea] 2. {{sed}} [c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f|https://github.com/apache/mesos/commit/c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f] Should I open new review and post them one by one? > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomasz Janiszewski updated MESOS-3243: -- Comment: was deleted (was: I fixed PR. And made two separate commits ([diff|https://github.com/apache/mesos/compare/077ca9505c533f1c8fcf94e3cc1779d873ab8bea...c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f]) 1. {{clang-tidy}} [077ca9505c533f1c8fcf94e3cc1779d873ab8bea|https://github.com/apache/mesos/commit/077ca9505c533f1c8fcf94e3cc1779d873ab8bea] 2. {{sed}} [c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f|https://github.com/apache/mesos/commit/c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f] Should I open new review and post them one by one?) > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3243) Replace NULL with nullptr
[ https://issues.apache.org/jira/browse/MESOS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214887#comment-15214887 ] Tomasz Janiszewski commented on MESOS-3243: --- I fixed PR. And made two separate commits ([diff|https://github.com/apache/mesos/compare/077ca9505c533f1c8fcf94e3cc1779d873ab8bea...c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f]) 1. {{clang-tidy}} [077ca9505c533f1c8fcf94e3cc1779d873ab8bea|https://github.com/apache/mesos/commit/077ca9505c533f1c8fcf94e3cc1779d873ab8bea] 2. {{sed}} [c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f|https://github.com/apache/mesos/commit/c4ef738dbdf92fcaaa8724a8246d6f9c7bb85f7f] Should I open new review and post them one by one? > Replace NULL with nullptr > - > > Key: MESOS-3243 > URL: https://issues.apache.org/jira/browse/MESOS-3243 > Project: Mesos > Issue Type: Improvement >Reporter: Michael Park >Assignee: Tomasz Janiszewski > > As part of the C++ upgrade, it would be nice to move our use of {{NULL}} over > to use {{nullptr}}. I think it would be an interesting exercise to do this > with {{clang-modernize}} using the [nullptr > transform|http://clang.llvm.org/extra/UseNullptrTransform.html] (although > it's probably just as easy to use {{sed}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4090) Create a light-weight, executor only mesos egg
[ https://issues.apache.org/jira/browse/MESOS-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214849#comment-15214849 ] Vinod Kone commented on MESOS-4090: --- [~SteveNiemitz] Still waiting for your CHANGELOG update. Mind sending it (and any other doc updates) before this falls through the cracks? Would love to call it out in the 29.0 release. > Create a light-weight, executor only mesos egg > -- > > Key: MESOS-4090 > URL: https://issues.apache.org/jira/browse/MESOS-4090 > Project: Mesos > Issue Type: Improvement > Components: build >Reporter: Steve Niemitz >Assignee: Steve Niemitz > > Currently, when running tasks in docker containers, if the executor uses the > mesos.native python library, the execution environment inside the container > (OS, native libs, etc) must match the execution environment outside the > container fairly closely in order to load the mesos.so library. > The solution here can be to introduce a much lighter weight python egg, > mesos.executor, which only includes code (and dependencies) needed to create > and run an MesosExecutorDriver. Executors can then use this native library > instead of mesos.native. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5053) Add libz to dependencies
Philipp Blum created MESOS-5053: --- Summary: Add libz to dependencies Key: MESOS-5053 URL: https://issues.apache.org/jira/browse/MESOS-5053 Project: Mesos Issue Type: Documentation Components: build Reporter: Philipp Blum Everytime i compile mesos, i need zlib1g-dev. I think it's a good idea to add it to the documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4979) os::rmdir does not handle special files (e.g., device, socket).
[ https://issues.apache.org/jira/browse/MESOS-4979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-4979: Fix Version/s: 0.27.3 0.24.2 0.25.1 0.26.1 > os::rmdir does not handle special files (e.g., device, socket). > --- > > Key: MESOS-4979 > URL: https://issues.apache.org/jira/browse/MESOS-4979 > Project: Mesos > Issue Type: Bug > Components: stout >Affects Versions: 0.19.0, 0.20.0, 0.21.0, 0.22.0, 0.23.0, 0.24.0, 0.25.0, > 0.26.0, 0.27.0, 0.27.1, 0.27.2 >Reporter: Jie Yu >Assignee: Jojy Varghese >Priority: Blocker > Labels: mesosphere, twitter > Fix For: 0.28.0, 0.26.1, 0.25.1, 0.24.2, 0.27.3 > > > Stout os::rmdir does not handle special files like device files or socket > files. This could cause failures when GC sandboxes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5021) Memory leak in subprocess when 'environment' argument is provided.
[ https://issues.apache.org/jira/browse/MESOS-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-5021: Fix Version/s: (was: 0.23.2) > Memory leak in subprocess when 'environment' argument is provided. > -- > > Key: MESOS-5021 > URL: https://issues.apache.org/jira/browse/MESOS-5021 > Project: Mesos > Issue Type: Bug > Components: libprocess, slave >Affects Versions: 0.23.0, 0.23.1, 0.24.0, 0.24.1, 0.25.0, 0.26.0, 0.27.0, > 0.27.1, 0.28.0, 0.27.2 >Reporter: Benjamin Mahler >Assignee: Benjamin Mahler >Priority: Blocker > Fix For: 0.26.1, 0.25.1, 0.24.2, 0.28.1, 0.27.3 > > > A memory leak in process::subprocess was introduced here: > https://github.com/apache/mesos/commit/14b49f31840ff1523b31007c21b12c604700323f > This was found when [~jieyu] and I examined a memory leak in the health check > program (see MESOS-4869). > The leak is here: > https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/subprocess.cpp#L451-L456 > {code} > // Like above, we need to construct the environment that we'll pass > // to 'os::execvpe' as it might not be async-safe to perform the > // memory allocations. > char** envp = os::raw::environment(); > if (environment.isSome()) { > // NOTE: We add 1 to the size for a NULL terminator. > envp = new char*[environment.get().size() + 1]; > size_t index = 0; > foreachpair (const string& key, const string& value, environment.get()) { > string entry = key + "=" + value; > envp[index] = new char[entry.size() + 1]; > strncpy(envp[index], entry.c_str(), entry.size() + 1); > ++index; > } > envp[index] = NULL; > } > ... > // Need to delete 'envp' if we had environment variables passed to > // us and we needed to allocate the space. > if (environment.isSome()) { > CHECK_NE(os::raw::environment(), envp); > delete[] envp; // XXX Does not delete the sub arrays. > } > {code} > Auditing the code, it appears to affect a number of locations: > * > [docker::run|https://github.com/apache/mesos/blob/0.28.0/src/docker/docker.cpp#L661-L668] > * [health check > binary|https://github.com/apache/mesos/blob/0.28.0/src/health-check/main.cpp#L177-L205] > * > [liblogrotate|https://github.com/apache/mesos/blob/0.28.0/src/slave/container_loggers/lib_logrotate.cpp#L137-L194] > * Docker containerizer: > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1207-L1220] > and > [here|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/docker.cpp#L1119-L1131] > * [External > containerizer|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/external_containerizer.cpp#L479-L483] > * [Posix > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/launcher.cpp#L131-L141] > and [Linux > launcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/mesos/linux_launcher.cpp#L314-L324] > * > [Fetcher|https://github.com/apache/mesos/blob/0.28.0/src/slave/containerizer/fetcher.cpp#L768-L773] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3738) Mesos health check is invoked incorrectly when Mesos slave is within the docker container
[ https://issues.apache.org/jira/browse/MESOS-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-3738: Fix Version/s: 0.24.2 0.25.1 > Mesos health check is invoked incorrectly when Mesos slave is within the > docker container > - > > Key: MESOS-3738 > URL: https://issues.apache.org/jira/browse/MESOS-3738 > Project: Mesos > Issue Type: Bug > Components: containerization, docker >Affects Versions: 0.25.0 > Environment: Docker 1.8.0: > Client: > Version: 1.8.0 > API version: 1.20 > Go version: go1.4.2 > Git commit: 0d03096 > Built:Tue Aug 11 16:48:39 UTC 2015 > OS/Arch: linux/amd64 > Server: > Version: 1.8.0 > API version: 1.20 > Go version: go1.4.2 > Git commit: 0d03096 > Built:Tue Aug 11 16:48:39 UTC 2015 > OS/Arch: linux/amd64 > Host: Ubuntu 14.04 > Container: Debian 8.1 + Java-7 >Reporter: Yong Tang >Assignee: haosdent > Fix For: 0.26.0, 0.25.1, 0.24.2 > > Attachments: MESOS-3738-0_23_1.patch, MESOS-3738-0_24_1.patch, > MESOS-3738-0_25_0.patch > > > When Mesos slave is within the container, the COMMAND health check from > Marathon is invoked incorrectly. > In such a scenario, the sandbox directory (instead of the > launcher/health-check directory) is used. This result in an error with the > container. > Command to invoke the Mesos slave container: > {noformat} > sudo docker run -d -v /sys:/sys -v /usr/bin/docker:/usr/bin/docker:ro -v > /usr/lib/x86_64-linux-gnu/libapparmor.so.1:/usr/lib/x86_64-linux-gnu/libapparmor.so.1:ro > -v /var/run/docker.sock:/var/run/docker.sock -v /tmp/mesos:/tmp/mesos mesos > mesos slave --master=zk://10.2.1.2:2181/mesos --containerizers=docker,mesos > --executor_registration_timeout=5mins --docker_stop_timeout=10secs > --launcher=posix > {noformat} > Marathon JSON file: > {code} > { > "id": "ubuntu", > "container": > { > "type": "DOCKER", > "docker": > { > "image": "ubuntu", > "network": "BRIDGE", > "parameters": [] > } > }, > "args": [ "bash", "-c", "while true; do echo 1; sleep 5; done" ], > "uris": [], > "healthChecks": > [ > { > "protocol": "COMMAND", > "command": { "value": "echo Success" }, > "gracePeriodSeconds": 3000, > "intervalSeconds": 5, > "timeoutSeconds": 5, > "maxConsecutiveFailures": 300 > } > ], > "instances": 1 > } > {code} > {noformat} > STDOUT: > root@cea2be47d64f:/mnt/mesos/sandbox# cat stdout > --container="mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f" > --stop_timeout="10secs" > --container="mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f" > --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" > --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" > --mapped_directory="/mnt/mesos/sandbox" --quiet="false" > --sandbox_directory="/tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f" > --stop_timeout="10secs" > Registered docker executor on b01e2e75afcb > Starting task ubuntu.86bca10f-72c9-11e5-b36d-02420a020106 > 1 > Launching health check process: > /tmp/mesos/slaves/e20f8959-cd9f-40ae-987d-809401309361-S0/frameworks/e20f8959-cd9f-40ae-987d-809401309361-/executors/ubuntu.86bca10f-72c9-11e5-b36d-02420a020106/runs/815cc886-1cd1-4f13-8f9b-54af1f127c3f/mesos-health-check > --executor=(1)@10.2.1.7:40695 > --health_check_json={"command":{"shell":true,"value":"docker exec > mesos-e20f8959-cd9f-40ae-987d-809401309361-S0.815cc886-1cd1-4f13-8f9b-54af1f127c3f > sh -c \" echo Success > \""},"consecutive_failures":300,"delay_seconds":0.0,"grace_period_seconds":3000.0,"interval_seconds":5.0,"timeout_seconds":5.0} > --task_id=ubuntu.86bca10f-72c9-11e5-b36d-02420a020106 > Health check process launched at pid: 94 > 1 > 1 > 1 > 1 > 1 > STDERR: > root@cea2be47d64f:/mnt/mesos/sandbox# cat stderr > I1014 23:15:58.12795056 exec.cpp:134] Version: 0.25.0 > I1014 23:15:58.13062762 exec.cpp:208] Executor registered on slave > e20f8959-cd9f-40ae-987d-809401309361-S0 > WARNING: Your
[jira] [Updated] (MESOS-3560) JSON-based credential files do not work correctly
[ https://issues.apache.org/jira/browse/MESOS-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Park updated MESOS-3560: Fix Version/s: 0.24.2 0.25.1 > JSON-based credential files do not work correctly > - > > Key: MESOS-3560 > URL: https://issues.apache.org/jira/browse/MESOS-3560 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Michael Park >Assignee: Isabel Jimenez > Labels: mesosphere > Fix For: 0.26.0, 0.25.1, 0.24.2 > > > Specifying the following credentials file: > {code} > { > “credentials”: [ > { > “principal”: “user”, > “secret”: “password” > } > ] > } > {code} > Then hitting a master endpoint with: > {code} > curl -i -u “user:password” ... > {code} > Does not work. This is contrary to the text-based credentials file which > works: > {code} > user password > {code} > Currently, the password in a JSON-based credentials file needs to be > base64-encoded in order for it to work: > {code} > { > “credentials”: [ > { > “principal”: “user”, > “secret”: “cGFzc3dvcmQ=” > } > ] > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5052) Docker containerizer doesn't export CFS metrics when CFS is enabled.
[ https://issues.apache.org/jira/browse/MESOS-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214747#comment-15214747 ] Steve Niemitz commented on MESOS-5052: -- Review up at https://reviews.apache.org/r/41410/ > Docker containerizer doesn't export CFS metrics when CFS is enabled. > > > Key: MESOS-5052 > URL: https://issues.apache.org/jira/browse/MESOS-5052 > Project: Mesos > Issue Type: Bug >Reporter: Steve Niemitz >Assignee: Steve Niemitz > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5052) Docker containerizer doesn't export CFS metrics when CFS is enabled.
Steve Niemitz created MESOS-5052: Summary: Docker containerizer doesn't export CFS metrics when CFS is enabled. Key: MESOS-5052 URL: https://issues.apache.org/jira/browse/MESOS-5052 Project: Mesos Issue Type: Bug Reporter: Steve Niemitz Assignee: Steve Niemitz -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5029) Add labels to ExecutorInfo
[ https://issues.apache.org/jira/browse/MESOS-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214742#comment-15214742 ] Zhitao Li commented on MESOS-5029: -- Sounds good. The deprecation of {{ExecutorInfo.source}} will just be document change for now. > Add labels to ExecutorInfo > -- > > Key: MESOS-5029 > URL: https://issues.apache.org/jira/browse/MESOS-5029 > Project: Mesos > Issue Type: Improvement >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Minor > Labels: uber > > We want to to allow frameworks to populate metadata on ExecutorInfo object. > An use case would be custom labels inspected by QosController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5030) Expose TaskInfo's metadata to ResourceUsage struct
[ https://issues.apache.org/jira/browse/MESOS-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214739#comment-15214739 ] Zhitao Li commented on MESOS-5030: -- Sounds good. Will start a draft patch this week. > Expose TaskInfo's metadata to ResourceUsage struct > -- > > Key: MESOS-5030 > URL: https://issues.apache.org/jira/browse/MESOS-5030 > Project: Mesos > Issue Type: Improvement > Components: oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li > Labels: qos, uber > > So QosController could use metadata information from TaskInfo. > Based on conversations from Mesos work group, we would at least include: > - task id; > - name; > - labels; > ( I think resources, kill_policy should probably also included). > Alternative would be just purge fields like `data`. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5030) Expose TaskInfo's metadata to ResourceUsage struct
[ https://issues.apache.org/jira/browse/MESOS-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214725#comment-15214725 ] Benjamin Mahler commented on MESOS-5030: I'd suggest we define a stripped {{Task}} message within {{ResourceUsage}} and only expose minimal metadata (e.g. id, name, labels). > Expose TaskInfo's metadata to ResourceUsage struct > -- > > Key: MESOS-5030 > URL: https://issues.apache.org/jira/browse/MESOS-5030 > Project: Mesos > Issue Type: Improvement > Components: oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li > Labels: qos, uber > > So QosController could use metadata information from TaskInfo. > Based on conversations from Mesos work group, we would at least include: > - task id; > - name; > - labels; > ( I think resources, kill_policy should probably also included). > Alternative would be just purge fields like `data`. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5029) Add labels to ExecutorInfo
[ https://issues.apache.org/jira/browse/MESOS-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214701#comment-15214701 ] Benjamin Mahler commented on MESOS-5029: Sounds good, rather than introducing a new test, it would be ideal to extend an existing test that ensures the QoSController is getting a complete ResourceUsage (if one exists). I would also suggest that we deprecate {{ExecutorInfo.source}} in favor of using labels. > Add labels to ExecutorInfo > -- > > Key: MESOS-5029 > URL: https://issues.apache.org/jira/browse/MESOS-5029 > Project: Mesos > Issue Type: Improvement >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Minor > Labels: uber > > We want to to allow frameworks to populate metadata on ExecutorInfo object. > An use case would be custom labels inspected by QosController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4061) Flaky tests: docker containerizer tests on debian 8 VM
[ https://issues.apache.org/jira/browse/MESOS-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214689#comment-15214689 ] Greg Mann commented on MESOS-4061: -- I also observed the failure of {{DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker}} on the Mesosphere internal CI, on both Ubuntu 14 and Ubuntu 12: {code} [16:56:58] : [Step 10/10] [ RUN ] DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker [16:56:58]W: [Step 10/10] I0328 16:56:58.619204 25413 docker.cpp:740] Recovering Docker containers [16:56:58]W: [Step 10/10] I0328 16:56:58.619329 25413 docker.cpp:919] Running docker -H unix:///var/run/docker.sock ps -a [16:56:58]W: [Step 10/10] I0328 16:56:58.717419 25410 docker.cpp:800] Running docker -H unix:///var/run/docker.sock inspect mesos-1256102e-f3e5-43a0-ab64-c15db3bf243c-S0.fd72cfa3-2234-4a7b-ba09-9a761253c476 [16:56:58]W: [Step 10/10] I0328 16:56:58.822619 25409 docker.cpp:912] Checking if Docker container named '/mesos-1256102e-f3e5-43a0-ab64-c15db3bf243c-S0.fd72cfa3-2234-4a7b-ba09-9a761253c476' was started by Mesos [16:56:58]W: [Step 10/10] I0328 16:56:58.822669 25409 docker.cpp:922] Checking if Mesos container with ID 'fd72cfa3-2234-4a7b-ba09-9a761253c476' has been orphaned [16:56:58]W: [Step 10/10] I0328 16:56:58.822715 25409 docker.cpp:712] Running docker -H unix:///var/run/docker.sock stop -t 0 b02836190a3be54bbebbe5fe7ecd5d2984c35b9112678020a75221b808459742 [16:57:13] : [Step 10/10] ../../src/tests/containerizer/docker_containerizer_tests.cpp:1298: Failure [16:57:13] : [Step 10/10] Failed to wait 15secs for recover [16:57:13]W: [Step 10/10] I0328 16:57:13.621881 25393 docker.cpp:919] Running docker -H unix:///var/run/docker.sock ps -a [16:57:13]W: [Step 10/10] I0328 16:57:13.706827 25412 docker.cpp:800] Running docker -H unix:///var/run/docker.sock inspect mesos-1256102e-f3e5-43a0-ab64-c15db3bf243c-S0.fd72cfa3-2234-4a7b-ba09-9a761253c476 [16:57:13]W: [Step 10/10] I0328 16:57:13.812366 25393 docker.cpp:761] Running docker -H unix:///var/run/docker.sock rm -f -v b02836190a3be54bbebbe5fe7ecd5d2984c35b9112678020a75221b808459742 [16:57:13] : [Step 10/10] [ FAILED ] DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker (15297 ms) {code} > Flaky tests: docker containerizer tests on debian 8 VM > -- > > Key: MESOS-4061 > URL: https://issues.apache.org/jira/browse/MESOS-4061 > Project: Mesos > Issue Type: Bug > Environment: debian 8, vagrant, virtual box >Reporter: Jojy Varghese > > Following tests were failing for 0.26 rc3: > * DockerContainerizerTest.ROOT_DOCKER_NC_PortMapping > * DockerContainerizerTest.ROOT_DOCKER_Recover > * DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer > * DockerContainerizerTest.ROOT_DOCKER_Launch_Executor > * DockerContainerizerTest.ROOT_DOCKER_Launch > * DockerContainerizerTest.ROOT_DOCKER_Usage > * DockerContainerizerTest.ROOT_DOCKER_Update > * DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker > Note that this is not a comprehensive list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4899) Mesos slave crash after killing docker container
[ https://issues.apache.org/jira/browse/MESOS-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214665#comment-15214665 ] Greg Mann commented on MESOS-4899: -- [~baraldi86], thanks for your help in figuring out this issue. Has your new {{work_dir}} configuration solved the problem? I just want to confirm that this was the root cause so that I can update the docs accordingly. > Mesos slave crash after killing docker container > > > Key: MESOS-4899 > URL: https://issues.apache.org/jira/browse/MESOS-4899 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27.1 >Reporter: Giulio D'Ippolito >Priority: Blocker > > I have experienced an issue where a Mesos slave crashed because a docker task > could not be killed properly. > I'm using marathon to launch the task. > The setup is the following: > OS verison: Centos 7.2 > Docker version: 1.8.2 > Mesos-slave: 0.27.1 > Mesos-master: 0.27.1 > Marathon 0.15.3 > The mesos slave crashed (which is not great). This is the log from the mesos > slave (both mesos slave had the same issue): > {code} > Mar 07 15:25:16 marathon_mesos-2 mesos-slave[30866]: I0307 15:25:16.441756 > 30870 slave.cpp:1890] Asked to kill task > giulio.httpd.test.tag.a47c55be-db01-11e5-a92a-0242eb705eb2 of framework > ef1354df-7ecc-41ac-82d8 > -d7536e319ea2- > Mar 07 15:25:16 marathon_mesos-2 docker[13019]: > time="2016-03-07T15:25:16.553925302Z" level=info msg="POST > /v1.20/containers/mesos-d0f20f55-bc6e-43e8-babb-250b0176f5f6-S145.104860a3-4630-4cbb-8e68-87ecca13fcad/s > top?t=0" > Mar 07 15:25:16 marathon_mesos-2 docker[13019]: > time="2016-03-07T15:25:16.559932661Z" level=info msg="Container > fe412634ec92bb641a18b4c48d399895f703af29492804b927943646bd81ab8a failed to > exit within 0 seconds of > SIGTERM - using the force" > Mar 07 15:25:16 marathon_mesos-2 systemd[1]: Stopped docker container > fe412634ec92bb641a18b4c48d399895f703af29492804b927943646bd81ab8a. > Mar 07 15:25:16 marathon_mesos-2 systemd[1]: Stopping docker container > fe412634ec92bb641a18b4c48d399895f703af29492804b927943646bd81ab8a. > Mar 07 15:25:16 marathon_mesos-2 kernel: docker0: port 4(vetha772033) entered > disabled state > Mar 07 15:25:16 marathon_mesos-2 NetworkManager[788]: (vethd55899a): > failed to find device 106 'vethd55899a' with udev > Mar 07 15:25:16 marathon_mesos-2 NetworkManager[788]: (vethd55899a): > new Veth device (carrier: OFF, driver: 'veth', ifindex: 106) > Mar 07 15:25:16 marathon_mesos-2 NetworkManager[788]: (vetha772033): > link disconnected > Mar 07 15:25:16 marathon_mesos-2 kernel: docker0: port 4(vetha772033) entered > disabled state > Mar 07 15:25:16 marathon_mesos-2 avahi-daemon[11714]: Withdrawing address > record for fe80::b06a:2fff:fecc:87f1 on vetha772033. > Mar 07 15:25:16 marathon_mesos-2 kernel: device vetha772033 left promiscuous > mode > Mar 07 15:25:16 marathon_mesos-2 kernel: docker0: port 4(vetha772033) entered > disabled state > Mar 07 15:25:16 marathon_mesos-2 avahi-daemon[11714]: Withdrawing workstation > service for vethd55899a. > Mar 07 15:25:16 marathon_mesos-2 avahi-daemon[11714]: Withdrawing workstation > service for vetha772033. > Mar 07 15:25:16 marathon_mesos-2 NetworkManager[788]: (vethd55899a): > failed to disable userspace IPv6LL address handling > Mar 07 15:25:16 marathon_mesos-2 NetworkManager[788]: (docker0): > bridge port vetha772033 was detached > Mar 07 15:25:16 marathon_mesos-2 NetworkManager[788]: (vetha772033): > released from master docker0 > Mar 07 15:25:16 marathon_mesos-2 NetworkManager[788]: (vetha772033): > failed to disable userspace IPv6LL address handling > Mar 07 15:25:16 marathon_mesos-2 kernel: XFS (dm-8): Unmounting Filesystem > Mar 07 15:25:16 marathon_mesos-2 mesos-slave[30866]: I0307 15:25:16.879240 > 30868 slave.cpp:3001] Handling status update TASK_KILLED (UUID: > 1b9446db-dcdc-47a7-aa05-0372e84e1b4d) for task giulio.httpd.test.tag.a47 > c55be-db01-11e5-a92a-0242eb705eb2 of framework > ef1354df-7ecc-41ac-82d8-d7536e319ea2- from > executor(1)@XXX.XXX.XXX.XXX:57543 > Mar 07 15:25:16 marathon_mesos-2 mesos-slave[30866]: E0307 15:25:16.880144 > 30868 slave.cpp:3205] Failed to update resources for container > 104860a3-4630-4cbb-8e68-87ecca13fcad of executor 'giulio.httpd.test.tag.a > 47c55be-db01-11e5-a92a-0242eb705eb2' running task > giulio.httpd.test.tag.a47c55be-db01-11e5-a92a-0242eb705eb2 on status update > for terminal task, destroying container: Failed to determine cgroup for the > 'cpu' sub > system: Failed to read /proc/4390/cgroup: Failed to open file > '/proc/4390/cgroup': No such file or directory > Mar 07 15:25:16 marathon_mesos-2 mesos-slave[30866]: I0307 15:25:16.880388 > 30868 status_update_manager.cpp:320] Received status update TASK_KILLED > (UUID:
[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214659#comment-15214659 ] Adam B commented on MESOS-1739: --- But even changing attributes to a superset (or changing existing values) could require killing tasks, depending on the framework. Imagine your framework is running sensitive tasks on a node, and then the operator tags the node with the "public_internet_access=true" attribute, because the node is now in the open. You would want to be alerted so you could kill/move your sensitive tasks, even though it's a new attribute. For resources, adding new resources would not require frameworks to be notified (beyond the existing offer mechanism) nor require tasks to be killed, because existing tasks are not consuming those resources. Removing resources could require killing tasks, if there are not enough resources left after the change to keep running all tasks. Or the agent might just prevent the operator from reducing resources below current consumption. Adding/changing/removing attributes, however, requires frameworks to be notified. > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Epic >Reporter: Patrick Reilly > Labels: external-volumes, mesosphere, myriad > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4805) Update ry-http-parser-1c3624a to nodejs/http-parser 2.6.1
[ https://issues.apache.org/jira/browse/MESOS-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-4805: -- Shepherd: Vinod Kone (was: Benjamin Mahler) Sprint: Mesosphere Sprint 31 Story Points: 3 > Update ry-http-parser-1c3624a to nodejs/http-parser 2.6.1 > - > > Key: MESOS-4805 > URL: https://issues.apache.org/jira/browse/MESOS-4805 > Project: Mesos > Issue Type: Improvement >Reporter: Qian Zhang >Assignee: Chen Zhiwei > > See https://github.com/nodejs/http-parser/releases/tag/v2.6.1. > The motivation is that nodejs/http-parser 2.6.1 has officially supported IBM > Power (ppc64le), so this is needed by > [MESOS-4312|https://issues.apache.org/jira/browse/MESOS-4312]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5005) Make `ReservationInfo.principal` and `Persistence.principal` equivalent
[ https://issues.apache.org/jira/browse/MESOS-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214597#comment-15214597 ] Adam B commented on MESOS-5005: --- [~greggomann], I can shepherd if you have time to work on it this sprint. > Make `ReservationInfo.principal` and `Persistence.principal` equivalent > --- > > Key: MESOS-5005 > URL: https://issues.apache.org/jira/browse/MESOS-5005 > Project: Mesos > Issue Type: Bug >Reporter: Greg Mann > Labels: mesosphere, persistent-volumes, reservations > > Currently, we require that `ReservationInfo.principal` be equal to the > principal provided for authentication, which means that when HTTP > authentication is disabled this field cannot be set. Based on comments in > 'mesos.proto', the original intention was to enforce this same constraint for > `Persistence.principal`, but it seems that we don't enforce it. This should > be changed to make the two fields equivalent. > This means that when HTTP authentication is disabled, requests to '/reserve' > cannot set {{ReservationInfo.principal}}, while requests to `/create-volumes` > can set any principal in {{Persistence.principal}}. One solution would be to > add the constraint to {{Persistence.principal}} when HTTP authentication is > enabled, and remove the constraint from {{ReservationInfo.principal}} when > HTTP authentication is disabled: this would allow us to track a > reserver/creator principal when HTTP authentication is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3976) C++ HTTP Scheduler Library does not work with SSL enabled
[ https://issues.apache.org/jira/browse/MESOS-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3976: -- Shepherd: Vinod Kone > C++ HTTP Scheduler Library does not work with SSL enabled > - > > Key: MESOS-3976 > URL: https://issues.apache.org/jira/browse/MESOS-3976 > Project: Mesos > Issue Type: Bug > Components: framework, HTTP API >Reporter: Joseph Wu >Assignee: Anand Mazumdar > Labels: mesosphere, security > > The C++ HTTP scheduler library does not work against Mesos when SSL is > enabled (without downgrade). > The fix should be simple: > * The library should detect if SSL is enabled. > * If SSL is enabled, connections should be made with HTTPS instead of HTTP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5051) Create helpers for manipulating Linux capabilities.
[ https://issues.apache.org/jira/browse/MESOS-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5051: -- Labels: mesosphere (was: ) > Create helpers for manipulating Linux capabilities. > --- > > Key: MESOS-5051 > URL: https://issues.apache.org/jira/browse/MESOS-5051 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Jojy Varghese > Labels: mesosphere > > These helpers can either based on some existing library (e.g. libcap), or use > system calls directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5051) Create helpers for manipulating Linux capabilities.
[ https://issues.apache.org/jira/browse/MESOS-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5051: -- Shepherd: Jie Yu > Create helpers for manipulating Linux capabilities. > --- > > Key: MESOS-5051 > URL: https://issues.apache.org/jira/browse/MESOS-5051 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Jojy Varghese > Labels: mesosphere > > These helpers can either based on some existing library (e.g. libcap), or use > system calls directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5050) Design Linux capability support for Mesos containerizer
[ https://issues.apache.org/jira/browse/MESOS-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5050: -- Labels: mesosphere (was: ) > Design Linux capability support for Mesos containerizer > --- > > Key: MESOS-5050 > URL: https://issues.apache.org/jira/browse/MESOS-5050 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Jojy Varghese > Labels: mesosphere > > We should at least support the following cases: > 1) A root user has reduced capability > 2) A non-root user has the capability of CAP_NET_ADMIN (to do e.g., tcpdump) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4950) Implement reconnect funtionality in the scheduler library.
[ https://issues.apache.org/jira/browse/MESOS-4950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar reassigned MESOS-4950: - Assignee: Anand Mazumdar > Implement reconnect funtionality in the scheduler library. > -- > > Key: MESOS-4950 > URL: https://issues.apache.org/jira/browse/MESOS-4950 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar >Assignee: Anand Mazumdar > Labels: mesosphere > > Currently, there is no way for the schedulers to force a reconnection attempt > with the master using the scheduler library {{src/scheduler/scheduler.cpp}}. > It is specifically useful in scenarios where there is a one way network > partition with the master. Due to this, the scheduler has not received any > {{HEARTBEAT}} events from the master. In this case, the scheduler might want > to force a reconnection attempt with the master instead of relying on the > {{disconnected}} callback. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4053) MemoryPressureMesosTest tests fail on CentOS 6.6
[ https://issues.apache.org/jira/browse/MESOS-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-4053: - Story Points: 3 > MemoryPressureMesosTest tests fail on CentOS 6.6 > > > Key: MESOS-4053 > URL: https://issues.apache.org/jira/browse/MESOS-4053 > Project: Mesos > Issue Type: Bug > Environment: CentOS 6.6 >Reporter: Greg Mann >Assignee: Greg Mann > Labels: mesosphere, test-failure > > {{MemoryPressureMesosTest.CGROUPS_ROOT_Statistics}} and > {{MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery}} fail on CentOS 6.6. It > seems that mounted cgroups are not properly cleaned up after previous tests, > so multiple hierarchies are detected and thus an error is produced: > {code} > [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics > ../../src/tests/mesos.cpp:849: Failure > Value of: _baseHierarchy.get() > Actual: "/cgroup" > Expected: baseHierarchy > Which is: "/tmp/mesos_test_cgroup" > - > Multiple cgroups base hierarchies detected: > '/tmp/mesos_test_cgroup' > '/cgroup' > Mesos does not support multiple cgroups base hierarchies. > Please unmount the corresponding (or all) subsystems. > - > ../../src/tests/mesos.cpp:932: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/tmp/mesos_test_cgroup/perf_event/mesos_test': Device or resource busy > [ FAILED ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics (12 ms) > [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery > ../../src/tests/mesos.cpp:849: Failure > Value of: _baseHierarchy.get() > Actual: "/cgroup" > Expected: baseHierarchy > Which is: "/tmp/mesos_test_cgroup" > - > Multiple cgroups base hierarchies detected: > '/tmp/mesos_test_cgroup' > '/cgroup' > Mesos does not support multiple cgroups base hierarchies. > Please unmount the corresponding (or all) subsystems. > - > ../../src/tests/mesos.cpp:932: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/tmp/mesos_test_cgroup/perf_event/mesos_test': Device or resource busy > [ FAILED ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (7 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4053) MemoryPressureMesosTest tests fail on CentOS 6.6
[ https://issues.apache.org/jira/browse/MESOS-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4053: - Assignee: Greg Mann (was: Benjamin Hindman) > MemoryPressureMesosTest tests fail on CentOS 6.6 > > > Key: MESOS-4053 > URL: https://issues.apache.org/jira/browse/MESOS-4053 > Project: Mesos > Issue Type: Bug > Environment: CentOS 6.6 >Reporter: Greg Mann >Assignee: Greg Mann > Labels: mesosphere, test-failure > > {{MemoryPressureMesosTest.CGROUPS_ROOT_Statistics}} and > {{MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery}} fail on CentOS 6.6. It > seems that mounted cgroups are not properly cleaned up after previous tests, > so multiple hierarchies are detected and thus an error is produced: > {code} > [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics > ../../src/tests/mesos.cpp:849: Failure > Value of: _baseHierarchy.get() > Actual: "/cgroup" > Expected: baseHierarchy > Which is: "/tmp/mesos_test_cgroup" > - > Multiple cgroups base hierarchies detected: > '/tmp/mesos_test_cgroup' > '/cgroup' > Mesos does not support multiple cgroups base hierarchies. > Please unmount the corresponding (or all) subsystems. > - > ../../src/tests/mesos.cpp:932: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/tmp/mesos_test_cgroup/perf_event/mesos_test': Device or resource busy > [ FAILED ] MemoryPressureMesosTest.CGROUPS_ROOT_Statistics (12 ms) > [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery > ../../src/tests/mesos.cpp:849: Failure > Value of: _baseHierarchy.get() > Actual: "/cgroup" > Expected: baseHierarchy > Which is: "/tmp/mesos_test_cgroup" > - > Multiple cgroups base hierarchies detected: > '/tmp/mesos_test_cgroup' > '/cgroup' > Mesos does not support multiple cgroups base hierarchies. > Please unmount the corresponding (or all) subsystems. > - > ../../src/tests/mesos.cpp:932: Failure > (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup > '/tmp/mesos_test_cgroup/perf_event/mesos_test': Device or resource busy > [ FAILED ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (7 ms) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5030) Expose TaskInfo's metadata to ResourceUsage struct
[ https://issues.apache.org/jira/browse/MESOS-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lambert updated MESOS-5030: - Labels: qos uber (was: qos) > Expose TaskInfo's metadata to ResourceUsage struct > -- > > Key: MESOS-5030 > URL: https://issues.apache.org/jira/browse/MESOS-5030 > Project: Mesos > Issue Type: Improvement > Components: oversubscription >Reporter: Zhitao Li >Assignee: Zhitao Li > Labels: qos, uber > > So QosController could use metadata information from TaskInfo. > Based on conversations from Mesos work group, we would at least include: > - task id; > - name; > - labels; > ( I think resources, kill_policy should probably also included). > Alternative would be just purge fields like `data`. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5029) Add labels to ExecutorInfo
[ https://issues.apache.org/jira/browse/MESOS-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lambert updated MESOS-5029: - Labels: uber (was: ) > Add labels to ExecutorInfo > -- > > Key: MESOS-5029 > URL: https://issues.apache.org/jira/browse/MESOS-5029 > Project: Mesos > Issue Type: Improvement >Reporter: Zhitao Li >Assignee: Zhitao Li >Priority: Minor > Labels: uber > > We want to to allow frameworks to populate metadata on ExecutorInfo object. > An use case would be custom labels inspected by QosController. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4760) Expose metrics and gauges for fetcher cache usage and hit rate
[ https://issues.apache.org/jira/browse/MESOS-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lambert updated MESOS-4760: - Labels: features fetcher statistics uber (was: features fetcher statistics) > Expose metrics and gauges for fetcher cache usage and hit rate > -- > > Key: MESOS-4760 > URL: https://issues.apache.org/jira/browse/MESOS-4760 > Project: Mesos > Issue Type: Improvement > Components: fetcher, statistics >Reporter: Michael Browning >Priority: Minor > Labels: features, fetcher, statistics, uber > > To evaluate the fetcher cache and calibrate the value of the > fetcher_cache_size flag, it would be useful to have metrics and gauges on > agents that expose operational statistics like cache hit rate, occupied cache > size, and time spent downloading resources that were not present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4802) Update leveldb patch file to suport PowerPC LE
[ https://issues.apache.org/jira/browse/MESOS-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-4802: -- Story Points: 3 > Update leveldb patch file to suport PowerPC LE > -- > > Key: MESOS-4802 > URL: https://issues.apache.org/jira/browse/MESOS-4802 > Project: Mesos > Issue Type: Improvement >Reporter: Qian Zhang >Assignee: Chen Zhiwei > > See: https://github.com/google/leveldb/releases/tag/v1.18 for improvements / > bug fixes. > The motivation is that leveldb 1.18 has officially supported IBM Power > (ppc64le), so this is needed by > [MESOS-4312|https://issues.apache.org/jira/browse/MESOS-4312]. > Update: Since someone updated leveldb to 1.4, so I only update the patch file > to support PowerPC LE. Because I don't think upgrade 3rdparty library > frequently is a good thing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4803) Update vendored libev to 4.22
[ https://issues.apache.org/jira/browse/MESOS-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-4803: -- Story Points: 3 > Update vendored libev to 4.22 > - > > Key: MESOS-4803 > URL: https://issues.apache.org/jira/browse/MESOS-4803 > Project: Mesos > Issue Type: Improvement >Reporter: Qian Zhang >Assignee: Chen Zhiwei > > The motivation is that libev 4.22 has officially supported IBM Power > (ppc64le), so this is needed by > [MESOS-4312|https://issues.apache.org/jira/browse/MESOS-4312]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4678) Upgrade vendored Protobuf to 2.6.1
[ https://issues.apache.org/jira/browse/MESOS-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-4678: -- Story Points: 3 > Upgrade vendored Protobuf to 2.6.1 > -- > > Key: MESOS-4678 > URL: https://issues.apache.org/jira/browse/MESOS-4678 > Project: Mesos > Issue Type: Improvement > Components: general >Reporter: Neil Conway >Assignee: Chen Zhiwei > Labels: 3rdParty, mesosphere, protobuf, tech-debt > > We currently vendor Protobuf 2.5.0. We should upgrade to Protobuf 2.6.1. This > introduces various bugfixes, performance improvements, and at least one new > feature we might want to eventually take advantage of ({{map}} data type). > AFAIK there should be no backward compatibility concerns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3780) Replace Master/Slave Terminology Phase I - Update all strings output
[ https://issues.apache.org/jira/browse/MESOS-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214552#comment-15214552 ] Vinod Kone commented on MESOS-3780: --- [~dongdong] good to see that you are thinking about breaking the ticket. Can you actually close this ticket and create 3 separate tickets? That way it will be easy to track the work. > Replace Master/Slave Terminology Phase I - Update all strings output > > > Key: MESOS-3780 > URL: https://issues.apache.org/jira/browse/MESOS-3780 > Project: Mesos > Issue Type: Task >Reporter: Diana Arroyo >Assignee: zhou xing > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5051) Create helpers for manipulating Linux capabilities.
Jie Yu created MESOS-5051: - Summary: Create helpers for manipulating Linux capabilities. Key: MESOS-5051 URL: https://issues.apache.org/jira/browse/MESOS-5051 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Jojy Varghese These helpers can either based on some existing library (e.g. libcap), or use system calls directly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5050) Design Linux capability support for Mesos containerizer
Jie Yu created MESOS-5050: - Summary: Design Linux capability support for Mesos containerizer Key: MESOS-5050 URL: https://issues.apache.org/jira/browse/MESOS-5050 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Jojy Varghese We should at least support the following cases: 1) A root user has reduced capability 2) A non-root user has the capability of CAP_NET_ADMIN (to do e.g., tcpdump) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-5049) Refactore subproces setup functions.
[ https://issues.apache.org/jira/browse/MESOS-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214464#comment-15214464 ] Joerg Schad edited comment on MESOS-5049 at 3/28/16 5:00 PM: - {noformat} commit f8364bb45b651a1721985cf9ca9099eeaca0461f Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 16:52:00 2016 +0200 Subprocess: [1/7] Refactored setup functions. Executing arbitrary setup functions while creating new processes is dangerous as all functions called have to be async safe. As setup functions are used for only very few purposes (setsid, chdir, monitoring and killing a process (see upcoming review) it makes sense to support them safely via parameters to subprocess. Note this review by itself \-without the following ones- removing the uses of the old interface will break the build. Review: https://reviews.apache.org/r/45230/ commit 5043f0d63425c29d4013c71d04a57ade25fe6996 Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:00:22 2016 +0200 Subprocess: [2/7] Removed the use of setup functions. This review follows the previous one and removes most (see following reviews) usages setup functions throughout the code. Review: https://reviews.apache.org/r/45231/ commit a3d1bd49a3456c4be19292293f5bd2c76ac46632 Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:02:06 2016 +0200 Subprocess: [3/7] Introduced watchdog option. Some newly created processes such as perf should be killed in case the parent dies. Currently this is achieved by forking a new process from the child process which serves as a 'watchdog' and kill the child if the parent dies. This review introduces this as a general behavior into subprocess (and hence removes the need for the custom setup function). Review: https://reviews.apache.org/r/45232/ commit dbddb6bd4dafebb46efe64cf3c2424390bbcff6b Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:02:12 2016 +0200 Subprocess: [4/7] Refactored perf test without setup function. With the newly introduced watchdog option there is no need for the child setup function in the perf code anymore. Review: https://reviews.apache.org/r/45233/ commit 85f6145e5ea34c2574e07ec97413a7010e76befe Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:02:18 2016 +0200 Subprocess: [5/7] Introduced parentHooks to fork calls. So far subprocess supports parentHooks. This review adds this option also to fork(). Review: https://reviews.apache.org/r/45235/ commit 0e2895c744ea2ae19ae72c2597862efefe0c2479 Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:02:23 2016 +0200 Subprocess: [6/7] Refactored isolator tests to use parentHook. The isolator tests parent process isolates the child while the child is being blocked. This this the exact patter of a parentHook. Review: https://reviews.apache.org/r/45236/ commit 5914f571ea569777c8bda497a3eb1e6878ea8eb5 Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:35:19 2016 +0200 Subprocess: [7/7] Added watchdog to 'du' disk isolator process. The disk isolator process should also be killed when the parent process dies. Review: https://reviews.apache.org/r/45245/ {noformat} was (Author: js84): '''commit 5914f571ea569777c8bda497a3eb1e6878ea8eb5 Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:35:19 2016 +0200 Subprocess: [7/7] Added watchdog to 'du' disk isolator process. The disk isolator process should also be killed when the parent process dies. Review: https://reviews.apache.org/r/45245/ ''' > Refactore subproces setup functions. > > > Key: MESOS-5049 > URL: https://issues.apache.org/jira/browse/MESOS-5049 > Project: Mesos > Issue Type: Improvement >Reporter: Joerg Schad >Assignee: Joerg Schad > > Executing arbitrary setup functions while creating new processes is > dangerous as all functions called have to be async safe. As setup > functions are used for only very few purposes (setsid, chdir, monitoring > and killing a process (see upcoming review) it makes sense to support > them safely via parameters to subprocess. > Another common use of child setup are is to block the child while doing some > work in the parent. This pattern can be more cleanly expressed with > parentHooks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5049) Refactore subproces setup functions.
[ https://issues.apache.org/jira/browse/MESOS-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214464#comment-15214464 ] Joerg Schad commented on MESOS-5049: '''commit 5914f571ea569777c8bda497a3eb1e6878ea8eb5 Author: Joerg Schad jo...@mesosphere.io Date: Mon Mar 28 17:35:19 2016 +0200 Subprocess: [7/7] Added watchdog to 'du' disk isolator process. The disk isolator process should also be killed when the parent process dies. Review: https://reviews.apache.org/r/45245/ ''' > Refactore subproces setup functions. > > > Key: MESOS-5049 > URL: https://issues.apache.org/jira/browse/MESOS-5049 > Project: Mesos > Issue Type: Improvement >Reporter: Joerg Schad >Assignee: Joerg Schad > > Executing arbitrary setup functions while creating new processes is > dangerous as all functions called have to be async safe. As setup > functions are used for only very few purposes (setsid, chdir, monitoring > and killing a process (see upcoming review) it makes sense to support > them safely via parameters to subprocess. > Another common use of child setup are is to block the child while doing some > work in the parent. This pattern can be more cleanly expressed with > parentHooks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5049) Refactore subproces setup functions.
Joerg Schad created MESOS-5049: -- Summary: Refactore subproces setup functions. Key: MESOS-5049 URL: https://issues.apache.org/jira/browse/MESOS-5049 Project: Mesos Issue Type: Improvement Reporter: Joerg Schad Assignee: Joerg Schad Executing arbitrary setup functions while creating new processes is dangerous as all functions called have to be async safe. As setup functions are used for only very few purposes (setsid, chdir, monitoring and killing a process (see upcoming review) it makes sense to support them safely via parameters to subprocess. Another common use of child setup are is to block the child while doing some work in the parent. This pattern can be more cleanly expressed with parentHooks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5023) MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky.
[ https://issues.apache.org/jira/browse/MESOS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5023: -- Priority: Critical (was: Major) > MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky. > > > Key: MESOS-5023 > URL: https://issues.apache.org/jira/browse/MESOS-5023 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Alexander Rukletsov >Assignee: Gilbert Song >Priority: Critical > Labels: mesosphere > Fix For: 0.28.1 > > > Observed on the Apache Jenkins. > {noformat} > [ RUN ] MesosContainerizerProvisionerTest.ProvisionFailed > I0324 13:38:56.284261 2948 containerizer.cpp:666] Starting container > 'test_container' for executor 'executor' of framework '' > I0324 13:38:56.285825 2939 containerizer.cpp:1421] Destroying container > 'test_container' > I0324 13:38:56.285854 2939 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'test_container' > [ OK ] MesosContainerizerProvisionerTest.ProvisionFailed (7 ms) > [ RUN ] MesosContainerizerProvisionerTest.DestroyWhileProvisioning > I0324 13:38:56.291187 2944 containerizer.cpp:666] Starting container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' for executor 'executor' of framework '' > I0324 13:38:56.292157 2944 containerizer.cpp:1421] Destroying container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > I0324 13:38:56.292179 2944 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > F0324 13:38:56.292899 2944 containerizer.cpp:752] Check failed: > containers_.contains(containerId) > *** Check failure stack trace: *** > @ 0x2ac9973d0ae4 google::LogMessage::Fail() > @ 0x2ac9973d0a30 google::LogMessage::SendToLog() > @ 0x2ac9973d0432 google::LogMessage::Flush() > @ 0x2ac9973d3346 google::LogMessageFatal::~LogMessageFatal() > @ 0x2ac996af897c > mesos::internal::slave::MesosContainerizerProcess::_launch() > @ 0x2ac996b1f18a > _ZZN7process8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS1_11ContainerIDERK6OptionINS1_8TaskInfoEERKNS1_12ExecutorInfoERKSsRKS8_ISsERKNS1_7SlaveIDERKNS_3PIDINS3_5SlaveEEEbRKS8_INS3_13ProvisionInfoEES5_SA_SD_SsSI_SL_SQ_bSU_EENS_6FutureIT_EERKNSO_IT0_EEMS10_FSZ_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_ENKUlPNS_11ProcessBaseEE_clES1P_ > @ 0x2ac996b479d9 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERK6OptionINS5_8TaskInfoEERKNS5_12ExecutorInfoERKSsRKSC_ISsERKNS5_7SlaveIDERKNS0_3PIDINS7_5SlaveEEEbRKSC_INS7_13ProvisionInfoEES9_SE_SH_SsSM_SP_SU_bSY_EENS0_6FutureIT_EERKNSS_IT0_EEMS14_FS13_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2ac997334fef std::function<>::operator()() > @ 0x2ac99731b1c7 process::ProcessBase::visit() > @ 0x2ac997321154 process::DispatchEvent::visit() > @ 0x9a699c process::ProcessBase::serve() > @ 0x2ac9973173c0 process::ProcessManager::resume() > @ 0x2ac99731445a > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2ac997320916 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2ac9973208c6 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2ac997320858 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2ac9973207af > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2ac997320748 > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2ac9989aea60 (unknown) > @ 0x2ac999125182 start_thread > @ 0x2ac99943547d (unknown) > make[4]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[4]: *** [check-local] Aborted > make[3]: *** [check-am] Error 2 > make[3]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[2]: *** [check] Error 2 > make[2]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[1]: *** [check-recursive] Error 1 > make[1]: Leaving directory `/mesos/mesos-0.29.0/_build' > make: *** [distcheck] Error 1 > Build step 'Execute shell'
[jira] [Updated] (MESOS-5028) Copy provisioner cannot replace directory with symlink
[ https://issues.apache.org/jira/browse/MESOS-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5028: -- Sprint: Mesosphere Sprint 32 > Copy provisioner cannot replace directory with symlink > -- > > Key: MESOS-5028 > URL: https://issues.apache.org/jira/browse/MESOS-5028 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Zhitao Li >Assignee: Gilbert Song > > I'm trying to play with the new image provisioner on our custom docker > images, but one of layer failed to get copied, possibly due to a dangling > symlink. > Error log with Glog_v=1: > {quote} > I0324 05:42:48.926678 15067 copy.cpp:127] Copying layer path > '/tmp/mesos/store/docker/layers/5df0888641196b88dcc1b97d04c74839f02a73b8a194a79e134426d6a8fcb0f1/rootfs' > to rootfs > '/var/lib/mesos/provisioner/containers/5f05be6c-c970-4539-aa64-fd0eef2ec7ae/backends/copy/rootfses/507173f3-e316-48a3-a96e-5fdea9ffe9f6' > E0324 05:42:49.028506 15062 slave.cpp:3773] Container > '5f05be6c-c970-4539-aa64-fd0eef2ec7ae' for executor 'test' of framework > 75932a89-1514-4011-bafe-beb6a208bb2d-0004 failed to start: Collect failed: > Collect failed: Failed to copy layer: cp: cannot overwrite directory > ‘/var/lib/mesos/provisioner/containers/5f05be6c-c970-4539-aa64-fd0eef2ec7ae/backends/copy/rootfses/507173f3-e316-48a3-a96e-5fdea9ffe9f6/etc/apt’ > with non-directory > {quote} > Content of > _/tmp/mesos/store/docker/layers/5df0888641196b88dcc1b97d04c74839f02a73b8a194a79e134426d6a8fcb0f1/rootfs/etc/apt_ > points to a non-existing absolute path (cannot provide exact path but it's a > result of us trying to mount apt keys into docker container at build time). > I believe what happened is that we executed a script at build time, which > contains equivalent of: > {quote} > rm -rf /etc/apt/* && ln -sf /build-mount-point/ /etc/apt > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5023) MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky.
[ https://issues.apache.org/jira/browse/MESOS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5023: -- Story Points: 2 > MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky. > > > Key: MESOS-5023 > URL: https://issues.apache.org/jira/browse/MESOS-5023 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Alexander Rukletsov >Assignee: Gilbert Song > Labels: mesosphere > Fix For: 0.28.1 > > > Observed on the Apache Jenkins. > {noformat} > [ RUN ] MesosContainerizerProvisionerTest.ProvisionFailed > I0324 13:38:56.284261 2948 containerizer.cpp:666] Starting container > 'test_container' for executor 'executor' of framework '' > I0324 13:38:56.285825 2939 containerizer.cpp:1421] Destroying container > 'test_container' > I0324 13:38:56.285854 2939 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'test_container' > [ OK ] MesosContainerizerProvisionerTest.ProvisionFailed (7 ms) > [ RUN ] MesosContainerizerProvisionerTest.DestroyWhileProvisioning > I0324 13:38:56.291187 2944 containerizer.cpp:666] Starting container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' for executor 'executor' of framework '' > I0324 13:38:56.292157 2944 containerizer.cpp:1421] Destroying container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > I0324 13:38:56.292179 2944 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > F0324 13:38:56.292899 2944 containerizer.cpp:752] Check failed: > containers_.contains(containerId) > *** Check failure stack trace: *** > @ 0x2ac9973d0ae4 google::LogMessage::Fail() > @ 0x2ac9973d0a30 google::LogMessage::SendToLog() > @ 0x2ac9973d0432 google::LogMessage::Flush() > @ 0x2ac9973d3346 google::LogMessageFatal::~LogMessageFatal() > @ 0x2ac996af897c > mesos::internal::slave::MesosContainerizerProcess::_launch() > @ 0x2ac996b1f18a > _ZZN7process8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS1_11ContainerIDERK6OptionINS1_8TaskInfoEERKNS1_12ExecutorInfoERKSsRKS8_ISsERKNS1_7SlaveIDERKNS_3PIDINS3_5SlaveEEEbRKS8_INS3_13ProvisionInfoEES5_SA_SD_SsSI_SL_SQ_bSU_EENS_6FutureIT_EERKNSO_IT0_EEMS10_FSZ_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_ENKUlPNS_11ProcessBaseEE_clES1P_ > @ 0x2ac996b479d9 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERK6OptionINS5_8TaskInfoEERKNS5_12ExecutorInfoERKSsRKSC_ISsERKNS5_7SlaveIDERKNS0_3PIDINS7_5SlaveEEEbRKSC_INS7_13ProvisionInfoEES9_SE_SH_SsSM_SP_SU_bSY_EENS0_6FutureIT_EERKNSS_IT0_EEMS14_FS13_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2ac997334fef std::function<>::operator()() > @ 0x2ac99731b1c7 process::ProcessBase::visit() > @ 0x2ac997321154 process::DispatchEvent::visit() > @ 0x9a699c process::ProcessBase::serve() > @ 0x2ac9973173c0 process::ProcessManager::resume() > @ 0x2ac99731445a > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2ac997320916 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2ac9973208c6 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2ac997320858 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2ac9973207af > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2ac997320748 > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2ac9989aea60 (unknown) > @ 0x2ac999125182 start_thread > @ 0x2ac99943547d (unknown) > make[4]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[4]: *** [check-local] Aborted > make[3]: *** [check-am] Error 2 > make[3]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[2]: *** [check] Error 2 > make[2]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[1]: *** [check-recursive] Error 1 > make[1]: Leaving directory `/mesos/mesos-0.29.0/_build' > make: *** [distcheck] Error 1 > Build step 'Execute shell' marked build as failure > {noformat} -- This
[jira] [Updated] (MESOS-5023) MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky.
[ https://issues.apache.org/jira/browse/MESOS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5023: -- Assignee: Gilbert Song (was: Klaus Ma) > MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky. > > > Key: MESOS-5023 > URL: https://issues.apache.org/jira/browse/MESOS-5023 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Alexander Rukletsov >Assignee: Gilbert Song > Labels: mesosphere > Fix For: 0.28.1 > > > Observed on the Apache Jenkins. > {noformat} > [ RUN ] MesosContainerizerProvisionerTest.ProvisionFailed > I0324 13:38:56.284261 2948 containerizer.cpp:666] Starting container > 'test_container' for executor 'executor' of framework '' > I0324 13:38:56.285825 2939 containerizer.cpp:1421] Destroying container > 'test_container' > I0324 13:38:56.285854 2939 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'test_container' > [ OK ] MesosContainerizerProvisionerTest.ProvisionFailed (7 ms) > [ RUN ] MesosContainerizerProvisionerTest.DestroyWhileProvisioning > I0324 13:38:56.291187 2944 containerizer.cpp:666] Starting container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' for executor 'executor' of framework '' > I0324 13:38:56.292157 2944 containerizer.cpp:1421] Destroying container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > I0324 13:38:56.292179 2944 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > F0324 13:38:56.292899 2944 containerizer.cpp:752] Check failed: > containers_.contains(containerId) > *** Check failure stack trace: *** > @ 0x2ac9973d0ae4 google::LogMessage::Fail() > @ 0x2ac9973d0a30 google::LogMessage::SendToLog() > @ 0x2ac9973d0432 google::LogMessage::Flush() > @ 0x2ac9973d3346 google::LogMessageFatal::~LogMessageFatal() > @ 0x2ac996af897c > mesos::internal::slave::MesosContainerizerProcess::_launch() > @ 0x2ac996b1f18a > _ZZN7process8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS1_11ContainerIDERK6OptionINS1_8TaskInfoEERKNS1_12ExecutorInfoERKSsRKS8_ISsERKNS1_7SlaveIDERKNS_3PIDINS3_5SlaveEEEbRKS8_INS3_13ProvisionInfoEES5_SA_SD_SsSI_SL_SQ_bSU_EENS_6FutureIT_EERKNSO_IT0_EEMS10_FSZ_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_ENKUlPNS_11ProcessBaseEE_clES1P_ > @ 0x2ac996b479d9 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERK6OptionINS5_8TaskInfoEERKNS5_12ExecutorInfoERKSsRKSC_ISsERKNS5_7SlaveIDERKNS0_3PIDINS7_5SlaveEEEbRKSC_INS7_13ProvisionInfoEES9_SE_SH_SsSM_SP_SU_bSY_EENS0_6FutureIT_EERKNSS_IT0_EEMS14_FS13_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2ac997334fef std::function<>::operator()() > @ 0x2ac99731b1c7 process::ProcessBase::visit() > @ 0x2ac997321154 process::DispatchEvent::visit() > @ 0x9a699c process::ProcessBase::serve() > @ 0x2ac9973173c0 process::ProcessManager::resume() > @ 0x2ac99731445a > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2ac997320916 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2ac9973208c6 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2ac997320858 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2ac9973207af > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2ac997320748 > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2ac9989aea60 (unknown) > @ 0x2ac999125182 start_thread > @ 0x2ac99943547d (unknown) > make[4]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[4]: *** [check-local] Aborted > make[3]: *** [check-am] Error 2 > make[3]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[2]: *** [check] Error 2 > make[2]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[1]: *** [check-recursive] Error 1 > make[1]: Leaving directory `/mesos/mesos-0.29.0/_build' > make: *** [distcheck] Error 1 > Build step 'Execute shell' marked build as failure >
[jira] [Updated] (MESOS-5023) MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky.
[ https://issues.apache.org/jira/browse/MESOS-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-5023: -- Shepherd: Jie Yu Component/s: containerization > MesosContainerizerProvisionerTest.DestroyWhileProvisioning is flaky. > > > Key: MESOS-5023 > URL: https://issues.apache.org/jira/browse/MESOS-5023 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Alexander Rukletsov >Assignee: Klaus Ma > Labels: mesosphere > Fix For: 0.28.1 > > > Observed on the Apache Jenkins. > {noformat} > [ RUN ] MesosContainerizerProvisionerTest.ProvisionFailed > I0324 13:38:56.284261 2948 containerizer.cpp:666] Starting container > 'test_container' for executor 'executor' of framework '' > I0324 13:38:56.285825 2939 containerizer.cpp:1421] Destroying container > 'test_container' > I0324 13:38:56.285854 2939 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'test_container' > [ OK ] MesosContainerizerProvisionerTest.ProvisionFailed (7 ms) > [ RUN ] MesosContainerizerProvisionerTest.DestroyWhileProvisioning > I0324 13:38:56.291187 2944 containerizer.cpp:666] Starting container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' for executor 'executor' of framework '' > I0324 13:38:56.292157 2944 containerizer.cpp:1421] Destroying container > 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > I0324 13:38:56.292179 2944 containerizer.cpp:1424] Waiting for the > provisioner to complete for container 'c2316963-c6cb-4c7f-a3b9-17ca5931e5b2' > F0324 13:38:56.292899 2944 containerizer.cpp:752] Check failed: > containers_.contains(containerId) > *** Check failure stack trace: *** > @ 0x2ac9973d0ae4 google::LogMessage::Fail() > @ 0x2ac9973d0a30 google::LogMessage::SendToLog() > @ 0x2ac9973d0432 google::LogMessage::Flush() > @ 0x2ac9973d3346 google::LogMessageFatal::~LogMessageFatal() > @ 0x2ac996af897c > mesos::internal::slave::MesosContainerizerProcess::_launch() > @ 0x2ac996b1f18a > _ZZN7process8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS1_11ContainerIDERK6OptionINS1_8TaskInfoEERKNS1_12ExecutorInfoERKSsRKS8_ISsERKNS1_7SlaveIDERKNS_3PIDINS3_5SlaveEEEbRKS8_INS3_13ProvisionInfoEES5_SA_SD_SsSI_SL_SQ_bSU_EENS_6FutureIT_EERKNSO_IT0_EEMS10_FSZ_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_ENKUlPNS_11ProcessBaseEE_clES1P_ > @ 0x2ac996b479d9 > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIbN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERK6OptionINS5_8TaskInfoEERKNS5_12ExecutorInfoERKSsRKSC_ISsERKNS5_7SlaveIDERKNS0_3PIDINS7_5SlaveEEEbRKSC_INS7_13ProvisionInfoEES9_SE_SH_SsSM_SP_SU_bSY_EENS0_6FutureIT_EERKNSS_IT0_EEMS14_FS13_T1_T2_T3_T4_T5_T6_T7_T8_T9_ET10_T11_T12_T13_T14_T15_T16_T17_T18_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2ac997334fef std::function<>::operator()() > @ 0x2ac99731b1c7 process::ProcessBase::visit() > @ 0x2ac997321154 process::DispatchEvent::visit() > @ 0x9a699c process::ProcessBase::serve() > @ 0x2ac9973173c0 process::ProcessManager::resume() > @ 0x2ac99731445a > _ZZN7process14ProcessManager12init_threadsEvENKUlRKSt11atomic_boolE_clES3_ > @ 0x2ac997320916 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEE6__callIvIEILm0T_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE > @ 0x2ac9973208c6 > _ZNSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS3_EEEclIIEvEET0_DpOT_ > @ 0x2ac997320858 > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEE9_M_invokeIIEEEvSt12_Index_tupleIIXspT_EEE > @ 0x2ac9973207af > _ZNSt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS4_EEEvEEclEv > @ 0x2ac997320748 > _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv > @ 0x2ac9989aea60 (unknown) > @ 0x2ac999125182 start_thread > @ 0x2ac99943547d (unknown) > make[4]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[4]: *** [check-local] Aborted > make[3]: *** [check-am] Error 2 > make[3]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[2]: *** [check] Error 2 > make[2]: Leaving directory `/mesos/mesos-0.29.0/_build/src' > make[1]: *** [check-recursive] Error 1 > make[1]: Leaving directory `/mesos/mesos-0.29.0/_build' > make: *** [distcheck] Error 1 > Build step 'Execute shell' marked build as
[jira] [Commented] (MESOS-1739) Allow slave reconfiguration on restart
[ https://issues.apache.org/jira/browse/MESOS-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214389#comment-15214389 ] Greg Mann commented on MESOS-1739: -- For a first step, one approach would be to limit the scope of this work that it's only possible to change the slave resources/attributes to a superset of their previous values. As you see in review #25525, that was the original approach because it makes things much simpler. In fact, though it may not be *strictly* necessary to kill tasks if attributes are removed from a slave, consider the following scenario: a task was started on a slave with a particular attribute because that attribute indicates that the slave has access to a certain region of the network. If the attribute is removed because this region is no longer accessible, then it would make sense to kill the task which has that dependency. In this case, the master could notify the framework that the attributes were changed and the framework could take action as appropriate. So I think it might make sense to initially just implement a reconfiguration to a superset of previous attributes/resources. Do you have a specific use case for changing attributes/resources to a *subset* of their previous values? > Allow slave reconfiguration on restart > -- > > Key: MESOS-1739 > URL: https://issues.apache.org/jira/browse/MESOS-1739 > Project: Mesos > Issue Type: Epic >Reporter: Patrick Reilly > Labels: external-volumes, mesosphere, myriad > > Make it so that either via a slave restart or a out of process "reconfigure" > ping, the attributes and resources of a slave can be updated to be a superset > of what they used to be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4951) Enable actors to pass an authentication realm to libprocess
[ https://issues.apache.org/jira/browse/MESOS-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214353#comment-15214353 ] Greg Mann edited comment on MESOS-4951 at 3/28/16 3:46 PM: --- Yes, this is necessary for the endpoints which get installed in libprocess, like {{/profiler/*}}. was (Author: greggomann): Yes, this is necessary for the endpoints which get {{route}}d in libprocess, like {{/profiler/*}}. > Enable actors to pass an authentication realm to libprocess > --- > > Key: MESOS-4951 > URL: https://issues.apache.org/jira/browse/MESOS-4951 > Project: Mesos > Issue Type: Improvement > Components: libprocess, slave >Reporter: Greg Mann > Labels: authentication, http, mesosphere, security > > To prepare for MESOS-4902, the Mesos master and agent need a way to pass the > desired authentication realm to libprocess. Since some endpoints (like > {{/profiler/*}}) get installed in libprocess, the master/agent should be able > to specify during initialization what authentication realm the > libprocess-level endpoints will be authenticated under. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4951) Enable actors to pass an authentication realm to libprocess
[ https://issues.apache.org/jira/browse/MESOS-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214353#comment-15214353 ] Greg Mann commented on MESOS-4951: -- Yes, this is necessary for the endpoints which get {{route}}d in libprocess, like {{/profiler/*}}. > Enable actors to pass an authentication realm to libprocess > --- > > Key: MESOS-4951 > URL: https://issues.apache.org/jira/browse/MESOS-4951 > Project: Mesos > Issue Type: Improvement > Components: libprocess, slave >Reporter: Greg Mann > Labels: authentication, http, mesosphere, security > > To prepare for MESOS-4902, the Mesos master and agent need a way to pass the > desired authentication realm to libprocess. Since some endpoints (like > {{/profiler/*}}) get installed in libprocess, the master/agent should be able > to specify during initialization what authentication realm the > libprocess-level endpoints will be authenticated under. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-4610) MasterContender/MasterDetector should be loadable as modules
[ https://issues.apache.org/jira/browse/MESOS-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197665#comment-15197665 ] ANURAG SINGH edited comment on MESOS-4610 at 3/28/16 3:22 PM: -- We've proceeded further with the review and here's the updated list of changes: https://reviews.apache.org/r/44287/: Added MasterContender and MasterDetector abstract classes. https://reviews.apache.org/r/44288/: Changed MasterDetector/Contender namespace. https://reviews.apache.org/r/44543/: Removed unnecessary MasterContender and MasterDetector definitions. https://reviews.apache.org/r/44544/: Moved contender and detector definitions into separate directories. https://reviews.apache.org/r/44545/: Instead of keeping standalone and zookeper contender/detector class definitions and implementations in the same file, separated them. Also made the necessary changes in users of class headers to point to the new locations. https://reviews.apache.org/r/44546/: Moved functions in promises to a common header file. https://reviews.apache.org/r/44547/: Added functions in promises to the collect header. https://reviews.apache.org/r/44289/: Added support for contender and detector modules. https://reviews.apache.org/r/44669/: Added createFromModule methods to MasterContender and MasterDetector. https://reviews.apache.org/r/44670/: Added master_detector and master_contender flags. was (Author: anurag.prakash.singh): We've proceeded further with the review and here's the updated list of changes: https://reviews.apache.org/r/44287/ https://reviews.apache.org/r/44288/ https://reviews.apache.org/r/44543/ https://reviews.apache.org/r/44544/ https://reviews.apache.org/r/44545/ https://reviews.apache.org/r/44546/ https://reviews.apache.org/r/44547/ https://reviews.apache.org/r/44289/ https://reviews.apache.org/r/44669/ https://reviews.apache.org/r/44670/ > MasterContender/MasterDetector should be loadable as modules > > > Key: MESOS-4610 > URL: https://issues.apache.org/jira/browse/MESOS-4610 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Mark Cavage >Assignee: Mark Cavage > Labels: mesosphere > > Currently mesos depends on Zookeeper for leader election and notification to > slaves, although there is a C++ hierarchy in the code to support alternatives > (e.g., unit tests use an in-memory implementation). From an operational > perspective, many organizations/users do not want to take a dependency on > Zookeeper, and use an alternative solution to implementing leader election. > Our organization in particular, very much wants this, and as a reference > there have been several requests from the community (see referenced tickets) > to replace with etcd/consul/etc. > This ticket will serve as the work effort to modularize the > MasterContender/MasterDetector APIs such that integrators can build a > pluggable solution of their choice; this ticket will not fold in any > implementations such as etcd et al., but simply move this hierarchy to be > fully pluggable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4610) MasterContender/MasterDetector should be loadable as modules
[ https://issues.apache.org/jira/browse/MESOS-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214316#comment-15214316 ] ANURAG SINGH commented on MESOS-4610: - Done. > MasterContender/MasterDetector should be loadable as modules > > > Key: MESOS-4610 > URL: https://issues.apache.org/jira/browse/MESOS-4610 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Mark Cavage >Assignee: Mark Cavage > Labels: mesosphere > > Currently mesos depends on Zookeeper for leader election and notification to > slaves, although there is a C++ hierarchy in the code to support alternatives > (e.g., unit tests use an in-memory implementation). From an operational > perspective, many organizations/users do not want to take a dependency on > Zookeeper, and use an alternative solution to implementing leader election. > Our organization in particular, very much wants this, and as a reference > there have been several requests from the community (see referenced tickets) > to replace with etcd/consul/etc. > This ticket will serve as the work effort to modularize the > MasterContender/MasterDetector APIs such that integrators can build a > pluggable solution of their choice; this ticket will not fold in any > implementations such as etcd et al., but simply move this hierarchy to be > fully pluggable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5048) MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
[ https://issues.apache.org/jira/browse/MESOS-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214284#comment-15214284 ] Anand Mazumdar commented on MESOS-5048: --- [~qiujian] Can you run the tests with verbose logging enabled? Would help in triaging this issue. > MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky > --- > > Key: MESOS-5048 > URL: https://issues.apache.org/jira/browse/MESOS-5048 > Project: Mesos > Issue Type: Bug > Components: tests >Affects Versions: 0.28.0 > Environment: Ubuntu 15.04 >Reporter: Jian Qiu > Labels: flaky-test > > ./mesos-tests.sh > --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics > --gtest_repeat=100 --gtest_break_on_failure > This is found in rb, and reproduced in my local machine. There are two types > of failures. However, the failure does not appear when enabling verbose... > {code} > ../../src/tests/environment.cpp:790: Failure > Failed > Tests completed with child processes remaining: > -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests > \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor >\--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor > {code} > And > {code} > I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0 > I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave > 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0 > Registered executor on mesos > ../../src/tests/slave_recovery_tests.cpp:3506: Failure > Value of: containers.get().size() > Actual: 0 > Expected: 1u > Which is: 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5005) Make `ReservationInfo.principal` and `Persistence.principal` equivalent
[ https://issues.apache.org/jira/browse/MESOS-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5005: -- Shepherd: Adam B Sprint: Mesosphere Sprint 32 > Make `ReservationInfo.principal` and `Persistence.principal` equivalent > --- > > Key: MESOS-5005 > URL: https://issues.apache.org/jira/browse/MESOS-5005 > Project: Mesos > Issue Type: Bug >Reporter: Greg Mann > Labels: mesosphere, persistent-volumes, reservations > > Currently, we require that `ReservationInfo.principal` be equal to the > principal provided for authentication, which means that when HTTP > authentication is disabled this field cannot be set. Based on comments in > 'mesos.proto', the original intention was to enforce this same constraint for > `Persistence.principal`, but it seems that we don't enforce it. This should > be changed to make the two fields equivalent. > This means that when HTTP authentication is disabled, requests to '/reserve' > cannot set {{ReservationInfo.principal}}, while requests to `/create-volumes` > can set any principal in {{Persistence.principal}}. One solution would be to > add the constraint to {{Persistence.principal}} when HTTP authentication is > enabled, and remove the constraint from {{ReservationInfo.principal}} when > HTTP authentication is disabled: this would allow us to track a > reserver/creator principal when HTTP authentication is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4772) TaskInfo/ExecutorInfo should include fine-grained ownership/namespacing
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4772: -- Sprint: Mesosphere Sprint 30 (was: Mesosphere Sprint 30, Mesosphere Sprint 31) > TaskInfo/ExecutorInfo should include fine-grained ownership/namespacing > --- > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B >Assignee: Jan Schlicht > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5031) Authorization Action enum does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5031: -- Shepherd: Adam B Affects Version/s: 0.29.0 > Authorization Action enum does not support upgrades. > > > Key: MESOS-5031 > URL: https://issues.apache.org/jira/browse/MESOS-5031 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.29.0 >Reporter: Adam B >Assignee: Yong Tang > Labels: mesosphere, security > Fix For: 0.29.0 > > > We need to make the Action enum optional in authorization::Request, and add > an `UNKNOWN = 0;` enum value. See MESOS-4997 for details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5031) Authorization Action enum does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5031: -- Story Points: 2 > Authorization Action enum does not support upgrades. > > > Key: MESOS-5031 > URL: https://issues.apache.org/jira/browse/MESOS-5031 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.29.0 >Reporter: Adam B >Assignee: Yong Tang > Labels: mesosphere, security > Fix For: 0.29.0 > > > We need to make the Action enum optional in authorization::Request, and add > an `UNKNOWN = 0;` enum value. See MESOS-4997 for details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4951) Enable actors to pass an authentication realm to libprocess
[ https://issues.apache.org/jira/browse/MESOS-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214031#comment-15214031 ] Adam B commented on MESOS-4951: --- [~greggomann], is this still necessary after the latest discussion? We've landed /files authn without it > Enable actors to pass an authentication realm to libprocess > --- > > Key: MESOS-4951 > URL: https://issues.apache.org/jira/browse/MESOS-4951 > Project: Mesos > Issue Type: Improvement > Components: libprocess, slave >Reporter: Greg Mann > Labels: authentication, http, mesosphere, security > > To prepare for MESOS-4902, the Mesos master and agent need a way to pass the > desired authentication realm to libprocess. Since some endpoints (like > {{/profiler/*}}) get installed in libprocess, the master/agent should be able > to specify during initialization what authentication realm the > libprocess-level endpoints will be authenticated under. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4369) Enhance DockerExecuter to support Docker's user-defined networks
[ https://issues.apache.org/jira/browse/MESOS-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213960#comment-15213960 ] Ezra Silvera commented on MESOS-4369: - I was waiting for the NetworkInfo to be merged. I just changed the code today to use the new NetworkInfo and pushed a new patch > Enhance DockerExecuter to support Docker's user-defined networks > > > Key: MESOS-4369 > URL: https://issues.apache.org/jira/browse/MESOS-4369 > Project: Mesos > Issue Type: Improvement > Components: docker >Reporter: Qian Zhang >Assignee: Ezra Silvera > Labels: mesosphere > > Currently DockerContainerizer supports the following network options which > are Docker built-in networks: > {code} > message DockerInfo { > ... > // Network options. > enum Network { > HOST = 1; > BRIDGE = 2; > NONE = 3; > } > ... > {code} > However, since docker 1.9, Docker now supports user-defined networks (both > local and overlays) - e.g., {{docker network create --driver bridge > my-network}},. The user can then create containers that need to be attached > to these networks e.g., {{docker run --net=my-network}}, > We need to enhance DockerExecuter to support such network option so that the > Docker container that can connect into such network. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-5048) MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky
Jian Qiu created MESOS-5048: --- Summary: MesosContainerizerSlaveRecoveryTest.ResourceStatistics is flaky Key: MESOS-5048 URL: https://issues.apache.org/jira/browse/MESOS-5048 Project: Mesos Issue Type: Bug Components: tests Affects Versions: 0.28.0 Environment: Ubuntu 15.04 Reporter: Jian Qiu ./mesos-tests.sh --gtest_filter=MesosContainerizerSlaveRecoveryTest.ResourceStatistics --gtest_repeat=100 --gtest_break_on_failure This is found in rb, and reproduced in my local machine. There are two types of failures. However, the failure does not appear when enabling verbose... {code} ../../src/tests/environment.cpp:790: Failure Failed Tests completed with child processes remaining: -+- 1446 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-tests \-+- 9171 sh -c /mesos/mesos-0.29.0/_build/src/mesos-executor \--- 9185 /mesos/mesos-0.29.0/_build/src/.libs/lt-mesos-executor {code} And {code} I0328 15:42:36.982471 5687 exec.cpp:150] Version: 0.29.0 I0328 15:42:37.008765 5708 exec.cpp:225] Executor registered on slave 731fb93b-26fe-4c7c-a543-fc76f106a62e-S0 Registered executor on mesos ../../src/tests/slave_recovery_tests.cpp:3506: Failure Value of: containers.get().size() Actual: 0 Expected: 1u Which is: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-5031) Authorization Action enum does not support upgrades.
[ https://issues.apache.org/jira/browse/MESOS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-5031: -- Sprint: Mesosphere Sprint 32 > Authorization Action enum does not support upgrades. > > > Key: MESOS-5031 > URL: https://issues.apache.org/jira/browse/MESOS-5031 > Project: Mesos > Issue Type: Bug >Reporter: Adam B >Assignee: Yong Tang > Labels: mesosphere, security > Fix For: 0.29.0 > > > We need to make the Action enum optional in authorization::Request, and add > an `UNKNOWN = 0;` enum value. See MESOS-4997 for details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5038) Added a any mechanism for futures
[ https://issues.apache.org/jira/browse/MESOS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213869#comment-15213869 ] haosdent commented on MESOS-5038: - As you see, we use {{any}} in https://reviews.apache.org/r/45085/diff/ {code} any(futures).onAny(defer(PID(this), ::__prepare, containerId, lambda::_1)); {code} > Added a any mechanism for futures > - > > Key: MESOS-5038 > URL: https://issues.apache.org/jira/browse/MESOS-5038 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: haosdent >Assignee: haosdent > > Now we already have {{collect}} and {{await}} mechanisms which would wait for > a list of {{Future}}. However, we would like to return immediately if any of > the list of {{Future}} complete instead of wait for the whole list finished > in {{collect}}. The interface of this any mechanism could be > {code} > template > Future any(const std::list& futures); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5038) Added a any mechanism for futures
[ https://issues.apache.org/jira/browse/MESOS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213864#comment-15213864 ] Jay Guo commented on MESOS-5038: [~haosd...@gmail.com] Do you mind giving an example where we may use this in current codebase? > Added a any mechanism for futures > - > > Key: MESOS-5038 > URL: https://issues.apache.org/jira/browse/MESOS-5038 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: haosdent >Assignee: haosdent > > Now we already have {{collect}} and {{await}} mechanisms which would wait for > a list of {{Future}}. However, we would like to return immediately if any of > the list of {{Future}} complete instead of wait for the whole list finished > in {{collect}}. The interface of this any mechanism could be > {code} > template > Future any(const std::list& futures); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5038) Added a any mechanism for futures
[ https://issues.apache.org/jira/browse/MESOS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213856#comment-15213856 ] haosdent commented on MESOS-5038: - [~guoger] For example, you wait for a list of futures. You want to continue if any of them status becomes ready. > Added a any mechanism for futures > - > > Key: MESOS-5038 > URL: https://issues.apache.org/jira/browse/MESOS-5038 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: haosdent >Assignee: haosdent > > Now we already have {{collect}} and {{await}} mechanisms which would wait for > a list of {{Future}}. However, we would like to return immediately if any of > the list of {{Future}} complete instead of wait for the whole list finished > in {{collect}}. The interface of this any mechanism could be > {code} > template > Future any(const std::list& futures); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)