[jira] [Updated] (MESOS-1667) Extract from URI while downloading into work dir
[ https://issues.apache.org/jira/browse/MESOS-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-1667: -- Assignee: (was: Bernd Mathiske) Extract from URI while downloading into work dir Key: MESOS-1667 URL: https://issues.apache.org/jira/browse/MESOS-1667 Project: Mesos Issue Type: Improvement Components: fetcher, slave Affects Versions: 0.20.0 Environment: Every Reporter: Bernd Mathiske Labels: features, mesosphere, performance Original Estimate: 96h Remaining Estimate: 96h When the fetcher downloads an extractable archive, e.g. a tar file, it currently downloads it completely and only then starts extracting from it. But only the end result is needed for execution. Thus the space used for the downloaded copy of the archive is wasted. This can become critical in case of large archives. The general idea to solve this issue is to perform the extraction while downloading, and not storing intermediate results on disk. Possibly, this can be achieved by arranging process pipes or by using some extraction library code to stream the data through. However, as a result of this, repeated downloading may always be called for, whereas given an existing (https://reviews.apache.org/r/21316/) but not yet committed patch for MESOS-336, the fetcher cache could just repeat the extraction, without downloading more than once. Thus choosing in-stream extraction might result in an overall performance loss. We should therefore give users extra options in CommandInfo.URI to choose how to handle this. In some cases, it could be possible to reuse the extracted assets directly, also forgoing the repeat extraction. This could be handled with sym links. Then extraction can happen during downloading and neither repeat downloading nor repeat extraction occur. The user has to be conscious of the safety issue, though, that any post-extraction modifications to the downloaded assets are visible to subsequent tasks. So, an explicit flag in CommandInfo.UIR is called for here, as well. Ideally, this issue would be solved as a follow-up of MESOS-336, because some of the described benefits depend on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3070) Master CHECK failure if a framework uses duplicated task id.
[ https://issues.apache.org/jira/browse/MESOS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695224#comment-14695224 ] Klaus Ma commented on MESOS-3070: - *Current Status*: - Reproduced duplicate task id CHECK failed with UT cases - The previous solution (send KillTaskMessage to slave when re-register) will trigger another CHECK failed (removeTask) *Next actions*: - Find other solution: Option 1. send rejected tasks list to slave within SlaveReregistedMessage, slave kill the executor/tasks accordingly Option 2. persist tasks info in registry; reject duplicated tasks when master restarted - Add post-condition check according to the solution Master CHECK failure if a framework uses duplicated task id. Key: MESOS-3070 URL: https://issues.apache.org/jira/browse/MESOS-3070 Project: Mesos Issue Type: Bug Components: master Affects Versions: 0.22.1 Reporter: Jie Yu Assignee: Klaus Ma We observed this in one of our testing cluster. One framework (under development) keeps launching tasks using the same task_id. We don't expect the master to crash even if the framework is not doing what it's supposed to do. However, under a series of events, this could happen and keeps crashing the master. 1) frameworkA launches task 'task_id_1' on slaveA 2) master fails over 3) slaveA has not re-registered yet 4) frameworkA re-registered and launches task 'task_id_1' on slaveB 5) slaveA re-registering and add task task_id_1' to frameworkA 6) CHECK failure in addTask {noformat} I0716 21:52:50.759305 28805 master.hpp:159] Adding task 'task_id_1' with resources cpus(*):4; mem(*):32768 on slave 20150417-232509-1735470090-5050-48870-S25 (hostname) ... ... F0716 21:52:50.760136 28805 master.hpp:362] Check failed: !tasks.contains(task-task_id()) Duplicate task 'task_id_1' of framework framework_id {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3255) Create TasksKiller Tests
Joerg Schad created MESOS-3255: -- Summary: Create TasksKiller Tests Key: MESOS-3255 URL: https://issues.apache.org/jira/browse/MESOS-3255 Project: Mesos Issue Type: Task Components: test Reporter: Joerg Schad Assignee: Joerg Schad As a follow up to Mesos-3086 we test both the old (Freeze) TasksKiller and also the new (nonFreeze) TasksKiller. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3224) Create a Mesos Contributor Newbie Guide
[ https://issues.apache.org/jira/browse/MESOS-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Diana Arroyo reassigned MESOS-3224: --- Assignee: Diana Arroyo (was: Timothy Chen) Create a Mesos Contributor Newbie Guide --- Key: MESOS-3224 URL: https://issues.apache.org/jira/browse/MESOS-3224 Project: Mesos Issue Type: Documentation Components: documentation Reporter: Timothy Chen Assignee: Diana Arroyo Currently the website doesn't have a helpful guide for community users to know how to start learning to contribute to Mesos, understand the concepts and lower the barrier to get involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3224) Create a Mesos Contributor Newbie Guide
[ https://issues.apache.org/jira/browse/MESOS-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Diana Arroyo updated MESOS-3224: Assignee: Timothy Chen (was: Diana Arroyo) Create a Mesos Contributor Newbie Guide --- Key: MESOS-3224 URL: https://issues.apache.org/jira/browse/MESOS-3224 Project: Mesos Issue Type: Documentation Components: documentation Reporter: Timothy Chen Assignee: Timothy Chen Currently the website doesn't have a helpful guide for community users to know how to start learning to contribute to Mesos, understand the concepts and lower the barrier to get involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3256) Consistent naming of http request methods.
Joerg Schad created MESOS-3256: -- Summary: Consistent naming of http request methods. Key: MESOS-3256 URL: https://issues.apache.org/jira/browse/MESOS-3256 Project: Mesos Issue Type: Task Reporter: Joerg Schad Currently the http requests in libprocess/http.hpp are named post(), put(), and get(). This naming scheme did not for the addition of delete with Mesos-3152 as delete is a C++ keyword and hence that call was named deleteRequest. We should come up with a consistent naming scheme which is easily understandable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2880) Add Frameworkinfo.capabilities on framework re-registration
[ https://issues.apache.org/jira/browse/MESOS-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-2880: -- Description: Add support for adding capabilities. This should be straightforward. (was: Part 1: Add support for adding capabilities. This should be straightforward. Part 2: Add support for removing capabilities. This is a bit tricky because we need to deal with exiting tasks and allocations for revocable resources.) Summary: Add Frameworkinfo.capabilities on framework re-registration (was: Update Frameworkinfo.capabilities on framework re-registration) Add Frameworkinfo.capabilities on framework re-registration --- Key: MESOS-2880 URL: https://issues.apache.org/jira/browse/MESOS-2880 Project: Mesos Issue Type: Improvement Reporter: Vinod Kone Assignee: Aditi Dixit Add support for adding capabilities. This should be straightforward. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2706) When the docker-tasks grow, the time spare between Queuing task and Starting container grows
[ https://issues.apache.org/jira/browse/MESOS-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695609#comment-14695609 ] Alexander Rukletsov commented on MESOS-2706: This can be a docker related issue: docker daemon process requests slower in presence of numerous docker containers. When the docker-tasks grow, the time spare between Queuing task and Starting container grows Key: MESOS-2706 URL: https://issues.apache.org/jira/browse/MESOS-2706 Project: Mesos Issue Type: Bug Components: docker Affects Versions: 0.22.0 Environment: My Environment info: Mesos 0.22.0 Marathon 0.82-RC1 both running in one host-server. Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and 24G mems. So Mesos can launch thousands of task in theory. And the docker-task is very light-weight to launch a sshd service . Reporter: chenqiuhao At the beginning, Marathon can launch docker-task very fast,but when the number of tasks in the only-one mesos-slave host reached 50,It seemed Marathon lauch docker-task slow. So I check the mesos-slave log,and I found that the time spare between Queuing task and Starting container grew . For example, launch the 1st docker task, it takes about 0.008s [root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing task|Starting container' I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework '20150202-112355-2684495626-5050-26153- I0508 15:54:00.196832 225781 docker.cpp:581] Starting container 'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor 'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework '20150202-112355-2684495626-5050-26153-' launch the 50th docker task, it takes about 4.9s I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework '20150202-112355-2684495626-5050-26153- I0508 16:12:15.801503 225778 docker.cpp:581] Starting container '482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor 'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework '20150202-112355-2684495626-5050-26153-' And when i launch the 100th docker task,it takes about 13s! And I did the same test in one 24 Cpus and 256G mems server-host, it got the same result. Did somebody have the same experience , or Can help to do the same pressure test ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3257) Zookeeper JVM test failure causes test harness to fail
[ https://issues.apache.org/jira/browse/MESOS-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695592#comment-14695592 ] haosdent commented on MESOS-3257: - Hello, could you add jdk version and operation system information for this problem? Zookeeper JVM test failure causes test harness to fail -- Key: MESOS-3257 URL: https://issues.apache.org/jira/browse/MESOS-3257 Project: Mesos Issue Type: Bug Reporter: Paul Brett Failure of the test setup for ZooKeeper Java setup causes test harness to exit, preventing subsequent tests from running. {code} [--] 2 tests from LogZooKeeperTest F0813 16:09:33.647265 13790 zookeeper.cpp:78] CHECK_SOME(jvm): Error looking up symbol 'JNI_CreateJavaVM' in '' : /home/pbrett/sandbox/perf.refactor2/build/src/.libs/mesos-tests: undefined symbol: JNI_CreateJavaVM *** Check failure stack trace: *** @ 0x7f2d8cca7aac google::LogMessage::Fail() @ 0x7f2d8cca79fb google::LogMessage::SendToLog() @ 0x7f2d8cca740c google::LogMessage::Flush() @ 0x7f2d8ccaa140 google::LogMessageFatal::~LogMessageFatal() @ 0x8a938c _CheckFatal::~_CheckFatal() @ 0x12f68c0 mesos::internal::tests::ZooKeeperTest::SetUpTestCase() @ 0x132a88a testing::TestCase::RunSetUpTestCase() @ 0x1334cf7 testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x132fb94 testing::internal::HandleExceptionsInMethodIfSupported() @ 0x1311635 testing::TestCase::Run() @ 0x1317fca testing::internal::UnitTestImpl::RunAllTests() @ 0x1335427 testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x1330128 testing::internal::HandleExceptionsInMethodIfSupported() @ 0x1316cf0 testing::UnitTest::Run() @ 0xc3a9d8 RUN_ALL_TESTS() @ 0xc3a6c8 main @ 0x7f2d8818d9f4 __libc_start_main @ 0x8a5fa9 (unknown) make[3]: *** [check-local] Aborted {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3258) Remove Frameworkinfo capabilities on re-registration
Vinod Kone created MESOS-3258: - Summary: Remove Frameworkinfo capabilities on re-registration Key: MESOS-3258 URL: https://issues.apache.org/jira/browse/MESOS-3258 Project: Mesos Issue Type: Bug Reporter: Vinod Kone Assignee: Aditi Dixit Add support for removing capabilities. The idea is that we leave the running revocable tasks as it, but the framework will not got any new revocable offers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3149) Use setuptools to install python cli package
[ https://issues.apache.org/jira/browse/MESOS-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695571#comment-14695571 ] haosdent commented on MESOS-3149: - Hi, [~vinodkone] [~tillt] Let me describe my idea about this patch here. This patch try to fix this problem: when execute mesos-ps, mesos-cat, mesos-scp, mesos-tail, we would got this error. {code} Traceback (most recent call last): File /usr/local/bin/mesos-cat, line 12, in module from mesos import http ImportError: cannot import name http {code} So I think this patch is necessary to 0.24, could you help review this and commit it? Or we have a better way to fix the problem above? Thank you in advance. Also thank [~marco-mesos] help push this patch. ;-) Use setuptools to install python cli package Key: MESOS-3149 URL: https://issues.apache.org/jira/browse/MESOS-3149 Project: Mesos Issue Type: Task Reporter: haosdent Assignee: haosdent mesos-ps/mesos-cat which depends on src/cli/python/mesos could not work in OSX because src/cli/python is not installed to sys.path. It's time to finish this TODO. {code} # Add 'src/cli/python' to PYTHONPATH. # TODO(benh): Remove this if/when we install the 'mesos' module via # PIP and setuptools. PYTHONPATH=@abs_top_srcdir@/src/cli/python:${PYTHONPATH} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3260) SchedulerTest.* are broken on OSX and CentOS
[ https://issues.apache.org/jira/browse/MESOS-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696145#comment-14696145 ] Vinod Kone commented on MESOS-3260: --- I'm reopening this to track the proper fix. SchedulerTest.* are broken on OSX and CentOS Key: MESOS-3260 URL: https://issues.apache.org/jira/browse/MESOS-3260 Project: Mesos Issue Type: Bug Affects Versions: 0.24.0 Environment: OSX 10.10.5 (14F6a), Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn) Reporter: Till Toenshoff Assignee: Vinod Kone Priority: Blocker Fix For: 0.24.0 Running a plain configure and make check on OSX does currently lead to the following: {noformat} [ RUN ] SchedulerTest.Subscribe ../../src/tests/scheduler_tests.cpp:168: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::SUBSCRIBED Which is: SUBSCRIBED ../../src/tests/scheduler_tests.cpp:169: Failure Value of: event.get().subscribed().framework_id() Actual: Expected: id Which is: 20150813-222454-347252928-56290-60707- [ FAILED ] SchedulerTest.Subscribe (183 ms) [ RUN ] SchedulerTest.TaskRunning ../../src/tests/scheduler_tests.cpp:227: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::OFFERS Which is: OFFERS ../../src/tests/scheduler_tests.cpp:228: Failure Expected: (0) != (event.get().offers().offers().size()), actual: 0 vs 0 [libprotobuf FATAL ../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src/google/protobuf/repeated_field.h:824] CHECK failed: (index) (size()): ../../src/tests/scheduler_tests.cpp:237: Failure Actual function call count doesn't match EXPECT_CALL(containerizer, update(_, _))... Expected: to be called at least once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:233: Failure Actual function call count doesn't match EXPECT_CALL(exec, launchTask(_, _))... Expected: to be called once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:230: Failure Actual function call count doesn't match EXPECT_CALL(exec, registered(_, _, _, _))... Expected: to be called once Actual: never called - unsatisfied and active unknown file: Failure C++ exception with description CHECK failed: (index) (size()): thrown in the test body. *** Aborted at 1439497494 (unix time) try date -d @1439497494 if you are using GNU date *** PC: @ 0x7fb2c0f20490 (unknown) *** SIGBUS (@0x7fb2c0f20490) received by PID 60707 (TID 0x7fff7a876300) stack trace: *** @ 0x7fff8a77ef1a _sigtramp @ 0x7fff532c9990 (unknown) @0x10d3bcedb mesos::internal::tests::MesosTest::ShutdownSlaves() @0x10d3bce75 mesos::internal::tests::MesosTest::Shutdown() @0x10d3b7d47 mesos::internal::tests::MesosTest::TearDown() @0x10dbc8283 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbafab7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db6f8ba testing::Test::Run() @0x10db70deb testing::TestInfo::Run() @0x10db71ab7 testing::TestCase::Run() @0x10db804b3 testing::internal::UnitTestImpl::RunAllTests() @0x10dbc4fe3 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbb1ea7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db800b0 testing::UnitTest::Run() @0x10d10c8d1 RUN_ALL_TESTS() @0x10d108b87 main @ 0x7fff8da765c9 start Bus error: 10 {noformat} Results on CentOS look similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3256) Consistent naming of http request methods.
[ https://issues.apache.org/jira/browse/MESOS-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695791#comment-14695791 ] Benjamin Mahler commented on MESOS-3256: Also consider just taking a Request object per the TODO: https://github.com/apache/mesos/blob/0.23.0/3rdparty/libprocess/include/process/http.hpp#L649 Consistent naming of http request methods. -- Key: MESOS-3256 URL: https://issues.apache.org/jira/browse/MESOS-3256 Project: Mesos Issue Type: Task Reporter: Joerg Schad Currently the http requests in libprocess/http.hpp are named post(), put(), and get(). This naming scheme did not for the addition of delete with Mesos-3152 as delete is a C++ keyword and hence that call was named deleteRequest. We should come up with a consistent naming scheme which is easily understandable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3154) Enable Mesos Agent Node to use arbitrary script / module to figure out IP, HOSTNAME
[ https://issues.apache.org/jira/browse/MESOS-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696219#comment-14696219 ] Marco Massenzio commented on MESOS-3154: This is essentially a copy paste of the code in {{src/master/main.cpp}} - doesn't make me proud, but at least should speed up review commit :) Enable Mesos Agent Node to use arbitrary script / module to figure out IP, HOSTNAME --- Key: MESOS-3154 URL: https://issues.apache.org/jira/browse/MESOS-3154 Project: Mesos Issue Type: Story Components: slave Reporter: Benjamin Hindman Assignee: Marco Massenzio Labels: mesosphere Following from MESOS-2902 we want to enable the same functionality in the Mesos Agents too. This is probably best done once we implement the new {{os::shell}} semantics, as described in MESOS-3142. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2688) Slave should kill revocable tasks if oversubscription is disabled
[ https://issues.apache.org/jira/browse/MESOS-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2688: -- Sprint: (was: Twitter Mesos Q3 Sprint 3) Slave should kill revocable tasks if oversubscription is disabled - Key: MESOS-2688 URL: https://issues.apache.org/jira/browse/MESOS-2688 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Jie Yu Labels: twitter If oversubscription is disabled on a restarted slave (that had it previously enabled), it should kill revocable tasks. Slave knows this information from the Resources of a container that it checkpoints and recovers. Add a new reason OVERSUBSCRIPTION_DISABLED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2695) Add master flag to enable/disable oversubscription
[ https://issues.apache.org/jira/browse/MESOS-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2695: -- Sprint: (was: Twitter Mesos Q3 Sprint 3) Add master flag to enable/disable oversubscription -- Key: MESOS-2695 URL: https://issues.apache.org/jira/browse/MESOS-2695 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Jie Yu Labels: twitter This flag lets an operator control cluster level oversubscription. The master should send revocable offers to framework iff this flag is enabled and the framework opts in to receive them. Master should ignore revocable resources from slaves if the flag is disabled. Need tests for all these scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3260) SchedulerTest.* are broken on OSX and CentOS
[ https://issues.apache.org/jira/browse/MESOS-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696187#comment-14696187 ] Vinod Kone edited comment on MESOS-3260 at 8/14/15 12:03 AM: - commit 2a391e8036303f08aa42dc9c66c210940ec8d21f Author: Vinod Kone vinodk...@gmail.com Date: Thu Aug 13 16:17:52 2015 -0700 Fixed scheduler tests to work with heartbeats. Review: https://reviews.apache.org/r/37449 was (Author: vinodkone): commit f011b0a98e5a1c28f6c670102374f5317488fa03 Author: Vinod Kone vinodk...@gmail.com Date: Thu Aug 13 16:17:52 2015 -0700 Fixed scheduler tests to work with heartbeats. Review: https://reviews.apache.org/r/37449 SchedulerTest.* are broken on OSX and CentOS Key: MESOS-3260 URL: https://issues.apache.org/jira/browse/MESOS-3260 Project: Mesos Issue Type: Bug Affects Versions: 0.24.0 Environment: OSX 10.10.5 (14F6a), Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn) Reporter: Till Toenshoff Assignee: Vinod Kone Priority: Blocker Fix For: 0.24.0 Running a plain configure and make check on OSX does currently lead to the following: {noformat} [ RUN ] SchedulerTest.Subscribe ../../src/tests/scheduler_tests.cpp:168: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::SUBSCRIBED Which is: SUBSCRIBED ../../src/tests/scheduler_tests.cpp:169: Failure Value of: event.get().subscribed().framework_id() Actual: Expected: id Which is: 20150813-222454-347252928-56290-60707- [ FAILED ] SchedulerTest.Subscribe (183 ms) [ RUN ] SchedulerTest.TaskRunning ../../src/tests/scheduler_tests.cpp:227: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::OFFERS Which is: OFFERS ../../src/tests/scheduler_tests.cpp:228: Failure Expected: (0) != (event.get().offers().offers().size()), actual: 0 vs 0 [libprotobuf FATAL ../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src/google/protobuf/repeated_field.h:824] CHECK failed: (index) (size()): ../../src/tests/scheduler_tests.cpp:237: Failure Actual function call count doesn't match EXPECT_CALL(containerizer, update(_, _))... Expected: to be called at least once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:233: Failure Actual function call count doesn't match EXPECT_CALL(exec, launchTask(_, _))... Expected: to be called once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:230: Failure Actual function call count doesn't match EXPECT_CALL(exec, registered(_, _, _, _))... Expected: to be called once Actual: never called - unsatisfied and active unknown file: Failure C++ exception with description CHECK failed: (index) (size()): thrown in the test body. *** Aborted at 1439497494 (unix time) try date -d @1439497494 if you are using GNU date *** PC: @ 0x7fb2c0f20490 (unknown) *** SIGBUS (@0x7fb2c0f20490) received by PID 60707 (TID 0x7fff7a876300) stack trace: *** @ 0x7fff8a77ef1a _sigtramp @ 0x7fff532c9990 (unknown) @0x10d3bcedb mesos::internal::tests::MesosTest::ShutdownSlaves() @0x10d3bce75 mesos::internal::tests::MesosTest::Shutdown() @0x10d3b7d47 mesos::internal::tests::MesosTest::TearDown() @0x10dbc8283 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbafab7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db6f8ba testing::Test::Run() @0x10db70deb testing::TestInfo::Run() @0x10db71ab7 testing::TestCase::Run() @0x10db804b3 testing::internal::UnitTestImpl::RunAllTests() @0x10dbc4fe3 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbb1ea7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db800b0 testing::UnitTest::Run() @0x10d10c8d1 RUN_ALL_TESTS() @0x10d108b87 main @ 0x7fff8da765c9 start Bus error: 10 {noformat} Results on CentOS look similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3260) SchedulerTest.* are broken on OSX and CentOS
Till Toenshoff created MESOS-3260: - Summary: SchedulerTest.* are broken on OSX and CentOS Key: MESOS-3260 URL: https://issues.apache.org/jira/browse/MESOS-3260 Project: Mesos Issue Type: Bug Environment: OSX 10.10.5 (14F6a), Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn) Reporter: Till Toenshoff Priority: Blocker Running a plain configure and make check on OSX does currently lead to the following: {noformat} [ RUN ] SchedulerTest.Subscribe ../../src/tests/scheduler_tests.cpp:168: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::SUBSCRIBED Which is: SUBSCRIBED ../../src/tests/scheduler_tests.cpp:169: Failure Value of: event.get().subscribed().framework_id() Actual: Expected: id Which is: 20150813-222454-347252928-56290-60707- [ FAILED ] SchedulerTest.Subscribe (183 ms) [ RUN ] SchedulerTest.TaskRunning ../../src/tests/scheduler_tests.cpp:227: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::OFFERS Which is: OFFERS ../../src/tests/scheduler_tests.cpp:228: Failure Expected: (0) != (event.get().offers().offers().size()), actual: 0 vs 0 [libprotobuf FATAL ../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src/google/protobuf/repeated_field.h:824] CHECK failed: (index) (size()): ../../src/tests/scheduler_tests.cpp:237: Failure Actual function call count doesn't match EXPECT_CALL(containerizer, update(_, _))... Expected: to be called at least once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:233: Failure Actual function call count doesn't match EXPECT_CALL(exec, launchTask(_, _))... Expected: to be called once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:230: Failure Actual function call count doesn't match EXPECT_CALL(exec, registered(_, _, _, _))... Expected: to be called once Actual: never called - unsatisfied and active unknown file: Failure C++ exception with description CHECK failed: (index) (size()): thrown in the test body. *** Aborted at 1439497494 (unix time) try date -d @1439497494 if you are using GNU date *** PC: @ 0x7fb2c0f20490 (unknown) *** SIGBUS (@0x7fb2c0f20490) received by PID 60707 (TID 0x7fff7a876300) stack trace: *** @ 0x7fff8a77ef1a _sigtramp @ 0x7fff532c9990 (unknown) @0x10d3bcedb mesos::internal::tests::MesosTest::ShutdownSlaves() @0x10d3bce75 mesos::internal::tests::MesosTest::Shutdown() @0x10d3b7d47 mesos::internal::tests::MesosTest::TearDown() @0x10dbc8283 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbafab7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db6f8ba testing::Test::Run() @0x10db70deb testing::TestInfo::Run() @0x10db71ab7 testing::TestCase::Run() @0x10db804b3 testing::internal::UnitTestImpl::RunAllTests() @0x10dbc4fe3 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbb1ea7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db800b0 testing::UnitTest::Run() @0x10d10c8d1 RUN_ALL_TESTS() @0x10d108b87 main @ 0x7fff8da765c9 start Bus error: 10 {noformat} Results on CentOS look similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3260) SchedulerTest.* are broken on OSX and CentOS
[ https://issues.apache.org/jira/browse/MESOS-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695900#comment-14695900 ] Vinod Kone commented on MESOS-3260: --- I'm reverting some patches while we investigate the issue. SchedulerTest.* are broken on OSX and CentOS Key: MESOS-3260 URL: https://issues.apache.org/jira/browse/MESOS-3260 Project: Mesos Issue Type: Bug Environment: OSX 10.10.5 (14F6a), Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn) Reporter: Till Toenshoff Priority: Blocker Running a plain configure and make check on OSX does currently lead to the following: {noformat} [ RUN ] SchedulerTest.Subscribe ../../src/tests/scheduler_tests.cpp:168: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::SUBSCRIBED Which is: SUBSCRIBED ../../src/tests/scheduler_tests.cpp:169: Failure Value of: event.get().subscribed().framework_id() Actual: Expected: id Which is: 20150813-222454-347252928-56290-60707- [ FAILED ] SchedulerTest.Subscribe (183 ms) [ RUN ] SchedulerTest.TaskRunning ../../src/tests/scheduler_tests.cpp:227: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::OFFERS Which is: OFFERS ../../src/tests/scheduler_tests.cpp:228: Failure Expected: (0) != (event.get().offers().offers().size()), actual: 0 vs 0 [libprotobuf FATAL ../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src/google/protobuf/repeated_field.h:824] CHECK failed: (index) (size()): ../../src/tests/scheduler_tests.cpp:237: Failure Actual function call count doesn't match EXPECT_CALL(containerizer, update(_, _))... Expected: to be called at least once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:233: Failure Actual function call count doesn't match EXPECT_CALL(exec, launchTask(_, _))... Expected: to be called once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:230: Failure Actual function call count doesn't match EXPECT_CALL(exec, registered(_, _, _, _))... Expected: to be called once Actual: never called - unsatisfied and active unknown file: Failure C++ exception with description CHECK failed: (index) (size()): thrown in the test body. *** Aborted at 1439497494 (unix time) try date -d @1439497494 if you are using GNU date *** PC: @ 0x7fb2c0f20490 (unknown) *** SIGBUS (@0x7fb2c0f20490) received by PID 60707 (TID 0x7fff7a876300) stack trace: *** @ 0x7fff8a77ef1a _sigtramp @ 0x7fff532c9990 (unknown) @0x10d3bcedb mesos::internal::tests::MesosTest::ShutdownSlaves() @0x10d3bce75 mesos::internal::tests::MesosTest::Shutdown() @0x10d3b7d47 mesos::internal::tests::MesosTest::TearDown() @0x10dbc8283 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbafab7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db6f8ba testing::Test::Run() @0x10db70deb testing::TestInfo::Run() @0x10db71ab7 testing::TestCase::Run() @0x10db804b3 testing::internal::UnitTestImpl::RunAllTests() @0x10dbc4fe3 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbb1ea7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db800b0 testing::UnitTest::Run() @0x10d10c8d1 RUN_ALL_TESTS() @0x10d108b87 main @ 0x7fff8da765c9 start Bus error: 10 {noformat} Results on CentOS look similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3149) Use setuptools to install python cli package
[ https://issues.apache.org/jira/browse/MESOS-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695761#comment-14695761 ] Marco Massenzio commented on MESOS-3149: I've asked a couple of colleagues at Mesosphere who are familiar with Python to also review the patch, and we all seem to agree that the changes are fine. We will be committing this patch soon. Use setuptools to install python cli package Key: MESOS-3149 URL: https://issues.apache.org/jira/browse/MESOS-3149 Project: Mesos Issue Type: Task Reporter: haosdent Assignee: haosdent mesos-ps/mesos-cat which depends on src/cli/python/mesos could not work in OSX because src/cli/python is not installed to sys.path. It's time to finish this TODO. {code} # Add 'src/cli/python' to PYTHONPATH. # TODO(benh): Remove this if/when we install the 'mesos' module via # PIP and setuptools. PYTHONPATH=@abs_top_srcdir@/src/cli/python:${PYTHONPATH} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata
[ https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695826#comment-14695826 ] James DeFelice commented on MESOS-2841: --- I've addressed the outstanding review comments and submitted an updated patch for review: https://reviews.apache.org/r/37443/ FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata -- Key: MESOS-2841 URL: https://issues.apache.org/jira/browse/MESOS-2841 Project: Mesos Issue Type: Improvement Reporter: James DeFelice Assignee: Neil Conway Labels: mesosphere A framework instance may offer specific capabilities to the cluster: storage, smartly-balanced request handling across deployed tasks, access to 3rd party services outside of the cluster, etc. These capabilities may or may not be utilized by all, or even most mesos clusters. However, it should be possible for processes running in the cluster to discover capabilities or features of frameworks in order to achieve a higher level of functionality and a more seamless integration experience across the cluster. A rich discovery API attached to the FrameworkInfo could result in some form of early lock-in: there are probably many ways to realize cross-framework integration and external services integration that we haven't considered yet. Rather than over-specify a discovery info message type at the framework level I think FrameworkInfo should expose a **very generic** way to supply metadata for interested consumers (other processes, tasks, etc). Adding a Labels field to FrameworkInfo reuses an existing message type and seems to fit well with the overall intent: attaching generic metadata to a framework instance. These labels should be visible when querying a mesos master's state.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata
[ https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695852#comment-14695852 ] Neil Conway commented on MESOS-2841: [~jdef] Thanks for finishing this off! My apologies for being flaky (vacation/travel, etc.). I'm back now, in case any more work is needed here. FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata -- Key: MESOS-2841 URL: https://issues.apache.org/jira/browse/MESOS-2841 Project: Mesos Issue Type: Improvement Reporter: James DeFelice Assignee: Neil Conway Labels: mesosphere A framework instance may offer specific capabilities to the cluster: storage, smartly-balanced request handling across deployed tasks, access to 3rd party services outside of the cluster, etc. These capabilities may or may not be utilized by all, or even most mesos clusters. However, it should be possible for processes running in the cluster to discover capabilities or features of frameworks in order to achieve a higher level of functionality and a more seamless integration experience across the cluster. A rich discovery API attached to the FrameworkInfo could result in some form of early lock-in: there are probably many ways to realize cross-framework integration and external services integration that we haven't considered yet. Rather than over-specify a discovery info message type at the framework level I think FrameworkInfo should expose a **very generic** way to supply metadata for interested consumers (other processes, tasks, etc). Adding a Labels field to FrameworkInfo reuses an existing message type and seems to fit well with the overall intent: attaching generic metadata to a framework instance. These labels should be visible when querying a mesos master's state.json endpoint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2497) Create synchronous validations for Calls
[ https://issues.apache.org/jira/browse/MESOS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695769#comment-14695769 ] Benjamin Mahler commented on MESOS-2497: Proper validation of the 'Accept' header and a bug fix: {noformat} commit 61f71f05ad2a7e9205565437b3243aa84072bf84 Author: Isabel Jimenez cont...@isabeljimenez.com Date: Thu Aug 13 10:37:19 2015 -0700 Updated /scheduler endopint to use Request::acceptsMediaType. Review: https://reviews.apache.org/r/37403 {noformat} {noformat} commit b3c18d6d6179ac34be89545dc3b8a9333c91ebb7 Author: Benjamin Mahler benjamin.mah...@gmail.com Date: Thu Aug 13 11:43:06 2015 -0700 Ensure the Content-Type is set for the streaming scheduler endpoint. {noformat} Create synchronous validations for Calls Key: MESOS-2497 URL: https://issues.apache.org/jira/browse/MESOS-2497 Project: Mesos Issue Type: Bug Reporter: Isabel Jimenez Assignee: Isabel Jimenez Labels: HTTP, mesosphere /call endpoint will return a 202 accepted code but has to do some basic validations before. In case of invalidation it will return a 4xx code. We have to create a mechanism that will validate the 'request' and send back the appropriate code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3259) Support health checks in Docker Containerizer
Timothy Chen created MESOS-3259: --- Summary: Support health checks in Docker Containerizer Key: MESOS-3259 URL: https://issues.apache.org/jira/browse/MESOS-3259 Project: Mesos Issue Type: Improvement Components: docker Reporter: Timothy Chen Assignee: Jojy Varghese We need to support docker exec health checks in a container within the docker executor. A health check is defined in a TaskInfo and it's not supported in the Docker Containerizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3260) SchedulerTest.* are broken on OSX and CentOS
[ https://issues.apache.org/jira/browse/MESOS-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-3260: -- Assignee: Vinod Kone Affects Version/s: 0.24.0 Target Version/s: 0.24.0 SchedulerTest.* are broken on OSX and CentOS Key: MESOS-3260 URL: https://issues.apache.org/jira/browse/MESOS-3260 Project: Mesos Issue Type: Bug Affects Versions: 0.24.0 Environment: OSX 10.10.5 (14F6a), Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn) Reporter: Till Toenshoff Assignee: Vinod Kone Priority: Blocker Running a plain configure and make check on OSX does currently lead to the following: {noformat} [ RUN ] SchedulerTest.Subscribe ../../src/tests/scheduler_tests.cpp:168: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::SUBSCRIBED Which is: SUBSCRIBED ../../src/tests/scheduler_tests.cpp:169: Failure Value of: event.get().subscribed().framework_id() Actual: Expected: id Which is: 20150813-222454-347252928-56290-60707- [ FAILED ] SchedulerTest.Subscribe (183 ms) [ RUN ] SchedulerTest.TaskRunning ../../src/tests/scheduler_tests.cpp:227: Failure Value of: event.get().type() Actual: HEARTBEAT Expected: Event::OFFERS Which is: OFFERS ../../src/tests/scheduler_tests.cpp:228: Failure Expected: (0) != (event.get().offers().offers().size()), actual: 0 vs 0 [libprotobuf FATAL ../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src/google/protobuf/repeated_field.h:824] CHECK failed: (index) (size()): ../../src/tests/scheduler_tests.cpp:237: Failure Actual function call count doesn't match EXPECT_CALL(containerizer, update(_, _))... Expected: to be called at least once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:233: Failure Actual function call count doesn't match EXPECT_CALL(exec, launchTask(_, _))... Expected: to be called once Actual: never called - unsatisfied and active ../../src/tests/scheduler_tests.cpp:230: Failure Actual function call count doesn't match EXPECT_CALL(exec, registered(_, _, _, _))... Expected: to be called once Actual: never called - unsatisfied and active unknown file: Failure C++ exception with description CHECK failed: (index) (size()): thrown in the test body. *** Aborted at 1439497494 (unix time) try date -d @1439497494 if you are using GNU date *** PC: @ 0x7fb2c0f20490 (unknown) *** SIGBUS (@0x7fb2c0f20490) received by PID 60707 (TID 0x7fff7a876300) stack trace: *** @ 0x7fff8a77ef1a _sigtramp @ 0x7fff532c9990 (unknown) @0x10d3bcedb mesos::internal::tests::MesosTest::ShutdownSlaves() @0x10d3bce75 mesos::internal::tests::MesosTest::Shutdown() @0x10d3b7d47 mesos::internal::tests::MesosTest::TearDown() @0x10dbc8283 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbafab7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db6f8ba testing::Test::Run() @0x10db70deb testing::TestInfo::Run() @0x10db71ab7 testing::TestCase::Run() @0x10db804b3 testing::internal::UnitTestImpl::RunAllTests() @0x10dbc4fe3 testing::internal::HandleSehExceptionsInMethodIfSupported() @0x10dbb1ea7 testing::internal::HandleExceptionsInMethodIfSupported() @0x10db800b0 testing::UnitTest::Run() @0x10d10c8d1 RUN_ALL_TESTS() @0x10d108b87 main @ 0x7fff8da765c9 start Bus error: 10 {noformat} Results on CentOS look similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3187) Docker cli option support
[ https://issues.apache.org/jira/browse/MESOS-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Khanduja updated MESOS-3187: Issue Type: Bug (was: Improvement) Docker cli option support - Key: MESOS-3187 URL: https://issues.apache.org/jira/browse/MESOS-3187 Project: Mesos Issue Type: Bug Components: docker, slave Reporter: Vaibhav Khanduja Assignee: Vaibhav Khanduja Priority: Minor Mesos slave today support docker as a container environment. The docker cli support much more options than what is supported by mesos slave. The slave command line option should be enhanced support such parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2406) Add CLI tool for creating persistent volumes for pre-existing data
[ https://issues.apache.org/jira/browse/MESOS-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696387#comment-14696387 ] Klaus Ma commented on MESOS-2406: - If no developer working on this, i'd like to have a try. Add CLI tool for creating persistent volumes for pre-existing data -- Key: MESOS-2406 URL: https://issues.apache.org/jira/browse/MESOS-2406 Project: Mesos Issue Type: Task Reporter: Jie Yu This is for the case where the user has some pre-existing data under a certain directory (e.g., /var/lib/cassandra) and wants to expose that directory as a persistent volume to the framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-2406) Add CLI tool for creating persistent volumes for pre-existing data
[ https://issues.apache.org/jira/browse/MESOS-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Klaus Ma reassigned MESOS-2406: --- Assignee: Klaus Ma Add CLI tool for creating persistent volumes for pre-existing data -- Key: MESOS-2406 URL: https://issues.apache.org/jira/browse/MESOS-2406 Project: Mesos Issue Type: Task Reporter: Jie Yu Assignee: Klaus Ma This is for the case where the user has some pre-existing data under a certain directory (e.g., /var/lib/cassandra) and wants to expose that directory as a persistent volume to the framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3189) TimeTest.Now fails with --enable-libevent
[ https://issues.apache.org/jira/browse/MESOS-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696461#comment-14696461 ] Vinod Kone commented on MESOS-3189: --- Try running the test in a loop to see if you can repro. make check GTEST_FILTER=* TimeTest.Now* GTEST_REPEAT=-1 GTEST_BREAK_ON_FAILURE=1 TimeTest.Now fails with --enable-libevent - Key: MESOS-3189 URL: https://issues.apache.org/jira/browse/MESOS-3189 Project: Mesos Issue Type: Bug Components: libprocess Affects Versions: 0.23.0 Reporter: Joris Van Remoortere Labels: beginner, libprocess, mesosphere, newbie [ RUN ] TimeTest.Now ../../../3rdparty/libprocess/src/tests/time_tests.cpp:50: Failure Expected: (Microseconds(10)) (Clock::now() - t1), actual: 8-byte object 10-27 00-00 00-00 00-00 vs 0ns [ FAILED ] TimeTest.Now (0 ms) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3187) Docker cli option support
[ https://issues.apache.org/jira/browse/MESOS-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Khanduja updated MESOS-3187: Issue Type: Improvement (was: Bug) Docker cli option support - Key: MESOS-3187 URL: https://issues.apache.org/jira/browse/MESOS-3187 Project: Mesos Issue Type: Improvement Components: docker, slave Reporter: Vaibhav Khanduja Assignee: Vaibhav Khanduja Priority: Minor Mesos slave today support docker as a container environment. The docker cli support much more options than what is supported by mesos slave. The slave command line option should be enhanced support such parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2516) Move allocation-related types to mesos::master namespace
[ https://issues.apache.org/jira/browse/MESOS-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696326#comment-14696326 ] José Guilherme Vanz commented on MESOS-2516: https://reviews.apache.org/r/37468/ Move allocation-related types to mesos::master namespace Key: MESOS-2516 URL: https://issues.apache.org/jira/browse/MESOS-2516 Project: Mesos Issue Type: Improvement Components: allocation Reporter: Alexander Rukletsov Assignee: José Guilherme Vanz Priority: Minor Labels: easyfix, newbie {{Allocator}}, {{Sorter}} and {{Comaprator}} types live in {{master::allocator}} namespace. This is not consistent with the rest of the codebase: {{Isolator}}, {{Fetcher}}, {{Containerizer}} all live in {{slave}} namespace. Namespace {{allocator}} should be killed for consistency. Since sorters are poorly named, they should be renamed (or namespaced) prior to this change in order not to pollute {{master}} namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3189) TimeTest.Now fails with --enable-libevent
[ https://issues.apache.org/jira/browse/MESOS-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696377#comment-14696377 ] José Guilherme Vanz commented on MESOS-3189: I'm trying simulate these problem, but I could not. I executed `configure --enable-libevent` and `make check` in the current HEAD commit and the `TimeTest.Now(0ms)` test is passing. I'm using Fedora 22, maybe are there some issue in the libevent in other OS? What is your environment? TimeTest.Now fails with --enable-libevent - Key: MESOS-3189 URL: https://issues.apache.org/jira/browse/MESOS-3189 Project: Mesos Issue Type: Bug Components: libprocess Affects Versions: 0.23.0 Reporter: Joris Van Remoortere Labels: beginner, libprocess, mesosphere, newbie [ RUN ] TimeTest.Now ../../../3rdparty/libprocess/src/tests/time_tests.cpp:50: Failure Expected: (Microseconds(10)) (Clock::now() - t1), actual: 8-byte object 10-27 00-00 00-00 00-00 vs 0ns [ FAILED ] TimeTest.Now (0 ms) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-3189) TimeTest.Now fails with --enable-libevent
[ https://issues.apache.org/jira/browse/MESOS-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696377#comment-14696377 ] José Guilherme Vanz edited comment on MESOS-3189 at 8/14/15 3:32 AM: - I was trying simulate this issue, but I could not. I executed `configure --enable-libevent` and `make check` in the current HEAD commit and the `TimeTest.Now(0ms)` test is passing. I'm using Fedora 22, maybe are there some issue in the libevent in other OS? What is your environment? was (Author: jvanz): I'm trying simulate these problem, but I could not. I executed `configure --enable-libevent` and `make check` in the current HEAD commit and the `TimeTest.Now(0ms)` test is passing. I'm using Fedora 22, maybe are there some issue in the libevent in other OS? What is your environment? TimeTest.Now fails with --enable-libevent - Key: MESOS-3189 URL: https://issues.apache.org/jira/browse/MESOS-3189 Project: Mesos Issue Type: Bug Components: libprocess Affects Versions: 0.23.0 Reporter: Joris Van Remoortere Labels: beginner, libprocess, mesosphere, newbie [ RUN ] TimeTest.Now ../../../3rdparty/libprocess/src/tests/time_tests.cpp:50: Failure Expected: (Microseconds(10)) (Clock::now() - t1), actual: 8-byte object 10-27 00-00 00-00 00-00 vs 0ns [ FAILED ] TimeTest.Now (0 ms) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2912) Provide a Python library for master detection
[ https://issues.apache.org/jira/browse/MESOS-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695435#comment-14695435 ] Marco Massenzio commented on MESOS-2912: Review posted at https://github.com/mesos/commons/pull/2 [~vinodkone] mentioned he had comments about it (but no review yet) - it would be good to have those in, before I proceed any further in adding more features/functionality to it. Provide a Python library for master detection - Key: MESOS-2912 URL: https://issues.apache.org/jira/browse/MESOS-2912 Project: Mesos Issue Type: Task Reporter: Vinod Kone Assignee: Marco Massenzio Labels: mesosphere When schedulers start interacting with Mesos master via HTTP endpoints, they need a way to detect masters. Mesos should provide a master detection Python library to make this easy for frameworks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3070) Master CHECK failure if a framework uses duplicated task id.
[ https://issues.apache.org/jira/browse/MESOS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14695439#comment-14695439 ] Vinod Kone commented on MESOS-3070: --- Thanks for the update on progress. Not sure how Option 1 works. If the old slave kills the duplicate task it will result in master getting terminal updates for the task which might confuse the master in thinking that the *new* task has terminated. Option 2 is a heavy hammer because it might become scalability bottleneck if the master has to persist every task info. Note that a duplicate task id is only a problem if the duplicate task is being launched on a different slave than the original slave. If it were the same slave, master would've rejected it! So how about storing tasks in master in a per slave map instead of a global tasks map? That way master can be smarter when receiving duplicate task launches or status updates. Master CHECK failure if a framework uses duplicated task id. Key: MESOS-3070 URL: https://issues.apache.org/jira/browse/MESOS-3070 Project: Mesos Issue Type: Bug Components: master Affects Versions: 0.22.1 Reporter: Jie Yu Assignee: Klaus Ma We observed this in one of our testing cluster. One framework (under development) keeps launching tasks using the same task_id. We don't expect the master to crash even if the framework is not doing what it's supposed to do. However, under a series of events, this could happen and keeps crashing the master. 1) frameworkA launches task 'task_id_1' on slaveA 2) master fails over 3) slaveA has not re-registered yet 4) frameworkA re-registered and launches task 'task_id_1' on slaveB 5) slaveA re-registering and add task task_id_1' to frameworkA 6) CHECK failure in addTask {noformat} I0716 21:52:50.759305 28805 master.hpp:159] Adding task 'task_id_1' with resources cpus(*):4; mem(*):32768 on slave 20150417-232509-1735470090-5050-48870-S25 (hostname) ... ... F0716 21:52:50.760136 28805 master.hpp:362] Check failed: !tasks.contains(task-task_id()) Duplicate task 'task_id_1' of framework framework_id {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-3257) Zookeeper JVM test failure causes test harness to fail
Paul Brett created MESOS-3257: - Summary: Zookeeper JVM test failure causes test harness to fail Key: MESOS-3257 URL: https://issues.apache.org/jira/browse/MESOS-3257 Project: Mesos Issue Type: Bug Reporter: Paul Brett Failure of the test setup for ZooKeeper Java setup causes test harness to exit, preventing subsequent tests from running. {code} [--] 2 tests from LogZooKeeperTest F0813 16:09:33.647265 13790 zookeeper.cpp:78] CHECK_SOME(jvm): Error looking up symbol 'JNI_CreateJavaVM' in '' : /home/pbrett/sandbox/perf.refactor2/build/src/.libs/mesos-tests: undefined symbol: JNI_CreateJavaVM *** Check failure stack trace: *** @ 0x7f2d8cca7aac google::LogMessage::Fail() @ 0x7f2d8cca79fb google::LogMessage::SendToLog() @ 0x7f2d8cca740c google::LogMessage::Flush() @ 0x7f2d8ccaa140 google::LogMessageFatal::~LogMessageFatal() @ 0x8a938c _CheckFatal::~_CheckFatal() @ 0x12f68c0 mesos::internal::tests::ZooKeeperTest::SetUpTestCase() @ 0x132a88a testing::TestCase::RunSetUpTestCase() @ 0x1334cf7 testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x132fb94 testing::internal::HandleExceptionsInMethodIfSupported() @ 0x1311635 testing::TestCase::Run() @ 0x1317fca testing::internal::UnitTestImpl::RunAllTests() @ 0x1335427 testing::internal::HandleSehExceptionsInMethodIfSupported() @ 0x1330128 testing::internal::HandleExceptionsInMethodIfSupported() @ 0x1316cf0 testing::UnitTest::Run() @ 0xc3a9d8 RUN_ALL_TESTS() @ 0xc3a6c8 main @ 0x7f2d8818d9f4 __libc_start_main @ 0x8a5fa9 (unknown) make[3]: *** [check-local] Aborted {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-1013) ExamplesTest.JavaLog is flaky
[ https://issues.apache.org/jira/browse/MESOS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-1013: - Shepherd: Till Toenshoff (was: Joris Van Remoortere) ExamplesTest.JavaLog is flaky - Key: MESOS-1013 URL: https://issues.apache.org/jira/browse/MESOS-1013 Project: Mesos Issue Type: Bug Components: test Affects Versions: 0.19.0 Reporter: Vinod Kone Assignee: Greg Mann Labels: flaky, mesosphere Attachments: ExamplesTest.JavaLog.logs The {{ExamplesTest.JavaLog}} test framework is flaky, possibly related to a race condition between mutexes. {noformat} [ RUN ] ExamplesTest.JavaLog Using temporary directory '/tmp/ExamplesTest_JavaLog_WBWEb9' Feb 18, 2014 12:10:57 PM TestLog main INFO: Starting a local ZooKeeper server ... F0218 12:10:58.575036 17450 coordinator.cpp:394] Check failed: !missing Not expecting local replica to be missing position 3 after the writing is done *** Check failure stack trace: *** tests/script.cpp:81: Failure Failed java_log_test.sh terminated with signal 'Aborted' [ FAILED ] ExamplesTest.JavaLog (2166 ms) {noformat} Full logs attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)