Thomas Marshall created MESOS-514:
-------------------------------------

             Summary: FaultToleranceTest.ReconcileIncompleteTasks is flaky
                 Key: MESOS-514
                 URL: https://issues.apache.org/jira/browse/MESOS-514
             Project: Mesos
          Issue Type: Bug
            Reporter: Thomas Marshall


https://hadrian.millennium.berkeley.edu/jenkins/job/Mesos-minimal/366/consoleFull
 
https://hadrian.millennium.berkeley.edu/jenkins/job/Mesos/302/console

[ RUN      ] FaultToleranceTest.ReconcileIncompleteTasks
I0617 03:47:37.707007 10205 master.cpp:228] Master started on 127.0.1.1:32998
I0617 03:47:37.707059 10205 master.cpp:243] Master ID: 
201306170347-16842879-32998-10185
I0617 03:47:37.707181 10206 slave.cpp:219] Slave started on 78)@127.0.1.1:32998
I0617 03:47:37.707248 10206 slave.cpp:220] Slave resources: cpus=2; mem=1024; 
ports=[31000-32000]; disk=1024
W0617 03:47:37.707418 10204 master.cpp:83] No whitelist given. Advertising 
offers for all slaves
I0617 03:47:37.707494 10205 master.cpp:526] Elected as master!
I0617 03:47:37.707707 10205 master.cpp:569] Registering framework 
201306170347-16842879-32998-10185-0000 at scheduler(68)@127.0.1.1:32998
I0617 03:47:37.707897 10206 slave.cpp:540] New master detected at 
[email protected]:32998
I0617 03:47:37.707911 10205 hierarchical_allocator_process.hpp:327] Added 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.708050 10206 slave.cpp:555] Postponing registration until 
recovery is complete
I0617 03:47:37.708091 10207 status_update_manager.cpp:155] New master detected 
at [email protected]:32998
I0617 03:47:37.708112 10206 slave.cpp:401] Finished recovery
I0617 03:47:37.708359 10207 master.cpp:891] Attempting to register slave on 
ubuntu at slave(78)@127.0.1.1:32998
I0617 03:47:37.708413 10207 master.cpp:1851] Adding slave 
201306170347-16842879-32998-10185-0 at ubuntu with cpus=2; mem=1024; 
ports=[31000-32000]; disk=1024
I0617 03:47:37.708566 10205 slave.cpp:600] Registered with master 
[email protected]:32998; given slave ID 201306170347-16842879-32998-10185-0
I0617 03:47:37.708719 10205 hierarchical_allocator_process.hpp:449] Added slave 
201306170347-16842879-32998-10185-0 (ubuntu) with cpus=2; mem=1024; 
ports=[31000-32000]; disk=1024 (and cpus=2; mem=1024; ports=[31000-32000]; 
disk=1024 available)
I0617 03:47:37.709128 10207 master.cpp:1239] Sending 1 offers to framework 
201306170347-16842879-32998-10185-0000
I0617 03:47:37.710453 10207 master.cpp:1472] Processing reply for offer 
201306170347-16842879-32998-10185-0 on slave 
201306170347-16842879-32998-10185-0 (ubuntu) for framework 
201306170347-16842879-32998-10185-0000
I0617 03:47:37.710728 10207 master.hpp:291] Adding task 1 with resources 
cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 
201306170347-16842879-32998-10185-0
I0617 03:47:37.710814 10207 master.cpp:1591] Launching task 1 of framework 
201306170347-16842879-32998-10185-0000 with resources cpus=2; mem=1024; 
ports=[31000-32000]; disk=1024 on slave 201306170347-16842879-32998-10185-0 
(ubuntu)
I0617 03:47:37.711158 10207 slave.cpp:740] Got assigned task 1 for framework 
201306170347-16842879-32998-10185-0000
I0617 03:47:37.711511 10207 slave.cpp:838] Launching task 1 for framework 
201306170347-16842879-32998-10185-0000
I0617 03:47:37.713037 10207 paths.hpp:303] Created executor directory 
'/tmp/FaultToleranceTest_ReconcileIncompleteTasks_fTS0CG/slaves/201306170347-16842879-32998-10185-0/frameworks/201306170347-16842879-32998-10185-0000/executors/default/runs/9db6775d-9610-4eeb-ac97-50349467607a'
I0617 03:47:37.713376 10207 slave.cpp:949] Queuing task '1' for executor 
default of framework '201306170347-16842879-32998-10185-0000
I0617 03:47:37.713525 10207 slave.cpp:522] Successfully attached file 
'/tmp/FaultToleranceTest_ReconcileIncompleteTasks_fTS0CG/slaves/201306170347-16842879-32998-10185-0/frameworks/201306170347-16842879-32998-10185-0000/executors/default/runs/9db6775d-9610-4eeb-ac97-50349467607a'
I0617 03:47:37.714153 10206 slave.cpp:1396] Got registration for executor 
'default' of framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.714453 10206 slave.cpp:1511] Flushing queued task 1 for executor 
'default' of framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.716209 10206 slave.cpp:1693] Handling status update 
TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.716699 10206 status_update_manager.cpp:290] Received status 
update TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000 with checkpoint=false
I0617 03:47:37.716789 10206 status_update_manager.cpp:450] Creating 
StatusUpdate stream for task 1 of framework 
201306170347-16842879-32998-10185-0000
I0617 03:47:37.716927 10206 status_update_manager.cpp:336] Forwarding status 
update TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000 to [email protected]:32998
I0617 03:47:37.717314 10206 slave.cpp:1810] Sending acknowledgement for status 
update TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000 to executor(29)@127.0.1.1:32998
I0617 03:47:37.717918 10205 slave.cpp:540] New master detected at 
[email protected]:32998
I0617 03:47:37.718029 10207 status_update_manager.cpp:155] New master detected 
at [email protected]:32998
W0617 03:47:37.718075 10207 status_update_manager.cpp:165] Resending status 
update TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.718139 10207 status_update_manager.cpp:336] Forwarding status 
update TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000 to [email protected]:32998
W0617 03:47:37.718214 10206 master.cpp:944] Slave at slave(78)@127.0.1.1:32998 
(ubuntu) is being allowed to re-register with an already in use id 
(201306170347-16842879-32998-10185-0)
I0617 03:47:37.718479 10205 slave.cpp:636] Re-registered with master 
[email protected]:32998
I0617 03:47:37.718602 10205 slave.cpp:1278] Updating framework 
201306170347-16842879-32998-10185-0000 pid to scheduler(68)@127.0.1.1:32998
I0617 03:47:37.718904 10206 master.cpp:1022] Status update from 
slave(78)@127.0.1.1:32998: task 1 of framework 
201306170347-16842879-32998-10185-0000 is now in state TASK_FINISHED
W0617 03:47:37.719336 10204 master.cpp:83] No whitelist given. Advertising 
offers for all slaves
I0617 03:47:37.719465 10206 master.hpp:303] Removing task 1 with resources 
cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 
201306170347-16842879-32998-10185-0
W0617 03:47:37.719513 10204 status_update_manager.cpp:433] Resending status 
update TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.719606 10204 status_update_manager.cpp:336] Forwarding status 
update TASK_FINISHED (UUID: ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18) for task 1 of 
framework 201306170347-16842879-32998-10185-0000 to [email protected]:32998
I0617 03:47:37.719857 10206 hierarchical_allocator_process.hpp:616] Recovered 
cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (total allocatable: cpus=2; 
mem=1024; ports=[31000-32000]; disk=1024) on slave 
201306170347-16842879-32998-10185-0 from framework 
201306170347-16842879-32998-10185-0000
I0617 03:47:37.719954 10206 master.cpp:1022] Status update from 
slave(78)@127.0.1.1:32998: task 1 of framework 
201306170347-16842879-32998-10185-0000 is now in state TASK_FINISHED
W0617 03:47:37.720090 10206 master.cpp:1065] Status update from 
slave(78)@127.0.1.1:32998 (ubuntu): error, couldn't lookup task 1
../../src/tests/fault_tolerance_tests.cpp:1504: Failure
Mock function called more times than expected - returning directly.
    Function call: statusUpdate(0x7fff842068e0, @0x2b759c001a30 56-byte object 
<90-CF 70-89 75-2B 00-00 00-00 00-00 00-00 00-00 E0-12 01-9C 75-2B 00-00 08-B5 
F1-00 00-00 00-00 08-B5 F1-00 00-00 00-00 02-00 00-00 00-00 00-00 03-00 00-00 
00-00 00-00>)
         Expected: to be called once
           Actual: called twice - over-saturated and active
I0617 03:47:37.720504 10204 master.cpp:385] Master terminating
I0617 03:47:37.720610 10185 master.cpp:207] Shutting down master
I0617 03:47:37.720737 10206 slave.cpp:496] Slave asked to shut down by 
[email protected]:32998
I0617 03:47:37.721110 10206 slave.cpp:1113] Asked to shut down framework 
201306170347-16842879-32998-10185-0000 by [email protected]:32998
I0617 03:47:37.721174 10206 slave.cpp:1138] Shutting down framework 
201306170347-16842879-32998-10185-0000
I0617 03:47:37.721268 10206 slave.cpp:2320] Shutting down executor 'default' of 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.720873 10207 status_update_manager.cpp:360] Received status 
update acknowledgement ff73e62c-f0b4-47e3-b196-2ed3b6c6ec18 for task 1 of 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.721436 10207 status_update_manager.cpp:481] Cleaning up status 
update stream for task 1 of framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.722100 10204 hierarchical_allocator_process.hpp:412] Deactivated 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.722244 10204 hierarchical_allocator_process.hpp:367] Removed 
framework 201306170347-16842879-32998-10185-0000
I0617 03:47:37.722342 10204 hierarchical_allocator_process.hpp:477] Removed 
slave 201306170347-16842879-32998-10185-0
I0617 03:47:37.722759 10206 slave.cpp:451] Slave terminating
I0617 03:47:37.722797 10206 slave.cpp:1113] Asked to shut down framework 
201306170347-16842879-32998-10185-0000 by @0.0.0.0:0
W0617 03:47:37.722825 10206 slave.cpp:1134] Ignoring shutdown framework 
201306170347-16842879-32998-10185-0000 because it is terminating
[  FAILED  ] FaultToleranceTest.ReconcileIncompleteTasks (18 ms)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to