[ 
https://issues.apache.org/jira/browse/MESOS-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler resolved MESOS-778.
-----------------------------------

    Resolution: Cannot Reproduce

Was not able to reproduce with 1100 iterations, let's re-open this if we're 
still seeing flakiness in CI or locally.

> FaultToleranceTest.ReconcileIncompleteTasks test is flaky
> ---------------------------------------------------------
>
>                 Key: MESOS-778
>                 URL: https://issues.apache.org/jira/browse/MESOS-778
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Vinod Kone
>             Fix For: 0.16.0
>
>
> [ RUN      ] FaultToleranceTest.ReconcileIncompleteTasks
> I1025 09:51:27.338650 21123 master.cpp:284] Master started on 127.0.1.1:55043
> I1025 09:51:27.339184 21123 master.cpp:299] Master ID: 
> 201310250951-16842879-55043-21104
> I1025 09:51:27.339200 21123 master.cpp:302] Master only allowing 
> authenticated frameworks to register!
> I1025 09:51:27.339436 21126 hierarchical_allocator_process.hpp:302] 
> Initializing hierarchical allocator process with master : 
> [email protected]:55043
> I1025 09:51:27.339468 21126 master.cpp:85] No whitelist given. Advertising 
> offers for all slaves
> I1025 09:51:27.340128 21123 master.cpp:697] Elected as master!
> I1025 09:51:27.341193 21126 slave.cpp:109] Slave started on 
> 45)@127.0.1.1:55043
> I1025 09:51:27.341266 21126 slave.cpp:209] Slave resources: cpus(*):2; 
> mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I1025 09:51:27.341608 21126 slave.cpp:481] New master detected at 
> [email protected]:55043
> I1025 09:51:27.341631 21126 slave.cpp:496] Postponing registration until 
> recovery is complete
> I1025 09:51:27.341701 21126 sched.cpp:195] New master at 
> [email protected]:55043
> I1025 09:51:27.341716 21126 sched.cpp:281] Authenticating with master 
> [email protected]:55043
> I1025 09:51:27.341804 21126 status_update_manager.cpp:158] New master 
> detected at [email protected]:55043
> I1025 09:51:27.341868 21126 state.cpp:33] Recovering state from 
> '/tmp/FaultToleranceTest_ReconcileIncompleteTasks_q5zsfZ/meta'
> I1025 09:51:27.341933 21126 authenticatee.hpp:124] Creating new client SASL 
> connection
> I1025 09:51:27.342088 21126 master.cpp:1702] Authenticating framework at 
> scheduler(62)@127.0.1.1:55043
> I1025 09:51:27.342190 21126 status_update_manager.cpp:180] Recovering status 
> update manager
> I1025 09:51:27.342242 21126 authenticator.hpp:140] Creating new server SASL 
> connection
> I1025 09:51:27.342407 21126 authenticatee.hpp:212] Received SASL 
> authentication mechanisms: CRAM-MD5
> I1025 09:51:27.342425 21126 authenticatee.hpp:238] Attempting to authenticate 
> with mechanism 'CRAM-MD5'
> I1025 09:51:27.342491 21126 slave.cpp:2733] Finished recovery
> I1025 09:51:27.342555 21126 authenticator.hpp:243] Received SASL 
> authentication start
> I1025 09:51:27.342588 21126 authenticator.hpp:325] Authentication requires 
> more steps
> I1025 09:51:27.342656 21126 master.cpp:1223] Attempting to register slave on 
> raring at slave(45)@127.0.1.1:55043
> I1025 09:51:27.342669 21126 master.cpp:2481] Adding slave 
> 201310250951-16842879-55043-21104-0 at raring with cpus(*):2; mem(*):1024; 
> disk(*):1024; ports(*):[31000-32000]
> I1025 09:51:27.344199 21126 slave.cpp:547] Registered with master 
> [email protected]:55043; given slave ID 201310250951-16842879-55043-21104-0
> I1025 09:51:27.343286 21123 authenticatee.hpp:258] Received SASL 
> authentication step
> I1025 09:51:27.345064 21123 hierarchical_allocator_process.hpp:445] Added 
> slave 201310250951-16842879-55043-21104-0 (raring) with cpus(*):2; 
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; 
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I1025 09:51:27.345101 21123 hierarchical_allocator_process.hpp:708] Performed 
> allocation for slave 201310250951-16842879-55043-21104-0 in 5250ns
> I1025 09:51:27.345119 21123 authenticator.hpp:271] Received SASL 
> authentication step
> I1025 09:51:27.345134 21123 auxprop.cpp:81] Request to lookup properties for 
> user: 'test-principal' realm: 'raring' server FQDN: 'raring' 
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
> SASL_AUXPROP_AUTHZID: false 
> I1025 09:51:27.345141 21123 auxprop.cpp:153] Looking up auxiliary property 
> '*userPassword'
> I1025 09:51:27.345149 21123 auxprop.cpp:153] Looking up auxiliary property 
> '*cmusaslsecretCRAM-MD5'
> I1025 09:51:27.345157 21123 auxprop.cpp:81] Request to lookup properties for 
> user: 'test-principal' realm: 'raring' server FQDN: 'raring' 
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
> SASL_AUXPROP_AUTHZID: true 
> I1025 09:51:27.345163 21123 auxprop.cpp:103] Skipping auxiliary property 
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I1025 09:51:27.345167 21123 auxprop.cpp:103] Skipping auxiliary property 
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I1025 09:51:27.345176 21123 authenticator.hpp:317] Authentication success
> I1025 09:51:27.345198 21123 authenticatee.hpp:298] Authentication success
> I1025 09:51:27.345226 21123 master.cpp:1742] Successfully authenticated 
> framework at scheduler(62)@127.0.1.1:55043
> I1025 09:51:27.345269 21123 sched.cpp:326] Successfully authenticated with 
> master [email protected]:55043
> I1025 09:51:27.345329 21123 master.cpp:755] Received registration request 
> from scheduler(62)@127.0.1.1:55043
> I1025 09:51:27.345366 21123 master.cpp:773] Registering framework 
> 201310250951-16842879-55043-21104-0000 at scheduler(62)@127.0.1.1:55043
> I1025 09:51:27.345415 21123 sched.cpp:365] Framework registered with 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.345435 21123 sched.cpp:379] Scheduler::registered took 8475ns
> I1025 09:51:27.345453 21123 hierarchical_allocator_process.hpp:332] Added 
> framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.345484 21123 hierarchical_allocator_process.hpp:752] Offering 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 
> 201310250951-16842879-55043-21104-0 to framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.345566 21123 hierarchical_allocator_process.hpp:688] Performed 
> allocation for 1 slaves in 102340ns
> I1025 09:51:27.345605 21123 master.hpp:413] Adding offer 
> 201310250951-16842879-55043-21104-0 with resources cpus(*):2; mem(*):1024; 
> disk(*):1024; ports(*):[31000-32000] on slave 
> 201310250951-16842879-55043-21104-0 (raring)
> I1025 09:51:27.345641 21123 master.cpp:1668] Sending 1 offers to framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.345720 21123 sched.cpp:472] Scheduler::resourceOffers took 
> 16496ns
> I1025 09:51:27.346945 21123 master.cpp:1994] Processing reply for offer 
> 201310250951-16842879-55043-21104-0 on slave 
> 201310250951-16842879-55043-21104-0 (raring) for framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.347056 21123 master.hpp:385] Adding task 1 with resources 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 
> 201310250951-16842879-55043-21104-0 (raring)
> I1025 09:51:27.347105 21123 master.cpp:2118] Launching task 1 of framework 
> 201310250951-16842879-55043-21104-0000 with resources cpus(*):2; mem(*):1024; 
> disk(*):1024; ports(*):[31000-32000] on slave 
> 201310250951-16842879-55043-21104-0 (raring)
> I1025 09:51:27.347198 21123 master.hpp:423] Removing offer 
> 201310250951-16842879-55043-21104-0 with resources cpus(*):2; mem(*):1024; 
> disk(*):1024; ports(*):[31000-32000] on slave 
> 201310250951-16842879-55043-21104-0 (raring)
> I1025 09:51:27.347801 21126 slave.cpp:720] Got assigned task 1 for framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.347975 21126 slave.cpp:831] Launching task 1 for framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.349030 21126 slave.cpp:941] Queuing task '1' for executor 
> default of framework '201310250951-16842879-55043-21104-0000
> I1025 09:51:27.350592 21123 exec.cpp:174] Executor started at: 
> executor(17)@127.0.1.1:55043 with pid 21104
> I1025 09:51:27.351519 21126 slave.cpp:463] Successfully attached file 
> '/tmp/FaultToleranceTest_ReconcileIncompleteTasks_q5zsfZ/slaves/201310250951-16842879-55043-21104-0/frameworks/201310250951-16842879-55043-21104-0000/executors/default/runs/0ac3b3b4-4e6e-498e-aa14-f84b4a9c6562'
> I1025 09:51:27.352376 21126 slave.cpp:1403] Got registration for executor 
> 'default' of framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.353603 21126 slave.cpp:1524] Flushing queued task 1 for 
> executor 'default' of framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.354063 21124 exec.cpp:198] Executor registered on slave 
> 201310250951-16842879-55043-21104-0
> I1025 09:51:27.355964 21124 exec.cpp:210] Executor::registered took 14445ns
> I1025 09:51:27.356777 21124 exec.cpp:285] Executor asked to run task '1'
> I1025 09:51:27.357578 21124 exec.cpp:294] Executor::launchTask took 43283ns
> I1025 09:51:27.359156 21124 exec.cpp:505] Executor sending status update 
> TASK_FINISHED (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 of 
> framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.360010 21126 slave.cpp:1737] Handling status update 
> TASK_FINISHED (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 of 
> framework 201310250951-16842879-55043-21104-0000 from 
> executor(17)@127.0.1.1:55043
> I1025 09:51:27.360036 21126 slave.cpp:3155] Terminating task 1
> I1025 09:51:27.360687 21126 status_update_manager.cpp:305] Received status 
> update TASK_FINISHED (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 
> of framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.360707 21126 status_update_manager.cpp:476] Creating 
> StatusUpdate stream for task 1 of framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.360739 21126 status_update_manager.cpp:356] Forwarding status 
> update TASK_FINISHED (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 
> of framework 201310250951-16842879-55043-21104-0000 to [email protected]:55043
> I1025 09:51:27.361398 21126 slave.cpp:1856] Status update manager 
> successfully handled status update TASK_FINISHED (UUID: 
> 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 of framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.361418 21126 slave.cpp:1862] Sending acknowledgement for 
> status update TASK_FINISHED (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for 
> task 1 of framework 201310250951-16842879-55043-21104-0000 to 
> executor(17)@127.0.1.1:55043
> I1025 09:51:27.362143 21126 slave.cpp:481] New master detected at 
> [email protected]:55043
> I1025 09:51:27.362225 21126 status_update_manager.cpp:158] New master 
> detected at [email protected]:55043
> W1025 09:51:27.362239 21126 status_update_manager.cpp:168] Resending status 
> update TASK_FINISHED (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 
> of framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.362247 21126 status_update_manager.cpp:356] Forwarding status 
> update TASK_FINISHED (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 
> of framework 201310250951-16842879-55043-21104-0000 to [email protected]:55043
> W1025 09:51:27.362318 21126 master.cpp:1281] Slave at 
> slave(45)@127.0.1.1:55043 (raring) is being allowed to re-register with an 
> already in use id (201310250951-16842879-55043-21104-0)
> I1025 09:51:27.362484 21126 master.cpp:1425] Status update TASK_FINISHED 
> (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 of framework 
> 201310250951-16842879-55043-21104-0000 from slave(45)@127.0.1.1:55043
> I1025 09:51:27.362524 21126 master.hpp:403] Removing task 1 with resources 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 
> 201310250951-16842879-55043-21104-0 (raring)
> I1025 09:51:27.362613 21126 slave.cpp:594] Re-registered with master 
> [email protected]:55043
> I1025 09:51:27.362639 21126 slave.cpp:1289] Updating framework 
> 201310250951-16842879-55043-21104-0000 pid to scheduler(62)@127.0.1.1:55043
> I1025 09:51:27.363248 21123 hierarchical_allocator_process.hpp:637] Recovered 
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total 
> allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on 
> slave 201310250951-16842879-55043-21104-0 from framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.365298 21126 sched.cpp:527] Scheduler::statusUpdate took 
> 1.446906ms
> I1025 09:51:27.366274 21126 status_update_manager.cpp:380] Received status 
> update acknowledgement (UUID: 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 
> 1 of framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.367187 21126 status_update_manager.cpp:507] Cleaning up status 
> update stream for task 1 of framework 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.368058 21126 slave.cpp:1343] Status update manager 
> successfully handled status update acknowledgement (UUID: 
> 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8) for task 1 of framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:27.368967 21126 slave.cpp:3178] Completing task 1
> I1025 09:51:27.368535 21124 exec.cpp:331] Executor received status update 
> acknowledgement 875d7c12-8efa-4ca4-ad0f-fafaa5db89c8 for task 1 of framework 
> 201310250951-16842879-55043-21104-0000
> I1025 09:51:29.510723 21646 exec.cpp:84] Committing suicide by killing the 
> process group
> I1025 09:51:29.514854 21634 exec.cpp:84] Committing suicide by killing the 
> process group
> I1025 09:51:29.518934 21639 exec.cpp:84] Committing suicide by killing the 
> process group
> I1025 09:51:31.652802 21764 exec.cpp:84] Committing suicide by killing the 
> process group
> I1025 09:51:31.654765 21758 exec.cpp:84] Committing suicide by killing the 
> process group
> I1025 09:51:31.659418 21760 exec.cpp:84] Committing suicide by killing the 
> process group



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to