Vinod Kone created MESOS-502:
--------------------------------
Summary: Slave crashes when handling duplicate terminal updates
Key: MESOS-502
URL: https://issues.apache.org/jira/browse/MESOS-502
Project: Mesos
Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Vinod Kone
Assignee: Vinod Kone
Fix For: 0.13.0
Saw this in production at Twitter, where we allow duplicate terminal status
updates.
I0611 04:45:00.304193 11094 slave.cpp:1740] Handling status update
TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task
1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of
framework 201103282247-0000000019-0000
I0611 04:45:00.304843 11094 status_update_manager.cpp:290] Received status
update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task
1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of
framework 201103282247-0000000019-0000 with checkpoint=false
I0611 04:45:00.304852 11099 cgroups_isolator.cpp:656] Changing cgroup controls
for executor
thermos-1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a
of framework 201103282247-0000000019-0000 with resources cpus=0.25; mem=128;
disk=0
I0611 04:45:00.305250 11094 status_update_manager.cpp:336] Forwarding status
update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task
1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of
framework 201103282247-0000000019-0000 to [email protected]:5050
I0611 04:45:00.306172 11099 cgroups_isolator.cpp:853] Updated 'cpu.shares' to
255 for executor
thermos-1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a
of framework 201103282247-0000000019-0000
I0611 04:45:00.307164 11099 cgroups_isolator.cpp:991] Updated
'memory.soft_limit_in_bytes' to 134217728 for executor
thermos-1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a
of framework 201103282247-0000000019-0000
I0611 04:45:00.307320 11087 slave.cpp:1796] Status update manager successfully
handled status update TASK_FINISHED (UUID:
f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task
1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of
framework 201103282247-0000000019-0000
I0611 04:45:00.307601 11087 slave.cpp:1802] Sending acknowledgement for status
update TASK_FINISHED (UUID: f5a9b568-2a4e-4c73-bd2d-24a3e209fc46) for task
1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of
framework 201103282247-0000000019-0000 to executor(1)@10.34.20.131:38573
I0611 04:45:00.366597 11088 slave.cpp:1740] Handling status update
TASK_FINISHED (UUID: 3bd6cbd7-b39c-4b83-81b5-b83c50fa4327) for task
1370925835073-mesos-meta_slave_32-30-447c967b-5202-4ad2-9e68-3dcc5c45f38a of
framework 201103282247-0000000019-0000
F0611 04:45:00.367133 11088 slave.cpp:2964] Check failed: 'task' Must be non
NULL
*** Check failure stack trace: ***
@ 0x7f2f3ae09ddd google::LogMessage::Fail()
@ 0x7f2f3ae0fa47 google::LogMessage::SendToLog()
@ 0x7f2f3ae0b68c google::LogMessage::Flush()
@ 0x7f2f3ae0b8f6 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f2f3aa5641d google::CheckNotNull<>()
@ 0x7f2f3aad38b3 mesos::internal::slave::Executor::terminateTask()
@ 0x7f2f3aaf425b mesos::internal::slave::Slave::statusUpdate()
@ 0x7f2f3ab206fd ProtobufProcess<>::handler1<>()
@ 0x7f2f3aafad8a std::tr1::_Function_handler<>::_M_invoke()
@ 0x7f2f3ab2246b ProtobufProcess<>::visit()
@ 0x7f2f3ad02ae5 process::ProcessManager::resume()
@ 0x7f2f3ad0349f process::schedule()
@ 0x7f2f3a4b773d start_thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira