[
https://issues.apache.org/jira/browse/MESOS-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720818#comment-16720818
]
Chun-Hung Hsiao commented on MESOS-8096:
----------------------------------------
Observed with
{{MesosContainerizer/DefaultExecutorTest.ROOT_INTERNET_CURL_DockerTaskWithFileURI/0}}:
{noformat}
01:06:55 I1214 01:06:55.865653 29390 scheduler.cpp:845] Enqueuing event UPDATE
received from http://172.16.10.62:35809/master/api/v1/scheduler
01:06:55 I1214 01:06:55.865679 29387 slave.cpp:5689] Task status update manager
successfully handled status update TASK_FINISHED (Status UUID:
21f8cdb2-04a0-4780-8d89-54776d3e68ac) for task
555ef8a2-cb55-47de-a8f6-c2763c6de745 of framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000
01:06:55 *** Aborted at 1544749615 (unix time) try "date -d @1544749615" if you
are using GNU date ***
01:06:55 PC: @ 0x7fab0ad11d23 mesos::v1::scheduler::Mesos::send()
01:06:55 I1214 01:06:55.869730 29392 master.cpp:1390] Framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (default) disconnected
01:06:55 I1214 01:06:55.869750 29392 master.cpp:3241] Deactivating framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (default)
01:06:55 I1214 01:06:55.869771 29392 master.cpp:3218] Disconnecting framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (default)
01:06:55 I1214 01:06:55.869781 29392 master.cpp:1405] Giving framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (default) 0ns to failover
01:06:55 I1214 01:06:55.869849 29389 hierarchical.cpp:418] Deactivated
framework ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000
01:06:55 I1214 01:06:55.869964 29391 master.cpp:9296] Framework failover
timeout, removing framework ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (default)
01:06:55 I1214 01:06:55.869979 29391 master.cpp:10233] Removing framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (default)
01:06:55 I1214 01:06:55.870019 29391 master.cpp:10968] Updating the state of
task 555ef8a2-cb55-47de-a8f6-c2763c6de745 of framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (latest state: TASK_FINISHED, status
update state: TASK_KILLED)
01:06:55 I1214 01:06:55.870038 29391 master.cpp:11066] Removing task
555ef8a2-cb55-47de-a8f6-c2763c6de745 with resources cpus(allocated: *):0.1;
mem(allocated: *):32; disk(allocated: *):32 of framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 on agent
ed4b332a-fa42-4c3e-9dae-67e0f601385d-S0 at slave(1078)@172.16.10.62:35809
(ip-172-16-10-62.ec2.internal)
01:06:55 I1214 01:06:55.870173 29391 master.cpp:11103] Removing executor
'default' with resources cpus(allocated: *):0.1; mem(allocated: *):32;
disk(allocated: *):32 of framework ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 on
agent ed4b332a-fa42-4c3e-9dae-67e0f601385d-S0 at slave(1078)@172.16.10.62:35809
(ip-172-16-10-62.ec2.internal)
01:06:55 I1214 01:06:55.870316 29391 slave.cpp:3912] Asked to shut down
framework ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 by [email protected]:35809
01:06:55 I1214 01:06:55.870329 29391 slave.cpp:3937] Shutting down framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000
01:06:55 I1214 01:06:55.870339 29391 slave.cpp:6723] Shutting down executor
'default' of framework ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000 (via HTTP)
01:06:55 I1214 01:06:55.870524 29389 hierarchical.cpp:1238] Recovered
cpus(allocated: *):0.1; mem(allocated: *):32; disk(allocated: *):32 (total:
cpus:2; mem:1024; disk:1024; ports:[31000-32000], allocated: {}) on agent
ed4b332a-fa42-4c3e-9dae-67e0f601385d-S0 from framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000
01:06:55 I1214 01:06:55.870702 29389 hierarchical.cpp:357] Removed framework
ed4b332a-fa42-4c3e-9dae-67e0f601385d-0000
01:06:55 *** SIGSEGV (@0x0) received by PID 29367 (TID 0x7faaffa32700) from PID
0; stack trace: ***
01:06:55 @ 0x7faaedecdde7 (unknown)
01:06:55 @ 0x7faaeded5385 (unknown)
01:06:55 @ 0x7faaedeca583 (unknown)
01:06:55 @ 0x7fab088337e0 (unknown)
01:06:55 @ 0x7fab0ad11d23 mesos::v1::scheduler::Mesos::send()
01:06:55 @ 0x55df576c3e66
_ZNK5mesos8internal5tests2v19scheduler23SendAcknowledgeActionP2INS_2v111FrameworkIDENS5_7AgentIDEE10gmock_ImplIFvPNS5_9scheduler5MesosERKNSA_12Event_UpdateEEE17gmock_PerformImplISC_SF_N7testing8internal12ExcessiveArgESL_SL_SL_SL_SL_SL_SL_EEvRKSt5tupleIJSC_SF_EET_T0_T1_T2_T3_T4_T5_T6_T7_T8_
01:06:55 @ 0x55df576c3fda
_ZN5mesos8internal5tests2v19scheduler23SendAcknowledgeActionP2INS_2v111FrameworkIDENS5_7AgentIDEE10gmock_ImplIFvPNS5_9scheduler5MesosERKNSA_12Event_UpdateEEE7PerformERKSt5tupleIJSC_SF_EE
01:06:55 @ 0x55df5765488e
_ZNK7testing6ActionIFvPN5mesos2v19scheduler5MesosERKNS3_12Event_UpdateEEE7PerformERKSt5tupleIJS5_S8_EE
01:06:55 @ 0x55df5765488e
_ZNK7testing6ActionIFvPN5mesos2v19scheduler5MesosERKNS3_12Event_UpdateEEE7PerformERKSt5tupleIJS5_S8_EE
01:06:55 @ 0x55df57654951
testing::internal::FunctionMockerBase<>::UntypedPerformAction()
01:06:55 @ 0x55df58b371ec
testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith()
01:06:55 @ 0x55df576d2b8a
mesos::internal::tests::scheduler::MockHTTPScheduler<>::events()
01:06:55 @ 0x55df57644c23 std::_Function_handler<>::_M_invoke()
01:06:55 @ 0x7fab0ad16b48 process::AsyncExecutorProcess::execute<>()
01:06:55 @ 0x7fab0ad20bdd
_ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEE10CallableFnINS_8internal7PartialIZNS1_8dispatchI7NothingNS1_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeISH_SaISH_EEEEESL_SR_RSL_EENS1_6FutureIT_EERKNS1_3PIDIT0_EEMSX_FSU_T1_T2_EOT3_OT4_EUlSt10unique_ptrINS1_7PromiseISA_EESt14default_deleteIS1B_EEOSP_OSL_S3_E_JS1E_SP_SL_St12_PlaceholderILi1EEEEEEclEOS3_
01:06:55 @ 0x7fab0bb35f01 process::ProcessBase::consume()
01:06:55 @ 0x7fab0bb4abea process::ProcessManager::resume()
01:06:55 @ 0x7fab0bb4ebb6
_ZNSt6thread11_State_implISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv
01:06:55 @ 0x7fab0bdfb9cf execute_native_thread_routine
01:06:55 @ 0x7fab0882baa1 start_thread
01:06:55 @ 0x7fab07bd0c4d clone
{noformat}
> Enqueueing events in MockHTTPScheduler can lead to segfaults.
> -------------------------------------------------------------
>
> Key: MESOS-8096
> URL: https://issues.apache.org/jira/browse/MESOS-8096
> Project: Mesos
> Issue Type: Bug
> Components: scheduler driver, test
> Environment: Fedora 23, Ubuntu 14.04, Ubuntu 16
> Reporter: Alexander Rukletsov
> Assignee: Alexander Rukletsov
> Priority: Major
> Labels: flaky-test, integration, mesosphere
> Attachments: AsyncExecutorProcess-badrun-1.txt,
> AsyncExecutorProcess-badrun-2.txt, AsyncExecutorProcess-badrun-3.txt,
> mesos-8096-1.txt, mesos-8096-2.txt, mesos-8096-3.txt,
> scheduler-shutdown-invalid-driver-2.txt, scheduler-shutdown-invalid-driver.txt
>
>
> Various tests segfault due to a yet unknown reason. Comparing logs (attached)
> hints that the problem might be in the scheduler's event queue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)