[ https://issues.apache.org/jira/browse/MESOS-8540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363288#comment-16363288 ]
Greg Mann commented on MESOS-8540: ---------------------------------- I just saw a similar stacktrace occur during test teardown on a different test: {{MesosContainerizer/DefaultExecutorTest.TaskWithFileURI/0}}. This was on Debian 9 with SSL and GRPC enabled. Log attached as {{TaskWithFileURI.txt}}. > ROOT_TasksSharingViaSandboxVolumes is flaky. > -------------------------------------------- > > Key: MESOS-8540 > URL: https://issues.apache.org/jira/browse/MESOS-8540 > Project: Mesos > Issue Type: Bug > Affects Versions: 1.5.0 > Environment: Ubuntu 17.04 > autotools, ssl build > Reporter: Till Toenshoff > Priority: Major > Labels: flaky, flaky-test > Attachments: TaskWithFileURI.txt > > > {noformat} > 22:50:20 [ RUN ] > LauncherAndIsolationParam/PersistentVolumeDefaultExecutor.ROOT_TasksSharingViaSandboxVolumes/2 > [...] > 22:50:20 I0202 22:50:20.661401 4573 default_executor.cpp:191] Received > ACKNOWLEDGED event > 22:50:20 I0202 22:50:20.661367 18013 hierarchical.cpp:1192] Recovered > cpus(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):0.1; mem(allocated: > default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):32; > disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):32 (total: cpus:1.7; mem:928; > disk:928; ports:[31000-32000]; cpus(reservations: > [(DYNAMIC,default-role,test-principal)]):0.3; mem(reservations: > [(DYNAMIC,default-role,test-principal)]):96; disk(reservations: > [(DYNAMIC,default-role,test-principal)]):95; disk(reservations: > [(DYNAMIC,default-role,test-principal)])[executor:executor_volume_path]:1, > allocated: disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)])[executor:executor_volume_path]:1; > disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):63; mem(allocated: > default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):64; > cpus(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):0.2) on agent > a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 from framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.661736 18006 scheduler.cpp:739] Enqueuing event > UPDATE received from http://172.16.10.208:39629/master/api/v1/scheduler > 22:50:20 I0202 22:50:20.661929 18011 scheduler.cpp:247] Sending ACKNOWLEDGE > call to http://172.16.10.208:39629/master/api/v1/scheduler > 22:50:20 I0202 22:50:20.662333 18010 process.cpp:3554] Handling HTTP event > for process 'master' with path: '/master/api/v1/scheduler' > 22:50:20 I0202 22:50:20.697870 18007 containerizer.cpp:2791] Container > 8acd5ca5-98f0-4373-9914-2118c9c74bc4.e463748e-59c9-46fe-8f0a-24c33a506068 has > exited > 22:50:20 I0202 22:50:20.697901 18007 containerizer.cpp:2338] Destroying > container > 8acd5ca5-98f0-4373-9914-2118c9c74bc4.e463748e-59c9-46fe-8f0a-24c33a506068 in > RUNNING state > 22:50:20 I0202 22:50:20.697909 18007 containerizer.cpp:2952] Transitioning > the state of container > 8acd5ca5-98f0-4373-9914-2118c9c74bc4.e463748e-59c9-46fe-8f0a-24c33a506068 > from RUNNING to DESTROYING > 22:50:20 I0202 22:50:20.697983 18007 linux_launcher.cpp:514] Asked to destroy > container > 8acd5ca5-98f0-4373-9914-2118c9c74bc4.e463748e-59c9-46fe-8f0a-24c33a506068 > 22:50:20 I0202 22:50:20.698537 18007 linux_launcher.cpp:560] Using freezer to > destroy cgroup > mesos/8acd5ca5-98f0-4373-9914-2118c9c74bc4/mesos/e463748e-59c9-46fe-8f0a-24c33a506068 > 22:50:20 I0202 22:50:20.699381 18008 cgroups.cpp:3060] Freezing cgroup > /sys/fs/cgroup/freezer/mesos/8acd5ca5-98f0-4373-9914-2118c9c74bc4/mesos/e463748e-59c9-46fe-8f0a-24c33a506068 > 22:50:20 I0202 22:50:20.700407 18008 cgroups.cpp:1415] Successfully froze > cgroup > /sys/fs/cgroup/freezer/mesos/8acd5ca5-98f0-4373-9914-2118c9c74bc4/mesos/e463748e-59c9-46fe-8f0a-24c33a506068 > after 992us > 22:50:20 I0202 22:50:20.701438 18008 cgroups.cpp:3078] Thawing cgroup > /sys/fs/cgroup/freezer/mesos/8acd5ca5-98f0-4373-9914-2118c9c74bc4/mesos/e463748e-59c9-46fe-8f0a-24c33a506068 > 22:50:20 I0202 22:50:20.702519 18008 cgroups.cpp:1444] Successfully thawed > cgroup > /sys/fs/cgroup/freezer/mesos/8acd5ca5-98f0-4373-9914-2118c9c74bc4/mesos/e463748e-59c9-46fe-8f0a-24c33a506068 > after 1.05088ms > 22:50:20 I0202 22:50:20.703819 18010 http.cpp:1185] HTTP POST for > /master/api/v1/scheduler from 172.16.10.208:45566 > 22:50:20 I0202 22:50:20.704056 18010 master.cpp:5775] Processing ACKNOWLEDGE > call for status 5c032406-9bb1-45dd-a053-1af6f064ccfa for task producer of > framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (default) on agent > a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 > 22:50:20 I0202 22:50:20.704083 18011 provisioner.cpp:598] Ignoring destroy > request for unknown container > 8acd5ca5-98f0-4373-9914-2118c9c74bc4.e463748e-59c9-46fe-8f0a-24c33a506068 > 22:50:20 I0202 22:50:20.704332 18011 containerizer.cpp:2628] Checkpointing > termination state to nested container's runtime directory > '/tmp/LauncherAndIsolationParam_PersistentVolumeDefaultExecutor_ROOT_TasksSharingViaSandboxVolumes_2_BfTlNd/containers/8acd5ca5-98f0-4373-9914-2118c9c74bc4/containers/e463748e-59c9-46fe-8f0a-24c33a506068/termination' > 22:50:20 I0202 22:50:20.704195 18010 master.cpp:10303] Removing task producer > with resources cpus(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):0.1; mem(allocated: > default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):32; > disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):32 of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 on agent > a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 at slave(1025)@172.16.10.208:39629 > (ip-172-16-10-208) > 22:50:20 I0202 22:50:20.704879 18011 task_status_update_manager.cpp:401] > Received task status update acknowledgement (UUID: > 5c032406-9bb1-45dd-a053-1af6f064ccfa) for task producer of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.705235 18011 task_status_update_manager.cpp:538] > Cleaning up status update stream for task producer of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.705646 18011 slave.cpp:4057] Task status update > manager successfully handled status update acknowledgement (UUID: > 5c032406-9bb1-45dd-a053-1af6f064ccfa) for task producer of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.705756 4573 default_executor.cpp:889] Child > container > 8acd5ca5-98f0-4373-9914-2118c9c74bc4.e463748e-59c9-46fe-8f0a-24c33a506068 of > task 'consumer' completed in state TASK_FINISHED: Command exited with status 0 > 22:50:20 I0202 22:50:20.705790 4573 default_executor.cpp:1018] Terminating > after 1secs > 22:50:20 I0202 22:50:20.705934 18011 slave.cpp:9069] Completing task producer > 22:50:20 I0202 22:50:20.706215 18011 process.cpp:3554] Handling HTTP event > for process 'slave(1025)' with path: '/slave(1025)/api/v1/executor' > 22:50:20 I0202 22:50:20.748010 18006 http.cpp:1185] HTTP POST for > /slave(1025)/api/v1/executor from 172.16.10.208:45572 > 22:50:20 I0202 22:50:20.748122 18006 slave.cpp:4809] Handling status update > TASK_FINISHED (Status UUID: 0a1559c6-5681-44c1-a043-6cf06beaa78c) for task > consumer of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.748740 18006 task_status_update_manager.cpp:328] > Received task status update TASK_FINISHED (Status UUID: > 0a1559c6-5681-44c1-a043-6cf06beaa78c) for task consumer of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.748795 18006 task_status_update_manager.cpp:383] > Forwarding task status update TASK_FINISHED (Status UUID: > 0a1559c6-5681-44c1-a043-6cf06beaa78c) for task consumer of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 to the agent > 22:50:20 I0202 22:50:20.748862 18006 slave.cpp:5291] Forwarding the update > TASK_FINISHED (Status UUID: 0a1559c6-5681-44c1-a043-6cf06beaa78c) for task > consumer of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 to > master@172.16.10.208:39629 > 22:50:20 I0202 22:50:20.748937 18006 slave.cpp:5184] Task status update > manager successfully handled status update TASK_FINISHED (Status UUID: > 0a1559c6-5681-44c1-a043-6cf06beaa78c) for task consumer of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.749053 18006 master.cpp:7850] Status update > TASK_FINISHED (Status UUID: 0a1559c6-5681-44c1-a043-6cf06beaa78c) for task > consumer of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 from agent > a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 at slave(1025)@172.16.10.208:39629 > (ip-172-16-10-208) > 22:50:20 I0202 22:50:20.749079 18006 master.cpp:7906] Forwarding status > update TASK_FINISHED (Status UUID: 0a1559c6-5681-44c1-a043-6cf06beaa78c) for > task consumer of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.749179 18006 master.cpp:10204] Updating the state of > task consumer of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (latest > state: TASK_FINISHED, status update state: TASK_FINISHED) > 22:50:20 I0202 22:50:20.749598 18013 scheduler.cpp:739] Enqueuing event > UPDATE received from http://172.16.10.208:39629/master/api/v1/scheduler > 22:50:20 I0202 22:50:20.749830 18006 hierarchical.cpp:1192] Recovered > cpus(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):0.1; mem(allocated: > default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):32; > disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):32 (total: cpus:1.7; mem:928; > disk:928; ports:[31000-32000]; cpus(reservations: > [(DYNAMIC,default-role,test-principal)]):0.3; mem(reservations: > [(DYNAMIC,default-role,test-principal)]):96; disk(reservations: > [(DYNAMIC,default-role,test-principal)]):95; disk(reservations: > [(DYNAMIC,default-role,test-principal)])[executor:executor_volume_path]:1, > allocated: disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)])[executor:executor_volume_path]:1; > disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):31; mem(allocated: > default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):32; > cpus(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):0.1) on agent > a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 from framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 *** Aborted at 1517611820 (unix time) try "date -d @1517611820" if > you are using GNU date *** > 22:50:20 PC: @ 0x7f5fdf402163 mesos::v1::scheduler::Mesos::send() > 22:50:20 I0202 22:50:20.753651 18007 master.cpp:1422] Framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (default) disconnected > 22:50:20 I0202 22:50:20.753679 18007 master.cpp:3239] Deactivating framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (default) > 22:50:20 I0202 22:50:20.753697 18007 master.cpp:3216] Disconnecting framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (default) > 22:50:20 I0202 22:50:20.753707 18007 master.cpp:1437] Giving framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (default) 0ns to failover > 22:50:20 I0202 22:50:20.753773 18007 master.cpp:8623] Framework failover > timeout, removing framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > (default) > 22:50:20 I0202 22:50:20.753793 18007 master.cpp:9500] Removing framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (default) > 22:50:20 I0202 22:50:20.753829 18007 master.cpp:10204] Updating the state of > task consumer of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (latest > state: TASK_FINISHED, status update state: TASK_KILLED) > 22:50:20 I0202 22:50:20.753865 18007 master.cpp:10303] Removing task consumer > with resources cpus(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):0.1; mem(allocated: > default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):32; > disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):32 of framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 on agent > a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 at slave(1025)@172.16.10.208:39629 > (ip-172-16-10-208) > 22:50:20 I0202 22:50:20.754012 18007 master.cpp:10332] Removing executor > 'default' with resources > [{"allocation_info":{"role":"default-role"},"name":"cpus","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"mem","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":32.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"disk","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":31.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"disk":{"persistence":{"id":"executor","principal":"test-principal"},"volume":{"container_path":"executor_volume_path","mode":"RW"}},"name":"disk","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":1.0},"type":"SCALAR"}] > of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 on agent > a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 at slave(1025)@172.16.10.208:39629 > (ip-172-16-10-208) > 22:50:20 I0202 22:50:20.754346 18007 hierarchical.cpp:405] Deactivated > framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.754549 18007 hierarchical.cpp:1192] Recovered > cpus(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):0.1; mem(allocated: > default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):32; > disk(allocated: default-role)(reservations: > [(DYNAMIC,default-role,test-principal)]):31; disk(allocated: > default-role)(reservations: > [(DYNAMIC,default-role,test-principal)])[executor:executor_volume_path]:1 > (total: cpus:1.7; mem:928; disk:928; ports:[31000-32000]; cpus(reservations: > [(DYNAMIC,default-role,test-principal)]):0.3; mem(reservations: > [(DYNAMIC,default-role,test-principal)]):96; disk(reservations: > [(DYNAMIC,default-role,test-principal)]):95; disk(reservations: > [(DYNAMIC,default-role,test-principal)])[executor:executor_volume_path]:1, > allocated: {}) on agent a3d02bc8-96b0-4c84-a92b-1845429070a0-S0 from > framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.754637 18007 hierarchical.cpp:344] Removed framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.754680 18007 slave.cpp:3454] Asked to shut down > framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 by > master@172.16.10.208:39629 > 22:50:20 I0202 22:50:20.754700 18007 slave.cpp:3479] Shutting down framework > a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 > 22:50:20 I0202 22:50:20.754710 18007 slave.cpp:6178] Shutting down executor > 'default' of framework a3d02bc8-96b0-4c84-a92b-1845429070a0-0000 (via HTTP) > 22:50:20 *** SIGSEGV (@0x0) received by PID 30314 (TID 0x7f5fd0b4a700) from > PID 0; stack trace: *** > 22:50:20 @ 0x7f5fcce095d2 os::Linux::chained_handler() > 22:50:20 @ 0x7f5fcce0e299 JVM_handle_linux_signal > 22:50:20 @ 0x7f5fcce01ff8 signalHandler() > 22:50:20 @ 0x7f5fdcc79670 (unknown) > 22:50:20 @ 0x7f5fdf402163 mesos::v1::scheduler::Mesos::send() > 22:50:20 @ 0x559267eb57c6 > _ZNK5mesos8internal5tests2v19scheduler23SendAcknowledgeActionP2INS_2v111FrameworkIDENS5_7AgentIDEE10gmock_ImplIFvPNS5_9scheduler5MesosERKNSA_12Event_UpdateEEE17gmock_PerformImplISC_SF_N7testing8internal12ExcessiveArgESL_SL_SL_SL_SL_SL_SL_EEvRKSt5tupleIJSC_SF_EET_T0_T1_T2_T3_T4_T5_T6_T7_T8_ > 22:50:20 @ 0x559267eb593a > _ZN5mesos8internal5tests2v19scheduler23SendAcknowledgeActionP2INS_2v111FrameworkIDENS5_7AgentIDEE10gmock_ImplIFvPNS5_9scheduler5MesosERKNSA_12Event_UpdateEEE7PerformERKSt5tupleIJSC_SF_EE > 22:50:20 @ 0x559267dc547e > _ZN7testing8internal12DoBothActionI17PromiseArgActionPILi1EPN7process7PromiseIN5mesos2v19scheduler12Event_UpdateEEEENS5_8internal5tests2v19scheduler23SendAcknowledgeActionP2INS6_11FrameworkIDENS6_7AgentIDEEEE4ImplIFvPNS7_5MesosERKS8_EE7PerformERKSt5tupleIJSN_SP_EE > 22:50:20 @ 0x559267df459b > testing::internal::FunctionMockerBase<>::UntypedPerformAction() > 22:50:20 @ 0x559269038869 > testing::internal::UntypedFunctionMockerBase::UntypedInvokeWith() > 22:50:20 @ 0x559267ecbc0a > mesos::internal::tests::scheduler::MockHTTPScheduler<>::events() > 22:50:20 @ 0x559267e4c2c3 std::_Function_handler<>::_M_invoke() > 22:50:20 @ 0x7f5fdf405e18 process::AsyncExecutorProcess::execute<>() > 22:50:20 @ 0x7f5fdf40d77d > _ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEE10CallableFnINS_8internal7PartialIZNS1_8dispatchI7NothingNS1_20AsyncExecutorProcessERKSt8functionIFvRKSt5queueIN5mesos2v19scheduler5EventESt5dequeISH_SaISH_EEEEESL_SR_RSL_EENS1_6FutureIT_EERKNS1_3PIDIT0_EEMSX_FSU_T1_T2_EOT3_OT4_EUlSt10unique_ptrINS1_7PromiseISA_EESt14default_deleteIS1B_EEOSP_OSL_S3_E_JS1E_SP_SL_St12_PlaceholderILi1EEEEEEclEOS3_ > 22:50:20 @ 0x7f5fdffc4df1 process::ProcessBase::consume() > 22:50:20 @ 0x7f5fdffd79a2 process::ProcessManager::resume() > 22:50:20 @ 0x7f5fdffdb6b6 > _ZNSt6thread11_State_implISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv > 22:50:20 @ 0x7f5fdd15883f (unknown) > 22:50:20 @ 0x7f5fdcc6f6da start_thread > 22:50:20 @ 0x7f5fdc9a9d7f (unknown) > 22:50:21 timeout: the monitored command dumped core > 22:50:21 ./mesos-ci/mesos_ci: line 151: 30312 Segmentation fault (core > dumped) $SUDO env GTEST_FILTER="$GTEST_FILTER" GLOG_v=1 PATH="$PATH" $TIMEOUT > 60m "$test_binary" --verbose --gtest_output="xml:${xml}" > 22:50:21 The test binary has crashed OR the timeout has been exceeded! > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)