Jan Schlicht created MESOS-8677:
-----------------------------------
Summary: FaulToleranceTest.ReregisterCompletedFrameworks crashes
on macOS
Key: MESOS-8677
URL: https://issues.apache.org/jira/browse/MESOS-8677
Project: Mesos
Issue Type: Bug
Components: test
Environment: macOS 10.13.3 with LLVM 6.0.0 as well as with Apple LLVM
version 9.0.0 (clang-900.0.39.2)
Reporter: Jan Schlicht
Here's a {{GLOG_v=1}} run of the test:
{noformat}
[ RUN ] FaultToleranceTest.ReregisterCompletedFrameworks
I0314 14:30:11.240077 2290090816 cluster.cpp:172] Creating default 'local'
authorizer
I0314 14:30:11.241261 55140352 master.cpp:463] Master
025f775d-9c75-43f6-9ee6-079a605fbf01 (Jenkinss-Mac-mini.local) started on
10.0.49.4:54648
I0314 14:30:11.241287 55140352 master.cpp:465] Flags at startup: --acls=""
--agent_ping_timeout="15secs" --agent_reregister_timeout="10mins"
--allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate_agents="true" --authenticate_frameworks="true"
--authenticate_http_frameworks="true" --authenticate_http_readonly="true"
--authenticate_http_readwrite="true" --authenticators="crammd5"
--authorizers="local"
--credentials="/private/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/ZyMWb1/credentials"
--filter_gpu_resources="true" --framework_sorter="drf" --help="false"
--hostname_lookup="true" --http_authenticators="basic"
--http_framework_authenticators="basic" --initialize_driver_logging="true"
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO"
--max_agent_ping_timeouts="5" --max_completed_frameworks="50"
--max_completed_tasks_per_framework="1000"
--max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false"
--recovery_agent_removal_limit="100%" --registry="in_memory"
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins"
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400"
--registry_store_timeout="100secs" --registry_strict="false"
--require_agent_domain="false" --root_submissions="true" --user_sorter="drf"
--version="false" --webui_dir="/usr/local/share/mesos/webui"
--work_dir="/private/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/ZyMWb1/master"
--zk_session_timeout="10secs"
I0314 14:30:11.241439 55140352 master.cpp:514] Master only allowing
authenticated frameworks to register
I0314 14:30:11.241447 55140352 master.cpp:520] Master only allowing
authenticated agents to register
I0314 14:30:11.241452 55140352 master.cpp:526] Master only allowing
authenticated HTTP frameworks to register
I0314 14:30:11.241461 55140352 credentials.hpp:37] Loading credentials for
authentication from
'/private/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/ZyMWb1/credentials'
I0314 14:30:11.241678 55140352 master.cpp:570] Using default 'crammd5'
authenticator
I0314 14:30:11.241739 55140352 http.cpp:957] Creating default 'basic' HTTP
authenticator for realm 'mesos-master-readonly'
I0314 14:30:11.241824 55140352 http.cpp:957] Creating default 'basic' HTTP
authenticator for realm 'mesos-master-readwrite'
I0314 14:30:11.241873 55140352 http.cpp:957] Creating default 'basic' HTTP
authenticator for realm 'mesos-master-scheduler'
I0314 14:30:11.241919 55140352 master.cpp:649] Authorization enabled
I0314 14:30:11.242066 52457472 whitelist_watcher.cpp:77] No whitelist given
I0314 14:30:11.242079 51920896 hierarchical.cpp:175] Initialized hierarchical
allocator process
I0314 14:30:11.243557 52994048 master.cpp:2119] Elected as the leading master!
I0314 14:30:11.243574 52994048 master.cpp:1678] Recovering from registrar
I0314 14:30:11.243640 51920896 registrar.cpp:347] Recovering registrar
I0314 14:30:11.243852 52457472 registrar.cpp:391] Successfully fetched the
registry (0B) in 190976ns
I0314 14:30:11.243928 52457472 registrar.cpp:495] Applied 1 operations in
28606ns; attempting to update the registry
I0314 14:30:11.244163 52457472 registrar.cpp:552] Successfully updated the
registry in 194816ns
I0314 14:30:11.244222 52457472 registrar.cpp:424] Successfully recovered
registrar
I0314 14:30:11.244408 54067200 master.cpp:1792] Recovered 0 agents from the
registry (155B); allowing 10mins for agents to reregister
I0314 14:30:11.244443 52994048 hierarchical.cpp:213] Skipping recovery of
hierarchical allocator: nothing to recover
W0314 14:30:11.247259 2290090816 process.cpp:2805] Attempted to spawn already
running process [email protected]:54648
I0314 14:30:11.247681 2290090816 cluster.cpp:460] Creating default 'local'
authorizer
I0314 14:30:11.248837 55676928 slave.cpp:265] Mesos agent started on
(50)@10.0.49.4:54648
I0314 14:30:11.248865 55676928 slave.cpp:266] Flags at startup: --acls=""
--appc_simple_discovery_uri_prefix="http://"
--appc_store_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/store/appc"
--authenticate_http_executors="true" --authenticate_http_readonly="true"
--authenticate_http_readwrite="true" --authenticatee="crammd5"
--authentication_backoff_factor="1secs" --authorizer="local"
--container_disk_watch_interval="15secs" --containerizers="mesos"
--credential="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/credential"
--default_role="*" --disk_watch_interval="1mins" --docker="docker"
--docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io"
--docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock"
--docker_stop_timeout="0ns"
--docker_store_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/store/docker"
--docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume"
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins"
--executor_reregistration_timeout="2secs"
--executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/fetch"
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
--gc_disk_headroom="0.1" --hadoop_home="" --help="false"
--hostname_lookup="true" --http_command_executor="false"
--http_credentials="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/http_credentials"
--http_heartbeat_interval="30secs" --initialize_driver_logging="true"
--isolation="posix/cpu,posix/mem"
--jwt_secret_key="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/jwt_secret_key"
--launcher="posix"
--launcher_dir="/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build/src"
--logbufsecs="0" --logging_level="INFO"
--max_completed_executors_per_framework="150"
--oversubscribed_resources_interval="15secs" --port="5051"
--qos_correction_interval_min="0ns" --quiet="false"
--reconfiguration_policy="equal" --recover="reconnect"
--recovery_timeout="15mins" --registration_backoff_factor="10ms"
--resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]"
--runtime_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG"
--sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true"
--version="false"
--work_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_1NHm2G"
--zk_session_timeout="10secs"
I0314 14:30:11.249145 55676928 credentials.hpp:86] Loading credential for
authentication from
'/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/credential'
I0314 14:30:11.249279 55676928 slave.cpp:298] Agent using credential for:
test-principal
I0314 14:30:11.249297 55676928 credentials.hpp:37] Loading credentials for
authentication from
'/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/http_credentials'
I0314 14:30:11.249531 55676928 http.cpp:957] Creating default 'basic' HTTP
authenticator for realm 'mesos-agent-executor'
I0314 14:30:11.249590 55676928 http.cpp:978] Creating default 'jwt' HTTP
authenticator for realm 'mesos-agent-executor'
I0314 14:30:11.249687 55676928 http.cpp:957] Creating default 'basic' HTTP
authenticator for realm 'mesos-agent-readonly'
I0314 14:30:11.249727 55676928 http.cpp:978] Creating default 'jwt' HTTP
authenticator for realm 'mesos-agent-readonly'
I0314 14:30:11.249786 55676928 http.cpp:957] Creating default 'basic' HTTP
authenticator for realm 'mesos-agent-readwrite'
I0314 14:30:11.249827 55676928 http.cpp:978] Creating default 'jwt' HTTP
authenticator for realm 'mesos-agent-readwrite'
I0314 14:30:11.249979 52457472 process.cpp:3564] Handling HTTP event for
process 'master' with path: '/master/state'
I0314 14:30:11.250329 55676928 slave.cpp:615] Agent resources:
[{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
I0314 14:30:11.250526 54603776 http.cpp:1097] HTTP GET for /master/state from
10.0.49.4:55065
I0314 14:30:11.250535 55676928 slave.cpp:623] Agent attributes: [ ]
I0314 14:30:11.250561 55676928 slave.cpp:632] Agent hostname:
Jenkinss-Mac-mini.local
I0314 14:30:11.250732 51920896 task_status_update_manager.cpp:181] Pausing
sending task status updates
F0314 14:30:11.251163 53530624 authorizer.cpp:321] Check failed: object->task
!= nullptr || object->task_info != nullptr
*** Check failure stack trace: ***
I0314 14:30:11.251318 52457472 state.cpp:66] Recovering state from
'/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_1NHm2G/meta'
I0314 14:30:11.251448 54603776 task_status_update_manager.cpp:207] Recovering
task status update manager
I0314 14:30:11.251709 54067200 slave.cpp:7218] Finished recovery
I0314 14:30:11.252424 52994048 task_status_update_manager.cpp:181] Pausing
sending task status updates
I0314 14:30:11.252446 54603776 slave.cpp:1266] New master detected at
[email protected]:54648
I0314 14:30:11.252493 54603776 slave.cpp:1321] Detecting new master
I0314 14:30:11.262979 54067200 slave.cpp:1348] Authenticating with master
[email protected]:54648
I0314 14:30:11.263046 54067200 slave.cpp:1357] Using default CRAM-MD5
authenticatee
I0314 14:30:11.263149 55140352 authenticatee.cpp:121] Creating new client SASL
connection
@ 0x11af0ff0a google::LogMessage::Fail()
@ 0x11af0dbbc google::LogMessage::SendToLog()
@ 0x11af0e949 google::LogMessage::Flush()
@ 0x11af17bc9 google::LogMessageFatal::~LogMessageFatal()
@ 0x11af103f5 google::LogMessageFatal::~LogMessageFatal()
@ 0x1173931a2
mesos::internal::LocalAuthorizerObjectApprover::approved()
@ 0x117ccf66b
_ZN5mesos15ObjectApprovers8approvedILNS_13authorization6ActionE18EJEEEbDpRKT0_
@ 0x117ccddfe
_ZZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS5_14authentication9PrincipalEEENK4$_15clERKNS4_5OwnedINS_15ObjectApproversEEEENKUlPN4JSON12ObjectWriterEE_clESN_
@ 0x117ccc3dc
_ZZN4JSON8internal7jsonifyIZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS8_14authentication9PrincipalEEENK4$_15clERKNS7_5OwnedINS2_15ObjectApproversEEEEUlPNS_12ObjectWriterEE_vEENSt3__18functionIFvPNSR_13basic_ostreamIcNSR_11char_traitsIcEEEEEEERKT_NS0_6PreferEENKUlSX_E_clESX_
@ 0x117ccc370
_ZNSt3__128__invoke_void_return_wrapperIvE6__callIJRZN4JSON8internal7jsonifyIZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINSC_14authentication9PrincipalEEENK4$_15clERKNSB_5OwnedINS6_15ObjectApproversEEEEUlPNS3_12ObjectWriterEE_vEENS_8functionIFvPNS_13basic_ostreamIcNS_11char_traitsIcEEEEEEERKT_NS4_6PreferEEUlS10_E_S10_EEEvDpOT_
@ 0x117ccc269
_ZNSt3__110__function6__funcIZN4JSON8internal7jsonifyIZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINSB_14authentication9PrincipalEEENK4$_15clERKNSA_5OwnedINS5_15ObjectApproversEEEEUlPNS2_12ObjectWriterEE_vEENS_8functionIFvPNS_13basic_ostreamIcNS_11char_traitsIcEEEEEEERKT_NS3_6PreferEEUlSZ_E_NS_9allocatorIS16_EES10_EclEOSZ_
@ 0x108544677 std::__1::function<>::operator()()
@ 0x1173b8f34 JSON::operator<<()
@ 0x11a937c97 process::http::OK::OK()
@ 0x11a9389c5 process::http::OK::OK()
@ 0x117ccae1f
mesos::internal::master::Master::Http::state()::$_15::operator()()
@ 0x117ce51a4
_ZN5cpp176invokeIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS7_14authentication9PrincipalEEE4$_15JNS6_5OwnedINS1_15ObjectApproversEEEEEEDTclclsr3stdE7forwardIT_Efp_Espclsr3stdE7forwardIT0_Efp0_EEEOSL_DpOSM_
@ 0x117ce5162
_ZN6lambda8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS8_14authentication9PrincipalEEE4$_15JNS7_5OwnedINS2_15ObjectApproversEEEEE13invoke_expandISI_NSt3__15tupleIJSL_EEENSP_IJEEEJLm0EEEEDTclsr5cpp17E6invokeclsr3stdE7forwardIT_Efp_Espcl6expandclsr3stdE3getIXT2_EEclsr3stdE7forwardIT0_Efp0_EEclsr3stdE7forwardIT1_Efp2_EEEEOSS_OST_N5cpp1416integer_sequenceImJXspT2_EEEEOSU_
@ 0x117ce50c0
_ZNO6lambda8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS8_14authentication9PrincipalEEE4$_15JNS7_5OwnedINS2_15ObjectApproversEEEEEclIJEEEDTcl13invoke_expandclL_ZNSt3__14moveIRSI_EEONSO_16remove_referenceIT_E4typeEOSS_EdtdefpT1fEclL_ZNSP_IRNSO_5tupleIJSL_EEEEESV_SW_EdtdefpT10bound_argsEcvN5cpp1416integer_sequenceImJLm0EEEE_Eclsr3stdE16forward_as_tuplespclsr3stdE7forwardIT_Efp_EEEEDpOS13_
@ 0x117ce5054
_ZN5cpp176invokeIN6lambda8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINSA_14authentication9PrincipalEEE4$_15JNS9_5OwnedINS4_15ObjectApproversEEEEEEJEEEDTclclsr3stdE7forwardIT_Efp_Espclsr3stdE7forwardIT0_Efp0_EEEOSP_DpOSQ_
@ 0x117ce4fb3
_ZN6lambda8internal6InvokeIN7process6FutureINS2_4http8ResponseEEEEclINS0_7PartialIZNK5mesos8internal6master6Master4Http5stateERKNS4_7RequestERK6OptionINS4_14authentication9PrincipalEEE4$_15JNS2_5OwnedINSA_15ObjectApproversEEEEEEJEEES6_OT_DpOT0_
@ 0x117ce4e6d
_ZNO6lambda12CallableOnceIFN7process6FutureINS1_4http8ResponseEEEvEE10CallableFnINS_8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKNS3_7RequestERK6OptionINS3_14authentication9PrincipalEEE4$_15JNS1_5OwnedINSB_15ObjectApproversEEEEEEEclEv
@ 0x1174a87e4
_ZNO6lambda12CallableOnceIFN7process6FutureINS1_4http8ResponseEEEvEEclEv
@ 0x1174a8620
_ZZN7process8internal8DispatchINS_6FutureINS_4http8ResponseEEEEclIN6lambda12CallableOnceIFS5_vEEEEES5_RKNS_4UPIDEOT_ENKUlNSt3__110unique_ptrINS_7PromiseIS4_EENSH_14default_deleteISK_EEEEOSB_PNS_11ProcessBaseEE_clESN_SO_SQ_
@ 0x1174a821f
_ZN5cpp176invokeIZN7process8internal8DispatchINS1_6FutureINS1_4http8ResponseEEEEclIN6lambda12CallableOnceIFS7_vEEEEES7_RKNS1_4UPIDEOT_EUlNSt3__110unique_ptrINS1_7PromiseIS6_EENSJ_14default_deleteISM_EEEEOSD_PNS1_11ProcessBaseEE_JSP_SD_SS_EEEDTclclsr3stdE7forwardISH_Efp_Espclsr3stdE7forwardIT0_Efp0_EEESI_DpOSU_
@ 0x1174a8052
_ZN6lambda8internal7PartialIZN7process8internal8DispatchINS2_6FutureINS2_4http8ResponseEEEEclINS_12CallableOnceIFS8_vEEEEES8_RKNS2_4UPIDEOT_EUlNSt3__110unique_ptrINS2_7PromiseIS7_EENSJ_14default_deleteISM_EEEEOSD_PNS2_11ProcessBaseEE_JSP_SD_NSJ_12placeholders4__phILi1EEEEE13invoke_expandIST_NSJ_5tupleIJSP_SD_SW_EEENSZ_IJOSS_EEEJLm0ELm1ELm2EEEEDTclsr5cpp17E6invokeclsr3stdE7forwardISH_Efp_Espcl6expandclsr3stdE3getIXT2_EEclsr3stdE7forwardIT0_Efp0_EEclsr3stdE7forwardIT1_Efp2_EEEESI_OS13_N5cpp1416integer_sequenceImJXspT2_EEEEOS14_
@ 0x1174a7f33
_ZNO6lambda8internal7PartialIZN7process8internal8DispatchINS2_6FutureINS2_4http8ResponseEEEEclINS_12CallableOnceIFS8_vEEEEES8_RKNS2_4UPIDEOT_EUlNSt3__110unique_ptrINS2_7PromiseIS7_EENSJ_14default_deleteISM_EEEEOSD_PNS2_11ProcessBaseEE_JSP_SD_NSJ_12placeholders4__phILi1EEEEEclIJSS_EEEDTcl13invoke_expandclL_ZNSJ_4moveIRST_EEONSJ_16remove_referenceISH_E4typeESI_EdtdefpT1fEclL_ZNSZ_IRNSJ_5tupleIJSP_SD_SW_EEEEES14_SI_EdtdefpT10bound_argsEcvN5cpp1416integer_sequenceImJLm0ELm1ELm2EEEE_Eclsr3stdE16forward_as_tuplespclsr3stdE7forwardIT_Efp_EEEEDpOS1B_
@ 0x1174a7dfd
_ZN5cpp176invokeIN6lambda8internal7PartialIZN7process8internal8DispatchINS4_6FutureINS4_4http8ResponseEEEEclINS1_12CallableOnceIFSA_vEEEEESA_RKNS4_4UPIDEOT_EUlNSt3__110unique_ptrINS4_7PromiseIS9_EENSL_14default_deleteISO_EEEEOSF_PNS4_11ProcessBaseEE_JSR_SF_NSL_12placeholders4__phILi1EEEEEEJSU_EEEDTclclsr3stdE7forwardISJ_Efp_Espclsr3stdE7forwardIT0_Efp0_EEESK_DpOS10_
@ 0x1174a7dc1
_ZN6lambda8internal6InvokeIvEclINS0_7PartialIZN7process8internal8DispatchINS5_6FutureINS5_4http8ResponseEEEEclINS_12CallableOnceIFSB_vEEEEESB_RKNS5_4UPIDEOT_EUlNSt3__110unique_ptrINS5_7PromiseISA_EENSM_14default_deleteISP_EEEEOSG_PNS5_11ProcessBaseEE_JSS_SG_NSM_12placeholders4__phILi1EEEEEEJSV_EEEvSL_DpOT0_
@ 0x1174a7a46
_ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEE10CallableFnINS_8internal7PartialIZNS1_8internal8DispatchINS1_6FutureINS1_4http8ResponseEEEEclINS0_IFSE_vEEEEESE_RKNS1_4UPIDEOT_EUlNSt3__110unique_ptrINS1_7PromiseISD_EENSO_14default_deleteISR_EEEEOSI_S3_E_JSU_SI_NSO_12placeholders4__phILi1EEEEEEEclEOS3_
@ 0x11ac9a28f
_ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEEclES3_
@ 0x11ac9a104 process::ProcessBase::consume()
{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)