Jan Schlicht created MESOS-8677: ----------------------------------- Summary: FaulToleranceTest.ReregisterCompletedFrameworks crashes on macOS Key: MESOS-8677 URL: https://issues.apache.org/jira/browse/MESOS-8677 Project: Mesos Issue Type: Bug Components: test Environment: macOS 10.13.3 with LLVM 6.0.0 as well as with Apple LLVM version 9.0.0 (clang-900.0.39.2) Reporter: Jan Schlicht
Here's a {{GLOG_v=1}} run of the test: {noformat} [ RUN ] FaultToleranceTest.ReregisterCompletedFrameworks I0314 14:30:11.240077 2290090816 cluster.cpp:172] Creating default 'local' authorizer I0314 14:30:11.241261 55140352 master.cpp:463] Master 025f775d-9c75-43f6-9ee6-079a605fbf01 (Jenkinss-Mac-mini.local) started on 10.0.49.4:54648 I0314 14:30:11.241287 55140352 master.cpp:465] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5" --authorizers="local" --credentials="/private/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/ZyMWb1/credentials" --filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false" --root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/private/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/ZyMWb1/master" --zk_session_timeout="10secs" I0314 14:30:11.241439 55140352 master.cpp:514] Master only allowing authenticated frameworks to register I0314 14:30:11.241447 55140352 master.cpp:520] Master only allowing authenticated agents to register I0314 14:30:11.241452 55140352 master.cpp:526] Master only allowing authenticated HTTP frameworks to register I0314 14:30:11.241461 55140352 credentials.hpp:37] Loading credentials for authentication from '/private/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/ZyMWb1/credentials' I0314 14:30:11.241678 55140352 master.cpp:570] Using default 'crammd5' authenticator I0314 14:30:11.241739 55140352 http.cpp:957] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I0314 14:30:11.241824 55140352 http.cpp:957] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' I0314 14:30:11.241873 55140352 http.cpp:957] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I0314 14:30:11.241919 55140352 master.cpp:649] Authorization enabled I0314 14:30:11.242066 52457472 whitelist_watcher.cpp:77] No whitelist given I0314 14:30:11.242079 51920896 hierarchical.cpp:175] Initialized hierarchical allocator process I0314 14:30:11.243557 52994048 master.cpp:2119] Elected as the leading master! I0314 14:30:11.243574 52994048 master.cpp:1678] Recovering from registrar I0314 14:30:11.243640 51920896 registrar.cpp:347] Recovering registrar I0314 14:30:11.243852 52457472 registrar.cpp:391] Successfully fetched the registry (0B) in 190976ns I0314 14:30:11.243928 52457472 registrar.cpp:495] Applied 1 operations in 28606ns; attempting to update the registry I0314 14:30:11.244163 52457472 registrar.cpp:552] Successfully updated the registry in 194816ns I0314 14:30:11.244222 52457472 registrar.cpp:424] Successfully recovered registrar I0314 14:30:11.244408 54067200 master.cpp:1792] Recovered 0 agents from the registry (155B); allowing 10mins for agents to reregister I0314 14:30:11.244443 52994048 hierarchical.cpp:213] Skipping recovery of hierarchical allocator: nothing to recover W0314 14:30:11.247259 2290090816 process.cpp:2805] Attempted to spawn already running process files@10.0.49.4:54648 I0314 14:30:11.247681 2290090816 cluster.cpp:460] Creating default 'local' authorizer I0314 14:30:11.248837 55676928 slave.cpp:265] Mesos agent started on (50)@10.0.49.4:54648 I0314 14:30:11.248865 55676928 slave.cpp:266] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/store/appc" --authenticate_http_executors="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/credential" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_command_executor="false" --http_credentials="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/http_credentials" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --jwt_secret_key="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/jwt_secret_key" --launcher="posix" --launcher_dir="/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]" --runtime_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --version="false" --work_dir="/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_1NHm2G" --zk_session_timeout="10secs" I0314 14:30:11.249145 55676928 credentials.hpp:86] Loading credential for authentication from '/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/credential' I0314 14:30:11.249279 55676928 slave.cpp:298] Agent using credential for: test-principal I0314 14:30:11.249297 55676928 credentials.hpp:37] Loading credentials for authentication from '/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_UqvwBG/http_credentials' I0314 14:30:11.249531 55676928 http.cpp:957] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-executor' I0314 14:30:11.249590 55676928 http.cpp:978] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-executor' I0314 14:30:11.249687 55676928 http.cpp:957] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-readonly' I0314 14:30:11.249727 55676928 http.cpp:978] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-readonly' I0314 14:30:11.249786 55676928 http.cpp:957] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-readwrite' I0314 14:30:11.249827 55676928 http.cpp:978] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-readwrite' I0314 14:30:11.249979 52457472 process.cpp:3564] Handling HTTP event for process 'master' with path: '/master/state' I0314 14:30:11.250329 55676928 slave.cpp:615] Agent resources: [{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}] I0314 14:30:11.250526 54603776 http.cpp:1097] HTTP GET for /master/state from 10.0.49.4:55065 I0314 14:30:11.250535 55676928 slave.cpp:623] Agent attributes: [ ] I0314 14:30:11.250561 55676928 slave.cpp:632] Agent hostname: Jenkinss-Mac-mini.local I0314 14:30:11.250732 51920896 task_status_update_manager.cpp:181] Pausing sending task status updates F0314 14:30:11.251163 53530624 authorizer.cpp:321] Check failed: object->task != nullptr || object->task_info != nullptr *** Check failure stack trace: *** I0314 14:30:11.251318 52457472 state.cpp:66] Recovering state from '/var/folders/6w/rw03zh013y38ys6cyn8qppf80000gn/T/FaultToleranceTest_ReregisterCompletedFrameworks_1NHm2G/meta' I0314 14:30:11.251448 54603776 task_status_update_manager.cpp:207] Recovering task status update manager I0314 14:30:11.251709 54067200 slave.cpp:7218] Finished recovery I0314 14:30:11.252424 52994048 task_status_update_manager.cpp:181] Pausing sending task status updates I0314 14:30:11.252446 54603776 slave.cpp:1266] New master detected at master@10.0.49.4:54648 I0314 14:30:11.252493 54603776 slave.cpp:1321] Detecting new master I0314 14:30:11.262979 54067200 slave.cpp:1348] Authenticating with master master@10.0.49.4:54648 I0314 14:30:11.263046 54067200 slave.cpp:1357] Using default CRAM-MD5 authenticatee I0314 14:30:11.263149 55140352 authenticatee.cpp:121] Creating new client SASL connection @ 0x11af0ff0a google::LogMessage::Fail() @ 0x11af0dbbc google::LogMessage::SendToLog() @ 0x11af0e949 google::LogMessage::Flush() @ 0x11af17bc9 google::LogMessageFatal::~LogMessageFatal() @ 0x11af103f5 google::LogMessageFatal::~LogMessageFatal() @ 0x1173931a2 mesos::internal::LocalAuthorizerObjectApprover::approved() @ 0x117ccf66b _ZN5mesos15ObjectApprovers8approvedILNS_13authorization6ActionE18EJEEEbDpRKT0_ @ 0x117ccddfe _ZZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS5_14authentication9PrincipalEEENK4$_15clERKNS4_5OwnedINS_15ObjectApproversEEEENKUlPN4JSON12ObjectWriterEE_clESN_ @ 0x117ccc3dc _ZZN4JSON8internal7jsonifyIZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS8_14authentication9PrincipalEEENK4$_15clERKNS7_5OwnedINS2_15ObjectApproversEEEEUlPNS_12ObjectWriterEE_vEENSt3__18functionIFvPNSR_13basic_ostreamIcNSR_11char_traitsIcEEEEEEERKT_NS0_6PreferEENKUlSX_E_clESX_ @ 0x117ccc370 _ZNSt3__128__invoke_void_return_wrapperIvE6__callIJRZN4JSON8internal7jsonifyIZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINSC_14authentication9PrincipalEEENK4$_15clERKNSB_5OwnedINS6_15ObjectApproversEEEEUlPNS3_12ObjectWriterEE_vEENS_8functionIFvPNS_13basic_ostreamIcNS_11char_traitsIcEEEEEEERKT_NS4_6PreferEEUlS10_E_S10_EEEvDpOT_ @ 0x117ccc269 _ZNSt3__110__function6__funcIZN4JSON8internal7jsonifyIZZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINSB_14authentication9PrincipalEEENK4$_15clERKNSA_5OwnedINS5_15ObjectApproversEEEEUlPNS2_12ObjectWriterEE_vEENS_8functionIFvPNS_13basic_ostreamIcNS_11char_traitsIcEEEEEEERKT_NS3_6PreferEEUlSZ_E_NS_9allocatorIS16_EES10_EclEOSZ_ @ 0x108544677 std::__1::function<>::operator()() @ 0x1173b8f34 JSON::operator<<() @ 0x11a937c97 process::http::OK::OK() @ 0x11a9389c5 process::http::OK::OK() @ 0x117ccae1f mesos::internal::master::Master::Http::state()::$_15::operator()() @ 0x117ce51a4 _ZN5cpp176invokeIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS7_14authentication9PrincipalEEE4$_15JNS6_5OwnedINS1_15ObjectApproversEEEEEEDTclclsr3stdE7forwardIT_Efp_Espclsr3stdE7forwardIT0_Efp0_EEEOSL_DpOSM_ @ 0x117ce5162 _ZN6lambda8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS8_14authentication9PrincipalEEE4$_15JNS7_5OwnedINS2_15ObjectApproversEEEEE13invoke_expandISI_NSt3__15tupleIJSL_EEENSP_IJEEEJLm0EEEEDTclsr5cpp17E6invokeclsr3stdE7forwardIT_Efp_Espcl6expandclsr3stdE3getIXT2_EEclsr3stdE7forwardIT0_Efp0_EEclsr3stdE7forwardIT1_Efp2_EEEEOSS_OST_N5cpp1416integer_sequenceImJXspT2_EEEEOSU_ @ 0x117ce50c0 _ZNO6lambda8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINS8_14authentication9PrincipalEEE4$_15JNS7_5OwnedINS2_15ObjectApproversEEEEEclIJEEEDTcl13invoke_expandclL_ZNSt3__14moveIRSI_EEONSO_16remove_referenceIT_E4typeEOSS_EdtdefpT1fEclL_ZNSP_IRNSO_5tupleIJSL_EEEEESV_SW_EdtdefpT10bound_argsEcvN5cpp1416integer_sequenceImJLm0EEEE_Eclsr3stdE16forward_as_tuplespclsr3stdE7forwardIT_Efp_EEEEDpOS13_ @ 0x117ce5054 _ZN5cpp176invokeIN6lambda8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKN7process4http7RequestERK6OptionINSA_14authentication9PrincipalEEE4$_15JNS9_5OwnedINS4_15ObjectApproversEEEEEEJEEEDTclclsr3stdE7forwardIT_Efp_Espclsr3stdE7forwardIT0_Efp0_EEEOSP_DpOSQ_ @ 0x117ce4fb3 _ZN6lambda8internal6InvokeIN7process6FutureINS2_4http8ResponseEEEEclINS0_7PartialIZNK5mesos8internal6master6Master4Http5stateERKNS4_7RequestERK6OptionINS4_14authentication9PrincipalEEE4$_15JNS2_5OwnedINSA_15ObjectApproversEEEEEEJEEES6_OT_DpOT0_ @ 0x117ce4e6d _ZNO6lambda12CallableOnceIFN7process6FutureINS1_4http8ResponseEEEvEE10CallableFnINS_8internal7PartialIZNK5mesos8internal6master6Master4Http5stateERKNS3_7RequestERK6OptionINS3_14authentication9PrincipalEEE4$_15JNS1_5OwnedINSB_15ObjectApproversEEEEEEEclEv @ 0x1174a87e4 _ZNO6lambda12CallableOnceIFN7process6FutureINS1_4http8ResponseEEEvEEclEv @ 0x1174a8620 _ZZN7process8internal8DispatchINS_6FutureINS_4http8ResponseEEEEclIN6lambda12CallableOnceIFS5_vEEEEES5_RKNS_4UPIDEOT_ENKUlNSt3__110unique_ptrINS_7PromiseIS4_EENSH_14default_deleteISK_EEEEOSB_PNS_11ProcessBaseEE_clESN_SO_SQ_ @ 0x1174a821f _ZN5cpp176invokeIZN7process8internal8DispatchINS1_6FutureINS1_4http8ResponseEEEEclIN6lambda12CallableOnceIFS7_vEEEEES7_RKNS1_4UPIDEOT_EUlNSt3__110unique_ptrINS1_7PromiseIS6_EENSJ_14default_deleteISM_EEEEOSD_PNS1_11ProcessBaseEE_JSP_SD_SS_EEEDTclclsr3stdE7forwardISH_Efp_Espclsr3stdE7forwardIT0_Efp0_EEESI_DpOSU_ @ 0x1174a8052 _ZN6lambda8internal7PartialIZN7process8internal8DispatchINS2_6FutureINS2_4http8ResponseEEEEclINS_12CallableOnceIFS8_vEEEEES8_RKNS2_4UPIDEOT_EUlNSt3__110unique_ptrINS2_7PromiseIS7_EENSJ_14default_deleteISM_EEEEOSD_PNS2_11ProcessBaseEE_JSP_SD_NSJ_12placeholders4__phILi1EEEEE13invoke_expandIST_NSJ_5tupleIJSP_SD_SW_EEENSZ_IJOSS_EEEJLm0ELm1ELm2EEEEDTclsr5cpp17E6invokeclsr3stdE7forwardISH_Efp_Espcl6expandclsr3stdE3getIXT2_EEclsr3stdE7forwardIT0_Efp0_EEclsr3stdE7forwardIT1_Efp2_EEEESI_OS13_N5cpp1416integer_sequenceImJXspT2_EEEEOS14_ @ 0x1174a7f33 _ZNO6lambda8internal7PartialIZN7process8internal8DispatchINS2_6FutureINS2_4http8ResponseEEEEclINS_12CallableOnceIFS8_vEEEEES8_RKNS2_4UPIDEOT_EUlNSt3__110unique_ptrINS2_7PromiseIS7_EENSJ_14default_deleteISM_EEEEOSD_PNS2_11ProcessBaseEE_JSP_SD_NSJ_12placeholders4__phILi1EEEEEclIJSS_EEEDTcl13invoke_expandclL_ZNSJ_4moveIRST_EEONSJ_16remove_referenceISH_E4typeESI_EdtdefpT1fEclL_ZNSZ_IRNSJ_5tupleIJSP_SD_SW_EEEEES14_SI_EdtdefpT10bound_argsEcvN5cpp1416integer_sequenceImJLm0ELm1ELm2EEEE_Eclsr3stdE16forward_as_tuplespclsr3stdE7forwardIT_Efp_EEEEDpOS1B_ @ 0x1174a7dfd _ZN5cpp176invokeIN6lambda8internal7PartialIZN7process8internal8DispatchINS4_6FutureINS4_4http8ResponseEEEEclINS1_12CallableOnceIFSA_vEEEEESA_RKNS4_4UPIDEOT_EUlNSt3__110unique_ptrINS4_7PromiseIS9_EENSL_14default_deleteISO_EEEEOSF_PNS4_11ProcessBaseEE_JSR_SF_NSL_12placeholders4__phILi1EEEEEEJSU_EEEDTclclsr3stdE7forwardISJ_Efp_Espclsr3stdE7forwardIT0_Efp0_EEESK_DpOS10_ @ 0x1174a7dc1 _ZN6lambda8internal6InvokeIvEclINS0_7PartialIZN7process8internal8DispatchINS5_6FutureINS5_4http8ResponseEEEEclINS_12CallableOnceIFSB_vEEEEESB_RKNS5_4UPIDEOT_EUlNSt3__110unique_ptrINS5_7PromiseISA_EENSM_14default_deleteISP_EEEEOSG_PNS5_11ProcessBaseEE_JSS_SG_NSM_12placeholders4__phILi1EEEEEEJSV_EEEvSL_DpOT0_ @ 0x1174a7a46 _ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEE10CallableFnINS_8internal7PartialIZNS1_8internal8DispatchINS1_6FutureINS1_4http8ResponseEEEEclINS0_IFSE_vEEEEESE_RKNS1_4UPIDEOT_EUlNSt3__110unique_ptrINS1_7PromiseISD_EENSO_14default_deleteISR_EEEEOSI_S3_E_JSU_SI_NSO_12placeholders4__phILi1EEEEEEEclEOS3_ @ 0x11ac9a28f _ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEEclES3_ @ 0x11ac9a104 process::ProcessBase::consume() {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)