Hello,I deploy mesos on centos,kernel is 3.14.73,mesos version 1.0.0,this is
my master config:
export MESOS_log_dir=/apps/mesos/logs/
export MESOS_ip=0.0.0.0
export MESOS_hostname=`hostname`
export MESOS_logging_level=INFO
export MESOS_quorum=2
export MESOS_work_dir=/apps/mesos/master
export MESOS_zk=zk://zk1:2181,zk2:2181,zk3:2181/oss-mesos
export MESOS_allocator=HierarchicalDRF
export MESOS_cluster=oss-mesos
export MESOS_credentials=/apps/mesos/etc/mesos/credentials.txt
export MESOS_registry=replicated_log
export MESOS_webui_dir=/apps/mesos/share/mesos/webui
export MESOS_zk_session_timeout=90secs
export MESOS_max_executors_per_slave=10
export MESOS_registry_fetch_timeout=2mins
I start two master node :
but master nodes will crash in a few minute
the log message is
I0719 11:50:22.673280 5376 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (287)@10.10.186.76:5050
I0719 11:50:23.154119 5381 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (504)@10.10.179.252:5050
I0719 11:50:23.154749 5376 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0719 11:50:23.156838 5378 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0719 11:50:23.563072 5382 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (289)@10.10.186.76:5050
I0719 11:50:23.883855 5376 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (507)@10.10.179.252:5050
I0719 11:50:23.884414 5380 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0719 11:50:23.886569 5375 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0719 11:50:24.163056 5379 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (291)@10.10.186.76:5050
I0719 11:50:24.425379 5378 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (510)@10.10.179.252:5050
I0719 11:50:24.425864 5379 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0719 11:50:24.428951 5375 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0719 11:50:24.935673 5379 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (293)@10.10.186.76:5050
F0719 11:50:25.262277 5381 master.cpp:1662] Recovery failed: Failed to recover
registrar: Failed to perform fetch within 2mins
*** Check failure stack trace: ***
@ 0x7fe6fa0ac37c google::LogMessage::Fail()
@ 0x7fe6fa0ac2d8 google::LogMessage::SendToLog()
@ 0x7fe6fa0abcce google::LogMessage::Flush()
@ 0x7fe6fa0aea88 google::LogMessageFatal::~LogMessageFatal()
@ 0x7fe6f900a64c mesos::internal::master::fail()
@ 0x7fe6f90deffb
_ZNSt5_BindIFPFvRKSsS1_EPKcSt12_PlaceholderILi1EEEE6__callIvJS1_EJLm0ELm1EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
@ 0x7fe6f90b98df
_ZNSt5_BindIFPFvRKSsS1_EPKcSt12_PlaceholderILi1EEEEclIJS1_EvEET0_DpOT_
@ 0x7fe6f9086783
_ZZNK7process6FutureI7NothingE8onFailedISt5_BindIFPFvRKSsS6_EPKcSt12_PlaceholderILi1EEEEvEERKS2_OT_NS2_6PreferEENUlS6_E_clES6_
@ 0x7fe6f90df0cd
_ZNSt17_Function_handlerIFvRKSsEZNK7process6FutureI7NothingE8onFailedISt5_BindIFPFvS1_S1_EPKcSt12_PlaceholderILi1EEEEvEERKS6_OT_NS6_6PreferEEUlS1_E
_E9_M_invokeERKSt9_Any_dataS1_
@ 0x4a4833 std::function<>::operator()()
@ 0x49f0eb
_ZN7process8internal3runISt8functionIFvRKSsEEJS4_EEEvRKSt6vectorIT_SaIS8_EEDpOT0_
@ 0x4997c2 process::Future<>::fail()
@ 0x7fe6f8ccfa22 process::Promise<>::fail()
@ 0x7fe6f90dc4f0 process::internal::thenf<>()
@ 0x7fe6f9120bd9
_ZNSt5_BindIFPFvRKSt8functionIFN7process6FutureI7NothingEERKN5mesos8internal8RegistryEEERKSt10shared_ptrINS1_7PromiseIS3_EEERKNS2_IS7_EEESB_SH_St12
_PlaceholderILi1EEEE6__callIvISM_EILm0ELm1ELm2EEEET_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE
@ 0x7fe6f91178cd std::_Bind<>::operator()<>()
@ 0x7fe6f90fe821 std::_Function_handler<>::_M_invoke()
@ 0x7fe6f9117aff std::function<>::operator()()
@ 0x7fe6f90fe955
_ZZNK7process6FutureIN5mesos8internal8RegistryEE5onAnyIRSt8functionIFvRKS4_EEvEES8_OT_NS4_6PreferEENUlS8_E_clES8_
@ 0x7fe6f9120c85
_ZNSt17_Function_handlerIFvRKN7process6FutureIN5mesos8internal8RegistryEEEEZNKS5_5onAnyIRSt8functionIS8_EvEES7_OT_NS5_6PreferEEUlS7_E_E9_M_invokeER
KSt9_Any_dataS7_
@ 0x7fe6f9117aff std::function<>::operator()()
@ 0x7fe6f91807c4 process::internal::run<>()
@ 0x7fe6f9176ef4 process::Future<>::fail()
@ 0x7fe6f91b12de std::_Mem_fn<>::operator()<>()
I0719 11:50:25.414069 5382 replica.cpp:673] Replica in EMPTY status received a
broadcasted recover request from (513)@10.10.179.252:5050
I0719 11:50:25.414718 5376 recover.cpp:197] Received a recover response from a
replica in EMPTY status
@ 0x7fe6f91ac6c7
_ZNSt5_BindIFSt7_Mem_fnIMN7process6FutureIN5mesos8internal8RegistryEEEFbRKSsEES6_St12_PlaceholderILi1EEEE6__callIbIS8_EILm0ELm1EEEET_OSt5tupleIIDpT
0_EESt12_Index_tupleIIXspT1_EEE
I0719 11:50:25.416431 5377 recover.cpp:197] Received a recover response from a
replica in EMPTY status
I0719 11:50:25.418115 5379 http.cpp:381] HTTP GET for /master/state from
10.10.159.106:3363 with User-Agent='Mozilla/5.0 (X11; Linux x86_64)
AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/48.0.2564.116 Safari/537.36'
@ 0x7fe6f91a4d23
_ZNSt5_BindIFSt7_Mem_fnIMN7process6FutureIN5mesos8internal8RegistryEEEFbRKSsEES6_St12_PlaceholderILi1EEEEclIJS8_EbEET0_DpOT_
@ 0x7fe6f919ac63
_ZZNK7process6FutureIN5mesos8internal8RegistryEE8onFailedISt5_BindIFSt7_Mem_fnIMS4_FbRKSsEES4_St12_PlaceholderILi1EEEEbEERKS4_OT_NS4_6PreferEENUlS9
_E_clES9_
@ 0x7fe6f91ac752
_ZNSt17_Function_handlerIFvRKSsEZNK7process6FutureIN5mesos8internal8RegistryEE8onFailedISt5_BindIFSt7_Mem_fnIMS8_FbS1_EES8_St12_PlaceholderILi1EEEE
bEERKS8_OT_NS8_6PreferEEUlS1_E_E9_M_invokeERKSt9_Any_dataS1_
@ 0x4a4833 std::function<>::operator()()
@ 0x49f0eb
_ZN7process8internal3runISt8functionIFvRKSsEEJS4_EEEvRKSt6vectorIT_SaIS8_EEDpOT0_
@ 0x7fe6f9176ecc process::Future<>::fail()
@ 0x7fe6f916feac process::Promise<>::fail()
error logs is
Log file created at: 2016/07/19 11:50:25
Running on machine: oss-mesos-master-bjc-001
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F0719 11:50:25.262277 5381 master.cpp:1662] Recovery failed: Failed to recover
registrar: Failed to perform fetch within 2mins
can you help me.thanks