Alexey Serbin created KUDU-2998:
-----------------------------------

             Summary: RebalancingDuringElectionStormTest.RoundRobin sometimes 
crashes
                 Key: KUDU-2998
                 URL: https://issues.apache.org/jira/browse/KUDU-2998
             Project: Kudu
          Issue Type: Bug
          Components: test
    Affects Versions: 1.10.0, 1.10.1
            Reporter: Alexey Serbin
         Attachments: rebalancer_tool-test.6.txt.xz

I saw the {{RebalancingDuringElectionStormTest.RoundRobin}} tests crashed in 
DEBUG configuration with the following error:

noformat}
F1116 06:53:57.325479 11078 quorum_util.cc:167] Check failed: 
RaftPeerPB::NON_PARTICIPANT != GetConsensusRole(peer_uuid, cstate) (3 vs. 3) 
Peer fe4321fd981c466d86cd1fe2949868dc << not a participant in current_term: 25 
leader_uuid: "422db10def4d4c95a5a5bfd2cb787aa2" committed_config { opid_index: 
77 OBSOLETE_local: false peers { permanent_uuid: 
"f6d99d2a6f5542428e5e797972a0f53e" member_type: VOTER last_known_addr { host: 
"127.25.232.67" port: 41397 } } peers { permanent_uuid: 
"4084fddb6afb4aed80b27fc4bee3de1f" member_type: VOTER last_known_addr { host: 
"127.25.232.68" port: 39941 } } peers { permanent_uuid: 
"fe4321fd981c466d86cd1fe2949868dc" member_type: VOTER last_known_addr { host: 
"127.25.232.65" port: 40533 } attrs { replace: true } } peers { permanent_uuid: 
"422db10def4d4c95a5a5bfd2cb787aa2" member_type: VOTER last_known_addr { host: 
"127.25.232.70" port: 35983 } attrs { promote: false } } } pending_config { 
opid_index: 80 OBSOLETE_local: false peers { permanent_uuid: 
"f6d99d2a6f5542428e5e797972a0f53e" member_type: VOTER last_known_addr { host: 
"127.25.232.67" port: 41397 } } peers { permanent_uuid: 
"4084fddb6afb4aed80b27fc4bee3de1f" member_type: VOTER last_known_addr { host: 
"127.25.232.68" port: 39941 } } peers { permanent_uuid: 
"422db10def4d4c95a5a5bfd2cb787aa2" member_type: VOTER last_known_addr { host: 
"127.25.232.70" port: 35983 } attrs { promote: false } } }
{noformat}


The stack trace looked like the following:
{noformat}
    @     0x7f6598afa62d  google::LogMessage::Fail() at ??:0
    @     0x7f6598afc64c  google::LogMessage::SendToLog() at ??:0
    @     0x7f6598afa189  google::LogMessage::Flush() at ??:0
    @     0x7f6598afcfdf  google::LogMessageFatal::~LogMessageFatal() at ??:0
    @     0x7f6599a2c12c  kudu::consensus::GetParticipantRole() at ??:0
    @     0x7f659a549cae  
kudu::master::CatalogManager::BuildLocationsForTablet() at ??:0
    @     0x7f6596765d8b  
_ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_6master15MasterServiceIfC1ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E18_E9_M_invokeERKSt9_Any_dataS4_S5_S9_
 at ??:0
    @     0x7f659a54a37b  kudu::master::CatalogManager::GetTabletLocations() at 
??:0
    @     0x7f659a5de5ea  kudu::master::MasterServiceImpl::GetTabletLocations() 
at ??:0
    @     0x7f659675e742  
_ZZN4kudu6master15MasterServiceIfC1ERK13scoped_refptrINS_12MetricEntityEERKS2_INS_3rpc13ResultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE4_clESG_SH_SJ_
 at ??:0
    @     0x7f6596764a9f  
_ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_6master15MasterServiceIfC1ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E4_E9_M_invokeERKSt9_Any_dataS4_S5_S9_
 at ??:0
    @     0x7f659472cd16  std::function<>::operator()() at ??:0
    @     0x7f659472c547  kudu::rpc::GeneratedServiceIf::Handle() at ??:0
    @     0x7f659472f02e  kudu::rpc::ServicePool::RunThread() at ??:0
    @     0x7f65947303fd  boost::_mfi::mf0<>::operator()() at ??:0
    @     0x7f6594730224  boost::_bi::list1<>::operator()<>() at ??:0
    @     0x7f659473010b  boost::_bi::bind_t<>::operator()() at ??:0
    @     0x7f659473003a  
boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
    @     0x7f6599599842  boost::function0<>::operator()() at ??:0
    @     0x7f65995965cb  kudu::Thread::SuperviseThread() at ??:0
    @     0x7f6595ca2184  start_thread at ??:0
    @     0x7f6598104ffd  clone at ??:0
{noformat}

The full log is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to