[
https://issues.apache.org/jira/browse/KUDU-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated KUDU-2030:
--------------------------------
Summary: Tablet server crashes on using deallocated Mutex object (was:
Tablet server crashes on using deallocated)
> Tablet server crashes on using deallocated Mutex object
> -------------------------------------------------------
>
> Key: KUDU-2030
> URL: https://issues.apache.org/jira/browse/KUDU-2030
> Project: Kudu
> Issue Type: Bug
> Components: tserver
> Affects Versions: 1.4.0
> Reporter: Alexey Serbin
> Labels: stability
>
> The code in {{RaftConsensus::UpdateReplica()}}
> (src/kudu/consensus/raft_consensus.cc) instantiates {{Synchronizer}} on the
> stack and then uses the derived StatusCallback in a way that under certain
> code path leads to an attempt to use already deallocated {{Mutex}} object
> {{CountDownLatch::lock_}}. The instance of {{CountDownLatch}} is aggregated
> by the {{Synchronizer}} object itself.
> Under certain scenarios, tserver crashes with the following stack trace:
> {noformat}
> F0605 18:22:23.583866 14144 mutex.cc:76] Check failed: rv == 0 || rv == 16 .
> Invalid argument. Owner tid: 23156096; Self tid: 144; To collect the owner
> stack trace, enable the flag --debug_mutex_collect_stacktrace
> *** Check failure stack trace: ***
>
> @ 0x7fab619a62fd google::LogMessage::Fail() at ??:0
>
> @ 0x7fab619a81bd google::LogMessage::SendToLog() at ??:0
>
> @ 0x7fab619a5e39 google::LogMessage::Flush() at ??:0
>
> @ 0x7fab619a8c5f google::LogMessageFatal::~LogMessageFatal() at ??:0
>
> @ 0x7fab627eb453 kudu::Mutex::TryAcquire() at ??:0
>
> @ 0x7fab627eb82c kudu::Mutex::Acquire() at ??:0
>
> @ 0x7fab6aec6b7a kudu::CountDownLatch::CountDown() at ??:0
>
> @ 0x7fab6aec526a kudu::CountDownLatch::CountDown() at ??:0
>
> @ 0x7fab69339633 kudu::Synchronizer::StatusCB() at ??:0
>
> @ 0x7fab69339a21 kudu::internal::RunnableAdapter<>::Run() at ??:0
>
> @ 0x7fab69339964 kudu::internal::InvokeHelper<>::MakeItSo() at ??:0
>
> @ 0x7fab693398f2 kudu::internal::Invoker<>::Run() at ??:0
>
> @ 0x7fab692bce26 kudu::Callback<>::Run() at ??:0
> {noformat}
> The {{pthread_mutex_trylock()}} in mutex.cc:74 returns {{EINVAL}} since the
> underlying pthread mutex handle has already been deallocated.
> To reproduce, run the
> {{ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer}} from
> {{client-negotiation-failover-itest}} built from version
> {{5f8442ff67fe87b019c71a09f0556bdcb6868428}} in DEBUG configuration with
> --stress-cpu-threads=8 about 1K times. One 1K run usually produces about 3-4
> crashes like that.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)