[
https://issues.apache.org/jira/browse/KUDU-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016582#comment-16016582
]
Alexey Serbin commented on KUDU-1034:
-------------------------------------
Running the new test implemented in the patch from [~mpercy]
(client_timeout_fail.patch), the current Kudu C++ client apparently retries but
eventually test fails due to consistency check (that's true both for DEBUG and
RELEASE configurations):
{noformat}
W0518 15:41:00.559075 6311 consensus_peers.cc:357] T
fb88746eb1674bbaacbcf459dd492669 P 623bba78cbb6472dbd2c8779cf51c93d -> Peer
16df2bf0fb074c46959c06f6f2069150 (127.24.81.2:43726): Couldn't send request to
peer 16df2bf0fb074c46959c06f6f2069150 for tablet
fb88746eb1674bbaacbcf459dd492669. Status: Timed out: UpdateConsensus RPC to
127.24.81.2:43726 timed out after 0.050s (ON_OUTBOUND_QUEUE). Retrying in the
next heartbeat period. Already tried 20 times.
W0518 15:41:00.638219 6697 batcher.cc:329] Timed out: Failed to write batch of
50 ops to tablet fb88746eb1674bbaacbcf459dd492669 after 1 attempt(s): Failed to
write to server: 16df2bf0fb074c46959c06f6f2069150 (127.24.81.2:43726): Write
RPC to 127.24.81.2:43726 timed out after 0.500s (SENT)
W0518 15:41:01.059166 6311 consensus_peers.cc:357] T
fb88746eb1674bbaacbcf459dd492669 P 623bba78cbb6472dbd2c8779cf51c93d -> Peer
16df2bf0fb074c46959c06f6f2069150 (127.24.81.2:43726): Couldn't send request to
peer 16df2bf0fb074c46959c06f6f2069150 for tablet
fb88746eb1674bbaacbcf459dd492669. Status: Timed out: UpdateConsensus RPC to
127.24.81.2:43726 timed out after 0.050s (ON_OUTBOUND_QUEUE). Retrying in the
next heartbeat period. Already tried 21 times.
F0518 15:41:01.216828 6225 raft_consensus-itest.cc:454] Check failed:
workload.rows_inserted() >= rows_target (1450 vs. 1550)
*** Check failure stack trace: ***
@ 0x8a59b5 google::LogMessage::SendToLog()
@ 0x8a5e9f google::LogMessage::Flush()
@ 0x8a99f2 google::LogMessageFatal::~LogMessageFatal()
@ 0x803fb8
kudu::tserver::RaftConsensusITest_TestClientFailoverOnLeaderTimeout_Test::TestBody()
@ 0x1894aa7
testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x1879022 testing::Test::Run()
@ 0x187a284 testing::TestInfo::Run()
@ 0x187aa33 testing::TestCase::Run()
@ 0x1883539 testing::internal::UnitTestImpl::RunAllTests()
@ 0x1895683
testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x18830ea testing::UnitTest::Run()
@ 0x8a1dc9 main
@ 0x3ae0a1ed5d (unknown)
@ 0x801b41 (unknown)
Aborted
{noformat}
> Client does not fail over due to timeout
> ----------------------------------------
>
> Key: KUDU-1034
> URL: https://issues.apache.org/jira/browse/KUDU-1034
> Project: Kudu
> Issue Type: Bug
> Components: client
> Affects Versions: Feature Complete
> Reporter: Mike Percy
> Assignee: Alexey Serbin
> Priority: Critical
> Attachments: client_timeout_fail.patch,
> client_timeout_flush_hang.patch
>
>
> The client will not fail over due to a timeout error. Attaching a failing
> test case.
> I just made the test case part of RaftConsensusITest because it was
> convenient, maybe it should go elsewhere.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)