[ 
https://issues.apache.org/jira/browse/KUDU-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016582#comment-16016582
 ] 

Alexey Serbin commented on KUDU-1034:
-------------------------------------

Running the new test implemented in the patch from [~mpercy] 
(client_timeout_fail.patch), the current Kudu C++ client apparently retries but 
eventually test fails due to consistency check (that's true both for DEBUG and 
RELEASE configurations):

{noformat}
W0518 15:41:00.559075  6311 consensus_peers.cc:357] T 
fb88746eb1674bbaacbcf459dd492669 P 623bba78cbb6472dbd2c8779cf51c93d -> Peer 
16df2bf0fb074c46959c06f6f2069150 (127.24.81.2:43726): Couldn't send request to 
peer 16df2bf0fb074c46959c06f6f2069150 for tablet 
fb88746eb1674bbaacbcf459dd492669. Status: Timed out: UpdateConsensus RPC to 
127.24.81.2:43726 timed out after 0.050s (ON_OUTBOUND_QUEUE). Retrying in the 
next heartbeat period. Already tried 20 times.
W0518 15:41:00.638219  6697 batcher.cc:329] Timed out: Failed to write batch of 
50 ops to tablet fb88746eb1674bbaacbcf459dd492669 after 1 attempt(s): Failed to 
write to server: 16df2bf0fb074c46959c06f6f2069150 (127.24.81.2:43726): Write 
RPC to 127.24.81.2:43726 timed out after 0.500s (SENT)
W0518 15:41:01.059166  6311 consensus_peers.cc:357] T 
fb88746eb1674bbaacbcf459dd492669 P 623bba78cbb6472dbd2c8779cf51c93d -> Peer 
16df2bf0fb074c46959c06f6f2069150 (127.24.81.2:43726): Couldn't send request to 
peer 16df2bf0fb074c46959c06f6f2069150 for tablet 
fb88746eb1674bbaacbcf459dd492669. Status: Timed out: UpdateConsensus RPC to 
127.24.81.2:43726 timed out after 0.050s (ON_OUTBOUND_QUEUE). Retrying in the 
next heartbeat period. Already tried 21 times.
F0518 15:41:01.216828  6225 raft_consensus-itest.cc:454] Check failed: 
workload.rows_inserted() >= rows_target (1450 vs. 1550) 
*** Check failure stack trace: ***
    @           0x8a59b5  google::LogMessage::SendToLog()
    @           0x8a5e9f  google::LogMessage::Flush()
    @           0x8a99f2  google::LogMessageFatal::~LogMessageFatal()
    @           0x803fb8  
kudu::tserver::RaftConsensusITest_TestClientFailoverOnLeaderTimeout_Test::TestBody()
    @          0x1894aa7  
testing::internal::HandleExceptionsInMethodIfSupported<>()
    @          0x1879022  testing::Test::Run()
    @          0x187a284  testing::TestInfo::Run()
    @          0x187aa33  testing::TestCase::Run()
    @          0x1883539  testing::internal::UnitTestImpl::RunAllTests()
    @          0x1895683  
testing::internal::HandleExceptionsInMethodIfSupported<>()
    @          0x18830ea  testing::UnitTest::Run()
    @           0x8a1dc9  main
    @       0x3ae0a1ed5d  (unknown)
    @           0x801b41  (unknown)
Aborted
{noformat}



> Client does not fail over due to timeout
> ----------------------------------------
>
>                 Key: KUDU-1034
>                 URL: https://issues.apache.org/jira/browse/KUDU-1034
>             Project: Kudu
>          Issue Type: Bug
>          Components: client
>    Affects Versions: Feature Complete
>            Reporter: Mike Percy
>            Assignee: Alexey Serbin
>            Priority: Critical
>         Attachments: client_timeout_fail.patch, 
> client_timeout_flush_hang.patch
>
>
> The client will not fail over due to a timeout error. Attaching a failing 
> test case.
> I just made the test case part of RaftConsensusITest because it was 
> convenient, maybe it should go elsewhere.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to