Mike Percy created KUDU-2734:
--------------------------------

             Summary: RemoteKsckTest.TestClusterWithLocation is flaky
                 Key: KUDU-2734
                 URL: https://issues.apache.org/jira/browse/KUDU-2734
             Project: Kudu
          Issue Type: Improvement
          Components: test
    Affects Versions: 1.9.0
            Reporter: Mike Percy


RemoteKsckTest.TestClusterWithLocation is flaky

Alexey took a look at it and here is the analysis:

In essence, due to slowness of TSAN builds, connection negotiation from kudu 
CLI to one of master servers timed out, so one of the preconditions of the test 
didn't meet.  The error output by the test was:
{code:java}
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/tools/ksck_remote-test.cc:523:
 Failure
Failed                                                                          
Bad status: Network error: failed to gather info from all masters: 1 of 3 had 
errors
{code}
The corresponding error in the master's log was:
{code:java}
W0221 12:38:27.119146 31380 negotiation.cc:313] Failed RPC negotiation. Trace:  
0221 12:38:23.949428 (+     0us) reactor.cc:583] Submitting negotiation task 
for client connection to 127.25.42.190:51799
0221 12:38:25.362220 (+1412792us) negotiation.cc:98] Waiting for socket to 
connect
0221 12:38:25.363489 (+  1269us) client_negotiation.cc:167] Beginning 
negotiation
0221 12:38:25.369976 (+  6487us) client_negotiation.cc:244] Sending NEGOTIATE 
NegotiatePB request
0221 12:38:25.431582 (+ 61606us) client_negotiation.cc:261] Received NEGOTIATE 
NegotiatePB response
0221 12:38:25.431610 (+    28us) client_negotiation.cc:355] Received NEGOTIATE 
response from server
0221 12:38:25.432659 (+  1049us) client_negotiation.cc:182] Negotiated 
authn=SASL
0221 12:38:27.051125 (+1618466us) client_negotiation.cc:483] Received 
TLS_HANDSHAKE response from server
0221 12:38:27.062085 (+ 10960us) client_negotiation.cc:471] Sending 
TLS_HANDSHAKE message to server
0221 12:38:27.062132 (+    47us) client_negotiation.cc:244] Sending 
TLS_HANDSHAKE NegotiatePB request
0221 12:38:27.064391 (+  2259us) negotiation.cc:304] Negotiation complete: 
Timed out: Client connection negotiation failed: client connection to 
127.25.42.190:51799: BlockingWrite timed out
{code}
We are seeing this on the flaky test dashboard for both TSAN and ASAN builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to