Alexey Serbin has uploaded this change for review. (
http://gerrit.cloudera.org:8080/23099
Change subject: [tests] fix flakiness in exactly_once_writes-itest
......................................................................
[tests] fix flakiness in exactly_once_writes-itest
This changelist fixes a flakiness in ExactlyOnceSemanticsITest, the
TestWritesWithExactlyOnceSemanticsWithCrashyNodes scenario. As one can
see from the log snippet below, an unexpected crash happened in
TabletServerIntegrationTestBase::BuildAndStart() because of the
--fault_crash_after_leader_request_fraction flag setting inducing
the crash. Apparently, it wasn't intended to crash in BuildAndStart(),
so now the fault injection flag is set via SetFlag() _after_ starting
the cluster and bootstrapping the components of test harness.
E20250626 08:41:09.165330 27745 fault_injection.cc:59] injecting fault for
kudu (pid 27730): FLAGS_fault_crash_after_leader_request_fraction (process will
exit)
W20250626 08:41:09.188339 27092 connection.cc:537] server connection from
127.26.53.135:35457 recv error: Network error: recv error from unknown peer:
Transport endpoint is not connected (error 107)
W20250626 08:41:09.188642 27933 negotiation.cc:337] Failed RPC negotiation.
Trace:
0626 08:41:09.162641 (+ 0us) reactor.cc:625] Submitting negotiation task
for client connection to 127.26.53.135:44789 (local address 127.0.0.1:40552)
0626 08:41:09.162821 (+ 180us) negotiation.cc:107] Waiting for socket to
connect
0626 08:41:09.162841 (+ 20us) client_negotiation.cc:174] Beginning
negotiation
0626 08:41:09.162954 (+ 113us) client_negotiation.cc:252] Sending NEGOTIATE
NegotiatePB request
0626 08:41:09.186860 (+ 23906us) negotiation.cc:327] Negotiation complete:
Network error: Client connection negotiation failed: client connection to
127.26.53.135:44789: BlockingRecv error: recv error from unknown peer:
Transport endpoint is not connected (error 107)
Metrics: {"client-negotiator.queue_time_us":58}
/home/jenkins-slave/workspace/build_and_test/src/kudu/integration-tests/ts_itest-base.cc:555:
Failure
Failed
Bad status: Network error: Client connection negotiation failed: client
connection to 127.26.53.135:44789: BlockingRecv error: recv error from unknown
peer: Transport endpoint is not connected (error 107)
/home/jenkins-slave/workspace/build_and_test/src/kudu/integration-tests/exactly_once_writes-itest.cc:208:
Failure
Expected: BuildAndStart(ts_flags, master_flags) doesn't generate new fatal
failures in the current thread.
Actual: it does.
Change-Id: Ia2d9da084220fa48749fa67cfb4dca422a9601f2
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/integration-tests/exactly_once_writes-itest.cc
2 files changed, 44 insertions(+), 23 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/23099/1
--
To view, visit http://gerrit.cloudera.org:8080/23099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia2d9da084220fa48749fa67cfb4dca422a9601f2
Gerrit-Change-Number: 23099
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <[email protected]>