[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 8: Code-Review+1 (3 comments) Overall looks good to me, but it seems the rate of flakiness for the newly added test is a bit high. http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc File src/kudu/integration-tests/raft_consensus_election-itest.cc: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@167 PS4, Line 167: TEST_F(RaftConsensusElectionITest, TestNewLeaderCantResolvePeers) { > Tried it and the test passes. Cool, thanks a lot for verifying that! http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@247 PS4, Line 247: }); : }; : NO_FATALS(wait_for_some_elections()); : NO_FATALS(cluster_->AssertNoCrashes()); : : // Stop injecting latency so we can get a stable leader. > Done Thank you! http://gerrit.cloudera.org:8080/#/c/17585/8/src/kudu/integration-tests/raft_consensus_election-itest.cc File src/kudu/integration-tests/raft_consensus_election-itest.cc: http://gerrit.cloudera.org:8080/#/c/17585/8/src/kudu/integration-tests/raft_consensus_election-itest.cc@217 PS8, Line 217: ASSERT_LT(initial_cstate.current_term() + 10, cstate.current_term()); It seems this fails rather often, even in RELEASE build: src/kudu/integration-tests/raft_consensus_election-itest.cc:218 Expected: (initial_cstate.current_term() + 10) < (cstate.current_term()), actual: 30 vs 22 Maybe, it's worth switch this scenario into manual election mode? -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 12 Jun 2021 01:55:06 + Gerrit-HasComments: Yes
[kudu-CR] [clock] change clock source selection for 'auto'
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17582 ) Change subject: [clock] change clock source selection for 'auto' .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17582/3/src/kudu/clock/hybrid_clock.cc File src/kudu/clock/hybrid_clock.cc: http://gerrit.cloudera.org:8080/#/c/17582/3/src/kudu/clock/hybrid_clock.cc@260 PS3, Line 260: uint64_t now_latest = GetPhysicalValueMicros(now) + error; > warning: The right operand of '+' is a garbage value [clang-analyzer-core.U This is a garbage warning: NowWithErrorOrDie() cannot return not setting the value for 'error'. Ignored. -- To view, visit http://gerrit.cloudera.org:8080/17582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 Gerrit-Change-Number: 17582 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 12 Jun 2021 00:27:09 + Gerrit-HasComments: Yes
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17585 to look at the new patch set (#8). Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus-test-util.h M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 9 files changed, 241 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/8 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17585 to look at the new patch set (#7). Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus-test-util.h M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 9 files changed, 241 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/7 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc File src/kudu/integration-tests/raft_consensus_election-itest.cc: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@167 PS4, Line 167: TEST_F(RaftConsensusElectionITest, TestNewLeaderCantResolvePeers) { > I'm curious: in the scope of this test, is it important or not to have conn Tried it and the test passes. http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@247 PS4, Line 247: if (peer.health_report().overall_health() == consensus::HealthReportPB::FAILED) { : has_failed_peer = true; : break; : } : } : } > In addition to this happy-ending, could you add a scenario when there is an Done -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 11 Jun 2021 23:54:33 + Gerrit-HasComments: Yes
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17585 to look at the new patch set (#6). Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus-test-util.h M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 9 files changed, 247 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/6 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [clock] change clock source selection for 'auto'
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17582 ) Change subject: [clock] change clock source selection for 'auto' .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/17582/2/src/kudu/clock/hybrid_clock.h File src/kudu/clock/hybrid_clock.h: http://gerrit.cloudera.org:8080/#/c/17582/2/src/kudu/clock/hybrid_clock.h@206 PS2, Line 206: // The 'builtin_ntp_servers' is used in case of TimeSource::BUILTIN_NTP_SYNC. > this line is no longer needed. Done http://gerrit.cloudera.org:8080/#/c/17582/2/src/kudu/clock/hybrid_clock.cc File src/kudu/clock/hybrid_clock.cc: http://gerrit.cloudera.org:8080/#/c/17582/2/src/kudu/clock/hybrid_clock.cc@260 PS2, Line 260: uint64_t now_latest = GetPhysicalValueMicros(now) + error; > warning: The right operand of '+' is a garbage value [clang-analyzer-core.U This is a garbage warning: NowWithErrorOrDie() cannot return not setting the value for 'error'. Ignored. -- To view, visit http://gerrit.cloudera.org:8080/17582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 Gerrit-Change-Number: 17582 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 11 Jun 2021 22:52:14 + Gerrit-HasComments: Yes
[kudu-CR] [clock] change clock source selection for 'auto'
Hello Tidy Bot, Kudu Jenkins, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17582 to look at the new patch set (#3). Change subject: [clock] change clock source selection for 'auto' .. [clock] change clock source selection for 'auto' This patch changes the logic to select particular time source when the pseudo-source 'auto' used for --time_source. Before this patch, the 'system' time source would be auto-selected if a Kudu server with --time_source=auto is run in an environment where the instance detector isn't aware of a dedicated NTP server (those are usually available via the host-only interface, at least for EC2 and GCE instances). After this patch, the 'builtin' time source would be auto-selected if a Kudu server runs with --time_source=auto in environment where the instance detector isn't aware of dedicated NTP servers AND the --builtin_ntp_servers flag is set to a valid value. Otherwise, if --builtin_ntp_servers flag is set to an empty/invalid value, 'system' becomes the auto-selected time source for platforms supporting get_ntptime() API, otherwise the catch-all case is used to auto-select 'system_unsync' as the time source. I also added a new test scenario to cover the updated functionality. Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 --- M src/kudu/clock/builtin_ntp.cc M src/kudu/clock/builtin_ntp.h M src/kudu/clock/hybrid_clock-test.cc M src/kudu/clock/hybrid_clock.cc M src/kudu/clock/hybrid_clock.h M src/kudu/clock/ntp-test.cc M src/kudu/server/default_path_handlers.cc 7 files changed, 146 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/17582/3 -- To view, visit http://gerrit.cloudera.org:8080/17582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 Gerrit-Change-Number: 17582 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [clock] use GTEST SKIP in BuiltinNtpWithMiniChronydTest
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17587 ) Change subject: [clock] use GTEST_SKIP in BuiltinNtpWithMiniChronydTest .. [clock] use GTEST_SKIP in BuiltinNtpWithMiniChronydTest Once switched to a newer version of gtest where GTEST_SKIP() is present, it make sense to use GTEST_SKIP() instead of simple return to have more accurate reports on passed/failed/skipped tests. This is a follow-up to 78bd6c04ef37ca8906379a587140bc646433e28c. Change-Id: I2b6fe6d8afb14e23a9f8b7766986b43040bfd7e9 Reviewed-on: http://gerrit.cloudera.org:8080/17587 Reviewed-by: Bankim Bhavsar Tested-by: Kudu Jenkins --- M src/kudu/clock/ntp-test.cc 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Bankim Bhavsar: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/17587 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I2b6fe6d8afb14e23a9f8b7766986b43040bfd7e9 Gerrit-Change-Number: 17587 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17584 ) Change subject: [util] flag to specify hosts to fail DNS resolution .. Patch Set 3: Verified+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/17584/2/src/kudu/util/net/net_util.cc File src/kudu/util/net/net_util.cc: http://gerrit.cloudera.org:8080/#/c/17584/2/src/kudu/util/net/net_util.cc@236 PS2, Line 236: " > nit: would it be helpful to report the address which the DNS resolution fai Done -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 21:16:19 + Gerrit-HasComments: Yes
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Andrew Wong has removed a vote on this change. Change subject: [util] flag to specify hosts to fail DNS resolution .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [clock] use GTEST SKIP in BuiltinNtpWithMiniChronydTest
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17587 ) Change subject: [clock] use GTEST_SKIP in BuiltinNtpWithMiniChronydTest .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17587 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2b6fe6d8afb14e23a9f8b7766986b43040bfd7e9 Gerrit-Change-Number: 17587 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 20:58:37 + Gerrit-HasComments: No
[kudu-CR] [clock] use GTEST SKIP in BuiltinNtpWithMiniChronydTest
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17587 Change subject: [clock] use GTEST_SKIP in BuiltinNtpWithMiniChronydTest .. [clock] use GTEST_SKIP in BuiltinNtpWithMiniChronydTest Once switched to a newer version of gtest where GTEST_SKIP() is present, it make sense to use GTEST_SKIP() instead of simple return to have more accurate reports on passed/failed/skipped tests. This is a follow-up to 78bd6c04ef37ca8906379a587140bc646433e28c. Change-Id: I2b6fe6d8afb14e23a9f8b7766986b43040bfd7e9 --- M src/kudu/clock/ntp-test.cc 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/87/17587/1 -- To view, visit http://gerrit.cloudera.org:8080/17587 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I2b6fe6d8afb14e23a9f8b7766986b43040bfd7e9 Gerrit-Change-Number: 17587 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17584 ) Change subject: [util] flag to specify hosts to fail DNS resolution .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 20:53:56 + Gerrit-HasComments: No
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 5: Code-Review+1 (2 comments) Overall looks good, but it would be great to add a test scenario which would exercise one other realistic scenario when the names many other tablet servers in the cluster are still resolvable, so the system replaces the failed peer with a new one for the time of the DNS resolver's malfunction for one particular kudu-tserver node. http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc File src/kudu/integration-tests/raft_consensus_election-itest.cc: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@167 PS4, Line 167: TEST_F(RaftConsensusElectionITest, TestNewLeaderCantResolvePeers) { I'm curious: in the scope of this test, is it important or not to have connections already established between Raft peers? Will anything change if adding --rpc_reopen_outbound_connections=true in this scenario? http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/integration-tests/raft_consensus_election-itest.cc@247 PS4, Line 247: // Once we stop failing DNS resolution, the peer should become healthy. : for (const auto& ts : tablet_servers_) { : ASSERT_OK(cluster_->SetFlag(cluster_->tablet_server_by_uuid(ts.first), : "fail_dns_resolution", "false")); : } : NO_FATALS(check_for_failed_peer(/*expect_failed*/false)); In addition to this happy-ending, could you add a scenario when there is an extra tablet servers to spawn a replacement replica? I guess the expected result would be having the peer with failed DNS name replaced, and it's important to make sure everything works as expected in that case. -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 11 Jun 2021 19:52:17 + Gerrit-HasComments: Yes
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus-test-util.h File src/kudu/consensus/consensus-test-util.h: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus-test-util.h@341 PS4, Line 341: class NoOpTestPeerProxy : public TestPeerProxy { > warning: parameter 'peer_pb' is unused [misc-unused-parameters] Done http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus-test-util.h@342 PS4, Line 342: public: > This could be dangerous. The ownership of the raw ptr has been transferred Existing test usage of Peers has been unsafe in similar ways (e.g. creating a proxy as a raw pointer and passing ownership to the Peers). Updated this impl a bit to at least pass ASAN, and added a comment that this can be unsafe and that users should be wary. If you feel strongly that this isn't sufficient for tests, I can go about making more drastic changes. http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus_peers-test.cc File src/kudu/consensus/consensus_peers-test.cc: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus_peers-test.cc@152 PS4, Line 152: , > This piece looks problematic even before this change. Done -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 11 Jun 2021 19:47:05 + Gerrit-HasComments: Yes
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Hello Alexey Serbin, Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17584 to look at the new patch set (#3). Change subject: [util] flag to specify hosts to fail DNS resolution .. [util] flag to specify hosts to fail DNS resolution This patch adds a flag to specify the specific hostports to which failures are injected by --fail_dns_resolution. I intend on using this in a subsequent patch. Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab --- M src/kudu/util/net/net_util-test.cc M src/kudu/util/net/net_util.cc 2 files changed, 61 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/17584/3 -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Hello Tidy Bot, Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17585 to look at the new patch set (#5). Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus-test-util.h M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 9 files changed, 194 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/5 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus-test-util.h File src/kudu/consensus/consensus-test-util.h: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus-test-util.h@342 PS4, Line 342: *proxy = std::unique_ptr(proxy_); This could be dangerous. The ownership of the raw ptr has been transferred to the unique_ptr but the proxy_ still points to a location which can be deleted by the caller of NewProxy() holding the unique_ptr. It'd be better if proxy_ itself is a unique_ptr. http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus_peers-test.cc File src/kudu/consensus/consensus_peers-test.cc: http://gerrit.cloudera.org:8080/#/c/17585/4/src/kudu/consensus/consensus_peers-test.cc@152 PS4, Line 152: return proxy_ptr; > warning: Use of memory after it is freed [clang-analyzer-cplusplus.NewDelet This piece looks problematic even before this change. -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 11 Jun 2021 19:21:18 + Gerrit-HasComments: Yes
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17584 ) Change subject: [util] flag to specify hosts to fail DNS resolution .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/17584/2/src/kudu/util/net/net_util.cc File src/kudu/util/net/net_util.cc: http://gerrit.cloudera.org:8080/#/c/17584/2/src/kudu/util/net/net_util.cc@236 PS2, Line 236: " nit: would it be helpful to report the address which the DNS resolution failure was injected for? -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 18:59:24 + Gerrit-HasComments: Yes
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/consensus_peers.cc@508 PS2, Line 508: d::move(peer_proxy) > From the test code looks like this could be null. Do we need to check for n Done http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.cc File src/kudu/consensus/peer_manager.cc: http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.cc@77 PS2, Line 77: unique_ptr peer_proxy; : WARN_NOT_OK(peer_proxy_factory_->NewProxy(peer_pb, _proxy), : Substitute("Unable to create proxy for $0", peer_pb.permanent_uuid())); > Since the proxy factory is being supplied to NewRemotePeer, this could move I wanted to not be too invasive with the API for tests originally, but turns out this was only used in one test. -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 18:30:37 + Gerrit-HasComments: Yes
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Hello Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17585 to look at the new patch set (#4). Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus-test-util.h M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 9 files changed, 207 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/4 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17584 ) Change subject: [util] flag to specify hosts to fail DNS resolution .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 17:45:03 + Gerrit-HasComments: No
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.h File src/kudu/consensus/peer_manager.h: http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.h@60 PS2, Line 60: // proxies also re-attempt to re-resolve their addresses. > What happens currently when the peers are not retried? Ah I left this from before I started re-resolving. Peers are retried in this patch. -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 17:43:47 + Gerrit-HasComments: Yes
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17584 ) Change subject: [util] flag to specify hosts to fail DNS resolution .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/17584/1/src/kudu/util/net/net_util.cc File src/kudu/util/net/net_util.cc: http://gerrit.cloudera.org:8080/#/c/17584/1/src/kudu/util/net/net_util.cc@228 PS1, Line 228: for (const auto& hp_str : failed_hostports) { > Nit:To ensure no white spacing issue or missing/default port, it'd be bette Done -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 17:43:21 + Gerrit-HasComments: Yes
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Hello Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17584 to look at the new patch set (#2). Change subject: [util] flag to specify hosts to fail DNS resolution .. [util] flag to specify hosts to fail DNS resolution This patch adds a flag to specify the specific hostports to which failures are injected by --fail_dns_resolution. I intend on using this in a subsequent patch. Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab --- M src/kudu/util/net/net_util-test.cc M src/kudu/util/net/net_util.cc 2 files changed, 61 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/17584/2 -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Hello Kudu Jenkins, Grant Henke, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17585 to look at the new patch set (#3). Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 8 files changed, 161 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/3 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/consensus_peers.cc File src/kudu/consensus/consensus_peers.cc: http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/consensus_peers.cc@508 PS2, Line 508: peer_proxy_factory_ >From the test code looks like this could be null. Do we need to check for >not-null ptr or return an error? http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.cc File src/kudu/consensus/peer_manager.cc: http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.cc@77 PS2, Line 77: unique_ptr peer_proxy; : WARN_NOT_OK(peer_proxy_factory_->NewProxy(peer_pb, _proxy), : Substitute("Unable to create proxy for $0", peer_pb.permanent_uuid())); Since the proxy factory is being supplied to NewRemotePeer, this could move to NewRemotePeer() itself utilizing the CreateIfNeeded() function and no need to supply peer_proxy to NewRemotePeer(). -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 17:40:03 + Gerrit-HasComments: Yes
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17584 ) Change subject: [util] flag to specify hosts to fail DNS resolution .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17584/1/src/kudu/util/net/net_util.cc File src/kudu/util/net/net_util.cc: http://gerrit.cloudera.org:8080/#/c/17584/1/src/kudu/util/net/net_util.cc@228 PS1, Line 228: if (ContainsKey(failed_hostports, ToString())) { Nit:To ensure no white spacing issue or missing/default port, it'd be better to compare HostPort/Sockaddr instead of strings. -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 16:24:13 + Gerrit-HasComments: Yes
[kudu-CR] [rest] add oat++ framework to kudu
Hello Attila Bukor, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17554 to look at the new patch set (#6). Change subject: [rest] add oat++ framework to kudu .. [rest] add oat++ framework to kudu Add oat++ framework with swagger module, which is used in the follow-up commits to implement an embeddable rest server into the master servers and later on tablet servers. Change-Id: Ie1b5376297b0170a624655acf836cdedc090f6e7 --- M CMakeLists.txt A cmake_modules/FindOatpp.cmake A cmake_modules/FindOatppSwagger.cmake M thirdparty/build-definitions.sh M thirdparty/build-thirdparty.sh M thirdparty/download-thirdparty.sh M thirdparty/vars.sh 7 files changed, 143 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/54/17554/6 -- To view, visit http://gerrit.cloudera.org:8080/17554 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie1b5376297b0170a624655acf836cdedc090f6e7 Gerrit-Change-Number: 17554 Gerrit-PatchSet: 6 Gerrit-Owner: Khazar Mammadli Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] WIP [clock] change clock source selection for 'auto'
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/17582 ) Change subject: WIP [clock] change clock source selection for 'auto' .. Patch Set 2: (1 comment) The overall change/approach makes sense to me. http://gerrit.cloudera.org:8080/#/c/17582/2/src/kudu/clock/hybrid_clock.h File src/kudu/clock/hybrid_clock.h: http://gerrit.cloudera.org:8080/#/c/17582/2/src/kudu/clock/hybrid_clock.h@206 PS2, Line 206: // The 'builtin_ntp_servers' is used in case of TimeSource::BUILTIN_NTP_SYNC. this line is no longer needed. -- To view, visit http://gerrit.cloudera.org:8080/17582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 Gerrit-Change-Number: 17582 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 11 Jun 2021 12:56:27 + Gerrit-HasComments: Yes
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.h File src/kudu/consensus/peer_manager.h: http://gerrit.cloudera.org:8080/#/c/17585/2/src/kudu/consensus/peer_manager.h@60 PS2, Line 60: // TODO(awong): those peers should be retried later. What happens currently when the peers are not retried? -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 12:55:42 + Gerrit-HasComments: Yes
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/17585 ) Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 12:51:27 + Gerrit-HasComments: No
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Grant Henke has removed a vote on this change. Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/17584 ) Change subject: [util] flag to specify hosts to fail DNS resolution .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 11 Jun 2021 12:51:12 + Gerrit-HasComments: No
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17585 to look at the new patch set (#2). Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 8 files changed, 161 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/2 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [consensus] KUDU-2302: don't crash if new leader can't resolve peer
Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17585 Change subject: [consensus] KUDU-2302: don't crash if new leader can't resolve peer .. [consensus] KUDU-2302: don't crash if new leader can't resolve peer When a tablet replica is elected leader, it constructs Peer objects for each replica in the Raft config for the sake of sending RPCs to each. If, during this construction, any remote peer cannot be reached for whatever reason, this would result in a crash. Rather than crashing, this patch allows us to start Peers without a proxy, and retries constructing the proxy the next time a proxy is required. Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 --- M src/kudu/consensus/consensus_peers-test.cc M src/kudu/consensus/consensus_peers.cc M src/kudu/consensus/consensus_peers.h M src/kudu/consensus/peer_manager.cc M src/kudu/consensus/peer_manager.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/integration-tests/raft_consensus_election-itest.cc 8 files changed, 160 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/17585/1 -- To view, visit http://gerrit.cloudera.org:8080/17585 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I22d870ecc526fa47b97f6856c3b023bc1ec029c7 Gerrit-Change-Number: 17585 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong
[kudu-CR] [util] flag to specify hosts to fail DNS resolution
Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17584 Change subject: [util] flag to specify hosts to fail DNS resolution .. [util] flag to specify hosts to fail DNS resolution This patch adds a flag to specify the specific hostports to which failures are injected by --fail_dns_resolution. I intend on using this in a subsequent patch. Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab --- M src/kudu/util/net/net_util-test.cc M src/kudu/util/net/net_util.cc 2 files changed, 53 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/84/17584/1 -- To view, visit http://gerrit.cloudera.org:8080/17584 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I25f2d3073fa0fa0a430afacf7c8cd1402e6bd6ab Gerrit-Change-Number: 17584 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong
[kudu-CR] wip
Alexey Serbin has abandoned this change. ( http://gerrit.cloudera.org:8080/17581 ) Change subject: wip .. Abandoned submitted by mistake before squashing commits -- To view, visit http://gerrit.cloudera.org:8080/17581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Ib5e9c4cda6318a56252678447191ea8f265fb9fd Gerrit-Change-Number: 17581 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] WIP [clock] change clock source selection for 'auto'
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17582 to look at the new patch set (#2). Change subject: WIP [clock] change clock source selection for 'auto' .. WIP [clock] change clock source selection for 'auto' This patch changes the logic to select particular time source the pseudo-source 'auto' used for --time_source. Before this patch, the 'system' time source would be selected if a Kudu server --time_source=auto is run on a non-cloud VM or other environment where the cloud instance detector isn't aware of a dedicated NTP server (which is usually available via the host-only interface). After this patch, the 'builtin' time source would be selected if a Kudu server runs with --time_source=auto on a non-cloud VM or other environment where the cloud instance detector isn't aware of dedicated NTP servers AND the --builtin_ntp_servers flag is set to a valid value. WIP: * collect initial feedback * update/add tests to cover the newly introduced functionality Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 --- M src/kudu/clock/builtin_ntp.cc M src/kudu/clock/builtin_ntp.h M src/kudu/clock/hybrid_clock.cc M src/kudu/clock/hybrid_clock.h M src/kudu/clock/ntp-test.cc M src/kudu/server/default_path_handlers.cc 6 files changed, 83 insertions(+), 54 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/17582/2 -- To view, visit http://gerrit.cloudera.org:8080/17582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 Gerrit-Change-Number: 17582 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] WIP [clock] change clock source selection for 'auto'
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17582 Change subject: WIP [clock] change clock source selection for 'auto' .. WIP [clock] change clock source selection for 'auto' This patch changes the logic to select particular time source the pseudo-source 'auto' used for --time_source. Before this patch, the 'system' time source would be selected if a Kudu server --time_source=auto is run on a non-cloud VM or other environment where the cloud instance detector isn't aware of a dedicated NTP server (which is usually available via the host-only interface). After this patch, the 'builtin' time source would be selected if a Kudu server runs with --time_source=auto on a non-cloud VM or other environment where the cloud instance detector isn't aware of dedicated NTP servers AND the --builtin_ntp_servers flag is set to a valid value. WIP: * collect initial feedback * update/add tests to cover the newly introduced functionality Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 --- M src/kudu/clock/builtin_ntp.cc M src/kudu/clock/hybrid_clock.h M src/kudu/clock/ntp-test.cc 3 files changed, 28 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/17582/1 -- To view, visit http://gerrit.cloudera.org:8080/17582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I72cb54c488b2aa2c73acd6e6e5f6e50dd5811175 Gerrit-Change-Number: 17582 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin
[kudu-CR] wip
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17581 Change subject: wip .. wip Change-Id: Ib5e9c4cda6318a56252678447191ea8f265fb9fd --- M src/kudu/clock/builtin_ntp.cc M src/kudu/clock/builtin_ntp.h M src/kudu/clock/hybrid_clock.cc M src/kudu/clock/hybrid_clock.h M src/kudu/clock/ntp-test.cc M src/kudu/server/default_path_handlers.cc 6 files changed, 59 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/17581/1 -- To view, visit http://gerrit.cloudera.org:8080/17581 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib5e9c4cda6318a56252678447191ea8f265fb9fd Gerrit-Change-Number: 17581 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin