[kudu-CR] tools: don't open block manager when dumping UUID
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14243 ) Change subject: tools: don't open block manager when dumping UUID .. Patch Set 1: There isn't much behind this CR; I just ran the tool on a pretty loaded server and it took a while to get the UUID. -- To view, visit http://gerrit.cloudera.org:8080/14243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0acf3b08fc13c169a52d4cfe847efd62bd30cc0a Gerrit-Change-Number: 14243 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 17 Sep 2019 05:56:00 + Gerrit-HasComments: No
[kudu-CR] tools: don't open block manager when dumping UUID
Hello Adar Dembo, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/14243 to review the following change. Change subject: tools: don't open block manager when dumping UUID .. tools: don't open block manager when dumping UUID The block manager can be expensive to open, even when opening it in read-only mode. If all a user wants is the server's UUID, all we really need is to read the instance metadata files. This patch updates the `kudu fs dump uuid` tool to do just that. Change-Id: I0acf3b08fc13c169a52d4cfe847efd62bd30cc0a --- M src/kudu/fs/fs_manager.cc M src/kudu/fs/fs_manager.h M src/kudu/tools/tool_action_fs.cc 3 files changed, 32 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/14243/1 -- To view, visit http://gerrit.cloudera.org:8080/14243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0acf3b08fc13c169a52d4cfe847efd62bd30cc0a Gerrit-Change-Number: 14243 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo
[kudu-CR] [tserver] include ip:port in the tserver name
helifu has posted comments on this change. ( http://gerrit.cloudera.org:8080/14232 ) Change subject: [tserver] include ip:port in the tserver name .. Patch Set 2: > (1 comment) > > The TSAN failure is interesting. It's not related to this patch, > but it is related to one of your previous patches, Lifu. > > In ts_tablet_manager-itest.cc: > > // 6. Restart the masters. > { > for (int i = 0; i < kNumMasters; ++i) { > int idx = 0; > ASSERT_OK(cluster_->GetLeaderMasterIndex()); > master::MiniMaster* mini_master = CHECK_NOTNULL(cluster_->mini_master(idx)); > mini_master->Shutdown(); > SleepFor(MonoDelta::FromMilliseconds(kMaxElectionTime)); > ASSERT_OK(mini_master->Restart()); > // Sometimes the election fails until the node restarts. > // And the restarted node is elected leader again. > // So, it is necessary to wait for all tservers to report. > SleepFor(MonoDelta::FromMilliseconds(FLAGS_heartbeat_interval_ms)); > NO_FATALS(CheckStats(kRowsCount)); > } > } > > I think there needs to be a call to mini_master->WaitForCatalogManagerInit() > somewhere, perhaps after the Restart(), or after the waiting. The test case here only cares about the leader master, so I don't think it's necessary to make sure the restarted master should be online, that means calling "mini_master->WaitForCatalogManagerInit()" is not required. Besides, there is evidence that the restarted master had been online before failure, please look at the full test output from Line2136 to Line2168. -- To view, visit http://gerrit.cloudera.org:8080/14232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6b0d8e29b6cac5a900923b881f0d7a74facbd6e Gerrit-Change-Number: 14232 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: helifu Gerrit-Comment-Date: Tue, 17 Sep 2019 02:40:07 + Gerrit-HasComments: No
[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing
Hannah Nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 ) Change subject: WIP KUDU-2780: create thread for auto-rebalancing .. Patch Set 3: (40 comments) http://gerrit.cloudera.org:8080/#/c/14177/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14177/3//COMMIT_MSG@8 PS3, Line 8: : Created auto-rebalancer thread in background tasks of catalog manager. : Set up framework for auto-rebalancing loop. : Loop retrieves information on tservers, tables, and tablets for rebalancing. : The number of replica moves per loop iteration is controlled by a flag. : If there are placement policy violations, the current loop iteration will only : perform replica moves to reinstate the policy. : Otherwise, the auto-rebalancer will perform moves to do inter-location(cross-location), : then intra-location(by table then by tserver) rebalancing. : If the cluster is balanced, the current rebalancing cycle completes, and the : thread will sleep for an interval, controlled by another flag. > nit: I found this a little difficult to read. Could you split it up into pa Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc File src/kudu/master/auto_rebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@58 PS3, Line 58: if (cluster_) > nit: it's a good practice to use scope braces for 'if ()' clause even it's Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@61 PS3, Line 61: else return -1; > warning: do not use 'else' after 'return' [readability-else-after-return] Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@68 PS3, Line 68: else return -1; > warning: do not use 'else' after 'return' [readability-else-after-return] Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@85 PS3, Line 85: int kNumMasters = 3; : int kNumTservers = 3; : int kNumTablets = 4; > nit: these are constants, right? Add const/constexpr if so. Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@96 PS3, Line 96: 10 > 10+ seconds run-time for a test makes it a 'slow' test. Add a notion of ru Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@101 PS3, Line 101: if (i==leader_idx) ASSERT_NE(0, number_of_loop_iterations(i)); : else ASSERT_EQ(0, moves_scheduled_this_round(i)); > Would this test fail if master leadership changes while the test is running Hmm, I think it would. Is it enough to just reset the leader index then restart the for loop? ie: if (leader_changed) { leader_idx = new_leader_idx; i=0; continue; } http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@104 PS3, Line 104: > nit: an extra space Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@111 PS3, Line 111: int kNumMasters = 3; : int kNumTservers = 3; : int kNumTablets = 3; > Add const/constexpr. Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@132 PS3, Line 132: auto iterations = number_of_loop_iterations(new_leader_idx); : ASSERT_LT(0, iterations); > Might it happen that the rebalancing thread hasn't yet after master leader Would it make sense to sleep for a bit to make sure the auto-rebalancer thread has been initialized and begins iterating? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@139 PS3, Line 139: kNumTservers, kNumTablets; > These are not constants, but 'kXxx' is used only for constants. Change the Done http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@214 PS3, Line 214: auto-rebalancing flag is paused > nit: how does one pause a flag? How do you allow a flag to be changed via the CLI? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@241 PS3, Line 241: ASSERT_GE(moves_scheduled_before, moves_scheduled_after); > If there shouldn't be any moves, then why ASSERT_GE(), not ASSERT_EQ()? if the flag to pause auto-rebalancing is set, the loop is still active (ie checking cluster status), just not getting or executing replica moves. It's possible that while auto-rebalancing is paused, the cluster becomes balanced. In this case, the variable "moves_scheduled_this_round" will be reset to 0. So it's possible that moves_scheduled_after < moves_scheduled_before. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@243 PS3, Line 243: FLAGS_auto_rebalancing_stop_flag = orig_flag; > I don't think this is
[kudu-CR] KUDU-2069 p4: stop replication from failed servers in maintenance mode
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14222 ) Change subject: KUDU-2069 p4: stop replication from failed servers in maintenance mode .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.cc File src/kudu/consensus/quorum_util.cc: http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.cc@456 PS2, Line 456: peer.attrs().replace() > I should have been more clear on what I said. Ah, just another observation. Probably, I should have mentioned that earlier for clarity. If I'm not mistaken, only tools like 'cluster rebalancer' and 'tablet change_config move_replica' set the 'REPLACE' attribute to make replica to be moved to some other tablet server. That corresponds to ChangeConfigType of MODIFY_PEER. The automatic re-replication in master uses ADD_PEER and REMOVE_PEER correspondingly. So, if we are about to ignore the REPLACE attribute when a tablet server is in the maintenance mode, that means a running session of the rebalancer tool of an explicit replica movement would stuck until the source tablet server is back into the regular (non-maintenance mode). Maybe it's worth double-checking what we want from the operability perspective here if ignoring the REPLACE attribute for tablet servers went into the maintenance mode: * allow for accumulation of those 'REPLACE' attributes set for replicas hosted by a tablet server in the maintenance mode (and be prepared for corresponding movements once the server is back into the normal mode) * explicitly disallow setting the 'REPLACE' attribute for replicas hosted by a tablet server in the maintenance mode Another alternative would be honoring the 'REPLACE' attribute even if a tablet server is in the maintenance mode, given that the presence of the 'REPLACE' attribute means it has been set explicitly, and it was not a result of the automatic re-replication activity. -- To view, visit http://gerrit.cloudera.org:8080/14222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db Gerrit-Change-Number: 14222 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 17 Sep 2019 00:38:48 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2069 p4: stop replication from failed servers in maintenance mode
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14222 ) Change subject: KUDU-2069 p4: stop replication from failed servers in maintenance mode .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc File src/kudu/integration-tests/maintenance_mode-itest.cc: http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@156 PS2, Line 156: TestMaintenanceModeDoesntRereplicate > Does it make sense to add a scenario to verify that it's possible to move r To clarify, my comment was about the manual movement of replicas from the node, e.g., when explicitly moving a replica using 'kudu tablet change_config move_replica'. We talked offline regarding another comment related to this question. So, if we are OK with introducing the restriction of not allowing replicas to be moved even explicitly, please ignore this comment. -- To view, visit http://gerrit.cloudera.org:8080/14222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db Gerrit-Change-Number: 14222 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 17 Sep 2019 00:07:46 + Gerrit-HasComments: Yes
[kudu-CR] [catalog manager] correct the wrong comment
honeyhexin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14233 ) Change subject: [catalog_manager] correct the wrong comment .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14233/1/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/14233/1/src/kudu/master/catalog_manager.h@141 PS1, Line 141: . > nit: while you are at it, maybe change this comma (,) into period (.) as we Done -- To view, visit http://gerrit.cloudera.org:8080/14233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Gerrit-Change-Number: 14233 Gerrit-PatchSet: 2 Gerrit-Owner: honeyhexin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: honeyhexin Gerrit-Comment-Date: Mon, 16 Sep 2019 23:38:55 + Gerrit-HasComments: Yes
[kudu-CR] [catalog manager] correct the wrong comment
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14233 ) Change subject: [catalog_manager] correct the wrong comment .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Gerrit-Change-Number: 14233 Gerrit-PatchSet: 2 Gerrit-Owner: honeyhexin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: honeyhexin Gerrit-Comment-Date: Mon, 16 Sep 2019 23:39:04 + Gerrit-HasComments: No
[kudu-CR] [catalog manager] correct the wrong comment
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14233 ) Change subject: [catalog_manager] correct the wrong comment .. [catalog_manager] correct the wrong comment Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Reviewed-on: http://gerrit.cloudera.org:8080/14233 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin --- M src/kudu/master/catalog_manager.h 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Kudu Jenkins: Verified Alexey Serbin: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/14233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Gerrit-Change-Number: 14233 Gerrit-PatchSet: 3 Gerrit-Owner: honeyhexin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: honeyhexin
[kudu-CR] [thirdparty] add SO REUSEPORT for chronyd NTP socket
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14228 ) Change subject: [thirdparty] add SO_REUSEPORT for chronyd NTP socket .. [thirdparty] add SO_REUSEPORT for chronyd NTP socket This patch adds SO_REUSEPORT option for NTP server socket opened by chronyd. This is to allow for using the port reservation technique from external minicluster when starting chronyd test NTP servers. Change-Id: Iee26fcf93976dd7affe77254751016bcbf398620 Reviewed-on: http://gerrit.cloudera.org:8080/14228 Tested-by: Alexey Serbin Reviewed-by: Greg Solovyev Reviewed-by: Hao Hao Reviewed-by: Adar Dembo --- M thirdparty/download-thirdparty.sh A thirdparty/patches/chrony-reuseport.patch 2 files changed, 36 insertions(+), 2 deletions(-) Approvals: Alexey Serbin: Verified Greg Solovyev: Looks good to me, but someone else must approve Hao Hao: Looks good to me, but someone else must approve Adar Dembo: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/14228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iee26fcf93976dd7affe77254751016bcbf398620 Gerrit-Change-Number: 14228 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [thirdparty] add SO REUSEPORT for chronyd NTP socket
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14228 ) Change subject: [thirdparty] add SO_REUSEPORT for chronyd NTP socket .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee26fcf93976dd7affe77254751016bcbf398620 Gerrit-Change-Number: 14228 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 23:29:21 + Gerrit-HasComments: No
[kudu-CR] [thirdparty] add SO REUSEPORT for chronyd NTP socket
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14228 ) Change subject: [thirdparty] add SO_REUSEPORT for chronyd NTP socket .. Patch Set 1: > > > Could we get by with a patch to enable ephemeral port binding > > > instead? That's less invasive than this, which seems unlikely > to be > > > accepted upstream. > > > > Why do you think the ephemeral port binding change for chrony is > less invasive than adding SO_REUSEPORT (i.e. this small patch)? Do > you have some particular considerations w.r.t. SO_REUSEPORT change? > > > > From what I see in chrony's source code, changing its code to > properly support (and that's is needed if we talking about > accepting the patch upstream by chrony) binding to ephemeral port > would change many places there: interpretation of configuration > directive 'port' in configuraiton file, multiple places in code > that interpret port '0' as a special value meaning "don't open NTP > port at all", etc. By my understanding, that would be much more > invasive than this small patch. > > > > What do you think? > > I'll copy in what I wrote in Slack. > > When I wrote "invasive" I didn't mean that it'd require more or > less code change to chronyd; I was thinking about it being more > difficult to reason about chronyd and the other sockets bound on > the system. As I understand it, the purpose of SO_REUSEPORT is for > multi-threaded applications that wish to more evenly accept new > connections (or new datagrams, for UDP sockets). It's not for "port > reservation to enable testing" the way you're using it. If an > application binds with SO_REUSEPORT, another application running > with the same EUID can now accidentally bind to that address while > it's in use (if SO_REUSEPORT is fed to the second bind call). > Contrast this with ephemeral port usage, which is an age old > sockets concept and has "fire and forget" semantics: you bind to > port 0, you see what port you got, and at that point the semantics > are just like non-ephemeral port usage. > > In general, I think we should seek to minimize the number of third > party patches we carry, especially of the "upstream will never > accept this" variety. My concern is that SO_REUSEPORT in chronyd > won't be accepted because it's a little bit weird, and I think > that's less of an issue for ephemeral port usage. > > All that said, if it's going to be much more work to support > ephemeral ports, or if you think it's also nonsensical for reasons > I'm not seeing, then we can move forward with this instead. > > > Could we get by with a patch to enable ephemeral port binding > > > instead? That's less invasive than this, which seems unlikely > to be > > > accepted upstream. > > > > Why do you think the ephemeral port binding change for chrony is > less invasive than adding SO_REUSEPORT (i.e. this small patch)? Do > you have some particular considerations w.r.t. SO_REUSEPORT change? > > > > From what I see in chrony's source code, changing its code to > properly support (and that's is needed if we talking about > accepting the patch upstream by chrony) binding to ephemeral port > would change many places there: interpretation of configuration > directive 'port' in configuraiton file, multiple places in code > that interpret port '0' as a special value meaning "don't open NTP > port at all", etc. By my understanding, that would be much more > invasive than this small patch. > > > > What do you think? > > I'll copy in what I wrote in Slack. > > When I wrote "invasive" I didn't mean that it'd require more or > less code change to chronyd; I was thinking about it being more > difficult to reason about chronyd and the other sockets bound on > the system. As I understand it, the purpose of SO_REUSEPORT is for > multi-threaded applications that wish to more evenly accept new > connections (or new datagrams, for UDP sockets). It's not for "port > reservation to enable testing" the way you're using it. If an > application binds with SO_REUSEPORT, another application running > with the same EUID can now accidentally bind to that address while > it's in use (if SO_REUSEPORT is fed to the second bind call). > Contrast this with ephemeral port usage, which is an age old > sockets concept and has "fire and forget" semantics: you bind to > port 0, you see what port you got, and at that point the semantics > are just like non-ephemeral port usage. > > In general, I think we should seek to minimize the number of third > party patches we carry, especially of the "upstream will never > accept this" variety. My concern is that SO_REUSEPORT in chronyd > won't be accepted because it's a little bit weird, and I think > that's less of an issue for ephemeral port usage. > > All that said, if it's going to be much more work to support > ephemeral ports, or if
[kudu-CR] KUDU-2069 p4: stop replication from failed servers in maintenance mode
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14222 ) Change subject: KUDU-2069 p4: stop replication from failed servers in maintenance mode .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.cc File src/kudu/consensus/quorum_util.cc: http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.cc@456 PS2, Line 456: peer.attrs().replace() > Do we want to avoid replacing replicas from nodes in maintenance mode? I should have been more clear on what I said. My concern was that with this code moving replicas from a node put into the maintenance mode is impossible, even if it's done not because of automatic re-replication, but because of manual 'replica move' using kudu CLI. Andrew and I discussed that offline, and it seems the constraint of not allowing replica movement from tablet servers in the maintenance mode is inline with the intention. -- To view, visit http://gerrit.cloudera.org:8080/14222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db Gerrit-Change-Number: 14222 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 23:18:25 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2069 p4: stop replication from failed servers in maintenance mode
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14222 ) Change subject: KUDU-2069 p4: stop replication from failed servers in maintenance mode .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/master/ts_manager.h File src/kudu/master/ts_manager.h: http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/master/ts_manager.h@94 PS2, Line 94: void GetWhitelistedUuids(std::unordered_set* uuids) const; > Not sure I understand why that would be any better than generating the list If O(num tservers in maintenance mode) is relatively high, GetWhitelistedUuids will take a lot of locks. The acquisitions are one-at-a-time, but it's possible it's still computationally more expensive than ~3 TSManager lookups in ShouldAddReplica. With the change to only call GetWhitelistedUuids once per heartbeat vs. one per reported tablet, it's probably better than lookups in ShouldAddReplica _for full reports_. But what about incremental reports, which may only include a few tablets? As for the name, maybe it'd be clearer if you shifted the complexity to ShouldAddReplica by calling the method GetMaintenanceModeTServers. Then ShouldAddReplica gets to explain how/why MM affects rereplication, but since that area of code is already dealing with that concept, it's more natural for the full explanation to also live there. -- To view, visit http://gerrit.cloudera.org:8080/14222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db Gerrit-Change-Number: 14222 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 23:12:46 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2069 p1: add persistent tserver maintenance mode
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14217 ) Change subject: KUDU-2069 p1: add persistent tserver maintenance mode .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/14217/6/src/kudu/master/sys_catalog.cc File src/kudu/master/sys_catalog.cc: http://gerrit.cloudera.org:8080/#/c/14217/6/src/kudu/master/sys_catalog.cc@788 PS6, Line 788: CHECK_OK(SchemaToPB(schema_, req.mutable_schema())); : KuduPartialRow row(_); : CHECK_OK(row.SetInt8(kSysCatalogTableColType, TSERVER_STATE)); : CHECK_OK(row.SetStringNoCopy(kSysCatalogTableColId, tserver_id)); RETURN_NOT_OK for these? http://gerrit.cloudera.org:8080/#/c/14217/6/src/kudu/master/ts_manager.h File src/kudu/master/ts_manager.h: http://gerrit.cloudera.org:8080/#/c/14217/6/src/kudu/master/ts_manager.h@94 PS6, Line 94: Status SetTServerState(const std::string& ts_uuid, Doc. http://gerrit.cloudera.org:8080/#/c/14217/6/src/kudu/master/ts_manager.h@102 PS6, Line 102: Status ReloadTServerStates(SysCatalogTable* sys_catalog); Doc. http://gerrit.cloudera.org:8080/#/c/14217/6/src/kudu/master/ts_manager.h@122 PS6, Line 122: std::unordered_map ts_state_by_uuid_; We talked about this in person last week and I thought you were going to add TServerStatePB to TSDescriptor and reuse servers_by_id_ (after modifying TSDescriptor to support "persisted but not yet registered" descriptors). What changed your mind? http://gerrit.cloudera.org:8080/#/c/14217/4/src/kudu/master/ts_state.h File src/kudu/master/ts_state.h: http://gerrit.cloudera.org:8080/#/c/14217/4/src/kudu/master/ts_state.h@27 PS4, Line 27: : : : > You previously raised concerns about mixing on-disk enums with non-on-disk Yeah, I don't view the elision of certain special states from the overall set of states as a leak in the abstraction. But edge-triggered (ENTER_state) vs. level-triggered (IN_state) feels meaningful to me, which is why I advocated for different enums for RPC and persistence. -- To view, visit http://gerrit.cloudera.org:8080/14217 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib669b43b3cee171c4c7dbd54041e29c30cb9f767 Gerrit-Change-Number: 14217 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 16 Sep 2019 23:04:39 + Gerrit-HasComments: Yes
[kudu-CR] [catalog manager] correct the wrong comment
Hello Alexey Serbin, Kudu Jenkins, Andrew Wong, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14233 to look at the new patch set (#2). Change subject: [catalog_manager] correct the wrong comment .. [catalog_manager] correct the wrong comment Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 --- M src/kudu/master/catalog_manager.h 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/14233/2 -- To view, visit http://gerrit.cloudera.org:8080/14233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Gerrit-Change-Number: 14233 Gerrit-PatchSet: 2 Gerrit-Owner: honeyhexin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [catalog manager] correct the wrong comment
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14241 ) Change subject: [catalog_manager] correct the wrong comment .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8b5dd17e8bc5d7034ff078da05abd62c111bec8c Gerrit-Change-Number: 14241 Gerrit-PatchSet: 1 Gerrit-Owner: honeyhexin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 22:51:23 + Gerrit-HasComments: No
[kudu-CR] [catalog manager] correct the wrong comment
honeyhexin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14241 Change subject: [catalog_manager] correct the wrong comment .. [catalog_manager] correct the wrong comment Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Change-Id: I8b5dd17e8bc5d7034ff078da05abd62c111bec8c --- M src/kudu/master/catalog_manager.h 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/14241/1 -- To view, visit http://gerrit.cloudera.org:8080/14241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8b5dd17e8bc5d7034ff078da05abd62c111bec8c Gerrit-Change-Number: 14241 Gerrit-PatchSet: 1 Gerrit-Owner: honeyhexin
[kudu-CR] [thirdparty] add SO REUSEPORT for chronyd NTP socket
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14228 ) Change subject: [thirdparty] add SO_REUSEPORT for chronyd NTP socket .. Patch Set 1: > > Could we get by with a patch to enable ephemeral port binding > > instead? That's less invasive than this, which seems unlikely to be > > accepted upstream. > > Why do you think the ephemeral port binding change for chrony is less > invasive than adding SO_REUSEPORT (i.e. this small patch)? Do you have some > particular considerations w.r.t. SO_REUSEPORT change? > > From what I see in chrony's source code, changing its code to properly > support (and that's is needed if we talking about accepting the patch > upstream by chrony) binding to ephemeral port would change many places > there: interpretation of configuration directive 'port' in configuraiton > file, multiple places in code that interpret port '0' as a special value > meaning "don't open NTP port at all", etc. By my understanding, that would > be much more invasive than this small patch. > > What do you think? I'll copy in what I wrote in Slack. When I wrote "invasive" I didn't mean that it'd require more or less code change to chronyd; I was thinking about it being more difficult to reason about chronyd and the other sockets bound on the system. As I understand it, the purpose of SO_REUSEPORT is for multi-threaded applications that wish to more evenly accept new connections (or new datagrams, for UDP sockets). It's not for "port reservation to enable testing" the way you're using it. If an application binds with SO_REUSEPORT, another application running with the same EUID can now accidentally bind to that address while it's in use (if SO_REUSEPORT is fed to the second bind call). Contrast this with ephemeral port usage, which is an age old sockets concept and has "fire and forget" semantics: you bind to port 0, you see what port you got, and at that point the semantics are just like non-ephemeral port usage. In general, I think we should seek to minimize the number of third party patches we carry, especially of the "upstream will never accept this" variety. My concern is that SO_REUSEPORT in chronyd won't be accepted because it's a little bit weird, and I think that's less of an issue for ephemeral port usage. All that said, if it's going to be much more work to support ephemeral ports, or if you think it's also nonsensical for reasons I'm not seeing, then we can move forward with this instead. -- To view, visit http://gerrit.cloudera.org:8080/14228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iee26fcf93976dd7affe77254751016bcbf398620 Gerrit-Change-Number: 14228 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 22:50:16 + Gerrit-HasComments: No
[kudu-CR] Add get table statistics interface for cpp client
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14218 ) Change subject: Add get table statistics interface for cpp client .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14218 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic00f46fd49c258f46a178a92d142d4d93d615d82 Gerrit-Change-Number: 14218 Gerrit-PatchSet: 2 Gerrit-Owner: ZhangYao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: ZhangYao Gerrit-Comment-Date: Mon, 16 Sep 2019 18:48:53 + Gerrit-HasComments: No
[kudu-CR] Add get table statistics interface for cpp client
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14218 ) Change subject: Add get table statistics interface for cpp client .. Add get table statistics interface for cpp client Change-Id: Ic00f46fd49c258f46a178a92d142d4d93d615d82 Reviewed-on: http://gerrit.cloudera.org:8080/14218 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo --- M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/master_proxy_rpc.cc A src/kudu/client/table_statistics-internal.h 5 files changed, 150 insertions(+), 0 deletions(-) Approvals: Kudu Jenkins: Verified Adar Dembo: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/14218 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic00f46fd49c258f46a178a92d142d4d93d615d82 Gerrit-Change-Number: 14218 Gerrit-PatchSet: 3 Gerrit-Owner: ZhangYao Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: ZhangYao
[kudu-CR] [tserver] include ip:port in the tserver name
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14232 ) Change subject: [tserver] include ip:port in the tserver name .. Patch Set 2: (1 comment) The TSAN failure is interesting. It's not related to this patch, but it is related to one of your previous patches, Lifu. In ts_tablet_manager-itest.cc: // 6. Restart the masters. { for (int i = 0; i < kNumMasters; ++i) { int idx = 0; ASSERT_OK(cluster_->GetLeaderMasterIndex()); master::MiniMaster* mini_master = CHECK_NOTNULL(cluster_->mini_master(idx)); mini_master->Shutdown(); SleepFor(MonoDelta::FromMilliseconds(kMaxElectionTime)); ASSERT_OK(mini_master->Restart()); // Sometimes the election fails until the node restarts. // And the restarted node is elected leader again. // So, it is necessary to wait for all tservers to report. SleepFor(MonoDelta::FromMilliseconds(FLAGS_heartbeat_interval_ms)); NO_FATALS(CheckStats(kRowsCount)); } } I think there needs to be a call to mini_master->WaitForCatalogManagerInit() somewhere, perhaps after the Restart(), or after the waiting. http://gerrit.cloudera.org:8080/#/c/14232/2/src/kudu/tserver/tablet_server.cc File src/kudu/tserver/tablet_server.cc: http://gerrit.cloudera.org:8080/#/c/14232/2/src/kudu/tserver/tablet_server.cc@69 PS2, Line 69: if (kRunning != state_) { : return "TabletServer (stopped)"; : } : return strings::Substitute("TabletServer@$0", first_rpc_address().ToString()); > Maybe, this is a good opportunity to consolidate implementation of correspo +1. No reason why masters shouldn't also report this; it's useful in multi-master deployments. -- To view, visit http://gerrit.cloudera.org:8080/14232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6b0d8e29b6cac5a900923b881f0d7a74facbd6e Gerrit-Change-Number: 14232 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 18:47:33 + Gerrit-HasComments: Yes
[kudu-CR] [catalog manager] correct the wrong comment
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14233 ) Change subject: [catalog_manager] correct the wrong comment .. Patch Set 1: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/14233/1/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/14233/1/src/kudu/master/catalog_manager.h@141 PS1, Line 141: , nit: while you are at it, maybe change this comma (,) into period (.) as well? -- To view, visit http://gerrit.cloudera.org:8080/14233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Gerrit-Change-Number: 14233 Gerrit-PatchSet: 1 Gerrit-Owner: honeyhexin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 16:49:33 + Gerrit-HasComments: Yes
[kudu-CR] [tserver] include ip:port in the tserver name
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14232 ) Change subject: [tserver] include ip:port in the tserver name .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/14232/2/src/kudu/tserver/tablet_server.cc File src/kudu/tserver/tablet_server.cc: http://gerrit.cloudera.org:8080/#/c/14232/2/src/kudu/tserver/tablet_server.cc@69 PS2, Line 69: if (kRunning != state_) { : return "TabletServer (stopped)"; : } : return strings::Substitute("TabletServer@$0", first_rpc_address().ToString()); Maybe, this is a good opportunity to consolidate implementation of corresponding methods for both Master and TabletServer into kserver::KuduServer::ToString() ? -- To view, visit http://gerrit.cloudera.org:8080/14232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic6b0d8e29b6cac5a900923b881f0d7a74facbd6e Gerrit-Change-Number: 14232 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Mon, 16 Sep 2019 16:42:35 + Gerrit-HasComments: Yes
[kudu-CR] WIP KUDU-2780: create thread for auto-rebalancing
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 ) Change subject: WIP KUDU-2780: create thread for auto-rebalancing .. Patch Set 3: (15 comments) http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc File src/kudu/master/auto_rebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@58 PS3, Line 58: if (cluster_) nit: it's a good practice to use scope braces for 'if ()' clause even it's a single-line statement (sprawls to multi-line in this case). http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@85 PS3, Line 85: int kNumMasters = 3; : int kNumTservers = 3; : int kNumTablets = 4; nit: these are constants, right? Add const/constexpr if so. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@96 PS3, Line 96: 10 10+ seconds run-time for a test makes it a 'slow' test. Add a notion of running this test only when KUDU_ALLOW_SLOW_TESTS=1 is set; you can see how it's done in other tests. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@101 PS3, Line 101: if (i==leader_idx) ASSERT_NE(0, number_of_loop_iterations(i)); : else ASSERT_EQ(0, moves_scheduled_this_round(i)); Would this test fail if master leadership changes while the test is running? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@104 PS3, Line 104: nit: an extra space http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@111 PS3, Line 111: int kNumMasters = 3; : int kNumTservers = 3; : int kNumTablets = 3; Add const/constexpr. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@132 PS3, Line 132: auto iterations = number_of_loop_iterations(new_leader_idx); : ASSERT_LT(0, iterations); Might it happen that the rebalancing thread hasn't yet after master leader elected? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@137 PS3, Line 137: TEST_F(AutoRebalancerTest, NoReplicaMovesIfBalanced) { Could it be possible to implement these scenarios in form of unit testing of some particular methods/function of the rebalancer? That way it would give much more control over the replica placement, etc. Also, it would be easier and faster to run. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@139 PS3, Line 139: kNumTservers, kNumTablets; These are not constants, but 'kXxx' is used only for constants. Change the naming. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@214 PS3, Line 214: auto-rebalancing flag is paused nit: how does one pause a flag? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@241 PS3, Line 241: ASSERT_GE(moves_scheduled_before, moves_scheduled_after); If there shouldn't be any moves, then why ASSERT_GE(), not ASSERT_EQ()? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer-test.cc@243 PS3, Line 243: FLAGS_auto_rebalancing_stop_flag = orig_flag; I don't think this is necessary -- KuduTest does that automatically using google::FlagSaver http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.h File src/kudu/master/auto_rebalancer.h: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.h@81 PS3, Line 81: // Variables for testing. : int number_of_loop_iterations; : int moves_scheduled_this_round; If this is for testing, move then under 'private' section and move those tests friends of this class. Another approach might be introducing metrics for these readings: look around for METRIC_DEFINE_gauge_ in other components for examples. http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@120 PS3, Line 120: DEFINE_uint32(auto_rebalancing_max_moves_per_iteration, 5, : "Maximum number of total replica moves to perform during one " : "iteration of the auto-rebalancing loop."); If I'm not mistaken, in the original rebalancer code constant 5 is used as a factor multiplied by maximum move operations per tserver, and that's where that 'magic' constant came from. The minimum number of operations per server was 2, so effectively this parameter was scaling along with number of nodes in the cluster, which is 'natural'. I think it would be nice to have some sort of relative constraint of the same sort for the auto-rebalancer as well, otherwise it would be necessary to tweak this constant on every
[kudu-CR] [catalog manager] correct the wrong comment
honeyhexin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14233 Change subject: [catalog_manager] correct the wrong comment .. [catalog_manager] correct the wrong comment Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 --- M src/kudu/master/catalog_manager.h 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/33/14233/1 -- To view, visit http://gerrit.cloudera.org:8080/14233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4071923dc052b6b829ddc25a8a0392ad8d1767e6 Gerrit-Change-Number: 14233 Gerrit-PatchSet: 1 Gerrit-Owner: honeyhexin
[kudu-CR] [tserver] include ip:port in the tserver name
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14232 to look at the new patch set (#2). Change subject: [tserver] include ip:port in the tserver name .. [tserver] include ip:port in the tserver name Change-Id: Ic6b0d8e29b6cac5a900923b881f0d7a74facbd6e --- M src/kudu/tserver/tablet_server.cc M src/kudu/tserver/tablet_server.h 2 files changed, 19 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/32/14232/2 -- To view, visit http://gerrit.cloudera.org:8080/14232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic6b0d8e29b6cac5a900923b881f0d7a74facbd6e Gerrit-Change-Number: 14232 Gerrit-PatchSet: 2 Gerrit-Owner: helifu Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [tserver] include ip:port in the tserver name
helifu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14232 Change subject: [tserver] include ip:port in the tserver name .. [tserver] include ip:port in the tserver name Change-Id: Ic6b0d8e29b6cac5a900923b881f0d7a74facbd6e --- M src/kudu/tserver/tablet_server.cc M src/kudu/tserver/tablet_server.h 2 files changed, 18 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/32/14232/1 -- To view, visit http://gerrit.cloudera.org:8080/14232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic6b0d8e29b6cac5a900923b881f0d7a74facbd6e Gerrit-Change-Number: 14232 Gerrit-PatchSet: 1 Gerrit-Owner: helifu