[kudu-CR] [docs] Fix code block formatting

2019-09-29 Thread Anonymous Coward (Code Review)
lingbi...@gmail.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14320


Change subject: [docs] Fix code block formatting
..

[docs] Fix code block formatting

Change-Id: Ibc54ba74ee58e1c601bf7e60a3a51c6d48659345
---
M docs/design-docs/tablet.md
1 file changed, 8 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/14320/1
--
To view, visit http://gerrit.cloudera.org:8080/14320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibc54ba74ee58e1c601bf7e60a3a51c6d48659345
Gerrit-Change-Number: 14320
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward 


[kudu-CR] [clock] add a built-in NTP client implementation

2019-09-29 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new patch set (#21) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/7477 )

Change subject: [clock] add a built-in NTP client implementation
..

[clock] add a built-in NTP client implementation

This adds a stripped-down implementation of built-in NTP client
without any reliance on the kernel NTP discipline.  This initial
implementation doesn't have all the necessary sanity checks for true
time that RFC-compliant client would have and doesn't implement
everything from RFC5905.  However, this implementation is good enough
for running with well behaved and properly configured reference NTP
servers such as servers from pool.ntp.org, other public-domain NTP
services, and properly configured NTP servers in local network.

This is a first step on the road to eventually have robust and
RFC-compliant Kudu's own NTP client that is able to detect and
reject misbehaving and rogue reference NTP servers.

For Kudu, this should hopefully make it easier for users to configure
NTP even if they don't have root, and also can maintain better clock
error than the system implementation, since we can prioritize low error
bounds rather than low jitter.  This patch doesn't have the knobs
to control error vs jitter yet, though.

This patch also contains tests to verify the newly introduced
functionality in scenarios when the built-in NTP client is pointed
to various combinations of properly configured NTP servers or
unintentionally misconfigured/unavailable ones.

Change-Id: Ieb9eee9f0334236f39617492bd6f01304d1a0255
---
M src/kudu/clock/CMakeLists.txt
A src/kudu/clock/builtin_ntp.cc
A src/kudu/clock/builtin_ntp.h
M src/kudu/clock/hybrid_clock.cc
A src/kudu/clock/ntp-test.cc
5 files changed, 1,662 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/77/7477/21
--
To view, visit http://gerrit.cloudera.org:8080/7477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ieb9eee9f0334236f39617492bd6f01304d1a0255
Gerrit-Change-Number: 7477
Gerrit-PatchSet: 21
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] add a tool to create table

2019-09-29 Thread YangSong (Code Review)
YangSong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14306 )

Change subject: add a tool to create table
..


Patch Set 5:

(2 comments)

The test code can define a new function to avoid repeating the same code over 
and over.

http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:

http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@937
PS5, Line 937:   ParseValueOfType(column.default_value(),
> Wrap in RETURN_NOT_OK.
I find there is a problem here. I can't use  ParseValueOfType derictly, there 
are two problems:
1. the string should convert to list, like ["1"] or [1]
2. in the JSON object, the default value's type is always string, but if I 
directly convert it to list, like "[1]" as the first parameter, it only work at 
the type of int, if the type is string, we should convert "1" to "[\"1\"]". 
Here I add a piece of code:
  if (column.column_type() == "STRING" OR "BINARY") {
default_v = "[\"" + column.default_value() + "\"]";
  } else {
default_v = "[" + column.default_value() + "]";
  }
Do you think this is right? Or better advice. ConvertToKuduPartialRow has the 
same problem.


http://gerrit.cloudera.org:8080/#/c/14306/5/src/kudu/tools/tool_action_table.cc@970
PS5, Line 970: for (const auto& value : 
bound.lower_bound().bound_values()) {
 :   values += value;
 :   values += ",";
 : }
> You should be able to use a function from gutil/strings/join.h here and bel
I try to modify like this:
  string tmp = JoinKeysIterator(
  bound.lower_bound().bound_values().begin(),
  bound.lower_bound().bound_values().end(), ",");
but error reported at compile time. Maybe the RepeatedPtrField iterator is not 
support.



--
To view, visit http://gerrit.cloudera.org:8080/14306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0bce4733a504f8ef5f024089a16fe3c6f1e493f1
Gerrit-Change-Number: 14306
Gerrit-PatchSet: 5
Gerrit-Owner: YangSong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: YangSong 
Gerrit-Comment-Date: Sun, 29 Sep 2019 10:00:20 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2069 p5: recheck tablet health when exiting maintenance mode

2019-09-29 Thread Andrew Wong (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14223

to look at the new patch set (#6).

Change subject: KUDU-2069 p5: recheck tablet health when exiting maintenance 
mode
..

KUDU-2069 p5: recheck tablet health when exiting maintenance mode

Previously, when exiting maintenance mode for a given tserver, if the
replicas of that tserver were unhealthy, there was no mechanism with
which to guarantee that the proper re-replication would happen.

Specifically, the following sequence of events was possible:
1. tablet T has replicas on tservers A, B*, C
2. A enters maintenance mode
3. A is shut down
4. enough time passes for B* to consider A as failed
5. B* notices the failure of A and reports to the master that replica A
   has failed
6. the master does nothing to schedule re-replication because A is in
   maintenance mode
7. A exits maintenance mode, but is not brought back online
8. B* never hears back from A, and never hits a health state change to
   report to the master, and so the master never "rechecks" the health
   of T
9. T is left under-replicated with only B* and C

Note: The set of tservers we need to recheck is the set that hosted a
leader of any replica on A.

This patch addresses this by requesting a full tablet report from every
tserver in the cluster upon exiting maintenance mode on any tserver.

Testing:
- this adds to the existing integration test for maintenance mode to
  exercise the new behavior
- amends an existing concurrency test to verify the correct locking
  behavior is used

Change-Id: Ic0ab3d78cbc5b1228c01592a00118f11f76e43dd
---
M src/kudu/integration-tests/maintenance_mode-itest.cc
M src/kudu/master/master_service.cc
M src/kudu/master/ts_descriptor.cc
M src/kudu/master/ts_descriptor.h
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
M src/kudu/master/ts_state-test.cc
7 files changed, 74 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/23/14223/6
--
To view, visit http://gerrit.cloudera.org:8080/14223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic0ab3d78cbc5b1228c01592a00118f11f76e43dd
Gerrit-Change-Number: 14223
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-2069 p4: stop replication from failed servers in maintenance mode

2019-09-29 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14222 )

Change subject: KUDU-2069 p4: stop replication from failed servers in 
maintenance mode
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/consensus/quorum_util-test.cc
File src/kudu/consensus/quorum_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/consensus/quorum_util-test.cc@645
PS5, Line 645:   // Even if the replica to replace is meant to be ignored on 
failure, we
 :   // should honor the replacement and try to add a replica.
 :   for (char health : { '+', '-', '?' }) {
 : RaftConfigPB config;
 : AddPeer(, "A", V, health, {{"REPLACE", true}});
 : AddPeer(, "B", V, '+');
 : AddPeer(, "C", V, '+');
 : EXPECT_TRUE(ShouldAddReplica(config, 3, { "A" }));
 :   }
 :   {
 : RaftConfigPB config;
 : AddPeer(, "A", V, '+', {{"REPLACE", true}});
 : AddPeer(, "B", V, '-');
 : AddPeer(, "C", V, '+');
 : // Ignoring failures shouldn't impede our ability to add a 
replica when the
 :
> nit: maybe, put this under for () cycle to iterate over the health status o
Done


http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/consensus/quorum_util-test.cc@675
PS5, Line 675: p
> Does it make sense to add coverage for other health statuses as well (unkno
Done


http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/consensus/quorum_util-test.cc@684
PS5, Line 684:   EXPECT_FALSE(ShouldAddReplica(config, 5, { "A", "B" }));
 :   // But when there is a failure that isn't suppose
> ditto
Done


http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/consensus/quorum_util.cc
File src/kudu/consensus/quorum_util.cc:

http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/consensus/quorum_util.cc@455
PS5, Line 455: if (VLOG_IS_ON(1) && ignore_failure_for_underreplication) {
 :   VLOG(1) << Substitute("ignoring $0 if failed", peer_uuid);
 : }
> Does it deserve VLOG(1) it's not important at all?
Done


http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/integration-tests/maintenance_mode-itest.cc
File src/kudu/integration-tests/maintenance_mode-itest.cc:

http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/integration-tests/maintenance_mode-itest.cc@269
PS5, Line 269:   ASSERT_OK(ChangeTServerState(maintenance_uuid, 
TServerStateChangePB::EXIT_MAINTENANCE_MODE));
> Thank you for adding this scenario.
Ack


http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/integration-tests/maintenance_mode-itest.cc@379
PS5, Line 379:   void SetUp() override {
 : SKIP_IF_SLOW_NOT_ALLOWED();
 : NO_FATALS(MaintenanceModeITest::SetUp());
 : NO_FATALS(SetUpCluster(5));
> Could it happen that replicas at the server fell behind WAL GC threshold si
Good point. I've separated the concern by disabling log GC and testing for 
FAILED_UNRECOVERABLE sepcifically using disk failures.


http://gerrit.cloudera.org:8080/#/c/14222/5/src/kudu/integration-tests/maintenance_mode-itest.cc@385
PS5, Line 385: SKIP_IF_SLOW_NOT_ALLOWED();
> Does it make sense to verify the number of written rows after stopping the
Done



--
To view, visit http://gerrit.cloudera.org:8080/14222
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db
Gerrit-Change-Number: 14222
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sun, 29 Sep 2019 06:18:15 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2069 p4: stop replication from failed servers in maintenance mode

2019-09-29 Thread Andrew Wong (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14222

to look at the new patch set (#6).

Change subject: KUDU-2069 p4: stop replication from failed servers in 
maintenance mode
..

KUDU-2069 p4: stop replication from failed servers in maintenance mode

When determining whether a replica needs to be added, we may now
consider a set of of UUIDs that are allowed to be in a "bad" state while
not counting towards being under-replicated.

Since the goal of maintenance mode is to cope with unexpected failures,
"healthy" movement, e.g. through tooling that uses REPLACE and PROMOTE
tagging, is still allowed to and from tservers in maintenance mode.

Testing:
- a unit test is added to exercise the new quorum logic to ignore
  certain failed UUIDs, taking into account REPLACE and PROMOTE replicas
- integration tests are added to test:
  - behavior with RF=3 through restarts of the master
  - behavior when running move_replica tooling
  - behavior with RF=5 with background failures

Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db
---
M src/kudu/consensus/quorum_util-test.cc
M src/kudu/consensus/quorum_util.cc
M src/kudu/consensus/quorum_util.h
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/maintenance_mode-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/ts_manager.cc
M src/kudu/master/ts_manager.h
8 files changed, 657 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/22/14222/6
--
To view, visit http://gerrit.cloudera.org:8080/14222
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db
Gerrit-Change-Number: 14222
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)