[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..

KUDU-1097 (patch1): test for replica health reporting

Added a test to verify that the leader tablet replica reports on the
replica health changes. The tests verifies that the health reports
are present in the Raft consensus state reported by the leader replica.
The test also verifies that the incremental tablet reports contain
appropriate information once the replica health status changes.

Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Reviewed-on: http://gerrit.cloudera.org:8080/8642
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
1 file changed, 291 insertions(+), 49 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Mike Percy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@362
PS1, Line 362: reports
> Ah, I see.  That's appro
I meant that's an interesting approach.  Let me put that then for the 
documentation purposes.



--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 27 Nov 2017 06:29:41 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..


Patch Set 2:

I'm ok with this, let's push it


--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 27 Nov 2017 06:29:17 +
Gerrit-HasComments: No


[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@362
PS1, Line 362: reports
> I see default param as nullptr as documentation that the arg is optional bu
Ah, I see.  That's appro


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@476
PS1, Line 476:   // The scenario below assumes the leader replica does not 
change anymore.
 :   FLAGS_enable_leader_failure_detection = false;
> I'm not sure it's truly safe but maybe it's safe enough for this kind of te
I meant if the assertions for the status fail, then we need to retry the whole 
scenario with the state change, in my understanding -- putting the only code 
below would not help much in that case.  That would require to put under 
ASSERT_EVENTUALLY almost all the scenario, which could hide some bugs.

I would prefer to keep it as it is or just rewrite it, disabling leader failure 
detection in the very beginning of the test.

Let's choose the first option for a while, if you accept it's more or less safe 
for this type of test.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@489
PS1, Line 489: const HealthReportPB& report(p.health_report());
> I don't think that's true. The copy constructor will get invoked and the in
ok, we talked about that offline, and it seems the code below can clarify a bit 
on this (compile with -O0 to disable optimizations):

#include 

using namespace std;

class A {
 public:
   A()
   : a_(0) {
 cout << "A::A()" << endl;
   }

   A(const A& a) {
 a_ = a.a_;
 cout << "A::A()" << endl;
   }

 private:
  int a_;
};

int main() {
  A a;
  const A& b = a;
  const A& c(a);
  A& d(a);

  return 0;
}



--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 27 Nov 2017 06:27:38 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@489
PS1, Line 489: he scenario below assumes the leader replica doe
> I don't think that's true. The copy constructor will get invoked and the in
OK, apparently you were right :)



--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 27 Nov 2017 06:15:01 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..


Patch Set 2: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@362
PS1, Line 362: eports.
> This does not make much sense because this closure is never invoked with tw
I see default param as nullptr as documentation that the arg is optional but 
I'm ok with this


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@476
PS1, Line 476: map reports;
 : NO_FATALS(get_health_reports());
> I saw that's how it's done in TsTabletManagerITest::TestReportNewLeaderOnLe
I'm not sure it's truly safe but maybe it's safe enough for this kind of 
test... I don't see how ASSERT_EVENTUALLY would be brittle but I can live with 
it


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@489
PS1, Line 489: he scenario below assumes the leader replica doe
> It's a constructor for a reference.  There is no difference here -- in both
I don't think that's true. The copy constructor will get invoked and the 
instance will be assigned to a temporary and "report" will act as a reference 
to that temporary. Actually I guess this part of C++ is a bit magical for me. 
const string& s = "blah" should act the same way.

That said, it's a test and not a very big deal.



--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 27 Nov 2017 05:58:27 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Alexey Serbin (Code Review)
Hello Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8642

to look at the new patch set (#2).

Change subject: KUDU-1097 (patch1): test for replica health reporting
..

KUDU-1097 (patch1): test for replica health reporting

Added a test to verify that the leader tablet replica reports on the
replica health changes. The tests verifies that the health reports
are present in the Raft consensus state reported by the leader replica.
The test also verifies that the incremental tablet reports contain
appropriate information once the replica health status changes.

Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
---
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
1 file changed, 291 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/8642/2
--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..


Patch Set 1:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@124
PS1, Line 124:   Status PrepareTabletReplicas(MonoDelta timeout,
> nit: doc these functions
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@362
PS1, Line 362:   auto get_health_reports = [&](map* 
reports,
> nit: doc this closure
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@362
PS1, Line 362: reports
> nit: if this is optional, how about we default it to nullptr
This does not make much sense because this closure is never invoked with two 
parameters set to the default values.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@366
PS1, Line 366: e
> nit: usually when we use "e" when iterating it's for an entry in a map, but
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@367
PS1, Line 367: con
> nit: mind spelling out "consensus" here?
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@369
PS1, Line 369: cs.has_leader_uuid()
> nit: no need to check has_leader_uuid(), since if it's not set it will retu
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@378
PS1, Line 378: config
> nit: perhaps name this committed_config as to avoid any confusion about whi
This is just a local shortcut for the next 10 lines, so I think this should not 
bring a confusion -- it's easy to see what the 'config' is.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@392
PS1, Line 392:   auto get_committed_config_from_reports = [&](const string& 
leader_replica_uuid,
> nit: doc this closure
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@409
PS1, Line 409: ;
> typo
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@410
PS1, Line 410: const TabletReportPB& report = reports[0];
 : ASSERT_EQ(1, reports.size());
> the order of these lines should be reversed
good catch!


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@432
PS1, Line 432: EXPECT_EQ
> EXPECT_EQ will not trigger an ASSERT_EVENTUALLY failure; consider using ASS
Good catch!  Exactly: EXPECT_XX macros do not generate fatal test failures.  It 
seems I just moved those under the ASSERT_EVENTUALLY, not thinking about that.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@469
PS1, Line 469: EXPECT_EQ
> ASSERT_EQ
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@471
PS1, Line 471: EXPECT_EQ
> ASSERT_EQ
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@476
PS1, Line 476:   // The scenario below assumes the leader replica does not 
change anymore.
 :   FLAGS_enable_leader_failure_detection = false;
> This isn't set specified as a runtime flag and i'm not sure it's really saf
I saw that's how it's done in 
TsTabletManagerITest::TestReportNewLeaderOnLeaderChange above -- they call this 
after starting the cluster already.  So, I assumed it was safe enough.

I also looked into the code, and it seemed safe to change this in run-time.  
Probably, that's because it has changed recently: we don't use dedicated thread 
for that anymore, so toggling this is not error-prone anymore.

Wrapping the code below in ASSERT_EVENTUALLY would be brittle, IMO.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@485
PS1, Line 485: p
> nit: mind just spelling out "peer" here?
Done


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@489
PS1, Line 489: const HealthReportPB& report(p.health_report());
> nit: why copy constructor? why not just get a ref like:
It's a constructor for a reference.  There is no difference here -- in both 
cases it will be a reference, no copy constructor should be invoked.  It's just 
one symbol less.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@491
PS1, Line 491: EXPECT_EQ
> nit: if we wrap this in ASSERT_EVENTUALLY then we will have to change this
I think we can get away with changing the 

[kudu-CR] KUDU-1097 (patch1): test for replica health reporting

2017-11-26 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8642 )

Change subject: KUDU-1097 (patch1): test for replica health reporting
..


Patch Set 1:

(17 comments)

Looks good. Thanks for writing this test for the patch 1 functionality. I have 
some nitpicky feedback and that's about it.

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc
File src/kudu/integration-tests/ts_tablet_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@124
PS1, Line 124:   Status PrepareTabletReplicas(MonoDelta timeout,
nit: doc these functions


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@362
PS1, Line 362:   auto get_health_reports = [&](map* 
reports,
nit: doc this closure


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@362
PS1, Line 362: reports
nit: if this is optional, how about we default it to nullptr


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@366
PS1, Line 366: e
nit: usually when we use "e" when iterating it's for an entry in a map, but 
this container is just a vector, so we should just use "replica" or something 
similar for this variable name.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@367
PS1, Line 367: con
nit: mind spelling out "consensus" here?


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@369
PS1, Line 369: cs.has_leader_uuid()
nit: no need to check has_leader_uuid(), since if it's not set it will return 
the default instance which is an empty string and thus won't match peer_uuid()


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@378
PS1, Line 378: config
nit: perhaps name this committed_config as to avoid any confusion about which 
config it is


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@392
PS1, Line 392:   auto get_committed_config_from_reports = [&](const string& 
leader_replica_uuid,
nit: doc this closure


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@409
PS1, Line 409: ;
typo


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@410
PS1, Line 410: const TabletReportPB& report = reports[0];
 : ASSERT_EQ(1, reports.size());
the order of these lines should be reversed


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@432
PS1, Line 432: EXPECT_EQ
EXPECT_EQ will not trigger an ASSERT_EVENTUALLY failure; consider using 
ASSERT_EQ here


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@469
PS1, Line 469: EXPECT_EQ
ASSERT_EQ


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@471
PS1, Line 471: EXPECT_EQ
ASSERT_EQ


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@476
PS1, Line 476:   // The scenario below assumes the leader replica does not 
change anymore.
 :   FLAGS_enable_leader_failure_detection = false;
This isn't set specified as a runtime flag and i'm not sure it's really safe to 
change at runtime.

Something we often to do in these situations is just wrap an 
ASSERT_EVENTUALLY() around any multi-step test sections.


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@485
PS1, Line 485: p
nit: mind just spelling out "peer" here?


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@489
PS1, Line 489: const HealthReportPB& report(p.health_report());
nit: why copy constructor? why not just get a ref like:

  const HealthReportPB& report = p.health_report();


http://gerrit.cloudera.org:8080/#/c/8642/1/src/kudu/integration-tests/ts_tablet_manager-itest.cc@491
PS1, Line 491: EXPECT_EQ
nit: if we wrap this in ASSERT_EVENTUALLY then we will have to change this one 
and the one below it to ASSERT_EQ



--
To view, visit http://gerrit.cloudera.org:8080/8642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie62b49efebad9a123eec51dd302e375e46e0682d
Gerrit-Change-Number: 8642
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Mon, 27 Nov 2017 03:21:43 +
Gerrit-HasComments: Yes