[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 5: Verified+1

Overriding Jenkins, unrelated TSAN failure.


--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 30 Oct 2018 04:20:12 +
Gerrit-HasComments: No


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..

rpc-test: fix TestClientConnectionMetrics

Every now and then, this test would fail with:

  rpc-test.cc:542: Failure
  Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), 
actual: 0 vs 0

Unfortunately, the test would go on to crash (and trigger a TSAN warning)
due to the lack of proper cleanup in the event of an ASSERT failure. I've
fixed that in this patch.

I also tried to address the root of the test flakiness (that the outbound
transfer queue contains at least one element), but I couldn't find a good
way to do it. Blocking the server reactor thread has no effect on
client-side queuing. And we can't block the client reactor thread outright
because DumpRunningRpcs runs on it. Some of this is touched on in the
original code review[1] that committed the test.

Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's
ham-fisted for sure, but it seems to work: without it, the test fails every
100-200 runs on my laptop, and with it I can't get it to fail at all. I also
looped it 1000 times in TSAN mode with 8 stress threads and didn't see any
failures. I don't understand the krpc subsystem very well, so if there's a
better way, I'm all ears.

1. https://gerrit.cloudera.org/c/9343/

Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Reviewed-on: http://gerrit.cloudera.org:8080/11819
Reviewed-by: Alexey Serbin 
Tested-by: Adar Dembo 
---
M src/kudu/rpc/rpc-test.cc
1 file changed, 45 insertions(+), 43 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Adar Dembo: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 30 Oct 2018 03:58:02 +
Gerrit-HasComments: No


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG@14
PS4, Line 14: on
> nit: on
Done



--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 30 Oct 2018 03:41:51 +
Gerrit-HasComments: Yes


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Andrew Wong, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11819

to look at the new patch set (#5).

Change subject: rpc-test: fix TestClientConnectionMetrics
..

rpc-test: fix TestClientConnectionMetrics

Every now and then, this test would fail with:

  rpc-test.cc:542: Failure
  Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), 
actual: 0 vs 0

Unfortunately, the test would go on to crash (and trigger a TSAN warning)
due to the lack of proper cleanup in the event of an ASSERT failure. I've
fixed that in this patch.

I also tried to address the root of the test flakiness (that the outbound
transfer queue contains at least one element), but I couldn't find a good
way to do it. Blocking the server reactor thread has no effect on
client-side queuing. And we can't block the client reactor thread outright
because DumpRunningRpcs runs on it. Some of this is touched on in the
original code review[1] that committed the test.

Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's
ham-fisted for sure, but it seems to work: without it, the test fails every
100-200 runs on my laptop, and with it I can't get it to fail at all. I also
looped it 1000 times in TSAN mode with 8 stress threads and didn't see any
failures. I don't understand the krpc subsystem very well, so if there's a
better way, I'm all ears.

1. https://gerrit.cloudera.org/c/9343/

Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
---
M src/kudu/rpc/rpc-test.cc
1 file changed, 45 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/5
--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG@14
PS4, Line 14: one
nit: on



--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 30 Oct 2018 01:23:11 +
Gerrit-HasComments: Yes


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:23:32 +
Gerrit-HasComments: No


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Adar Dembo has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 4: Verified+1

Overriding Jenkins, unrelated test failure.


--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:08:26 +
Gerrit-HasComments: No


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11819

to look at the new patch set (#4).

Change subject: rpc-test: fix TestClientConnectionMetrics
..

rpc-test: fix TestClientConnectionMetrics

Every now and then, this test would fail with:

  rpc-test.cc:542: Failure
  Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), 
actual: 0 vs 0

Unfortunately, the test would go one to crash (and trigger a TSAN warning)
due to the lack of proper cleanup in the event of an ASSERT failure. I've
fixed that in this patch.

I also tried to address the root of the test flakiness (that the outbound
transfer queue contains at least one element), but I couldn't find a good
way to do it. Blocking the server reactor thread has no effect on
client-side queuing. And we can't block the client reactor thread outright
because DumpRunningRpcs runs on it. Some of this is touched on in the
original code review[1] that committed the test.

Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's
ham-fisted for sure, but it seems to work: without it, the test fails every
100-200 runs on my laptop, and with it I can't get it to fail at all. I also
looped it 1000 times in TSAN mode with 8 stress threads and didn't see any
failures. I don't understand the krpc subsystem very well, so if there's a
better way, I'm all ears.

1. https://gerrit.cloudera.org/c/9343/

Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
---
M src/kudu/rpc/rpc-test.cc
1 file changed, 45 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/4
--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11819

to look at the new patch set (#3).

Change subject: rpc-test: fix TestClientConnectionMetrics
..

rpc-test: fix TestClientConnectionMetrics

Every now and then, this test would fail with:

  rpc-test.cc:542: Failure
  Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), 
actual: 0 vs 0

Unfortunately, the test would go one to crash (and trigger a TSAN warning)
due to the lack of proper cleanup in the event of an ASSERT failure. I've
fixed that in this patch.

I also tried to address the root of the test flakiness (that the outbound
transfer queue contains at least one element), but I couldn't find a good
way to do it. Blocking the server reactor thread has no effect on
client-side queuing. And we can't block the client reactor thread outright
because DumpRunningRpcs runs on it. Some of this is touched on in the
original code review[1] that committed the test.

Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's
ham-fisted for sure, but it seems to work: without it, the test fails every
100-200 runs on my laptop, and with it I can't get it to fail at all. I also
looped it 1000 times in TSAN mode with 8 stress threads and didn't see any
failures. I don't understand the krpc subsystem very well, so if there's a
better way, I'm all ears.

1. https://gerrit.cloudera.org/c/9343/

Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
---
M src/kudu/rpc/rpc-test.cc
1 file changed, 45 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/3
--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11819

to look at the new patch set (#2).

Change subject: rpc-test: fix TestClientConnectionMetrics
..

rpc-test: fix TestClientConnectionMetrics

Every now and then, this test would fail with:

  rpc-test.cc:542: Failure
  Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), 
actual: 0 vs 0

Unfortunately, the test would go one to crash (and trigger a TSAN warning)
due to the lack of proper cleanup in the event of an ASSERT failure. I've
fixed that in this patch.

I also tried to address the root of the test flakiness (that the outbound
transfer queue contains at least one element), but I couldn't find a good
way to do it. Blocking the server reactor thread has no effect on
client-side queuing. And we can't block the client reactor thread outright
because DumpRunningRpcs runs on it. Some of this is touched on in the
original code review[1] that committed the test.

Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's
ham-fisted for sure, but it seems to work: without it, the test fails every
100-200 runs on my laptop, and with it I can't get it to fail at all. I also
looped it 1000 times in TSAN mode with 8 stress threads and didn't see any
failures. I don't understand the krpc subsystem very well, so if there's a
better way, I'm all ears.

1. https://gerrit.cloudera.org/c/9343/

Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
---
M src/kudu/rpc/rpc-test.cc
1 file changed, 45 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/2
--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc
File src/kudu/rpc/rpc-test.cc:

http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@521
PS1, Line 521: int n_calls = 1000
> nit: add constexpr ?
Done


http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@552
PS1, Line 552: // Verify that all the RPCs have finished.
 : for (const auto& controller : controllers) {
 :   ASSERT_TRUE(controller->finished());
 : }
> nit: does it make sense to move this piece out of the ASSERT_EVENTUALLY() a
Can't; the controllers are defined within the ASSERT_EVENTUALLY, and since they 
need to be replaced with each call, it'd be messy to decouple them from the 
calls themselves.



--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 29 Oct 2018 22:38:43 +
Gerrit-HasComments: Yes


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11819 )

Change subject: rpc-test: fix TestClientConnectionMetrics
..


Patch Set 1:

(2 comments)

Looks good, just a couple of nits.

http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc
File src/kudu/rpc/rpc-test.cc:

http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@521
PS1, Line 521: int n_calls = 1000
nit: add constexpr ?


http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@552
PS1, Line 552: // Verify that all the RPCs have finished.
 : for (const auto& controller : controllers) {
 :   ASSERT_TRUE(controller->finished());
 : }
nit: does it make sense to move this piece out of the ASSERT_EVENTUALLY() after 
wait at latch is done?  This should not fail if the condition of the 
ASSERT_GT() at L546 is true, right?



--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:37:28 +
Gerrit-HasComments: Yes


[kudu-CR] rpc-test: fix TestClientConnectionMetrics

2018-10-29 Thread Adar Dembo (Code Review)
Hello Todd Lipcon,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11819

to review the following change.


Change subject: rpc-test: fix TestClientConnectionMetrics
..

rpc-test: fix TestClientConnectionMetrics

Every now and then, this test would fail with:

  rpc-test.cc:542: Failure
  Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), 
actual: 0 vs 0

Unfortunately, the test would go one to crash (and trigger a TSAN warning)
due to the lack of proper cleanup in the event of an ASSERT failure. I've
fixed that in this patch.

I also tried to address the root of the test flakiness (that the outbound
transfer queue contains at least one element), but I couldn't find a good
way to do it. Blocking the server reactor thread has no effect on
client-side queuing. And we can't block the client reactor thread outright
because DumpRunningRpcs runs on it. Some of this is touched on in the
original code review[1] that committed the test.

Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's
ham-fisted for sure, but it seems to work: without it, the test fails every
100-200 runs on my laptop, and with it I can't get it to fail at all. I also
looped it 1000 times in TSAN mode with 8 stress threads and didn't see any
failures. I don't understand the krpc subsystem very well, so if there's a
better way, I'm all ears.

1. https://gerrit.cloudera.org/c/9343/

Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
---
M src/kudu/rpc/rpc-test.cc
1 file changed, 45 insertions(+), 39 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/1
--
To view, visit http://gerrit.cloudera.org:8080/11819
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980
Gerrit-Change-Number: 11819
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Todd Lipcon