[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 5: Verified+1 Overriding Jenkins, unrelated TSAN failure. -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 5 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 30 Oct 2018 04:20:12 + Gerrit-HasComments: No
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Adar Dembo has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Removed reviewer Kudu Jenkins with the following votes: * Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 5 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. rpc-test: fix TestClientConnectionMetrics Every now and then, this test would fail with: rpc-test.cc:542: Failure Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), actual: 0 vs 0 Unfortunately, the test would go on to crash (and trigger a TSAN warning) due to the lack of proper cleanup in the event of an ASSERT failure. I've fixed that in this patch. I also tried to address the root of the test flakiness (that the outbound transfer queue contains at least one element), but I couldn't find a good way to do it. Blocking the server reactor thread has no effect on client-side queuing. And we can't block the client reactor thread outright because DumpRunningRpcs runs on it. Some of this is touched on in the original code review[1] that committed the test. Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's ham-fisted for sure, but it seems to work: without it, the test fails every 100-200 runs on my laptop, and with it I can't get it to fail at all. I also looped it 1000 times in TSAN mode with 8 stress threads and didn't see any failures. I don't understand the krpc subsystem very well, so if there's a better way, I'm all ears. 1. https://gerrit.cloudera.org/c/9343/ Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Reviewed-on: http://gerrit.cloudera.org:8080/11819 Reviewed-by: Alexey Serbin Tested-by: Adar Dembo --- M src/kudu/rpc/rpc-test.cc 1 file changed, 45 insertions(+), 43 deletions(-) Approvals: Alexey Serbin: Looks good to me, approved Adar Dembo: Verified -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 6 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 5 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 30 Oct 2018 03:58:02 + Gerrit-HasComments: No
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG@14 PS4, Line 14: on > nit: on Done -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 5 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 30 Oct 2018 03:41:51 + Gerrit-HasComments: Yes
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Hello Alexey Serbin, Andrew Wong, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11819 to look at the new patch set (#5). Change subject: rpc-test: fix TestClientConnectionMetrics .. rpc-test: fix TestClientConnectionMetrics Every now and then, this test would fail with: rpc-test.cc:542: Failure Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), actual: 0 vs 0 Unfortunately, the test would go on to crash (and trigger a TSAN warning) due to the lack of proper cleanup in the event of an ASSERT failure. I've fixed that in this patch. I also tried to address the root of the test flakiness (that the outbound transfer queue contains at least one element), but I couldn't find a good way to do it. Blocking the server reactor thread has no effect on client-side queuing. And we can't block the client reactor thread outright because DumpRunningRpcs runs on it. Some of this is touched on in the original code review[1] that committed the test. Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's ham-fisted for sure, but it seems to work: without it, the test fails every 100-200 runs on my laptop, and with it I can't get it to fail at all. I also looped it 1000 times in TSAN mode with 8 stress threads and didn't see any failures. I don't understand the krpc subsystem very well, so if there's a better way, I'm all ears. 1. https://gerrit.cloudera.org/c/9343/ Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 --- M src/kudu/rpc/rpc-test.cc 1 file changed, 45 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/5 -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 5 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 4: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11819/4//COMMIT_MSG@14 PS4, Line 14: one nit: on -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 4 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 30 Oct 2018 01:23:11 + Gerrit-HasComments: Yes
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 4 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 30 Oct 2018 00:23:32 + Gerrit-HasComments: No
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Adar Dembo has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Removed reviewer Kudu Jenkins with the following votes: * Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 4 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 4: Verified+1 Overriding Jenkins, unrelated test failure. -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 4 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 30 Oct 2018 00:08:26 + Gerrit-HasComments: No
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11819 to look at the new patch set (#4). Change subject: rpc-test: fix TestClientConnectionMetrics .. rpc-test: fix TestClientConnectionMetrics Every now and then, this test would fail with: rpc-test.cc:542: Failure Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), actual: 0 vs 0 Unfortunately, the test would go one to crash (and trigger a TSAN warning) due to the lack of proper cleanup in the event of an ASSERT failure. I've fixed that in this patch. I also tried to address the root of the test flakiness (that the outbound transfer queue contains at least one element), but I couldn't find a good way to do it. Blocking the server reactor thread has no effect on client-side queuing. And we can't block the client reactor thread outright because DumpRunningRpcs runs on it. Some of this is touched on in the original code review[1] that committed the test. Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's ham-fisted for sure, but it seems to work: without it, the test fails every 100-200 runs on my laptop, and with it I can't get it to fail at all. I also looped it 1000 times in TSAN mode with 8 stress threads and didn't see any failures. I don't understand the krpc subsystem very well, so if there's a better way, I'm all ears. 1. https://gerrit.cloudera.org/c/9343/ Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 --- M src/kudu/rpc/rpc-test.cc 1 file changed, 45 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/4 -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 4 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11819 to look at the new patch set (#3). Change subject: rpc-test: fix TestClientConnectionMetrics .. rpc-test: fix TestClientConnectionMetrics Every now and then, this test would fail with: rpc-test.cc:542: Failure Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), actual: 0 vs 0 Unfortunately, the test would go one to crash (and trigger a TSAN warning) due to the lack of proper cleanup in the event of an ASSERT failure. I've fixed that in this patch. I also tried to address the root of the test flakiness (that the outbound transfer queue contains at least one element), but I couldn't find a good way to do it. Blocking the server reactor thread has no effect on client-side queuing. And we can't block the client reactor thread outright because DumpRunningRpcs runs on it. Some of this is touched on in the original code review[1] that committed the test. Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's ham-fisted for sure, but it seems to work: without it, the test fails every 100-200 runs on my laptop, and with it I can't get it to fail at all. I also looped it 1000 times in TSAN mode with 8 stress threads and didn't see any failures. I don't understand the krpc subsystem very well, so if there's a better way, I'm all ears. 1. https://gerrit.cloudera.org/c/9343/ Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 --- M src/kudu/rpc/rpc-test.cc 1 file changed, 45 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/3 -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 3 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11819 to look at the new patch set (#2). Change subject: rpc-test: fix TestClientConnectionMetrics .. rpc-test: fix TestClientConnectionMetrics Every now and then, this test would fail with: rpc-test.cc:542: Failure Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), actual: 0 vs 0 Unfortunately, the test would go one to crash (and trigger a TSAN warning) due to the lack of proper cleanup in the event of an ASSERT failure. I've fixed that in this patch. I also tried to address the root of the test flakiness (that the outbound transfer queue contains at least one element), but I couldn't find a good way to do it. Blocking the server reactor thread has no effect on client-side queuing. And we can't block the client reactor thread outright because DumpRunningRpcs runs on it. Some of this is touched on in the original code review[1] that committed the test. Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's ham-fisted for sure, but it seems to work: without it, the test fails every 100-200 runs on my laptop, and with it I can't get it to fail at all. I also looped it 1000 times in TSAN mode with 8 stress threads and didn't see any failures. I don't understand the krpc subsystem very well, so if there's a better way, I'm all ears. 1. https://gerrit.cloudera.org/c/9343/ Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 --- M src/kudu/rpc/rpc-test.cc 1 file changed, 45 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/2 -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 2 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc File src/kudu/rpc/rpc-test.cc: http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@521 PS1, Line 521: int n_calls = 1000 > nit: add constexpr ? Done http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@552 PS1, Line 552: // Verify that all the RPCs have finished. : for (const auto& controller : controllers) { : ASSERT_TRUE(controller->finished()); : } > nit: does it make sense to move this piece out of the ASSERT_EVENTUALLY() a Can't; the controllers are defined within the ASSERT_EVENTUALLY, and since they need to be replaced with each call, it'd be messy to decouple them from the calls themselves. -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 1 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 29 Oct 2018 22:38:43 + Gerrit-HasComments: Yes
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/11819 ) Change subject: rpc-test: fix TestClientConnectionMetrics .. Patch Set 1: (2 comments) Looks good, just a couple of nits. http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc File src/kudu/rpc/rpc-test.cc: http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@521 PS1, Line 521: int n_calls = 1000 nit: add constexpr ? http://gerrit.cloudera.org:8080/#/c/11819/1/src/kudu/rpc/rpc-test.cc@552 PS1, Line 552: // Verify that all the RPCs have finished. : for (const auto& controller : controllers) { : ASSERT_TRUE(controller->finished()); : } nit: does it make sense to move this piece out of the ASSERT_EVENTUALLY() after wait at latch is done? This should not fail if the condition of the ASSERT_GT() at L546 is true, right? -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 1 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 29 Oct 2018 21:37:28 + Gerrit-HasComments: Yes
[kudu-CR] rpc-test: fix TestClientConnectionMetrics
Hello Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/11819 to review the following change. Change subject: rpc-test: fix TestClientConnectionMetrics .. rpc-test: fix TestClientConnectionMetrics Every now and then, this test would fail with: rpc-test.cc:542: Failure Expected: (dump_resp.outbound_connections(0).outbound_queue_size()) > (0), actual: 0 vs 0 Unfortunately, the test would go one to crash (and trigger a TSAN warning) due to the lack of proper cleanup in the event of an ASSERT failure. I've fixed that in this patch. I also tried to address the root of the test flakiness (that the outbound transfer queue contains at least one element), but I couldn't find a good way to do it. Blocking the server reactor thread has no effect on client-side queuing. And we can't block the client reactor thread outright because DumpRunningRpcs runs on it. Some of this is touched on in the original code review[1] that committed the test. Having given up, I wrapped the whole thing in an ASSERT_EVENTUALLY. It's ham-fisted for sure, but it seems to work: without it, the test fails every 100-200 runs on my laptop, and with it I can't get it to fail at all. I also looped it 1000 times in TSAN mode with 8 stress threads and didn't see any failures. I don't understand the krpc subsystem very well, so if there's a better way, I'm all ears. 1. https://gerrit.cloudera.org/c/9343/ Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 --- M src/kudu/rpc/rpc-test.cc 1 file changed, 45 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/11819/1 -- To view, visit http://gerrit.cloudera.org:8080/11819 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9c565b80bdca435d18787c7df0ec992728363980 Gerrit-Change-Number: 11819 Gerrit-PatchSet: 1 Gerrit-Owner: Adar Dembo Gerrit-Reviewer: Todd Lipcon