[kudu-CR] webserver-stress-itest: fix flakiness
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9414 ) Change subject: webserver-stress-itest: fix flakiness .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 26 Feb 2018 20:00:18 + Gerrit-HasComments: No
[kudu-CR] webserver-stress-itest: fix flakiness
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9414 ) Change subject: webserver-stress-itest: fix flakiness .. webserver-stress-itest: fix flakiness This fixes a source of flakiness I found on the flaky dashboard. In some runs of this test, we'd hit the following interleaving: - we start the master with webserver_port=0 and it picks some port (eg 35000) - we stop the master - the curl threads are still running, and one of them picks port 35000 as the local side of its TCP connection. It then tries to connect to 35000 and hits the dreaded "tcp loop connect" phenomenon[1] in which it actually connects to _itself_. Thus it just hangs there occupying the port - we try to start the master again, and it fails to bind - we now time out trying to Join() on the curl thread, which is waiting forever for itself to respond to an HTTP request. The fix is to use non-ephemeral ports for the webserver as we already do for the RPC server. I additionally added timeouts to the curl calls. [1] http://www.rampa.sk/static/tcpLoopConnect.html Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Reviewed-on: http://gerrit.cloudera.org:8080/9414 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo --- M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/webserver-stress-itest.cc M src/kudu/util/curl_util.cc M src/kudu/util/curl_util.h 4 files changed, 32 insertions(+), 2 deletions(-) Approvals: Kudu Jenkins: Verified Adar Dembo: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] webserver-stress-itest: fix flakiness
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9414 ) Change subject: webserver-stress-itest: fix flakiness .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9414/3/src/kudu/integration-tests/webserver-stress-itest.cc File src/kudu/integration-tests/webserver-stress-itest.cc: http://gerrit.cloudera.org:8080/#/c/9414/3/src/kudu/integration-tests/webserver-stress-itest.cc@61 PS3, Line 61: it's easier than adding the ability to pipe separate webserver : // ports to eac > Nit: could you change this to "...separate webserver ports..."? We can pipe Done -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 26 Feb 2018 19:31:22 + Gerrit-HasComments: Yes
[kudu-CR] webserver-stress-itest: fix flakiness
Hello Tidy Bot, Mike Percy, Dan Burkert, Kudu Jenkins, Adar Dembo, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9414 to look at the new patch set (#4). Change subject: webserver-stress-itest: fix flakiness .. webserver-stress-itest: fix flakiness This fixes a source of flakiness I found on the flaky dashboard. In some runs of this test, we'd hit the following interleaving: - we start the master with webserver_port=0 and it picks some port (eg 35000) - we stop the master - the curl threads are still running, and one of them picks port 35000 as the local side of its TCP connection. It then tries to connect to 35000 and hits the dreaded "tcp loop connect" phenomenon[1] in which it actually connects to _itself_. Thus it just hangs there occupying the port - we try to start the master again, and it fails to bind - we now time out trying to Join() on the curl thread, which is waiting forever for itself to respond to an HTTP request. The fix is to use non-ephemeral ports for the webserver as we already do for the RPC server. I additionally added timeouts to the curl calls. [1] http://www.rampa.sk/static/tcpLoopConnect.html Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b --- M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/webserver-stress-itest.cc M src/kudu/util/curl_util.cc M src/kudu/util/curl_util.h 4 files changed, 32 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/9414/4 -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] webserver-stress-itest: fix flakiness
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9414 ) Change subject: webserver-stress-itest: fix flakiness .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/9414/3/src/kudu/integration-tests/webserver-stress-itest.cc File src/kudu/integration-tests/webserver-stress-itest.cc: http://gerrit.cloudera.org:8080/#/c/9414/3/src/kudu/integration-tests/webserver-stress-itest.cc@61 PS3, Line 61: it's easier than adding the ability to pipe separate ports to : // each server. Nit: could you change this to "...separate webserver ports..."? We can pipe separate RPC ports via master_rpc_ports; this would avoid any confusion about that. -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 26 Feb 2018 19:27:05 + Gerrit-HasComments: Yes
[kudu-CR] webserver-stress-itest: fix flakiness
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9414 to look at the new patch set (#3). Change subject: webserver-stress-itest: fix flakiness .. webserver-stress-itest: fix flakiness This fixes a source of flakiness I found on the flaky dashboard. In some runs of this test, we'd hit the following interleaving: - we start the master with webserver_port=0 and it picks some port (eg 35000) - we stop the master - the curl threads are still running, and one of them picks port 35000 as the local side of its TCP connection. It then tries to connect to 35000 and hits the dreaded "tcp loop connect" phenomenon[1] in which it actually connects to _itself_. Thus it just hangs there occupying the port - we try to start the master again, and it fails to bind - we now time out trying to Join() on the curl thread, which is waiting forever for itself to respond to an HTTP request. The fix is to use non-ephemeral ports for the webserver as we already do for the RPC server. I additionally added timeouts to the curl calls. [1] http://www.rampa.sk/static/tcpLoopConnect.html Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b --- M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/webserver-stress-itest.cc M src/kudu/util/curl_util.cc M src/kudu/util/curl_util.h 4 files changed, 32 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/9414/3 -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] webserver-stress-itest: fix flakiness
Hello Tidy Bot, Mike Percy, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9414 to look at the new patch set (#2). Change subject: webserver-stress-itest: fix flakiness .. webserver-stress-itest: fix flakiness This fixes a source of flakiness I found on the flaky dashboard. In some runs of this test, we'd hit the following interleaving: - we start the master with webserver_port=0 and it picks some port (eg 35000) - we stop the master - the curl threads are still running, and one of them picks port 35000 as the local side of its TCP connection. It then tries to connect to 35000 and hits the dreaded "tcp loop connect" phenomenon[1] in which it actually connects to _itself_. Thus it just hangs there occupying the port - we try to start the master again, and it fails to bind - we now time out trying to Join() on the curl thread, which is waiting forever for itself to respond to an HTTP request. The fix is to use non-ephemeral ports for the webserver as we already do for the RPC server. I additionally added timeouts to the curl calls. [1] http://www.rampa.sk/static/tcpLoopConnect.html Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b --- M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/webserver-stress-itest.cc M src/kudu/util/curl_util.cc M src/kudu/util/curl_util.h 4 files changed, 31 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/9414/2 -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Reviewer: Tidy Bot
[kudu-CR] webserver-stress-itest: fix flakiness
Hello Mike Percy, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9414 to review the following change. Change subject: webserver-stress-itest: fix flakiness .. webserver-stress-itest: fix flakiness This fixes a source of flakiness I found on the flaky dashboard. In some runs of this test, we'd hit the following interleaving: - we start the master with webserver_port=0 and it picks some port (eg 35000) - we stop the master - the curl threads are still running, and one of them picks port 35000 as the local side of its TCP connection. It then tries to connect to 35000 and hits the dreaded "tcp loop connect" phenomenon[1] in which it actually connects to _itself_. Thus it just hangs there occupying the port - we try to start the master again, and it fails to bind - we now time out trying to Join() on the curl thread, which is waiting forever for itself to respond to an HTTP request. The fix is to use non-ephemeral ports for the webserver as we already do for the RPC server. I additionally added timeouts to the curl calls. [1] http://www.rampa.sk/static/tcpLoopConnect.html Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b --- M src/kudu/integration-tests/linked_list-test-util.h M src/kudu/integration-tests/webserver-stress-itest.cc M src/kudu/util/curl_util.cc M src/kudu/util/curl_util.h 4 files changed, 31 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/9414/1 -- To view, visit http://gerrit.cloudera.org:8080/9414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b Gerrit-Change-Number: 9414 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Mike Percy