Hello Mike Percy,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/9414

to review the following change.


Change subject: webserver-stress-itest: fix flakiness
......................................................................

webserver-stress-itest: fix flakiness

This fixes a source of flakiness I found on the flaky dashboard. In some runs
of this test, we'd hit the following interleaving:

- we start the master with webserver_port=0 and it picks some port (eg 35000)
- we stop the master
- the curl threads are still running, and one of them picks port 35000 as the
  local side of its TCP connection. It then tries to connect to 35000 and hits
  the dreaded "tcp loop connect" phenomenon[1] in which it actually connects
  to _itself_. Thus it just hangs there occupying the port
- we try to start the master again, and it fails to bind
- we now time out trying to Join() on the curl thread, which is waiting forever
  for itself to respond to an HTTP request.

The fix is to use non-ephemeral ports for the webserver as we already do
for the RPC server. I additionally added timeouts to the curl calls.

[1] http://www.rampa.sk/static/tcpLoopConnect.html

Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b
---
M src/kudu/integration-tests/linked_list-test-util.h
M src/kudu/integration-tests/webserver-stress-itest.cc
M src/kudu/util/curl_util.cc
M src/kudu/util/curl_util.h
4 files changed, 31 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/9414/1
--
To view, visit http://gerrit.cloudera.org:8080/9414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If754d7f47a4c9c04bae3e9ef31acad801dd4db9b
Gerrit-Change-Number: 9414
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>

Reply via email to