Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9011 )
Change subject: Kudu-2208 Avoid system call interruption in Subprocess ...................................................................... Patch Set 1: (10 comments) I think it's also worth testing Subprocess::Call which seems to have some bugs as well http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc File src/kudu/util/subprocess-test.cc: http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@301 PS1, Line 301: thd nit: either abbreviate to just 't' or 'thr' would be more in line with what we do elsewhere. http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@303 PS1, Line 303: subprocessThread nit: we use snake_case style in C++ http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@305 PS1, Line 305: ASSERT_OK(p.Start()); I think adding a short sleep here (eg 50ms) is a good idea to ensure that the signals being sent from the other thread have a chance to race against Start() and not just Wait() http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@310 PS1, Line 310: // create signal you aren't creating a signal here, but rather setting up a signal handler http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@317 PS1, Line 317: // send at most 10 kill signals to subprocess why at most 10? Why not loop until we see that the subprocess has exited? I think there is a getter like 'joinable()' that you could use in a while loop. Sending only 10 leaves open a possibility that we dont' cover all the potential races http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@319 PS1, Line 319: int signalCounter = 0; : bool hasError = false; : nit: snake_case http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@322 PS1, Line 322: if (signalCounter > 10 || hasError) break; why not just put this into the while condition above? http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@325 PS1, Line 325: hasError = pthread_kill(thd, SIGUSR2) == 0; do we expect any errors from this? if not, could we just ASSERT this? http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess-test.cc@330 PS1, Line 330: SleepFor(MonoDelta::FromMilliseconds(10)); why sleep in between? if our goal is to provoke races we want to send as many signals as we can as fast as possible. http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc File src/kudu/util/subprocess.cc: http://gerrit.cloudera.org:8080/#/c/9011/1/src/kudu/util/subprocess.cc@67 PS1, Line 67: // Retry on EINTR for functions like read() that return -1 on error. can we move this function to a utility header since it's used in a few places now? maybe os-util.h -- To view, visit http://gerrit.cloudera.org:8080/9011 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I66ab2ec391eced5ea1c37d4294d151fe03c7d586 Gerrit-Change-Number: 9011 Gerrit-PatchSet: 1 Gerrit-Owner: Jeffrey F. Lukman <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Fri, 12 Jan 2018 00:03:45 +0000 Gerrit-HasComments: Yes
