[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Hello Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9015 to look at the new patch set (#8). Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug This patch submits a unit test that create a Subprocess thread and while it is starting and waiting, another thread sends kill signals to the Subprocess thread. Since the pthread_kill() signal is sent asynchronously, sometimes the pthread_kill() raises error signal 3. In that condition, the unit test logs the incident as an INFO. Prior to this bug fix patch, the Subprocess throws an exception because it cannot handle the kill signal in Wait() state. To fix the bug, we add RETRY_ON_EINTR() inside Subprocess::DoWait() function. Furthermore, this patch also moves the RETRY_ON_EINTR() function that occurs in many places across the code to os-util.h. Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 --- M src/kudu/consensus/log_index.cc M src/kudu/util/env_posix.cc M src/kudu/util/net/socket.cc M src/kudu/util/os-util.h M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc 6 files changed, 69 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/8 -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 8 Gerrit-Owner: Jeffrey F. LukmanGerrit-Reviewer: Jeffrey F. Lukman Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Hello Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9015 to look at the new patch set (#7). Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug This patch submits a unit test that create a Subprocess thread and while it is starting and waiting, another thread sends kill signals to the Subprocess thread. Since the pthread_kill() signal is sent asynchronously, sometimes the pthread_kill() raises error signal 3. In that condition, the unit test logs the incident as an INFO. Prior to this bug fix patch, the Subprocess throws an exception because it cannot handle the kill signal in Wait() state. To fix the bug, we add RETRY_ON_EINTR() inside Subprocess::DoWait() function. Furthermore, this patch also moves the RETRY_ON_EINTR() function that occurs in many places across the code to os-util.h. Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 --- M src/kudu/consensus/log_index.cc M src/kudu/util/env_posix.cc M src/kudu/util/net/socket.cc M src/kudu/util/os-util.h M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc 6 files changed, 65 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/7 -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 7 Gerrit-Owner: Jeffrey F. LukmanGerrit-Reviewer: Jeffrey F. Lukman Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Jeffrey F. Lukman has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 ) Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h File src/kudu/util/os-util.h: http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h@33 PS5, Line 33: #define RETRY_ON_EINTR(err, expr) do { \ > need to #include in this header so that we get access to std: Done http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc File src/kudu/util/subprocess-test.cc: http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@28 PS5, Line 28: #include > per Alexey's comment please separate out the C includes from the C++ ones h Done http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@312 PS5, Line 312: t_started = true; > still think we should add a short sleep here so that we ensure that the sig Done http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@313 PS5, Line 313: SleepFor(MonoDelta::FromMilliseconds(50)); > nit: we have to use CHECK_OK here because gtest isn't thread-safe to run as Done http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@332 PS5, Line 332: ASSERT_TRUE(pthread_kill(t, SIGUSR2) == 0); > but the "user defined signal 2 error" is the bug we're trying to provoke he Done -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 6 Gerrit-Owner: Jeffrey F. LukmanGerrit-Reviewer: Jeffrey F. Lukman Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 12 Jan 2018 23:51:13 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Hello Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9015 to look at the new patch set (#6). Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug This patch submits a unit test that create a Subprocess thread and while it is starting and waiting, another thread sends kill signals to the Subprocess thread. Prior to the bug fix in this patch, the Subprocess throws an exception because it cannot handle the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR() at Subprocess::DoWait() function. Furthermore, this patch also moves the RETRY_ON_EINTR() function that occurs in many places across the code to os-util.h. Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 --- M src/kudu/consensus/log_index.cc M src/kudu/util/env_posix.cc M src/kudu/util/net/socket.cc M src/kudu/util/os-util.h M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc 6 files changed, 60 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/6 -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 6 Gerrit-Owner: Jeffrey F. LukmanGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9015 ) Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. Patch Set 5: (5 comments) http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h File src/kudu/util/os-util.h: http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/os-util.h@33 PS5, Line 33: static_assert(std::is_signed::value, \ need to #include in this header so that we get access to std::is_signed. Also whatever header has EINTR in it (not sure which one) http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc File src/kudu/util/subprocess-test.cc: http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@28 PS5, Line 28: #include per Alexey's comment please separate out the C includes from the C++ ones https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@312 PS5, Line 312: t_started = true; still think we should add a short sleep here so that we ensure that the signal-sending starts before we call p.Start http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@313 PS5, Line 313: ASSERT_OK(p.Start()); nit: we have to use CHECK_OK here because gtest isn't thread-safe to run assertions from non-main threads http://gerrit.cloudera.org:8080/#/c/9015/5/src/kudu/util/subprocess-test.cc@332 PS5, Line 332: // Add delay to avoid user defined signal 2 error but the "user defined signal 2 error" is the bug we're trying to provoke here. So if we are still seeing a crash due to it, then that means we haven't fully fixed the bug, right? -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 5 Gerrit-Owner: Jeffrey F. Lukman Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 12 Jan 2018 20:56:24 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Hello Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9015 to look at the new patch set (#5). Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug This patch submits a unit test that create a Subprocess thread and while it is starting and waiting, another thread sends kill signals to the Subprocess thread. Prior to the bug fix in this patch, the Subprocess throws an exception because it cannot handle the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR() at Subprocess::DoWait() function. Furthermore, this patch also moves the RETRY_ON_EINTR() function that occurs in many places across the code to os-util.h. Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 --- M src/kudu/consensus/log_index.cc M src/kudu/util/env_posix.cc M src/kudu/util/net/socket.cc M src/kudu/util/os-util.h M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc 6 files changed, 60 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/5 -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 5 Gerrit-Owner: Jeffrey F. LukmanGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Hello Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9015 to look at the new patch set (#3). Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug This patch submits a unit test that create a Subprocess thread and while it is starting and waiting, another thread sends kill signals to the Subprocess thread. Prior to the bug fix in this patch, the Subprocess throws an exception because it cannot handle the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR() at Subprocess::DoWait() function. Furthermore, this patch also moves the RETRY_ON_EINTR() function that occurs in many places across the code to os-util.h. Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 --- M src/kudu/consensus/log_index.cc M src/kudu/util/env_posix.cc M src/kudu/util/net/socket.cc M src/kudu/util/os-util.h M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc 6 files changed, 57 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/3 -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 3 Gerrit-Owner: Jeffrey F. LukmanGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug
Hello Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9015 to look at the new patch set (#2). Change subject: KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug .. KUDU-2208 Add unit test to detect Subprocess Interruption Handling failure and patch to fix the bug This patch submits a unit test that create a Subprocess thread and while it is starting and waiting, another thread sends kill signals to the Subprocess thread. Prior to the bug fix in this patch, the Subprocess throws an exception because it cannot handle the kill signal in Wait() state. The bug fix adds RETRY_ON_EINTR() at Subprocess::DoWait() function. Furthermore, this patch also moves the RETRY_ON_EINTR() function that occurs in many places across the code to os-util.h. Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 --- M src/kudu/consensus/log_index.cc M src/kudu/util/env_posix.cc M src/kudu/util/net/socket.cc M src/kudu/util/os-util.h M src/kudu/util/subprocess-test.cc M src/kudu/util/subprocess.cc 6 files changed, 57 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/9015/2 -- To view, visit http://gerrit.cloudera.org:8080/9015 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I148b4619a8dda8e3e95dd6ea5c6e993a9e37a333 Gerrit-Change-Number: 9015 Gerrit-PatchSet: 2 Gerrit-Owner: Jeffrey F. LukmanGerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon