[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release
Mike Percy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9597 ) Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release .. KUDU-2335. Work around rare consensus health bug for 1.7 release In very rare circumstances we have hit a DHCECK in quorum_util.cc in pre-commit builds stating that the leader should always have a HEALTHY health status. We have traced this to points in the replica lifecycle when the health status could be UNKNOWN. Since we want to release 1.7.0 soon, let's work around this issue for now. We'll follow up with a "real" fix and a decent test later. Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Reviewed-on: http://gerrit.cloudera.org:8080/9597 Reviewed-by: Alexey SerbinTested-by: Mike Percy --- M src/kudu/consensus/quorum_util.cc 1 file changed, 13 insertions(+), 3 deletions(-) Approvals: Alexey Serbin: Looks good to me, approved Mike Percy: Verified -- To view, visit http://gerrit.cloudera.org:8080/9597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Gerrit-Change-Number: 9597 Gerrit-PatchSet: 4 Gerrit-Owner: Mike Percy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9597 ) Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release .. Patch Set 3: Verified+1 test failed due to KUDU-2059, overriding Jenkins -- To view, visit http://gerrit.cloudera.org:8080/9597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Gerrit-Change-Number: 9597 Gerrit-PatchSet: 3 Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy Gerrit-Comment-Date: Tue, 13 Mar 2018 05:42:28 + Gerrit-HasComments: No
[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release
Mike Percy has removed a vote on this change. Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/9597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Gerrit-Change-Number: 9597 Gerrit-PatchSet: 3 Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy
[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9597 ) Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Gerrit-Change-Number: 9597 Gerrit-PatchSet: 3 Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Tue, 13 Mar 2018 03:12:16 + Gerrit-HasComments: No
[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9597 to look at the new patch set (#3). Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release .. KUDU-2335. Work around rare consensus health bug for 1.7 release In very rare circumstances we have hit a DHCECK in quorum_util.cc in pre-commit builds stating that the leader should always have a HEALTHY health status. We have traced this to points in the replica lifecycle when the health status could be UNKNOWN. Since we want to release 1.7.0 soon, let's work around this issue for now. We'll follow up with a "real" fix and a decent test later. Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 --- M src/kudu/consensus/quorum_util.cc 1 file changed, 13 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/9597/3 -- To view, visit http://gerrit.cloudera.org:8080/9597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Gerrit-Change-Number: 9597 Gerrit-PatchSet: 3 Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/9597 ) Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Gerrit-Change-Number: 9597 Gerrit-PatchSet: 1 Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Tue, 13 Mar 2018 02:04:55 + Gerrit-HasComments: No
[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release
Hello Alexey Serbin, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9597 to review the following change. Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release .. KUDU-2335. Work around rare consensus health bug for 1.7 release In very rare circumstances we have hit a DHCECK in quorum_util.cc stating that the leader should never have a "failed" health status. If we hit this in production we should log an error and decline to evict any nodes from that configuration. Since this is just a workaround for a very rare bug, we should also implement a fix once we get to the bottom of the issue. Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 --- M src/kudu/consensus/quorum_util.cc 1 file changed, 8 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/9597/1 -- To view, visit http://gerrit.cloudera.org:8080/9597 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197 Gerrit-Change-Number: 9597 Gerrit-PatchSet: 1 Gerrit-Owner: Mike PercyGerrit-Reviewer: Alexey Serbin