[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release

2018-03-12 Thread Mike Percy (Code Review)
Mike Percy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9597 )

Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release
..

KUDU-2335. Work around rare consensus health bug for 1.7 release

In very rare circumstances we have hit a DHCECK in quorum_util.cc in
pre-commit builds stating that the leader should always have a HEALTHY
health status. We have traced this to points in the replica lifecycle
when the health status could be UNKNOWN.

Since we want to release 1.7.0 soon, let's work around this issue for
now. We'll follow up with a "real" fix and a decent test later.

Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Reviewed-on: http://gerrit.cloudera.org:8080/9597
Reviewed-by: Alexey Serbin 
Tested-by: Mike Percy 
---
M src/kudu/consensus/quorum_util.cc
1 file changed, 13 insertions(+), 3 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Mike Percy: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Gerrit-Change-Number: 9597
Gerrit-PatchSet: 4
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release

2018-03-12 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9597 )

Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release
..


Patch Set 3: Verified+1

test failed due to KUDU-2059, overriding Jenkins


--
To view, visit http://gerrit.cloudera.org:8080/9597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Gerrit-Change-Number: 9597
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Comment-Date: Tue, 13 Mar 2018 05:42:28 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release

2018-03-12 Thread Mike Percy (Code Review)
Mike Percy has removed a vote on this change.

Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/9597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Gerrit-Change-Number: 9597
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 


[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release

2018-03-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9597 )

Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Gerrit-Change-Number: 9597
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 13 Mar 2018 03:12:16 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release

2018-03-12 Thread Mike Percy (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9597

to look at the new patch set (#3).

Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release
..

KUDU-2335. Work around rare consensus health bug for 1.7 release

In very rare circumstances we have hit a DHCECK in quorum_util.cc in
pre-commit builds stating that the leader should always have a HEALTHY
health status. We have traced this to points in the replica lifecycle
when the health status could be UNKNOWN.

Since we want to release 1.7.0 soon, let's work around this issue for
now. We'll follow up with a "real" fix and a decent test later.

Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
---
M src/kudu/consensus/quorum_util.cc
1 file changed, 13 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/9597/3
--
To view, visit http://gerrit.cloudera.org:8080/9597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Gerrit-Change-Number: 9597
Gerrit-PatchSet: 3
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release

2018-03-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9597 )

Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Gerrit-Change-Number: 9597
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Comment-Date: Tue, 13 Mar 2018 02:04:55 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2335. Work around rare consensus health bug for 1.7 release

2018-03-12 Thread Mike Percy (Code Review)
Hello Alexey Serbin,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9597

to review the following change.


Change subject: KUDU-2335. Work around rare consensus health bug for 1.7 release
..

KUDU-2335. Work around rare consensus health bug for 1.7 release

In very rare circumstances we have hit a DHCECK in quorum_util.cc
stating that the leader should never have a "failed" health status.
If we hit this in production we should log an error and decline to evict
any nodes from that configuration.

Since this is just a workaround for a very rare bug, we should also
implement a fix once we get to the bottom of the issue.

Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
---
M src/kudu/consensus/quorum_util.cc
1 file changed, 8 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/9597/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iad67c7943a5b619ef2fa3a67c92cc033e207e197
Gerrit-Change-Number: 9597
Gerrit-PatchSet: 1
Gerrit-Owner: Mike Percy 
Gerrit-Reviewer: Alexey Serbin