[kudu-CR](branch-1.5.x) KUDU-2209. HybridClock doesn't handle changes in STA NANO flag
Jean-Daniel Cryans has posted comments on this change. ( http://gerrit.cloudera.org:8080/8699 ) Change subject: KUDU-2209. HybridClock doesn't handle changes in STA_NANO flag .. Patch Set 1: Verified+1 Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.5.x Gerrit-MessageType: comment Gerrit-Change-Id: Iaa9241dea1304076ed5cfaec78779d15b11293ff Gerrit-Change-Number: 8699 Gerrit-PatchSet: 1 Gerrit-Owner: Attila BukorGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Mon, 04 Dec 2017 22:48:10 + Gerrit-HasComments: No
[kudu-CR](branch-1.5.x) KUDU-2209. HybridClock doesn't handle changes in STA NANO flag
Jean-Daniel Cryans has removed a vote on this change. Change subject: KUDU-2209. HybridClock doesn't handle changes in STA_NANO flag .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/8699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.5.x Gerrit-MessageType: deleteVote Gerrit-Change-Id: Iaa9241dea1304076ed5cfaec78779d15b11293ff Gerrit-Change-Number: 8699 Gerrit-PatchSet: 1 Gerrit-Owner: Attila BukorGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-1.5.x) KUDU-2209. HybridClock doesn't handle changes in STA NANO flag
Jean-Daniel Cryans has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8699 ) Change subject: KUDU-2209. HybridClock doesn't handle changes in STA_NANO flag .. KUDU-2209. HybridClock doesn't handle changes in STA_NANO flag Users have occasionally reported spurious crashes due to Kudu thinking that another node has a time stamp from the future. After some debugging I realized that the issue is that we currently capture the flag 'STA_NANO' from the kernel only at startup. This flag indicates whether the kernel's sub-second timestamp is in nanoseconds or microseconds. We initially assumed this was a static property of the kernel. However it turns out that this flag can get toggled at runtime by ntp in certain circumstances. Given this, it was possible for us to interpret a number of nanoseconds as if it were microseconds, resulting in a timestamp up to 1000 seconds in the future. This patch changes the SystemNtp time source to always use the 'adjtime' call to fetch the clock, and looks at the STA_NANO flag on every such call rather than only at startup. I checked the source for the ntp_gettime call that we used to use, and it turns out it was implemented in terms of the same adjtime() call, so there should be no performance penalty in this change. This patch doesn't include tests since it's based on some kernel functionality. However, I was able to test it as follows: - wrote a simple program to print the time once a second - stopped ntp, ran chrony, stopped chrony, and started ntpd -- this has the side effect of clearing STA_NANO - waited 10 minutes or so, and eventually ntp reset back to STA_NANO -- this caused my test program to start printing incorrect timestamps as it was interpreting nanosecond values from the kernel as if they were microseconds Change-Id: Iaa9241dea1304076ed5cfaec78779d15b11293ff Reviewed-on: http://gerrit.cloudera.org:8080/8450 Reviewed-by: Alexey SerbinTested-by: Alexey Serbin Reviewed-on: http://gerrit.cloudera.org:8080/8699 Reviewed-by: Jean-Daniel Cryans Tested-by: Jean-Daniel Cryans --- M src/kudu/clock/system_ntp.cc M src/kudu/clock/system_ntp.h 2 files changed, 21 insertions(+), 53 deletions(-) Approvals: Alexey Serbin: Looks good to me, but someone else must approve Jean-Daniel Cryans: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/8699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.5.x Gerrit-MessageType: merged Gerrit-Change-Id: Iaa9241dea1304076ed5cfaec78779d15b11293ff Gerrit-Change-Number: 8699 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR](branch-1.5.x) KUDU-2209. HybridClock doesn't handle changes in STA NANO flag
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/8699 ) Change subject: KUDU-2209. HybridClock doesn't handle changes in STA_NANO flag .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.5.x Gerrit-MessageType: comment Gerrit-Change-Id: Iaa9241dea1304076ed5cfaec78779d15b11293ff Gerrit-Change-Number: 8699 Gerrit-PatchSet: 1 Gerrit-Owner: Attila BukorGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 30 Nov 2017 22:57:29 + Gerrit-HasComments: No
[kudu-CR](branch-1.5.x) KUDU-2209. HybridClock doesn't handle changes in STA NANO flag
Hello Alexey Serbin, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/8699 to review the following change. Change subject: KUDU-2209. HybridClock doesn't handle changes in STA_NANO flag .. KUDU-2209. HybridClock doesn't handle changes in STA_NANO flag Users have occasionally reported spurious crashes due to Kudu thinking that another node has a time stamp from the future. After some debugging I realized that the issue is that we currently capture the flag 'STA_NANO' from the kernel only at startup. This flag indicates whether the kernel's sub-second timestamp is in nanoseconds or microseconds. We initially assumed this was a static property of the kernel. However it turns out that this flag can get toggled at runtime by ntp in certain circumstances. Given this, it was possible for us to interpret a number of nanoseconds as if it were microseconds, resulting in a timestamp up to 1000 seconds in the future. This patch changes the SystemNtp time source to always use the 'adjtime' call to fetch the clock, and looks at the STA_NANO flag on every such call rather than only at startup. I checked the source for the ntp_gettime call that we used to use, and it turns out it was implemented in terms of the same adjtime() call, so there should be no performance penalty in this change. This patch doesn't include tests since it's based on some kernel functionality. However, I was able to test it as follows: - wrote a simple program to print the time once a second - stopped ntp, ran chrony, stopped chrony, and started ntpd -- this has the side effect of clearing STA_NANO - waited 10 minutes or so, and eventually ntp reset back to STA_NANO -- this caused my test program to start printing incorrect timestamps as it was interpreting nanosecond values from the kernel as if they were microseconds Change-Id: Iaa9241dea1304076ed5cfaec78779d15b11293ff Reviewed-on: http://gerrit.cloudera.org:8080/8450 Reviewed-by: Alexey SerbinTested-by: Alexey Serbin --- M src/kudu/clock/system_ntp.cc M src/kudu/clock/system_ntp.h 2 files changed, 21 insertions(+), 53 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/99/8699/1 -- To view, visit http://gerrit.cloudera.org:8080/8699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.5.x Gerrit-MessageType: newchange Gerrit-Change-Id: Iaa9241dea1304076ed5cfaec78779d15b11293ff Gerrit-Change-Number: 8699 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Bukor Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Todd Lipcon