On Tue, Feb 17, 2015 at 03:16:18PM -0800, John Stultz wrote: > Ok, thanks for the more verbose explanation. Although this is more a > history of what you've seen rather then the crux of the change. > > To distill this down just a bit, the point is the usual mode for NTP > time_state machine looks like: > > TIME_OK -> TIME_INS -> TIME_OOP > | | > v v > TIME_DEL ------------> TIME_WAIT -(back)-> TIME_OK > > (hopefully the ascii art survives here) > > Now, from any of these states, currently if adjtimex is called w/ the > STA_PLL bit cleared (after STA_PLL was set), we reset back to TIME_OK, > effectively cancelling any transitions. (You'll have to imagine a line > from any of the states back to TIME_OK, since that's going to be too > ugly to do in ascii) > > Your patch is trying to remove the line back from TIME_OOP back to > TIME_OK. Basically stopping the ability to reset the ntp state during > a leapsecond. > > I do get that the behavior seen was strange due to a bug in the test > code which caused unexpected cancellation of state, but I'm not sure > if we should change the behavior to enforce that cancellation not be > possible. I could imagine some logic which really wants to reset the > state, which just by chance lands during a leap second, and the > application is confused since the state change didn't occur as > expected. > > So I guess I'm not seeing that the state machine is actually "broken" > in this case that you've outlined. If you can articulate better why > the OOP -> OK transition is truly invalid, I'd be interested in > hearing, but I'm not sure I want to risk a behavioral change unless > there's wide agreement.
I think the only real problem occurs when the adjtimex is called in the the TIME_OOP state with STA_PLL cleared _and_ STA_INS set. In this case the state machine is reset to TIME_OK but goes back to TIME_INS on the next second_overflow, potentially causing another false leap second to be inserted on the following midnight. The state machine is meant to only go back to TIME_INS once STA_INS is cleared and then set again - this is what the TIME_WAIT state is for. In fact, I don't see a reason why the STA_PLL -> !STA_PLL transition should ever set the time_state to TIME_OK. - When the STA_INS/STA_DEL flag is removed from the status, the state machine will end up in TIME_OK from any state. - When STA_INS/STA_DEL is set in the status, the state mchine will transition from TIME_OK to TIME_INS/TIME_DEL anyway. I think the "time_status = TIME_OK" should be just dropped. It has been added by eea83d896e318bda54be2d2770d2c5d6668d11db (ntp: NTP4 user space bits update) and it's not clear why. Roman? -- Jiri Bohac <[email protected]> SUSE Labs, SUSE CZ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

