Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7477 )

Change subject: [clock] add a built-in NTP client implementation
......................................................................


Patch Set 16:

(47 comments)

http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.h
File src/kudu/clock/builtin_ntp.h:

http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.h@45
PS13, Line 45: class BuiltInNtp : public TimeService {
> Could use some more documentation, at least for the class (i.e. what are we
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.h@73
PS13, Line 73:   bool TryReceivePacket();
> Should doc (amongst other things) what the return value means.
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.h@92
PS13, Line 92:   int64_t last_compute_mono_ = 0;
             :   int64_t last_compute_wall_ = 0;
             :   int64_t last_compute_error_ = 0;
             :   bool is_synchronized_ = false;
> Might want to encapsulate these in a struct so it's easy to make a local co
Done


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc
File src/kudu/clock/builtin_ntp.cc:

http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@286
PS9, Line 286:
             :   // The monotonic timestamp wh
> Do we need to convert this into a TODO, or a note?
A couple of notes here:

1) The recommendation about re-resolving NTP server addresses is gone (along  
with other best practices section for client) from RFC5905.  Does it mean it's 
no longer recommended?  I cannot tell, as of now.

2) From what I saw from chrony's source code, chrony does perform re-resolution 
only when replacing 'bad' servers or by explicit and implicit command request 
(e.g., from its CLI chronyc): the former about re-resolving addresses, the 
latter is about going from offline to online operation mode.

I updated the comment.


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@385
PS9, Line 385:   }
> Yeah, it looks like Todd's reimplementation of this logic remained and is n
This has been addressed in one of the recent patch series.  The idea is to use 
the same timeout parameter for both SystemNtp and BuiltinNtp.

Does it seem good enough or it's better to do that in using the mentioned 
'consolidate it' approach instead?


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@474
PS9, Line 474:
> Alexey, what do you think?
This has been addressed already.  E.g., 
https://gerrit.cloudera.org/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@557 or 
around.


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@509
PS9, Line 509:   {
             :     MutexLock l(state_lock_);
> Alexey, did you explore this comment?
Yes, I did: there was a bug in the code and now it's fixed.  In other words, 
current validation of a packet from NTP server properly verifies the mode.


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@536
PS9, Line 536:   // unclear what clock those come from
> Alexey, what do you think of this?
For this and other worth-doing non-trivial issues I opened few upstream  JIRA 
issues.  I added corresponding JIRA IDs along with the comments.


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@583
PS9, Line 583:   // 3.  The Originate Timestamp in the server reply should 
match the
> Alexey, did you work this out?
Here it's not necessary to add skew * roundtrip_delay, nope: this is just a 
reference time to tie the reported time with the local time (i.e. anchoring it 
to monotime).  Skew is added later on when assessing the clock error.


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@666
PS9, Line 666:   //    Originate Timestamp     T1   time request sent by client
> +1, Alexey could you do this?
Yes, that's a good idea especially given the fact that the original version of 
the code wasn't doing the thing it was supposed to do.

I'm planning to do so in a separate changelist, adding unit tests for the newly 
factored function.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc
File src/kudu/clock/builtin_ntp.cc:

http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@58
PS13, Line 58:               "0.ubuntu.pool.ntp.org,"
             :               "1.ubuntu.pool.ntp.org,"
             :               "2.ubuntu.pool.ntp.org,"
             :               "3.ubuntu.pool.ntp.org",
> Shouldn't we default to [0-3].pool.ntp.org? As per https://www.ntppool.org/
Yep, that seems to be less vendor-specific; good point.  Done.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@66
PS13, Line 66:
             : // As of 2019, recent version of ntpd allows setting minpoll as 
low as 3
             : // (2^3 = 8 seconds), and chronyd supports minpoll as low as to 
-6
             : // (2^-6 = 1/64 second), so 16 seconds looks like a reasonble 
default.
> Do we also want to express this in terms of powers of 2? Is that an interfa
Given that the whole premise of this built-in NTP client appeared from the fact 
that most users are frustrated with configuring NTP, and not many were able to 
do so properly, I think it makes sense to keep the plain notation for this 
setting :)


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@87
PS13, Line 87: DEFINE_int32(ntp_initial_sync_wait_secs, 60,
> Why did this move from one time service impl to another?
That's because system_clock (the former location of the definition) isn't 
linked in on macOS, but builtin_ntp requires that parameter as well and it's 
ilnked in on macOS.

The relevant comment thread is at 
https://gerrit.cloudera.org/#/c/7477/9/src/kudu/clock/builtin_ntp.cc@385


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@167
PS13, Line 167:   // NOTE: this is used as a nonce, not the actual time.
> Does this comment belong near the transmit times? I'm thinking of CreateCli
To me it looked like an attempt to clarify on the semantics of these fields -- 
that's what they are by the spec/RFC.  But I think it doesn't hurt if we move 
the NOTE  into CreateClientPacket.

Done.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@351
PS13, Line 351: currently
> Nit: already said current before, remove one of these.
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@352
PS13, Line 352:  If we
              :   // run out of addresses, we will re-resolve.
> This doesn't appear to be implemented yet; move to a TODO?
Well, it seems to be implemented already: see line 732 in PS13.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@370
PS13, Line 370: Status BuiltInNtp::CreateFromFlags(unique_ptr<TimeService>* 
ntp) {
> Since Init() is required and already returns a Status, you could defer the
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@374
PS13, Line 374: --builtin-ntp-servers
> Nit: --builtin_ntp_servers
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@382
PS13, Line 382:
> Nit: remove empty line
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@407
PS13, Line 407:   CHECK_OK(to_bind.ParseString("0.0.0.0", 0));
> Do we need to make this configurable? Do we always want to bind to every in
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@411
PS13, Line 411:   
RETURN_NOT_OK_PREPEND(socket_.SetRecvTimeout(MonoDelta::FromSeconds(0.5)),
              :                         "couldn't set socket recv timeout");
> Why is this important?
The IO loop of this implementation (see BuiltInNtp::PollThread() method) 
doesn't allow for IO multiplexing, so this short SO_RCVTIMEO timeout is set to 
avoid blocking the IO loop at the receiving phase when there is data to be sent.

I added corresponding comment into the code.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@433
PS13, Line 433:         return Status::Incomplete(Substitute(
> Why not TimedOut?
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@440
PS13, Line 440:   return Status::OK();
> Do you think we should return non-OK if state_ transitioned to kShutdown (i
Good catch!  Done.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@458
PS13, Line 458:     return Status::IllegalState("wallclock is not synchronized",
> I think the system NTP implementation returns ServiceUnavailable when not s
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@489
PS13, Line 489:   for (const auto& s : servers_) {
> Do we need to protect any server state with a lock? What if the poll thread
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@515
PS13, Line 515:     state_ = kShutdown;
> Maybe we should release state_lock_ after this? No need to hold it during t
Right, it's not WaitFor() on conditional or whatever.  Done.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@518
PS13, Line 518:     state_cond_.Broadcast();
> Signal()? We're just waking up one thread in Init.
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@528
PS13, Line 528:   }
> Should we close socket_ here, after the poll thread has exited?
Yes, we can do that -- socket_ is not used anywhere else after Init() has 
successfully completed.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@545
PS13, Line 545:     PLOG(WARNING) << "NTP recv error";
> Do we want to throttle this? Or is it rare?
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@572
PS13, Line 572:   memcpy(&resp, buf, sizeof(NtpPacket));
> Could you recvfrom() directly into the NtpPacket and avoid the copy?
Yes, but what if there is more data than needed in the socket?  I'm not sure 
how to handle that in case if pointing recvfrom directly into NtpPacket.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@609
PS13, Line 609: LI
> Where is this enforced? On L624 we reject servers whose LI is 3, not 0.
I'm not sure why it says so, probably it's a typo.  In both reference 
implementation in https://tools.ietf.org/html/rfc5905#appendix-A.5.1 and 
chrony, LI 0 is OK and means 'no warning', but LI 3 means 'not synchronized' 
and such packets should be discarded.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@625
PS13, Line 625:     VLOG(1) << Substitute("NTP server $0 is unsynchronized",
> Curious why this and L634 are VLOG and not KLOG_EVERY_N_SECS? Are they comm
I replaced most of those with VLOG() since in the regular usage scenario (not 
debugging) it's not that important to see there details in the log.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@717
PS13, Line 717:   RecordResponse(p->server, rr);
> No std::move here? Trivially copyable?
Done.

It's not trivially copyable because of SockAddr field.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@733
PS13, Line 733:       // Mark the server as offline.
> When is this ever undone? What happens when we've marked all of our servers
Ah, indeed -- it seems one line slipped through cracks before call of 
RecordResponse().  Thanks!

>From the other side, it seems I need to add a test for such a condition that 
>would catch this.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@746
PS13, Line 746:     if (now > s->next_poll_) {
> Maybe >= so that, if all of this code gets run quickly at Init() time (and
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@748
PS13, Line 748:       int delay = poll_interval + rng_.Uniform(poll_interval);
> Is the jitter mandated by a spec? If not, what's the rationale?
I think the rationale is to avoid high waves of IO if all servers are at same 
RTT distance since the trivial IO loop this implementation uses would't bode 
well with that.

I added corresponding comment.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@893
PS13, Line 893:   // This lambda builds an intersection interval given set of 
correctness
              :   // intervals and reference local time.
> Why define this as a lambda given it's only called once?
This was used multiple times in prior revision :)

I'm planning to update it along with factoring out the CombineClock() logic in 
a unit-testable method/function.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@996
PS13, Line 996:     std::lock_guard<RWMutex> l(last_compute_lock_);
> This could be deferred to L1000.
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/builtin_ntp.cc@1024
PS13, Line 1024:       state_cond_.Broadcast();
> Should be Signal, right?
Done


http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/ntp-test.cc
File src/kudu/clock/ntp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7477/9/src/kudu/clock/ntp-test.cc@48
PS9, Line 48:
> Yeah, now that we have more legit tests, perhaps we should remove this? Mor
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc
File src/kudu/clock/ntp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc@89
PS13, Line 89: scenariso
> scenarios
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc@101
PS13, Line 101: // TODO(aserbin): fix the described chrony's bug, at least in 
thirdparty
> Yeah do you have a sense for how much work that would be?
I looked at that.  My optimistic estimation is about one day.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc@130
PS13, Line 130: TEST_F(BuiltinNtpWithMiniChronydTest, Basic) {
> This test looks quite similar to TestNtp_SystemVsBuiltinNtp.
Yep, and I removed the former.


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc@131
PS13, Line 131:   const uint16_t base_port = 10123 + getpid() % 1000;
> How much work would it be to patch chrony to allow binding to an ephemeral
For a while, we use port reservations via SO_REUSEPORT


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc@134
PS13, Line 134:   for (auto idx = 0; idx < 3; ++idx) {
              :     MiniChronydOptions options;
              :     options.index = idx;
              :     options.bindaddress = GetBindIpForDaemon(options.index + 1, 
kDefaultBindMode);
              :     options.port = base_port + idx * 100;
              :     unique_ptr<MiniChronyd> chrony(new MiniChronyd(options));
              :     ASSERT_OK(chrony->Start());
              :
              :     servers_endpoints.emplace_back(options.bindaddress, 
options.port);
              :     servers.emplace_back(std::move(chrony));
              :   }
> All of the tests repeat this sort of block, could you encapsulate it in a f
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc@305
PS13, Line 305: User
> Do you mean "Use" here (and below on L331)?
Done


http://gerrit.cloudera.org:8080/#/c/7477/13/src/kudu/clock/ntp-test.cc@320
PS13, Line 320:   {
              :     const vector<HostPort> refs = { unsync_servers_refs[1] };
              :
              :     // Verify chronyd's client itself does not accept the set 
of of NTP sources.
              :     NO_FATALS(CheckNoNtpSource(refs));
              :
              :     BuiltInNtp c({ unsync_servers_refs[1] });
              :     NO_FATALS(CheckInitUnsync(&c));
              :     NO_FATALS(CheckWallTimeUnavailable(&c));
              :   }
> Some of this repetition could be elided via lambda.
Done



--
To view, visit http://gerrit.cloudera.org:8080/7477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieb9eee9f0334236f39617492bd6f01304d1a0255
Gerrit-Change-Number: 7477
Gerrit-PatchSet: 16
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Sun, 22 Sep 2019 23:42:50 +0000
Gerrit-HasComments: Yes

Reply via email to