Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14912 )

Change subject: [metrics] Add a new metric 'last_consult_timestamp' for tablet
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14912/1/src/kudu/tablet/tablet_metrics.cc
File src/kudu/tablet/tablet_metrics.cc:

http://gerrit.cloudera.org:8080/#/c/14912/1/src/kudu/tablet/tablet_metrics.cc@136
PS1, Line 136: METRIC_DEFINE_gauge_uint64(tablet, last_consult_timestamp, "Last 
Consult Timestamp",
> Does it make sense to separate this into two metrics? One for writes and on
Agreed with Grant.

We should also say something about the units. Is it milliseconds since the UNIX 
epoch?


http://gerrit.cloudera.org:8080/#/c/14912/1/src/kudu/tablet/tablet_metrics.cc@326
PS1, Line 326: // #define MAXINIT(x) x(METRIC_##x.Instantiate(entity, 0, kMax))
So this speaks to a deeper issue: every tserver has its own clock, and those 
clocks may not be well-synchronized, either individually with some NTP servers, 
or with one another.

What are your expectations w.r.t. this metric? It seems like you don't want it 
to "go backwards" if the system clock changes value. Why is that? How do you 
expect each tserver's value to correlate to other tserver values?

An alternative that might be cleaner is, rather than storing a timestamp in the 
metric, to store, as part of the tablet, a MonoTime corresponding to the last 
write or last scan. Then, you could expose a FunctionGauge for "last consult 
time" that'd be calculated as Now - time_of_last_write (or time_of_last_scan). 
I guess the MonoTime part is orthogonal; you could use time() instead. But 
there's value in using a MonoTime because it's guaranteed not to drop, so you 
don't have to answer these trickier questions.


http://gerrit.cloudera.org:8080/#/c/14912/1/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/14912/1/src/kudu/tserver/tablet_service.cc@2678
PS1, Line 2678:     // First, the number of rows/cells/bytes actually returned 
to the user.
Seems like we could merge this with the code in L2702-2706. It'd be nice to 
avoid the duplicate checks and cognitive overhead of two separate metric 
updates.



--
To view, visit http://gerrit.cloudera.org:8080/14912
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I90738ba90eaa8f78c8d721dde2eed2b723d5572d
Gerrit-Change-Number: 14912
Gerrit-PatchSet: 1
Gerrit-Owner: Yingchun Lai <405403...@qq.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Grant Henke <granthe...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 17 Dec 2019 00:05:02 +0000
Gerrit-HasComments: Yes

Reply via email to