Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8804 )

Change subject: KUDU-1704: add READ_YOUR_WRITES scan mode
......................................................................


Patch Set 12:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto
File src/kudu/common/common.proto:

http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@236
PS12, Line 236: subject to the propagated timestamp bound
This wording is pretty hard to a casual reader (or even me) to understand.

I think it would be more clear to say something like: Each tablet server will 
choose a timestamp to use for a server-local snapshot scan subject to the 
following criteria: (1) It will be higher than the propagated timestamp, (2) It 
will attempt to minimize latency caused by waiting for outstanding write 
transactions to complete.


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@237
PS12, Line 237: newest timestamp within
latest timestamp higher than


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@242
PS12, Line 242: the timestamp
a timestamp


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@244
PS12, Line 244: should use it
Is this something that end-users have to worry about, or can we word this so 
that it sounds like Kudu will take care of it? If the latter, how about

"The Kudu client library will use it as the propagated timestamp for subsequent 
reads" or something along those lines? However I'm not sure why we are making 
the distinction here between a normally propagated timestamp and this different 
thing.


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/common/common.proto@245
PS12, Line 245: wait
nit: waiting


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

PS12:
We are missing a C++ client test in this patch. Is that excluded on purpose?


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1924
PS12, Line 1924: TestScanYourWrites
> It worries me a bit that most tests of this functionality overwhelmingly te
I agree that a deterministic test would be nice and I think it should be 
possible using two clients.

It would also be nice to have a non-deterministic test where you have two 
clients in RYW mode reading and scanning. Loop them concurrently in separate 
threads at least a certain number of rounds or amount of time, as well as until 
both of the following conditions hold, while also triggering leader elections 
to make it easier to hit anomalies:

1. neither has ever violated RYW locally
2. we have observed anomalies where if we had been using a strict serializable 
approach then certain writes would have appeared, but they don't because of the 
hidden channel (the other client).


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1924
PS12, Line 1924: TEST_F(TabletServerTest, TestScanYourWrites) {
nit: add a comment at the top of this test describing what the test does


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1927
PS12, Line 1927: e
nit: missing period


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1958
PS12, Line 1958: TEST_F(TabletServerTest, 
TestScanYourWrites_WithoutPropagatedTimestamp) {
nit: add test comment


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1960
PS12, Line 1960: perform a write
nit: use capitalization and punctuation for code comments per 
https://google.github.io/styleguide/cppguide.html#Punctuation,_Spelling_and_Grammar


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_server-test.cc@1972
PS12, Line 1972: perform a write
punctuation


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.h
File src/kudu/tserver/tablet_service.h:

http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.h@155
PS12, Line 155:   // it exceeds the maximum allowed clock synchronization error 
time.
Can we add a note about the context, i.e. why this matters?

nit: Also perhaps rename to ValidateSnapshotTimestamp() ? Usually Stamp in 
Timestamp is not capitalized in the Kudu code, and this seems specific to 
snapshot timestamps.


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.h@158
PS12, Line 158: if
s/if/that/


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2113
PS12, Line 2113:
add: DCHECK(s.ok()) ?


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2114
PS12, Line 2114:   // Note: if 'max_allowed_ts' is not obtained from 
clock_->GetGlobalLatest() it's guaranteed
               :   // to be higher than 'tmp_snap_timestamp'.
I'm having trouble following this:

1. What does "not obtained" mean? Is it like GetGlobalLatest() returned 
Status::Corrruption or something like that and so then we assume max_allowed_ts 
is still default-constructed?
2. Why is it guaranteed to by higher than 'tmp_snap_timestamp'? Can we explain 
what guarantees that and why?


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2133
PS12, Line 2133: that
than


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2135
PS12, Line 2135: much
far


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2164
PS12, Line 2164: RETURN_NOT_OK(ValidateTimeStamp(tmp_snap_timestamp));
Hmm, we already updated the clock on line 2137 so AFAICT the only way this 
could fail is if the propagated timestamp +1 fails this test or the clean 
timestamp fails this test. Is that the purpose of this? Should we be trying to 
check this before we update the clock? I'm actually not sure what the right 
answer here is but intuitively it seems we should try to perform as much 
validation as possible before making changes to the system.


http://gerrit.cloudera.org:8080/#/c/8804/12/src/kudu/tserver/tablet_service.cc@2170
PS12, Line 2170: VerifyNotAncientHistory
same here; is there some way to check this before updating the clock, or we 
don't really care one way or the other?



--
To view, visit http://gerrit.cloudera.org:8080/8804
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84ddb981a1a0f199d4e66f5d5097318f8c785a48
Gerrit-Change-Number: 8804
Gerrit-PatchSet: 12
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 13 Feb 2018 23:55:39 +0000
Gerrit-HasComments: Yes

Reply via email to