[kudu-CR] KUDU-1189 if not set, use timestamp from first server
Alexey Serbin has posted comments on this change. Change subject: KUDU-1189 if not set, use timestamp from first server .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/5143/3/src/kudu/integration-tests/consistency-itest.cc File src/kudu/integration-tests/consistency-itest.cc: PS3, Line 360: ServerAssignedScanTimestamp > That I haven't verified yet. Will do and report on my findings. Good idea I checked that the test 100% fails it not adding the part for scanner-internal.cc: /home/aserbin/Projects/kudu/src/kudu/integration-tests/consistency-itest.cc:458: Failure Value of: row_count Actual: 2 Expected: 1UL Which is: 1 [ FAILED ] ConsistencyITest.TestSnapshotScanTimestampReuse (513 ms) The second scenario also fails 100%: /home/aserbin/Projects/kudu/src/kudu/integration-tests/consistency-itest.cc:522: Failure Value of: cfg.has_snapshot_timestamp() Actual: false Expected: true -- To view, visit http://gerrit.cloudera.org:8080/5143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2ac708e38b8a80834f7d54eca294517cbfb06ec6 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1189 if not set, use timestamp from first server
Alexey Serbin has posted comments on this change. Change subject: KUDU-1189 if not set, use timestamp from first server .. Patch Set 3: (33 comments) http://gerrit.cloudera.org:8080/#/c/5143/3//COMMIT_MSG Commit Message: PS3, Line 7: if not set, use timestamp from first server > Maybe rephrase to: Snapshot scans: reuse snapshot timestamp when not set Done PS3, Line 9: KUDU-1189 On reads at a snapshot that touch multiple tablets, without : the user setting a timestamp, use the timestamp from the first server : for following scans > weird wrapping, I guess you were going for an "extended" title. maybe not n As I understand, that's the title for KUDU-1189 as is. I'll remove the extra spaces for the next couple of lines to update the wrapping. PS3, Line 15: > nit: extra space Done PS3, Line 15: read > s/read/scan Done PS3, Line 19: more consistency > what is "more consistency"? :) 'more consistency' is close to nonsense and too fancy, I agree :) The Java client part hasn't been updated. I'm thinking to do that in a separate changelist. http://gerrit.cloudera.org:8080/#/c/5143/3/src/kudu/integration-tests/consistency-itest.cc File src/kudu/integration-tests/consistency-itest.cc: PS3, Line 75: TimestampPropagationTest > I missed this in the previous patch, but can you rename this test to Consis Done PS3, Line 164: GetTabletIdForKeyValue > not from this patch, but maybe rephrase to GetTabletIdsForKeyIntervals? That sound like a good idea. I'll rename it here and remove the 'table_name' parameter, while the generalization for key intervals will be in a separate changelist. PS3, Line 273: TwoBatchesAndReadAtSnapshot > missed this on the previous patch. after the main test class rename can you Done PS3, Line 352: KUDU-1189 > to close this we also need to do it on the java client. is that already don That's not done yet; yes, in the TODO list. Thank you for the reminder. PS3, Line 354: In the context of the same scan : // operation, that timestamp is then used as the snapshot timestamp for the rest : // of the operations to read data from appropriate tablet servers. > nit: maybe rephrase this to: If the scan spans multiple tablets, the timest Done PS3, Line 358: The idea of the test is simple: have a scan spanned across two tablets : // where the clocks of the corresponding tablet servers are skewed. > I think we could do a better job explaining what exactly this test is doing Done PS3, Line 360: ServerAssignedScanTimestamp > does this test fail 100% without the rest of the patch? That I haven't verified yet. Will do and report on my findings. Good idea, BTW. PS3, Line 360: ServerAssignedScanTimestamp > rename to: TestSnapshotScanTimestampReuse Done PS3, Line 360: TEST_F(TimestampPropagationTest, ServerAssignedScanTimestamp) { > FWYW I think the approach we did for the previous patch where we're pushing Ah, I see. That's a great observation. Let me split this patch into two patches to have the same meaningful sequence. PS3, Line 361: maximum delta > you mean maximum error? also not really maximum since your dividing by 2 ri This is outdated comment, will need to update this one. The idea add the second scenario came later, but I didn't update this comment. PS3, Line 363: delta > s/delta/offset ? or something Done PS3, Line 384: dynamic_cast(peer->clock()) > nit: is cases such as this you could use: HybridClock* clock = DCHECK_NOTNU Done PS3, Line 398: were > s/were/was Done Line 401: // Now, perform the scan at READ_AT_SNAPSHOT where timestamp is not specified: > where _a_ timestamp Done PS3, Line 409: to > remove "to" Done PS3, Line 417: size_t > still not totally sure why you're using 'size_t' here. I thought size_t was a good type for row count since the count cannot be negative, and there might be huge numbers there (each batch can be small, but in total there might be > 2^32 rows total). If you think int is better anyway, I'll replace it with int. Let me do that in a separate changelist, though. PS3, Line 422: , > At this point (choose one: we/the test has) inserted... Done PS3, Line 440: ASSERT_LE(0, t_diff.ToMicroseconds()); > what is this trying to "sanity check"? also the comparison is slightly weir After some consideration, I decided to remove this check. Basically, I think it's not worth checking that the basic clock works as expected -- there should be a specific unit test for that. Here we operate on the assumption that the clock works as it should. PS3, Line 452: Swap the servers in the time scale: > remove Done PS3, Line 453: deltas > replace deltas with something else Done PS3, Line 464: parition > typo Done PS3, Line 465: dynamic_cast (peer->clock()); > use DCHECK_NOTNULL Done PS3, Line 467: two_deltas > why "two_deltas"? Done PS3, Line 469:
[kudu-CR] KUDU-1189 if not set, use timestamp from first server
David Ribeiro Alves has posted comments on this change. Change subject: KUDU-1189 if not set, use timestamp from first server .. Patch Set 3: (33 comments) http://gerrit.cloudera.org:8080/#/c/5143/3//COMMIT_MSG Commit Message: PS3, Line 7: if not set, use timestamp from first server Maybe rephrase to: Snapshot scans: reuse snapshot timestamp when not set (i know it's long but at least it's informative) PS3, Line 9: KUDU-1189 On reads at a snapshot that touch multiple tablets, without : the user setting a timestamp, use the timestamp from the first server : for following scans weird wrapping, I guess you were going for an "extended" title. maybe not necessary anymore? PS3, Line 15: read s/read/scan PS3, Line 15: nit: extra space also maybe: _Then_ reuse it when continuing the scan on other tablet servers. PS3, Line 19: more consistency what is "more consistency"? :) I would say that fixes a bug where we wouldn't be scanning at a snapshot at all. Is this done on the java client? http://gerrit.cloudera.org:8080/#/c/5143/3/src/kudu/integration-tests/consistency-itest.cc File src/kudu/integration-tests/consistency-itest.cc: PS3, Line 75: TimestampPropagationTest I missed this in the previous patch, but can you rename this test to ConsistencyITest? 1 - All itests have the ITest suffix 2 - This is a good platform for other stuff that is not explicitely related to timestamp propagation PS3, Line 164: GetTabletIdForKeyValue not from this patch, but maybe rephrase to GetTabletIdsForKeyIntervals? Also maybe add a GetTabletIdForKey that reuses this, takes a single key and returns a single tablet (asserting on the latter) Finally if table_name_ is a field maybe not pass it? PS3, Line 273: TwoBatchesAndReadAtSnapshot missed this on the previous patch. after the main test class rename can you rename this to: TestTimestampPropagationFromScans PS3, Line 352: KUDU-1189 to close this we also need to do it on the java client. is that already done , or on your TODO list? PS3, Line 354: In the context of the same scan : // operation, that timestamp is then used as the snapshot timestamp for the rest : // of the operations to read data from appropriate tablet servers. nit: maybe rephrase this to: If the scan spans multiple tablets, the timestamp picked when scanning the first tablet is then used when scanning following tablets. PS3, Line 358: The idea of the test is simple: have a scan spanned across two tablets : // where the clocks of the corresponding tablet servers are skewed. I think we could do a better job explaining what exactly this test is doing here. took me a few minutes to understand. PS3, Line 360: ServerAssignedScanTimestamp does this test fail 100% without the rest of the patch? PS3, Line 360: ServerAssignedScanTimestamp rename to: TestSnapshotScanTimestampReuse PS3, Line 360: TEST_F(TimestampPropagationTest, ServerAssignedScanTimestamp) { FWYW I think the approach we did for the previous patch where we're pushing the test before the fix is better here. It allows anyone to download the test, run it without changing code and then download the fix and make sure the test passes. PS3, Line 361: maximum delta you mean maximum error? also not really maximum since your dividing by 2 right? PS3, Line 363: delta s/delta/offset ? or something PS3, Line 384: dynamic_cast(peer->clock()) nit: is cases such as this you could use: HybridClock* clock = DCHECK_NOTNULL(dynamic_cast (peer->clock())); PS3, Line 398: were s/were/was Line 401: // Now, perform the scan at READ_AT_SNAPSHOT where timestamp is not specified: where _a_ timestamp PS3, Line 409: to remove "to" PS3, Line 417: size_t still not totally sure why you're using 'size_t' here. PS3, Line 422: , At this point (choose one: we/the test has) inserted... PS3, Line 440: ASSERT_LE(0, t_diff.ToMicroseconds()); what is this trying to "sanity check"? also the comparison is slightly weird, maybe change it to: ASSERT_GE(t_diff.ToMicroseconds(), 0); PS3, Line 452: Swap the servers in the time scale: remove PS3, Line 453: deltas replace deltas with something else PS3, Line 464: parition typo PS3, Line 465: dynamic_cast (peer->clock()); use DCHECK_NOTNULL PS3, Line 467: two_deltas why "two_deltas"? PS3, Line 469: UpdateClock Change/Add AdvanceClockInTablet(string tablet_id, MonoDelta delta). Or maybe even: AdvanceClockInTabletHostingKey(int key, MonoDelta delta) This would get rid of a lot of the boiler plate in this test. PS3, Line 479: snapshot scan snapshot scan's timestamp PS3, Line 498: ince the snapshot timestamp : // is taken from the second server's clock why is it taken from the second server? isn't it that you now have advanced the first server's clock past all writes? PS3, Line 507:
[kudu-CR] KUDU-1189 if not set, use timestamp from first server
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5143 to look at the new patch set (#3). Change subject: KUDU-1189 if not set, use timestamp from first server .. KUDU-1189 if not set, use timestamp from first server KUDU-1189 On reads at a snapshot that touch multiple tablets, without the user setting a timestamp, use the timestamp from the first server for following scans For a READ_AT_SNAPSHOT scan operation with no snapshot timestamp specified, store the snapshot timestamp returned from the first tablet server into the scan configuration object. Use it for the rest of read operations performed at other tablet servers in the context of the same scan operation. This brings more consistency for read operations in READ_AT_SNAPSHOT mode where the snapshot timestamp is not specified explicitly. Change-Id: I2ac708e38b8a80834f7d54eca294517cbfb06ec6 --- M src/kudu/client/client.h M src/kudu/client/scanner-internal.cc M src/kudu/integration-tests/consistency-itest.cc 3 files changed, 178 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/5143/3 -- To view, visit http://gerrit.cloudera.org:8080/5143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2ac708e38b8a80834f7d54eca294517cbfb06ec6 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1189 if not set, use timestamp from first server
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5143 to look at the new patch set (#2). Change subject: KUDU-1189 if not set, use timestamp from first server .. KUDU-1189 if not set, use timestamp from first server KUDU-1189 On reads at a snapshot that touch multiple tablets, without the user setting a timestamp, use the timestamp from the first server for following scans For a READ_AT_SNAPSHOT scan operation with no snapshot timestamp specified, store the snapshot timestamp returned from the first tablet server into the scan configuration object. Use it for the rest of read operations performed at other tablet servers in the context of the same scan operation. This brings more consistency for read operations in READ_AT_SNAPSHOT mode where the snapshot timestamp is not specified explicitly. Change-Id: I2ac708e38b8a80834f7d54eca294517cbfb06ec6 --- M src/kudu/client/client.h M src/kudu/client/scanner-internal.cc M src/kudu/integration-tests/consistency-itest.cc 3 files changed, 178 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/5143/2 -- To view, visit http://gerrit.cloudera.org:8080/5143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2ac708e38b8a80834f7d54eca294517cbfb06ec6 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon