Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12592 )
Change subject: WIP: Expose RPC interface for diff scan ...................................................................... Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/12592/3/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: http://gerrit.cloudera.org:8080/#/c/12592/3/src/kudu/tserver/tablet_service.cc@2451 PS3, Line 2451: opts.snap_to_exclude = MvccSnapshot(Timestamp(scan_pb.snap_start_timestamp())); > Should we validate this isn't older than the AHM here? I added this validation and a test case for it. http://gerrit.cloudera.org:8080/#/c/12592/3/src/kudu/tserver/tablet_service.cc@2452 PS3, Line 2452: opts.include_deleted_rows = true; > Should this only be set if an is_deleted column is in the projection? I can't think of any use case for getting updates but not deletes, so since this is a private API it seems reasonable to force people to do what they need to get the deleted rows back too, which they can throw away if really not needed. http://gerrit.cloudera.org:8080/#/c/12592/1/src/kudu/tserver/tserver.proto File src/kudu/tserver/tserver.proto: http://gerrit.cloudera.org:8080/#/c/12592/1/src/kudu/tserver/tserver.proto@256 PS1, Line 256: // The read mode for this scan request. > Would it make sense to make this more explicit and add to the ReadMode enum We talked about this offline but I forget what we said. The two main drivers I see behind specifying a READ_DIFF mode are the following: 1. Ergonomics: A clear and usable RPC API to specify how to request a diff scan is valuable and using a different read mode would help with that in terms of self-documenting user client code as well as sanity checking and error messages. However we are able to provide a reasonable amount of validation / checking with READ_AT_SNAPSHOT so I'm not sure the current situation is very dire on the usability side of things. 2. Backwards compatibility: We don't want requests to old servers to silently return with a regular snapshot scan when a newer client specifies a diff scan. I was going to say that, as implemented, these requests to old servers will not be silently mishandled in that way because we require users to specify an IS_DELETED column in the projection for a diff scan and old servers would not recognize that type, but actually I think Kudu 1.9.0 would recognize that. So we may want to consider doing something like the READ_DIFF read mode to account for that. Alternatively, maybe we can use a feature flag although I need to look into whether we currently have support for argument-specific conditional flags such as "this is a diff scan, servers needs diff scan support to handle this request". http://gerrit.cloudera.org:8080/#/c/12592/1/src/kudu/tserver/tserver.proto@258 PS1, Line 258: optional ReadMode read_mode = 5 [default = READ_LATEST]; > Note that when rebasing on master this will collide with the new authz_toke It looks like you included it in your patch on top of this one so let's leave it there http://gerrit.cloudera.org:8080/#/c/12592/1/src/kudu/tserver/tserver.proto@261 PS1, Line 261: // specified. > We should update that this is the end timestamp in the case of a diff scan. Done -- To view, visit http://gerrit.cloudera.org:8080/12592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9602b549dc499f92b2069b297e66f366e51fb312 Gerrit-Change-Number: 12592 Gerrit-PatchSet: 3 Gerrit-Owner: Mike Percy <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Comment-Date: Tue, 19 Mar 2019 22:11:49 +0000 Gerrit-HasComments: Yes
