kennknowles commented on code in PR #30317:
URL: https://github.com/apache/beam/pull/30317#discussion_r1500851162
##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -1075,6 +1111,42 @@ message StateClearRequest {}
// A response to clear state.
message StateClearResponse {}
Review Comment:
OK my responses are evolving now that I've read the whole code change and
re-read the doc.
- I see specialized requests are broken out anyhow, so that's fine.
- Including everything needed for caching in the state key is good for raw
request caching, so re-using Get is good. Though there are perhaps smarter ways
to cache that won't benefit from this
- "append" is still a fine method for adding things to ordered list state,
but it isn't important and the name is misleading (as it is for bags and
multimaps, since they are not ordered, so anyhow it is the same here and might
as well keep the incorrect naming)
- Obviously `clear` is fine
##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -1075,6 +1111,42 @@ message StateClearRequest {}
// A response to clear state.
message StateClearResponse {}
+// A message describes a sort key range [start, end).
+message OrderedListRange {
+ int64 start = 1;
+ int64 end = 2;
+}
+
+// A data entry which is tagged with a sort key.
+message OrderedListEntry {
+ int64 sort_key = 1;
+ bytes data = 2;
+}
+
+// This request will fetch an ordered list with a sort key range. If the
+// timestamp range is not specified, the runner should use [MIN_INT64,
+// MAX_INT64) by default.
+message OrderedListStateGetRequest {
+ bytes continuation_token = 1;
+ OrderedListRange range = 2;
+}
+
+// A response to the get state request for an ordered list.
+message OrderedListStateGetResponse {
+ bytes continuation_token = 1;
+ bytes data = 2;
+}
+
+// A request to update an ordered list
+message OrderedListStateUpdateRequest {
+ // when the request is processed, deletes should always happen before
inserts.
Review Comment:
It would be helpful to outline the pro/con in the design doc of little
decisions like, and note which one was chosen and why.
For example one benefit to splitting the requests is to avoid ordering
issues. We would have to spec that either the inserts or deletes happen first,
even though they are in one request together. It is a bit confusing. And then
if you want them in the other order, you still have to make two requests but
each one has an empty field.
And note whether there is an efficiency consideration.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]