kennknowles commented on code in PR #30317:
URL: https://github.com/apache/beam/pull/30317#discussion_r1500851162


##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -1075,6 +1111,42 @@ message StateClearRequest {}
 // A response to clear state.
 message StateClearResponse {}

Review Comment:
   OK my responses are evolving now that I've read the whole code change and 
re-read the doc.
   
    - I see specialized requests are broken out anyhow, so that's fine.
    - Including everything needed for caching in the state key is good for raw 
request caching, so re-using Get is good. Though there are perhaps smarter ways 
to cache that won't benefit from this
    - "append" is still a fine method for adding things to ordered list state, 
but it isn't important and the name is misleading (as it is for bags and 
multimaps, since they are not ordered, so anyhow it is the same here and might 
as well keep the incorrect naming)
    - Obviously `clear` is fine



##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -1075,6 +1111,42 @@ message StateClearRequest {}
 // A response to clear state.
 message StateClearResponse {}
 
+// A message describes a sort key range [start, end).
+message OrderedListRange {
+  int64 start = 1;
+  int64 end = 2;
+}
+
+// A data entry which is tagged with a sort key.
+message OrderedListEntry {
+  int64 sort_key = 1;
+  bytes data = 2;
+}
+
+// This request will fetch an ordered list with a sort key range. If the
+// timestamp range is not specified, the runner should use [MIN_INT64,
+// MAX_INT64) by default.
+message OrderedListStateGetRequest {
+  bytes continuation_token = 1;
+  OrderedListRange range = 2;
+}
+
+// A response to the get state request for an ordered list.
+message OrderedListStateGetResponse {
+  bytes continuation_token = 1;
+  bytes data = 2;
+}
+
+// A request to update an ordered list
+message OrderedListStateUpdateRequest {
+  // when the request is processed, deletes should always happen before 
inserts.

Review Comment:
   It would be helpful to outline the pro/con in the design doc of little 
decisions like, and note which one was chosen and why.
   
   For example one benefit to splitting the requests is to avoid ordering 
issues. We would have to spec that either the inserts or deletes happen first, 
even though they are in one request together. It is a bit confusing. And then 
if you want them in the other order, you still have to make two requests but 
each one has an empty field.
   
   And note whether there is an efficiency consideration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to