shunping commented on code in PR #30317:
URL: https://github.com/apache/beam/pull/30317#discussion_r1492601187


##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -1075,6 +1111,42 @@ message StateClearRequest {}
 // A response to clear state.
 message StateClearResponse {}
 
+// A message describes a sort key range [start, end).
+message OrderedListRange {
+  int64 start = 1;
+  int64 end = 2;
+}
+
+// A data entry which is tagged with a sort key.
+message OrderedListEntry {
+  int64 sort_key = 1;
+  bytes data = 2;
+}
+
+// This request will fetch an ordered list with a sort key range. If the
+// timestamp range is not specified, the runner should use [MIN_INT64,
+// MAX_INT64) by default.
+message OrderedListStateGetRequest {
+  bytes continuation_token = 1;
+  OrderedListRange range = 2;
+}
+
+// A response to the get state request for an ordered list.
+message OrderedListStateGetResponse {
+  bytes continuation_token = 1;
+  bytes data = 2;

Review Comment:
   > how do we return multiple elements if the request was a range?
   
   We concatenate the encoded entries in a byte array and send them back in 
chunks with corresponding continuation token. 
   
   > should this be repeated OrderedListEntry instead of data?
   
   I have considered this option. Besides the efficiency reason @robertwb  
mentioned, I also find that representing the response as bytes has an advantage 
of reusing the existing code in 
https://github.com/apache/beam/blob/3693174c0421d0ff049042ca283db633431892ef/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/StateFetchingIterators.java.
 This iterator is used to parsed the returned data block (in not only 
OrderedListState but also Multimap, Bag, etc) and it supports blocks even if 
the boundary is not aligned with entries/elements. I think this is not 
achievable with OrderedListEntry 
   
   > But the coding should be specified (e.g. is this a concatenation of 
several encoded KV<sort-key, value>s
   
   That's right. It is basically the concatenation of encoded 
TimestampedValue<T>, since TimestampedValue<T> is already in use in the sdk 
interface of OrderedListState 
https://github.com/apache/beam/blob/3693174c0421d0ff049042ca283db633431892ef/sdks/java/core/src/main/java/org/apache/beam/sdk/state/OrderedListState.java#L29
   
   



##########
model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto:
##########
@@ -1075,6 +1111,42 @@ message StateClearRequest {}
 // A response to clear state.
 message StateClearResponse {}
 
+// A message describes a sort key range [start, end).
+message OrderedListRange {
+  int64 start = 1;
+  int64 end = 2;
+}
+
+// A data entry which is tagged with a sort key.
+message OrderedListEntry {
+  int64 sort_key = 1;
+  bytes data = 2;
+}
+
+// This request will fetch an ordered list with a sort key range. If the
+// timestamp range is not specified, the runner should use [MIN_INT64,
+// MAX_INT64) by default.
+message OrderedListStateGetRequest {
+  bytes continuation_token = 1;
+  OrderedListRange range = 2;
+}
+
+// A response to the get state request for an ordered list.
+message OrderedListStateGetResponse {
+  bytes continuation_token = 1;
+  bytes data = 2;

Review Comment:
   > how do we return multiple elements if the request was a range?
   
   We concatenate the encoded entries in a byte array and send them back in 
chunks with corresponding continuation token. 
   
   > should this be repeated OrderedListEntry instead of data?
   
   I have considered this option. Besides the efficiency reason @robertwb  
mentioned, I also find that representing the response as bytes has an advantage 
of reusing the existing code in 
https://github.com/apache/beam/blob/3693174c0421d0ff049042ca283db633431892ef/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/StateFetchingIterators.java.
 This iterator is used to parsed the returned data block (in not only 
OrderedListState but also Multimap, Bag, etc) and it supports blocks even if 
the boundary is not aligned with entries/elements. I think this is not 
achievable with OrderedListEntry. 
   
   > But the coding should be specified (e.g. is this a concatenation of 
several encoded KV<sort-key, value>s
   
   That's right. It is basically the concatenation of encoded 
TimestampedValue<T>, since TimestampedValue<T> is already in use in the sdk 
interface of OrderedListState 
https://github.com/apache/beam/blob/3693174c0421d0ff049042ca283db633431892ef/sdks/java/core/src/main/java/org/apache/beam/sdk/state/OrderedListState.java#L29
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to