Re: [PR] Add OrderedListState support for SparkRunner [beam]

via GitHub Wed, 25 Dec 2024 06:16:16 -0800


twosom commented on code in PR #33212:
URL: https://github.com/apache/beam/pull/33212#discussion_r1897383414



##########
runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkStateInternals.java:
##########
@@ -622,4 +627,82 @@ public Boolean read() {
       };
     }
   }
+
+  private final class SparkOrderedListState<T> extends 
AbstractState<List<TimestampedValue<T>>>
+      implements OrderedListState<T> {
+
+    private SparkOrderedListState(StateNamespace namespace, String id, 
Coder<T> coder) {
+      super(namespace, id, 
ListCoder.of(TimestampedValue.TimestampedValueCoder.of(coder)));
+    }
+
+    private SortedMap<Instant, TimestampedValue<T>> readAsMap() {
+      final List<TimestampedValue<T>> listValues =

Review Comment:
   @kennknowles 
   
   Hello. I've been reviewing the Spark documentation over the weekend.
   
   Unfortunately, as far as I can tell, Spark doesn't have the native state API 
that we're looking for. Moreover, since the current SparkStateInternals is a 
class implemented independently of the Spark API, it's also difficult to apply 
RDD-based APIs.
   
   The performance requirements you mentioned seem to be well satisfied by 
`WindmillOrderedList`. Given that the current implementation cannot guarantee 
the same performance, I completely understand if the PR doesn't get merged.
   
   However, regardless of whether this PR gets merged or not, I will continue 
researching to meet the requirements you've outlined. I plan to explore better 
implementation approaches, taking inspiration from concepts like pendingAdds 
and pendingDeletes as used in `WindmillOrderedList`.
   
   Belated Merry Christmas.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add OrderedListState support for SparkRunner [beam]

Reply via email to