Github user StephanEwen commented on a diff in the pull request:
https://github.com/apache/flink/pull/4963#discussion_r159618800
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/state/heap/HeapListState.java
---
@@ -120,4 +121,16 @@ public void add(V value) {
a.addAll(b);
return a;
}
+
+ @Override
+ public void update(List<V> values) throws Exception {
+ clear();
+
+ if (values != null && !values.isEmpty()) {
+ final N namespace = currentNamespace;
+ final StateTable<K, N, ArrayList<V>> map = stateTable;
+
+ map.put(namespace, new ArrayList<>(values));
--- End diff --
Correct, I wanted to enforce using `ArrayList` in the internal state when
possible, because users never interacted with the list directly. We don't
necessarily have to keep that, could relax it to `List` to avoid extra copies.
That should be a ground rule in all of Flink's runtime code: No extra work to
work around current code.
Background: I initially strongly typed to ArrayList to make it clear that
we want a compact and efficient list implementation. Because I have seen too
many times that LinkedList (which is for most cases so much slower) is used as
the default list (I blame this on University education, which talks about how
theoretical complexity of LinkedList is lower, but fails to actually take
processor architecture into account)
---