Github user StephanEwen commented on the issue:
https://github.com/apache/flink/pull/3218
Good catch for that problem.
I would suggest two changes:
1. Let's forbid `remove()` in all cases, also when the state is
non-queryable. It seems inconsistent that the heap-state allows modifications
via the iterable, while the RocksDB state does not.
2. It would be nice to get rid of the locking (or the code branch) in the
`add()` method:
- A simple approach is to override `add()` in the queryable state
version of the list and synchronize there.
- A more advanced idea: We may get rid of the locking on the ArrayList
alltogether by implementing our own specialized list: Since the list ever only
grows (a clear call removes the list as a whole from the map in the heap state)
creating a serialized copy means simply taking the list and taking all elements
from the list up to the size that are not null (to support `null` we can use a
null-marker element). Taking the elements from the list needs to be
conservative (take the lower, size or nun-null entries) to compensate for
visibility issues across the threads.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---