Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/1988#issuecomment-221289448
I started looking into it, but man this is one big change... ð
I have some first remarks about API and internals:
Whats the reason for the introduction of `PartitionedState`? The Javadoc
for `State` already says that it is the base class for partitioned state and
that it is only usable on a `KeyedStream`.
The signature of `KeyGroupedStateBackend` and `PartitionedStateBackend` is
exactly the same. `AbstractStateBackend` has both, method
`createPartitionedStateBackend` and `createKeyGroupStateBackend`. Users of an
`AbstractStateBackend` should only ever call the latter while the former is
reserved for internal use by the default implementation for
`KeyGroupedStateBackend` which is `GenericKeyGroupStateBackend`. Also,
`AbstractStreamOperator` has the new method `getKeyGroupStateBackend` that
should be used by operators such as the `WindowOperator` to deal with
partitioned state. Now, where am I going with this? What I think is that the
`AbstractStateBackend` should only have a method
`createPartitionedStateBackend` that is externally visible. This would be used
by the `AbstractStreamOperator` to create a state backend and users of the
interface, i.e. `WindowOperator` would also deal just with
`PartitionedStateBackend`, which they get from
`AbstractStreamOperator.getPartitionedStateBa
ckend`. The fact that there are these key groups should not be visible to
users of a state backend. Internally, state backends would use the
`GenericKeyGroupStateBackend`, they could provide an interface to it for
creating non-key-grouped backends.
Above, "exactly the same" is not 100 % correct, since the snapshot/restore
methods differ slightly but I think this could be worked around. Also, I found
it quite hard to express what I actually mean but I hope you get my point. ð
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---