Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1988#issuecomment-221289448
  
    I started looking into it, but man this is one big change... 😃 
    
    I have some first remarks about API and internals:
    
    Whats the reason for the introduction of `PartitionedState`? The Javadoc 
for `State` already says that it is the base class for partitioned state and 
that it is only usable on a `KeyedStream`.
    
    The signature of `KeyGroupedStateBackend` and `PartitionedStateBackend` is 
exactly the same. `AbstractStateBackend` has both, method 
`createPartitionedStateBackend` and `createKeyGroupStateBackend`. Users of an 
`AbstractStateBackend` should only ever call the latter while the former is 
reserved for internal use by the default implementation for 
`KeyGroupedStateBackend` which is `GenericKeyGroupStateBackend`. Also, 
`AbstractStreamOperator` has the new method `getKeyGroupStateBackend` that 
should be used by operators such as the `WindowOperator` to deal with 
partitioned state. Now, where am I going with this? What I think is that the 
`AbstractStateBackend` should only have a method 
`createPartitionedStateBackend` that is externally visible. This would be used 
by the `AbstractStreamOperator` to create a state backend and users of the 
interface, i.e. `WindowOperator` would also deal just with 
`PartitionedStateBackend`, which they get from 
`AbstractStreamOperator.getPartitionedStateBa
 ckend`. The fact that there are these key groups should not be visible to 
users of a state backend. Internally, state backends would use the 
`GenericKeyGroupStateBackend`, they could provide an interface to it for 
creating non-key-grouped backends.
    
    Above, "exactly the same" is not 100 % correct, since the snapshot/restore 
methods differ slightly but I think this could be worked around. Also, I found 
it quite hard to express what I actually mean but I hope you get my point. 😅 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to