Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1831#issuecomment-200933436
  
    To elaborate on this. State right now works well if you stick to the 
(admittedly somewhat hidden) rules. That is, you should only access state if 
there is a key available.
    
    If there is no key available the behavior changes in unexpected ways based 
on what state backend is used and the capabilities of the key serializer. For 
example, let's look at access to `ValueState` in `open()`. For mem/fs state: 
`ValueState.value()` works, it will return the default value. 
`ValueState.update()` will throw a NPE. For RocksDB state: Neither method works 
if the key serializer cannot handle null values. If it can, then both methods 
will change state for the `null` key.
    
    For these reasons I would like to change the semantics of state such that 
the user always has to call `getState` (or a similar method) and that the 
returned accessor object is documented to only be valid for the duration of the 
processing method. Right now, the user can wreak all kinds of havoc by 
down-casting the returned State object. Right now we have a very simple system 
that works if the user keeps to the rules and also makes things go fast. If we 
want to make it more restrictive we will lose some performance, of course.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to