Amit is correct. keyBy() ensures that all records with the same key are processed by the same paralllel instance of a function. This is different from "a parallel instance only sees records of one key".
I had a look at the docs [1]. I agree that "Logically partitions a stream into disjoint partitions, each partition containing elements of the same key." can be easily interpreted as you did. I've pushed a commit to clarify the description. The docs should be updated soon. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/ operators/#datastream-transformations 2018-04-05 6:21 GMT+02:00 Amit Jain <aj201...@gmail.com>: > Hi, > > KeyBy operation partition the data on given key and make sure same slot > will > get all future data belonging to same key. In default implementation, it > can > also map subset of keys in your DataStream to same slot. > > Assuming you have number of keys equal to number running slot then you may > specify your custom keyBy operation to the achieve the same. > > > Could you specify your case. > > -- > Thanks > Amit > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/ >