Hello,

Regarding the new BATCH mode of the data stream API, I see that the 
documentation states that some operators will process all data for a given key 
before moving on to the next one. However, I don't see how Flink is supposed to 
know whether the input will provide all data for a given key sequentially. In 
the DataSet API, an (undocumented?) feature is using SplitDataProperties 
(https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/api/java/io/SplitDataProperties.html)
 to specify different grouping/partitioning/sorting properties, so if the data 
is pre-sorted (e.g. when reading from a database), some operations can be 
optimized. Will the DataStream API get something similar?

Regards,
Alexis.

Reply via email to