Hi Xiaowei,
thanks for sharing this proposal. How would fault tolerance work with the
BatchFunction? Since the batch function seems to manage its own buffer,
users would also have to make sure that in-flight elements which are
buffered but not yet processed are checkpointed, wouldn't they?
Could you not do the same thing today with a FlatMap function that
stores incoming elements
and only computes and collects a result when a certain threshold is reached?
On 20.10.2016 09:50, Xiaowei Jiang wrote:
Very often, it's more efficient to process a batch of records at once
instead of
Very often, it's more efficient to process a batch of records at once
instead of processing them one by one. We can use window to achieve this
functionality. However, window will store all records in states, which can
be costly. It's desirable to have an efficient implementation of batch
operator.
Xiaowei Jiang created FLINK-4854:
Summary: Efficient Batch Operator in Streaming
Key: FLINK-4854
URL: https://issues.apache.org/jira/browse/FLINK-4854
Project: Flink
Issue Type: Improvement