Stephan Ewen created FLINK-19162: ------------------------------------ Summary: Allow Split Reader based sources to reuse record batches Key: FLINK-19162 URL: https://issues.apache.org/jira/browse/FLINK-19162 Project: Flink Issue Type: Sub-task Components: Connectors / Common Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 1.12.0
The Split Readers hand over a batch of records at a time from the I/O thread (fetching and decoding) to the main operator processing thread. These structures can memory intensive and expensive and performance greatly benefits from reusing them. This is especially true for high-performance format readers like ORC and Parquet. While previous sources (where I/O was in the main thread) could reuse objects in a trivial manner, the new Split Reader API (with multiple threads) needs an explicit {{recycle()}} hook to allow returning/reusing these objects. -- This message was sent by Atlassian Jira (v8.3.4#803005)