zhijiang created FLINK-9913:
-------------------------------
Summary: Improve output serialization only once in RecordWriter
Key: FLINK-9913
URL: https://issues.apache.org/jira/browse/FLINK-9913
Project: Flink
Issue Type: Improvement
Components: Network
Affects Versions: 1.6.0
Reporter: zhijiang
Assignee: zhijiang
Fix For: 1.6.0
Currently the {{RecordWriter}} emits output into multi channels via
{{ChannelSelector}} or broadcasts output to all channels directly. Each
channel has a separate {{RecordSerializer}} for serializing outputs, that means
the output will be serialized as many times as the number of selected channels.
As we know, data serialization is a high cost operation, so we can get good
benefits by improving the serialization only once.
I would suggest the following changes for realizing it.
# Only one {{RecordSerializer}} is created in {{RecordWriter}} for all the
channels.
# The output is serialized into the intermediate data buffer only once for
different channels.
# The intermediate serialization results are copied into different
{{BufferBuilder}}s for different channels.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)