zhijiangW commented on a change in pull request #7713: [FLINK-10995][network]
Copy intermediate serialization results only once for broadcast mode
URL: https://github.com/apache/flink/pull/7713#discussion_r263749333
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/writer/BroadcastRecordWriter.java
##########
@@ -44,4 +52,67 @@ public BroadcastRecordWriter(
public void emit(T record) throws IOException, InterruptedException {
broadcastEmit(record);
}
+
+ @Override
+ public void broadcastEmit(T record) throws IOException,
InterruptedException {
Review comment:
Yes, you are right. We do not need hardcode the size of latency marker. :)
I have submitted the codes in a separate commit. I am not sure whether this
change is suitable based on our comments, so I only refactored the main
processes and ignored the tests with annotations. After your review and
confirm, I would refactor the commits and tests.
I think it is better to explain some thoughts below:
In order to reuse the main process of
`RecordWriter#copyFromSerializerToTargetChannel`, the `randomTriggered` field
is still maintained in `BroadcastRecordWriter`, but I think the logic is easy
to understand than the first version.
After abstracting the `RecordWriter` to separate two implementations, the
changes seem more complex, because there are many abstract methods involved in
`BufferBuilder` and `ChannelSelector` which are the main differences between
two implementations.
I am not sure you could accept the way of magic 0 in
`BroadcastRecordWriter#broadcastEmit` for reusing the common codes. If we do
not pass the target channel in broadcast operation, we might need to realize
all the related processes in `BroadcastRecordWriter`.
I think my first version seems more easy to follow if we only change the
process of `BroadcastRecordWriter#randomEmit` as now. Maybe you have other good
ways. :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services