zhijiangW commented on a change in pull request #7713: [FLINK-10995][network]
Copy intermediate serialization results only once for broadcast mode
URL: https://github.com/apache/flink/pull/7713#discussion_r263288320
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/writer/BroadcastRecordWriter.java
##########
@@ -44,4 +52,67 @@ public BroadcastRecordWriter(
public void emit(T record) throws IOException, InterruptedException {
broadcastEmit(record);
}
+
+ @Override
+ public void broadcastEmit(T record) throws IOException,
InterruptedException {
Review comment:
Thanks for thinking of one more possible solution for this issue. I actually
like the idea of sharing random `BufferBuilder` in all partitions by setting
the offset position of `BufferConsumer`. In the before I also thought of
another similar way of always sharing `BufferBuilder` no matter with broadcast
switch, then every `BufferConsumer` maintains the different skip list to filter
the data not belonging to itself. But it seems so complicated and might bring
extra overheads. Your suggestion make it easy to only skip fixed length at the
beginning of `BufferConsumer`.
I am not so worried about the latency higher as you mentioned because we
also have flushing mechanism to work around. But I am just confused of another
issue, that means we strongly couple the `randomEmit` with `LatencyMarker`
together. The`randomEmit` should be a common method provided for operators, but
the implementation seems not extendable, even though it might never have other
usages for `randomEmit` in future.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services