gaoyunhaii commented on pull request #6: URL: https://github.com/apache/flink-ml/pull/6#issuecomment-945594305
Very thanks @lindong28 and @becketqin for the review! > Thanks for the PR @gaoyunhaii. It is not clear why FLIP-176 needs this BroadcastOutput. Can you explain it? And is there > any existing end-to-end test that shows how to support an iteration job using this BroadcastOutput? As @becketqin has pointed out, this utility is required mainly due to that we need to broadcast the epoch event no matter what the shuffler partitioner is used. The usage of this utility is available in the unit tests and in the following PR~ > If so is it the case, is this broadcast output only needed in case of the shuffle? Is my understanding correct? Flink has two kinds of outputs for an operator: 1. ChainedOutput: the output to another operator in the same operator chain. 2. RecordWriterOutput: the output to another operator in another task. If one operator has multiple outputs, they will be wrapped in a `BroadcastOutput`. Thus logically we only need to deal with the `RecordWriterOutput`. However, for `ChainedOutput` we still need to deal with the `OutputTag`: if the output is defined with an output tag, we need to also emit with this output tag for the events, otherwise these events would be deserted. Thus we would need to deal with both types of outputs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
