BIOINSu commented on PR #24128: URL: https://github.com/apache/flink/pull/24128#issuecomment-1905451090
@libenchao Thanks for the review comments. In fact, the "Deal with changelog messages" is necessary. In the previous version of the code, since consuming the changelog stream will add a `ChangelogNormalize` node after the Source node, the shuffle method between the two nodes is Hash Shuffle. Therefore, i hope to rely on such an association relationship to handle changelog data. However, since setting source parallelism and processing changelog data should be two independent features, in the new version of the code, I added `PartitionTransformation` after the `SourceTransformationWrapper` to ensure that the source and downstream nodes are Hash Shuffle after adjusting the parallelism. Since `PartitionTransformation` does not essentially generate actual physical nodes and selects the first `PartitionTransformation` corresponding partitioner as the real partitioner when there are multiple consecutive `PartitionTransformation`, this implementation will not affect the existing logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
